Custom Check states "UNKNOWN - not implemented" when called by Nagios but not when run manually

Hello everyone,

I have a problem with a custom check in check_mk which I just do not understand.

I am monitoring a MySQL (actually MariaDB) Galera cluster. The mk_mysql plugin is placed on the hosts and successfully reporting it’s information. Part of the plugin output is details on the Galera cluster state but for this check_mk provides no predefined check.
So I downloaded a Galera check from here (https://github.com/HeinleinSupport/check_mk/blob/master/mysql/checks/mysql.galera) and put in in my “~/local/check_mk/checks” folder.
So far so good, the inventory run in WATO shows 5 new checks (all green) which I then add to the host and activate the changes. But as soon as Nagios triggers Check_MK the next time these 5 services go into “UNKNOWN - Check not implemented” state.

Running “cmk -v database.host” from the command line shows the 5 services in OK state, refreshing the service view in Nagios/WATO Web interface after that also shows the services in OK state. Running Check_MK for the host once more from Nagios (either by forcing an active check oder waiting for the next scheduled check) results in UNKNOWN again. Also manually running “python ~/var/check_mk/precompiled/database.host” refreshes the services in Nagios to UNKNOWN.

Shouldn’t “cmk -v database.host” lead to the exact same results as “python ~/var/check_mk/precompiled/database.host”??? I also recreated the procompiled check using “cmk -R” (verified by checking timestamp of precompiled check) but it makes no difference.

I have some more custom checks in “~/local/check_mk/checks” which all work flawlessly as they should and there is no difference in ownership and permissions between the checks. I could not see any obvious errors in the check file itself, inventory and check funtions are declared correctly. Parsed input should also work as it’s a sub-check of pre-existing mysql which declares the parsing function.

For good measure I performed a “omd restart” but nothing changes.

Any hints are greatly appreciated.

Tobias

I am not sure if cmk registered the checks correctly the way they are provided in one single file.
Try to split the python code you have in mysql.galera into multiple check plugins like so:

  • create file mysql.galerasync and paste the according code into the file
#!/usr/bin/env python
# -*- encoding: utf-8; py-indent-offset: 4 -*-

#
# Galera Sync Status
#

def inventory_mysql_galerasync(parsed):
    for instance, values in parsed.items():
        if values.get('wsrep_provider', 'none') != 'none' and u'wsrep_local_state_comment' in values:
            yield instance, {}

def check_mysql_galerasync(item, _no_params, parsed):
    if item in parsed:
        values = parsed[item]
        if u'wsrep_local_state_comment' in values:
            if values[u'wsrep_local_state_comment'] == 'Synced':
                return (0, 'wsrep_local_state_comment is %s' % values[u'wsrep_local_state_comment'])
            else:
                return (2, 'wsrep_local_state_comment is %s' % values[u'wsrep_local_state_comment'])

check_info['mysql.galerasync'] = {
    "inventory_function"      : inventory_mysql_galerasync,
    "check_function"          : check_mysql_galerasync,
    "service_description"     : "MySQL Galera Sync %s",
    "has_perfdata"            : False,
}
  • continue to do so with the other checks (mysql.galeradonor, .startup, etc)

Do not forget the shebang and encoding line!

1 Like

Hi,

thanks for the response. I will try this (after the weekend) but I do not think this is the cause.

For once, Inventory (WATO discovery as well as “cmk -I”) shows all 5 checks and “cmk -v” also shows correct results for all 5 checks.

And putting multiple checks in one file is working correctly in all other cases. I have mutliple own custom checks in a single file and the checks coming directly from check_mk are also built the same way (e.g./omd/versions/default/share/check_mk/checks/mysql contains multiple checks).

There is a difference in the logic running in the background while doing inventories, cmk commands and the commands the core is running. Just give it a try.

I know there are plugins with multiple check declarations, but cmk handles files with .(dot) differently.

In most cases the “UNKNOWN - check not implemented” error is related to the check registration. What you can so additionally is running # cmk -L | grep galera what does this command show?

Edit:

Another approach to this error would be to rm-copy the mysql check to your local checks directory and add all the python code from mysql.galera to the mysql check.

TL;DR: Get rid of the strange name of the plugin file since no check actually is called “mysql.galera”. That is basically the idea I got behind this issue.

Thanks for the heads up.

Split the check file into 5 distinct files, unfortunately no difference in the behaviour.

Oh, the file never had a dot in the filename on my system. I renamed it to mysql_galera directly after download.

OMD[site]:~$ cmk -L | grep galera
mysql.galeradonor tcp (no man page present)
mysql.galerasize tcp (no man page present)
mysql.galerastartup tcp (no man page present)
mysql.galerastatus tcp (no man page present)
mysql.galerasync tcp (no man page present)

The result is the same, regardless of all checks in same file or distinct files.

Also did that, created a copy of “mysql” from the /omd/versions/default directory to my local site directory and added the 5 galera checks to the end of the file. Still the same, command line cmk returns OK and updates Nagios with that state, running check_mk from Nagios returns UNKNOWN.

Is this maybe an issue with the fact that the check functions use “parsed” input which relies on the parsing function of the parent “mysql” check without explicitely defining it? But all the other predefined mysql subchecks also use parsed input without explitely referencing a parse function.

I think I will try with a single galera check next using a custom parse function by copying the original one.

I assume you did a # cmk -R aswell after changing the files?

The parse function needs to be called just once (which is done by the mysql version check) IF the other checks are declared as subchecks correctly (e.g. mysql.sessions). CMK runs all related checks in one instance, so variables are available for all check and inventory functions within that instance. I don’t see the problem there since the inventory and direct call works.

Can you check if the commands got written correctly to the nagios config please?:

# cat /omd/sites/yoursite/var/nagios/retention.dat | grep -A5 '\.galera'
# cat /omd/sites/yoursite/etc/nagios/conf.d/check_mk_objets.cfg | grep -A5 '\.galera'

Looks good to me

OMD[site]:~$ grep -A5 '\.galera' var/nagios/retention.dat
check_command=check_mk-mysql.galeradonor
check_period=24X7
notification_period=24X7
event_handler=
has_been_checked=1
check_execution_time=0.000
--
check_command=check_mk-mysql.galerasize
check_period=24X7
notification_period=24X7
event_handler=
has_been_checked=1
check_execution_time=0.000
--
....

OMD[site]:~$ grep -B 1 -A 2 'command_name.*\.galera' etc/nagios/conf.d/check_mk_objects.cfg
define command {
  command_name                  check_mk-mysql.galerastatus
  command_line                  echo "ERROR - you did an active check on this service - please disable active checks" && exit 1
}
--
define command {
  command_name                  check_mk-mysql.galerasize
  command_line                  echo "ERROR - you did an active check on this service - please disable active checks" && exit 1
}
--
...

OMD[site]:~$ grep -B 10 -A 1 'check_command.*\.galera' etc/nagios/conf.d/check_mk_objects.cfg
define service {
  use                           check_mk_passive
  host_name                     database.host
  service_description           MySQL Galera Donor mysql
  check_interval                5
  contact_groups                group-users-nagios-monitor
  max_check_attempts            3
  first_notification_delay      10.0
  notification_options          w,u,c,r,f
  check_interval                5.0
  check_command                 check_mk-mysql.galeradonor
}
--
define service {
  use                           check_mk_passive_perf
  host_name                     database.host
  service_description           MySQL Galera Size mysql
  check_interval                5
  contact_groups                group-users-nagios-monitor
  max_check_attempts            3
  first_notification_delay      10.0
  notification_options          w,u,c,r,f
  check_interval                5.0
  check_command                 check_mk-mysql.galerasize
}
--
...

It does, indeed.

Did you install the mkp package from the github site or did you just used the file? If you didn’t use the mkp package, please try to do so as a last try. Remove the other files and run an inventory and do a restart before installing the package.

Could you provide me a full agent output of one of the affected hosts please? I would like to try debugging the error on different cmk systems but I do not have hosts providing such galera information. That is the next step I can suggest you after you tried the package thingy.

Indeed I just took the check file standalone, not the MKP. As CRE user I was not interested in the agent bakery parts. But you are right, I should try a full MKP installation. Will try this on Wednesday or Thursday.

Would the <<<mysql>>> section of the check_mk agent output be sufficient or should I grab a full agent output?

Yeah sure, mysql section only will work aswell. :slight_smile:

I installed the check today using the MKP file (https://github.com/HeinleinSupport/check_mk/blob/master/mysql/mysql_galera-1.7.mkp). I checked the MKP file and it only contains the check file itself (local/share/check_mk/checks/mysql.galera).

After the installation I performed a “cmk -R” to recreate the precompiled checks. The behaviour has not changed unfortuntely.

I went further and checked the precompiled python check (~/var/check_mk/precompiled/database.host.py).

In the function “get_precompiled_check_table” it contains the Galera checks:

...('mysql_capacity', u'mysql:monitor', None, u'MySQL DB Size mysql:monitor', ''), ('mysql.galeradonor', u'mysql', {}, u'MySQL Galera Donor mysql', ''), ('mysql.galerasize', u'mysql', {'invsize': 3}, u'MySQL Galera Size mysql', ''), ('mysql.galerastartup', u'mysql', {}, u'MySQL Galera Startup mysql', ''), ('mysql.galerastatus', u'mysql', {}, u'MySQL Galera Status mysql', ''), ('mysql.galerasync', u'mysql', {}, u'MySQL Galera Sync mysql', ''), ('mysql.innodb_io', u'mysql', {}, u'MySQL InnoDB IO mysql', ''),...

But it does not contain the contents of the “mysql_galera”/“mysql.galera” file located in the local check definitions of the site. A different custom check located in the same directory is present in the precompiled python file.

Do you know the logic by which the precompiled checks are created?

I dug a little into the creation of the procompiled checks for Nagios. In the Nagios module (~/share/check_mk/modules/nagios.py) the function “find_check_plugins” is responsible for adding the local check definitions to the precompiled check code.

The “find_check_plugins” function expects to find the code for a local check named “mysql.galerasync” in one of the following files:

~/local/share/check_mk/checks/mysql
~/local/share/check_mk/checks/mysql.galerasync
~/share/check_mk/checks/mysql
~/share/check_mk/checks/mysql.galerasync

That’s why the checks were never included in the precompiled files as I named the file “mysql.galera” or “mysql_galera”. Renaming the file to “mysql.galerasync” resulted in working checks. Renaming the file to “mysql” led to issues with the original predefined MySQL checks.

The only thing I do not understand is, why it did not work, when I copied over the “mysql” from predefined to local checks and added the galera checks into that file. I think I will try this once more.

Nonetheless the initial problem can be declared as solved:
Filenames of local checks need to match the check name or at least (in case of subchecks) the parent check name. Otherwise they will not be added to the preocompiled checks for the Nagios Core. Issues arise here if multiple subchecks for a predefined parent check are all added inside a single file.

1 Like

But isn’t that basically what I pointed out in my first post? :thinking:

Anyway, glad you figured it out and got it working for you.

When I look back now, yes it is… :flushed:

Unfotunately I did not test this directly but only a few days after and named my split filenames not in the form “mysql.galerasync” but rather “mysql_galerasync” as I did not understand that there is a requirement for filename and checkname to be the same. I assumed the splitting was the important thing, not the naming :cry:
Now this is all clear to me but it was not back then.

Thanks again for your help, your responses helped a lot in undestanding what’s happening behind the scenes! :+1:

1 Like