Services running on Client, not Satellite

satellite
distributed
icinga2
client

(Joe Valerio) #1

Re-opening a new thread as the previous one was no longer being responded to. So I will apologize ahead of time for that.

In my other post I discussed that I’m having an issue where my services that I’m defining on my master host are being executed directly on the client nodes, and not the satellite nodes. I’ve followed the wiki documentation to configure this distributed setup and I’m not sure where I’ve gone wrong.

Here is the error I’m receiving when I attempt to start icinga on the master:

Nov 27 09:07:12 <master> icinga2[72987]: [2018-11-27 09:07:12 -0500] critical/config: Error: Validation failed for object 'client01!OpenVPN' of type 'Service'; Attribute 'check_command': Object 'check_openvpn' of type 'CheckCommand' does not exist.

This CheckCommand “check_openvpn” does exist on the satellite, but not the client. If I add this command to the client node, I no longer receive this error. This is indicating to me that it’s attempting to run the command on the client node itself.

Here is my service entry:

apply Service "OpenVPN" {
  check_command = "check_openvpn"
  assign where host.zone == "prod1-na1"
}

Here is my API configuration on my client node:

/**
 * The API listener is used for distributed monitoring setups.
 */
object ApiListener "api" {
  accept_commands = false
  accept_config = false
}

If anyone can provide any help, that would be much appreciated!!


(Michael Friedrich) #2

You’ll get that error on the master, so I’d check if this CheckCommand object is defined there e.g. in a global zone. Also, you’ve mentioned that it exists on the satellite, where exactly?


(Joe Valerio) #3

Hi Michael, yes this error is happening on the master, and this check does exist on the satellite. What I did was imported our old nagios plugins directory in the constants.conf:

const PluginDir = "/usr/local/nagios/libexec/"

Am I incorrect in the assumption that the commands in that directory are not automatically loaded? As I stated I want to be executing this command from the satellite against the client, but it seems the client is attempting execution, unless this error is just referencing that it was unable to execute against said client node?

Edit: Also as I have mentioned, if I add this “check_openvpn” command into that directory on the client node the error will no longer appear.


(Michael Friedrich) #4

In order to execute a check plugin, you’ll need to integrate it with a CheckCommand. It doesn’t matter where in your cluster the actual execution happens, all instances top down need to be made aware of this CheckCommand objects. This is referenced in the host/service object config on the master, the satellite and the client. Therefore I’d suggest to put the CheckCommand definition into e.g. a global zone and let this one being synced from the master to all instances.


(Joe Valerio) #5

Ahhh, I feel silly now. Thank you for the clarification on that and the patience, the error confused me for sure!

Fixed!


(Joe Valerio) #6

Hey Michael – while that solved that error I think this is still an issue. I have two Check Commands defined here:

object CheckCommand "check_customer_portal" {
  command = [ PluginDir + "check_customer_portal.php $address$" ]
}

object CheckCommand "check_http" {
  command = [ PluginDir + "check_http -I $address$ $ARG1$" ]
}

And my service defined like so:

apply Service "Apache" {
  check_command = "check_http"
  assign where host.zone == "prod1-na1" && host.vars.type == "vpn"
}

apply Service "Customer Portal" {
  check_command = "check_customer_portal"
  assign where host.zone == "prod1-na1" && host.vars.type == "web"
}

Yet the response is giving me the following output:

execvpe(/usr/local/nagios/libexec/check_customer_portal.php 10.201.10.92) failed: No such file or directory
execvpe(/usr/local/nagios/libexec/check_http -I 10.201.10.22 ) failed: No such file or directory

I can prove the scripts exist on the satellite here with the correct path:

root@<satellite>:/usr/local/nagios/libexec# ll /usr/local/nagios/libexec/check_http /usr/local/nagios/libexec/check_customer_portal.php
-rwxr-xr-x 1 nagios nagios   6379 Dec  4 15:47 /usr/local/nagios/libexec/check_customer_portal.php*
-rwxr-xr-x 1 nagios nagios 362612 Sep 11 17:59 /usr/local/nagios/libexec/check_http*

So it still seems to me that these are being executed directly on the client node. Thoughts?


(Michael Friedrich) #7

Look into CheckCommand objects, and their definition. You cannot use the command line as you would with 1.x or Nagios. Instead, split this up into command and the arguments dictionary.

Also, you don‘t need to reinvent the wheel with http checks, that command already exists in the ITL.


(Joe Valerio) #8

Okay, so if I’m understand correctly it’s stating that it’s unable to find the path while including the arguments in that search?

Seems to have solved my problems! Last question (forgive me if this IS in the documentation), I’ve been scouring the CheckCommands page and haven’t been able to get this answered. The documentation shows the arguments dictionary as the key being the flag to pass along with the argument. What about something without a flag? For my check_customer_portal I’ve just separated the args in the command array which seems to have fixed it, but I want to ensure I’m following best practices:

object CheckCommand "check_customer_portal" {
  command = [ PluginDir + "check_customer_portal.php", "$address$" ]
}

(Alex) #9

Hi,

take a look at here. You can use skip_key to pass only the value to the command. Something like that:

object CheckCommand "check_customer_portal" {
    import "plugin-check-command"
    command = [ PluginDir + "/check_customer_portal.php" ]

    arguments = {
        "--address" = {
            value = "$customer_portal_address$"
            skip_key = true
        }
    }

    vars.customer_portal_address = "$address$"
}

Greetz


(Joe Valerio) #10

Thanks for the assistance, y’all! This simple issue shall plague me no longer!!