Check_hpasm timeout setting not working


(Alex) #1

I’m using the check_hpasm plugin to monitoring the hardware on my HP servers. The plugin is working great on most of my servers but I’m getting a timeout error on a few. The default timeout for check_hpasm is 60 seconds. I can run the check_hpasm plugin from the command line and set the timeout to 180 seconds the plugin results come back correctly on the servers I’m having problem with.

I have created my own custom plugin named “my-check_hpasm” and added the timeout value of 180. See below.

/*****  Custom HP Plugin *****/
object CheckCommand "my-hpasm" {
  command = [ PluginDir + "/check_hpasm" ] 

  arguments = {
    "--community" = "Icinga"
    "--timeout" = "180"
    }
  vars.hpasm_hostname = "$address$"
  }

I believe my new custom plugin is correct but it is not solving my time out errors. I reviewed the Icinga debug logs and I’m still getting timeout error after 60 seconds.

Has anyone else have sucess with changing the timeout value on the check_hpasm plugin?

Thanks in advance for your help.

BTW - I’m running Icinga 2 version r2.8.4-1 on Redhat Enterprise 7.4.


#2

As shown in the documentation you have to set “timeout = ...” in the object definition. Setting it as a plugin parameter doesn’t override the default value of 1m.


(Alex) #3

Hello Wolfgang,
Thanks for your reply. I have edited my object definition as you suggested but the timeout is still only set for 1 minute. Below are my conf files and debug log.

/etc/icinga2/zones.d/global-templates/commands.conf

/***** Checks HP Hardware ( Added timout switch )  *****/
object CheckCommand "my-hpasm" {
  command = [ PluginDir + "/check_hpasm" ] 

  arguments = {
    "--community" = "$hpasm_community$"
    "--timeout" = "$hpasm_timeout$"
    "--hostname" = "$address$"
    }
 
  }

/etc/icinga2/zones.d/global-templates/services.conf

/*********** HP Hardware Check ***********/
apply Service "HP_Hardware" {
  import "generic-service"

  display_name = "HP Hardware"
  name = "HP Hardware"

  groups = [ "Hardware" ]

  vars.hpasm_community = "Icinga"
  vars.hpasm_timeout = 180

  check_command = "my-hpasm"

  max_check_attempts = 3
  check_interval = 1h
  retry_interval = 10m

// command_endpoint = host.vars.client_endpoint    //specify where the check is executed

 assign where host.vars.hardware_manufacturer == "HP"
}

/var/log/icinga2/debug.log

[2018-10-29 09:57:10 -0400] notice/Process: Running command '/usr/lib64/nagios/plugins/check_hpasm' '--community' 'Icinga' '--hostname' 'myproblemserver' '--timeout' '180': PID 18894
[2018-10-29 09:57:10 -0400] debug/CheckerComponent: Check finished for object 'myproblemserver!HP Hardware'


[2018-10-29 09:58:10 -0400] warning/Process: Killing process group 18894 ('/usr/lib64/nagios/plugins/check_hpasm' '--community' 'Icinga' '--hostname' 'myproblemserver' '--timeout' '180') after timeout of 60 seconds
[2018-10-29 09:58:10 -0400] warning/Process: PID 18894 was terminated by signal 9 (Killed)
[2018-10-29 09:58:10 -0400] warning/PluginCheckTask: Check command for object 'myproblemserver!HP Hardware' (PID: 18894, arguments: '/usr/lib64/nagios/plugins/check_hpasm' '--community' 'Icinga' '--hostname' 'myproblemserver' '--timeout' '180') terminated with exit code 128, output: <Timeout exceeded.><Terminated by signal 9 (Killed).>

Thanks in advance for your help.


#4

Looking more closely at the documentation you’ll see that “command, ...” are at the same level, so it’s:

/***** Checks HP Hardware ( Added timout switch ) *****/
object CheckCommand "my-hpasm" {
   command = [ PluginDir + "/check_hpasm" ]
   arguments = {
      "--community" = "$hpasm_community$"
      "--hostname" = "$address$"
   }
   timeout = 180
}

(Michael Friedrich) #5

Or, a more readable version, timeout = 3m.


(Alex) #6

Hello Wolfgang,
Thanks for your reply. I have changed my command to as you suggested but It’s still posting a “timeout after 60 seconds” error. In the logs now it doesn’t show the timeout switch get entered on the command line.

/etc/icinga2/zones.d/global-templates/commands.conf

/***** Checks HP Hardware ( Added timout switch )  *****/
object CheckCommand "my-hpasm" {
  command = [ PluginDir + "/check_hpasm" ] 
  arguments = {
    "--community" = "$hpasm_community$"
    "--hostname" = "$address$"
    }
    timeout = 180   
  }

/var/log/icinga2/debug.log

[2018-10-29 15:22:22 -0400] notice/Process: Running command '/usr/lib64/nagios/plugins/check_hpasm' '--community' 'Icinga' '--hostname' 'myproblemserver': PID 30494

[2018-10-29 15:23:22 -0400] notice/Process: PID 30494 ('/usr/lib64/nagios/plugins/check_hpasm' '--community' 'Icinga' '--hostname' 'myproblemserver') terminated with exit code 3

Could this be a bug in how this plugin is processed? Do you know if anyone is using this plugin to monitor HP hardware?

Thanks in advance for your help.


(Alex) #7

I am using seconds becasue that’s what is recommended in the plugin help file.


#8

After a look at the source code it has a default timeout, too. Please add "--timeout" = "180" to the arguments as it was in your initial posting. Sorry.


(Michael Friedrich) #9

Icinga 2’s DSL allows for more readable duration literals. In the end, it is just 180 seconds after config compilation. Depends what you prefer.


(Alex) #10

Is there another timeout setting that I’m missing somewhere in the Icinga 2 configuration? From the debug log it looks like the setup of the custom command is correct. Is there a universal timeout setting for all commands in Icinga 2 configuration? If there is this setting could be set to time out before my custome command timeout setting. Any other ideas…

Thanks in advance for your help. :slight_smile:


#11
/***** Checks HP Hardware ( Added timout switch ) *****/
object CheckCommand "my-hpasm" {
   command = [ PluginDir + "/check_hpasm" ] arguments = {
   "--community" = "$hpasm_community$"
   "--timeout" = "$hpasm_timeout$"
   "--hostname" = "$address$"
   }
   timeout = 180
}

So this configuration isn’t working?


(Alex) #12

The latest configuration is working now. Thanks Wolfgang and Michael for your help !!!

Can you help me understand why the timeout value needs to be entered outside the arguments brackets? Just trying to better understand the code better. Thanks


#13

timeout = 180” isn’t an option for the check command itself but the plugin call in general. “–timeout” = "$hpasm_timeout$" is an argument specific to the plugin.