Service Check Timed Out

This forum was archived to /woltlab and is now in read-only mode.
  • Hi, Dear all,


    i meet some strange problem with my two monitoring systems:
    System A: Nagios which was installed from source code from http://www.nagios.org/download. It contains Nagios Core and Nagios Plugins. I have built some new services and plugins based on check_http
    System B: OMD whcih was installed from package (not source code) from http://files.omdistro.org/releases/suse/
    I tried to migrate my own service from System A to System B. So copied all the configuration into /omd/sites/test/etc/nagios/conf.d and plgins into /omd/sites/test/local/lib/nagios/plugins. Test is my own site name.
    Till now all the servcies work except one service:
    This servcie is a heavy service and it needs usually more than 70s to get the result. So i use the parameter "check_http -H host -p Port ... -t 120" to set a long timeout in this servcie. It works in System A for over one year, but not in System B. The service show me the following error on the web page of OMD after about 60s.
    Current Status: CRITICAL
    Status Information: Service Check Time Out
    Check Latency/ Duration: 0.119/60.001 seconds



    Can somebody help me to still use this heavy service in OMD? Can i reset the Check Latency/ Duration?


    Best Regards




  • I guess you need to set the global service_check_timeout variable in


    Code
    1. /etc/nagios/nagios.d/tuning.cfg

    (or something like this)



    to 120. I assume its default value is 60 seconds. So you get the timeout after about 60s on your 'heavy' check.

  • Thanks alot. Your suggestion works. I just added this setting in nagios.cfg instead of nagios.d because of this text in nagios.cfg:
    # Nagios main configuration file


    # This file will be read in after the files in nagios.d.
    # Variables you set here will override settings in those
    # files. Better do not edit the files in nagios.d but rather
    # copy variables from there to here. That will save you
    # trouble when updating your sites to new versions.


    You saved my day! :D

  • I'm using Shinken from OMD:

    Code
    1. $ omd version
    2. OMD - Open Monitoring Distribution Version 1.10

    Here is my service timeout:

    Code
    1. cd ~/etc
    2. grep -r service_check_timeoud .
    3. ./icinga/icinga.d/tuning.cfg:service_check_timeout=60
    4. ./nagios/nagios.d/tuning.cfg:service_check_timeout=1800
    5. ./nagios/nagios.cfg:service_check_timeout=420


    But a service that takes less than 3 minutes, times out with this message:

    Code
    1. (Service Check Timed Out)


    I think OMD doesn't use my setting.

  • Thanks for your answer.

    Even if tuning.cfg isn't included, I have configured a high timeout value in nagios.cfg, and I think the applied timeout is very low (maybe 60 secs.).