Inconsistent retry intervals

icinga2
retry_interval
inconsistant

(BWCA) #1

Im having an issue where checks seem to have random retry intervals for example, this service should retry every 5 minutes, but instead retries every 1-4 minutes.

Im at a loss… I hope its something simple. Max, retry, and check intervals are all defined, and the only other import is ‘plugin-check-command’. Both the master and client are both running icinga2 r2.8.1-1.

The inconsistencies as per icinga web2 gui:

NOTIFICATION - 22:14:47 - [Operations-PD] check https://node01.test.com:8443
CRITICAL - 22:13:33 - [slack-notifications] check https://node01.test.com:8443
CRITICAL - 22:13:33 - [ 4/5 ] check https://node01.test.com:8443
CRITICAL - 22:09:33 - [ 3/5 ] check https://node01.test.com:8443
CRITICAL - 22:08:13 - [ 2/5 ] check https://node01.test.com:8443
CRITICAL - 22:04:34 - [ 1/5 ] check https://node01.test.com:8443

The check:

template Service "generic-icinga-openshift-service" {      
  if (host) { 
    command_endpoint = host.name
  } else {
    command_endpoint = host_name
  }

  max_check_attempts = 5
  check_interval = 5m
  retry_interval = 5m
}

object CheckCommand "oshift_sdn-nodecheck" {
  import "plugin-check-command"
  command = [ "/usr/bin/sudo", PluginDir + "/oshift_sdn-nodecheck.sh" ]

arguments = {
  "fqdn" = {
         skip_key = true
         value = "$host.name$"
         order = -1
    }
  }
}

From the master node: (very similar to the satellites, same values)

Object 'node01.test.com!oshift_sdn-nodecheck' of type 'Service':
  % declared in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * __name = "node01.test.com!oshift_sdn-nodecheck"
  * action_url = ""
  * check_command = "oshift_sdn-nodecheck"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 105:3-105:40
  * check_interval = 300
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 9:3-9:21
  * check_period = ""
  * check_timeout = null
  * command_endpoint = "node01.test.com"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 3:5-3:32
  * display_name = "oshift_sdn-nodecheck"
  * enable_active_checks = true
  * enable_event_handler = true
  * enable_flapping = false
  * enable_notifications = true
  * enable_passive_checks = true
  * enable_perfdata = true
  * event_command = ""
  * flapping_threshold = 0
  * flapping_threshold_high = 30
  * flapping_threshold_low = 25
  * groups = [ ]
  * host_name = "node01.test.com"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 5
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 8:3-8:24
  * name = "oshift_sdn-nodecheck"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * flapping_threshold = 0
  * flapping_threshold_high = 30
  * flapping_threshold_low = 25
  * groups = [ ]
  * host_name = "node01.test.com"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * icon_image = ""
  * icon_image_alt = ""
  * max_check_attempts = 5
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 8:3-8:24
  * name = "oshift_sdn-nodecheck"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * notes = ""
  * notes_url = ""
  * package = "_etc"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
  * retry_interval = 300
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 10:3-10:21
  * source_location
    * first_column = 1
    * first_line = 103
    * last_column = 36
    * last_line = 103
    * path = "/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf"
  * templates = [ "oshift_sdn-nodecheck", "generic-icinga-openshift-service" ]
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 1:0-1:50
  * type = "Service"
  * vars
    * pagerduty = [ "Operations-PD" ]
      % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 106:3-106:38
  * volatile = false
  * zone = "master01-satellites05-06"
    % = modified in '/etc/icinga2/zones.d/global-configs/services/openshift-custom-checks.conf', lines 103:1-103:36

From the client node:

Object 'oshift_sdn-nodecheck' of type 'CheckCommand':
  % declared in '/var/lib/icinga2/api/zones/global-configs/_etc/services/openshift-custom-checks.conf', lines 13:1-13:42
  * __name = "oshift_sdn-nodecheck"
  * arguments
    % = modified in '/var/lib/icinga2/api/zones/global-configs/_etc/services/openshift-custom-checks.conf', lines 17:1-23:3
    * fqdn
      * order = -1
      * skip_key = true
      * value = "$host.name$"
  * command = [ "/usr/bin/sudo", "/usr/local/nagios/libexec//oshift_sdn-nodecheck.sh" ]
    % = modified in '/var/lib/icinga2/api/zones/global-configs/_etc/services/openshift-custom-checks.conf', lines 15:3-15:71
  * env = null
  * execute
    % = modified in 'methods-itl.conf', lines 36:3-36:33
    % = modified in 'methods-itl.conf', lines 36:3-36:33
    * arguments = [ "checkable", "cr", "resolvedMacros", "useResolvedMacros" ]
    * deprecated = false
    * name = "Internal#PluginCheck"
    * side_effect_free = false
    * type = "Function"
  * name = "oshift_sdn-nodecheck"
  * package = "_cluster"
  * source_location
    * first_column = 1
    * first_line = 13
    * last_column = 42
    * last_line = 13
    * path = "/var/lib/icinga2/api/zones/global-configs/_etc/services/openshift-custom-checks.conf"
  * templates = [ "oshift_sdn-nodecheck", "plugin-check-command", "plugin-check-command" ]
    % = modified in '/var/lib/icinga2/api/zones/global-configs/_etc/services/openshift-custom-checks.conf', lines 13:1-13:42
    % = modified in 'methods-itl.conf', lines 35:2-35:69
    % = modified in 'methods-itl.conf', lines 35:2-35:69
  * timeout = 60
  * type = "CheckCommand"
  * vars = null
  * zone = "global-configs"

Any help would greatly be appreciated. Thanks!


(Carsten Köbke) #2

Hello,

it looks like you run into this bug. So please update to 2.10 and test if the problem is still there.

Regards,
Carsten