Clustered Services and cold-standby - how to configure

We have a system which uses primary and cold stand-by Windows services: A Windows service normally runs on the primary host. If this fails (stops), it starts on the secondary host. It cannot run on both simultaneously.

Clearly if I have Icinga2 check both services (primary and stand-by) one will always be showing “critical” status and hitting the Icingaweb2 dashboard and causing worry.

If I only monitor the primary service, then it fails over to the secondary, I can see the first failure. However, if the secondary then fails - I don’t have visibility of the second failure.

So the challenge is how best to monitor both services? I am aware of check_cluster which allows a new service to be defined, which takes the status of a number of other services. But can the “not normally running” service be suppressed in Icingaweb2 dashboard? Or is there a more elegant way to achieve what I need. Thoughts are appreciated.

Strange idea, but it might just work. What if you set a dependency between the two (stand-by is dependent on primary) and set the attribute

state = [ Critical, Unknown ]

it should, if i understand the logic correctly think the stand-by is unavailable as long as the primary is OK or in Warning and you should not get notifications etc. Once the primary state changes to Critical or Unknown Icinga will start checking the stand-by while still showing the primary as Critical / Unknown.

Hope i got that correct, still working of wrapping my head around dependencies fully.

1 Like