Collect Error at Host Level for all Failing Servers and send Only one Email

This forum was archived to /woltlab and is now in read-only mode.
  • I have One HostGroup and one Host that holds several Services.

    All The Notifications occur between 08:00 and 08:30 and are sent once per day for each failing service, meaning if there are 10 Services failing, 10 emails are sent.



    I could not figure out how to collect all the failing services and Sent just one Notification (from the Host where the services belong) listing all the failing services.

    Also, in case of a certain number of services are failing (like 10 of 30), how can the Host be marked as "Down"?


    This is My very first Post and just started with Icinga a few days ago, sorry if this is a dumb question.

    Many thanks :)

  • Hi,

    I could not figure out how to collect all the failing services and Sent just one Notification (from the Host where the services belong) listing all the failing services.

    AFIAK this is not possible with the build in notification scripts. For each service that changes it (hard) state Icinga 2 will execute the service notification script. Maybe it could be done with a other notification script, which collects the failed services between a certain time.

    Also, in case of a certain number of services are failing (like 10 of 30), how can the Host be marked as "Down"?

    The host is marked as Down when the configured host check returns a critical state e.g. the hostalive check. What you want can be done with the Business Process module. The module collects the states of services (also hosts) and with a given rule set it calculates the state of the process node.


    https://github.com/Icinga/icingaweb2-module-businessprocess

  • Maybe it could be done with a other notification script, which collects the failed services between a certain time.

    Well, changing states during that time probably wouldn't be taken into account, so the actual number of failing objects might be wrong.


    Executing a script called via crontab once a day which selects the failing objects from the database (or livestatus?) might be an option.