Flood of alerts

This forum was archived to /woltlab and is now in read-only mode. Please register a new account on our new community platform.

You can create a thread on the new site and link to an archived thread. This archive is available as knowledge base, safe and secured.

More details here.
  • Lately we have been seeing this behavior where Icinga throws flood of alerts now and then. We have one single instance running for now.

    Checking error logs around the time when alerts were thrown, it seems there are error messages similar to "exited with error code 128". At the same time we received check_load critical on that icinga server.

    Load : Additional Info: CRITICAL - load average: 18.85, 12.01, 5.95

    Version and Platform details:

    icinga2 - The Icinga 2 network monitoring daemon (version: r2.6.0-1)

    System information:

    Platform: Ubuntu

    Platform version: 14.04.5 LTS, Trusty Tahr

    Kernel: Linux

    Kernel version: 3.13.0-53-generic

    Architecture: x86_64

    Build information:

    Compiler: GNU 4.8.4

    Build host: lgw01-16

  • Also seeing this errors:

    ERROR: Executed command exits with return code '7'

    NPCD[1300]: ERROR: Command line was '/usr/lib/pnp4nagios/libexec/process_perfdata.pl -n -b /var/spool/icinga2/perfdata/service-perfdata

  • it seems there are error messages similar to "exited with error code 128"

    Are these errors limited to the call of a single plugin, a type of plugins (e.g. SNMP), or "general"?

    NPCD[1300]: ERROR: Command line was '/usr/lib/pnp4nagios/libexec/process_perfdata.pl -n -b /var/spool/icinga2/perfdata/service-perfdata

    Have you tried to execute this call on the command line (as Icinga2 user)?

  • General. We don't use SNMP plugin.

    My main question is why host keeps getting overloaded after few days. Have enough Processing power, memory and disk. It doesn't show any errors in syslog.

  • I'd suggest upgrading to 2.6.3 which fixed a couple of stability bugs. Other than that, I'd analyse the load in performance graphs, htop and so on.

    More hints here: https://docs.icinga.com/icinga…oting-analyze-environment

  • Vaguely recall seeing this. Does ALL your PNP graphs look ok? I would have big holes appear and then it seemed to work again.

    It was a ownership issue. I now have this line in my installer:

    chown -R icinga:icinga /var/lib/pnp4nagios