I have an icinga2 server which was running in ubuntu 16.04 without any worries. I did the update to 18.04 few weeks ago and since then, I have a problem with perl scripts which often go out in timeout
Icinga2 version: 2.11.1
My problem: I have several times per minute and on perl check_snmp (check_snmp_storage / int / process …) which very often do timeouts. These seem random
An example of a debug command:
/usr/lib/nagios/plugins/check_snmp_load.pl ‘’ -C ‘’ xxxxxxx ‘’ -H ‘’ 18.104.22.168 ‘’ -T ‘’ netsl ‘’ -c '‘20, 19,18’ '-f ‘’ -t ‘’ 60 ‘’ -w ‘’ 10, 9.8 ') terminated with exit code 128, output: <Terminated by signal 9 (Killed).>
The observation: when I run this script manually, I have the result in less than a second. If I ‘spam’ with this command, about once in 50, the command hangs, and after 60 seconds, I have my timeout
It’s extremely random. It happens on differents hosts
I tried to disable some icinga2 features without success (API in particular, which I need though)
I tried to change snmp version from 2c to 1 without success
I tried to increase the interval between checks without success
I tried a lot of things…
It’s been 2 weeks and I have no more idea.
It seems to happen only with scripts executed in perl, all the other services are OK
I have read and reread the logs of my upgrade from 16.04 to 18.04 to see what would have been modified or deleted, but I don’t find anything convincing.
Any ideas ?