Check_vmware_api slow when run multiple times

vmware
icinga2
(Finn Meinen) #1

Hi,

We are currently in the process of migrating to a new server with a fresh Icinga2 version, jumping from Ubuntu 14.04 to 16.04.
We used to monitor our quite big ESX enviroment with the check_vmware_api.

On the new Server I installed vmware-vcli version 6.5 and all requiered Perl modules from cpan.

While the check is working properly when run only once, it gets absolutely stuck when run for all our ESX Hosts.
I nailed this down to some Perl module now using /dev/random instead of /dev/urandom, which requires entropy to generate random strings, which there is not enough of for 80 Checks.

Which module is responsible for this change and what version of it do i need to install manually to mitigate the issue?

Many thanks in advance,
Finn

(Michael Friedrich) #2

Which Perl version are you using? Can you also extract the installed Perl modules fetched via vmware-cvli?

There are known issues with libwww-perl in newer versions where HTTP calls against the VMWare API are tremendously slow. This is described here and might also be the case in your setup.

(Finn Meinen) #3

The new Server is running perl 5.22.1.
The problem definetly comes from some module calling /dev/random instead of /dev/urandom.
When you run the check with strace you can see it get stuck at “/dev/random” on the new server until there is enough entropy available, which you can monitor in “/proc/sys/kernel/random/entropy_avail”

To replicate this you have to have the check running multiple times. 3 Instances are already enough to see a clear delay beacause of missing entropy. The more often the check is run the bigger the delay becomes.
When the check is run only once it is done within a secound. So I do not think that this is an issue with the HTTP call.

on the old server the exact same script with the only differents being perl librarys you can see the script in strace calling “/dev/urandom” instead of “/dev/random”, which does not require a minimum amount aof entropy.

strace of old server:
open("/dev/urandom", O_RDONLY) = 4
read(4, “\n\224\344\36”, 4) = 4
open("/dev/urandom", O_RDONLY) = 4
read(4, “\232\202\222]x\302\355\325\326’\367\2152L\205\211H\264\255\256\24\f\327u:\365]\235;F7\212”…, 64) = 64

strace on new server:
open("/dev/random", O_RDONLY) = 3
read(3, “\234\305\346 e\350\356(\n|uvFg3\317R\24\307\342\350\357\263\367\324\r\220D\230\213\24\373”, 32) = 32
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 3
read(3, “$\200\226\276~\225\331\33a\317\232\300\202\21p\201\341\331\275M\23\235\205\262?\247\222\7\243\2\177\346”, 32) = 32

I think the module calling /dev/random is Crypt::SSLeay, but I am not sure. Do you happen to know if a specific version of it is requiered or recommended?
Currently version 0.73 is installed on the new server, while version 0.58 is installed on the old server.

On cpan.org I can only get version 0.64. Which I will try to install manually now.

(Rafael Voss) #4

You can solve this by

apt-get install haveged

http://issihosts.com/haveged/

4 Likes
(Finn Meinen) #5

Thank you!
That did the trick!

I did install rngd yesterday already but it only helped a bit.

1 Like
(Michael Friedrich) #6

Thanks, took a note here.

1 Like