Icinga nodes stayed in pending status

(Didentifier) #1

Hello I added 3 new nodes to my icinga master but they stayed in Pending mode.

I enabled the debug mode and I see that there is heartbeat.

[2019-09-26 09:53:39 +0200] notice/JsonRpcConnection: Received 'event::Heartbeat' message from 'clearwax.co.uk' [2019-09-26 09:53:39 +0200] notice/JsonRpcConnection: Received 'event::Heartbeat' message from 'gzone.com.cy'
I uploaded my zones.conf and zones.d/ directory maybe it can help.

Also I am new to the forum and kind of new to icinga so if you need mode information to help me please let me know what exactly I should check.

The zones.conf and zones.d are provisioned with Ansible and connected manually with the cli command (this worked also), so it is the same configuration for every client however the new clients stayed on pending.

An additional point, I think this happens to me when my manual configuration fails, when I manage to connect them they stay in pending mode, maybe its something with the nodes config? maybe there is a way to refresh the configs?

zones.zip (17.5 KB)

(Aflatto) #2

Hello and Welcome

It seems that you have some misunderstanding on how Zones work , once you define a Zone, if you place the host in that zone directory under the zones.d it will automatically be assigned to that zone, so the definition of a zone in your host is redundant.

The notification that you pasted is 2 entries for heartbeat from 2 zone endpoints, and has no implication on the issue you are asking, beside to state that the endpoint are communicating with the master

Now you seem to have defined your host ( Satellite endpoint) as the host within the zone, that will cause a “loop” logic, let me explain.

The end point is the Satellite that is meant to execute check in the zone and send the results to the master, in your case it’s IP is, but in the same config you told Icinga that the same IP is a host to be monitored from within the zone to IP and that will cause a problem to the Satellite to check.

If you want to check the host as a host, move the host definition (host.conf) to the master zone and that way the master will perform the check on the satellite.

Further more, the best forum to get help is the official one by icinga https://community.icinga.com/

Good luck

(Didentifier) #3

Hello and thank you for the answer,

Let me explain the logic behind the setup and please correct me if I am wrong.

Firstly, I decided not to perform check from the master and the reason is because each of every machine I am monitoring is a separate physical client ( web hosting), so if at some point a client needs more than one machine and a private network I will need to create satellites and create zones, so I decided to start from the beginning with zones, since I am using ansible it’s not much of an effort to create separate zone for each client and it’s not affecting the performance ( again everything I say is what I understood so far). However, as I understood so far, a satelite can monitor itself and the nodes in the zone, however having a satelite for one machine it doesn’t really have any benefit, it will just cost more.

Now in my setup, each of our client doesn’t really need zone yet because its just one machine one the same network ( but this will change soon with a new customer) so for me it made sense that a host would be a zone and a satellite as well monitoring itself and I didn’t really see any issues so far to be honest I am monitoring each and every one without much issues 3 months or so.

So today what happened, I did run the exact same playbook of Ansible that was supposed to install icinga2, connect the node to the master, put the scripts in the right places and work out of the box, didn’t work which was weird, after trying a lot of time to find an answer I read a comment pointing out that “icinga2 daemon -X” made obvious which one was the issue that he couldn’t find in logs. So i saw somewhere that icinga master is doing something that is deprecated ( i don’t remember the exact warning message) So I said ok let’s update Icinga.

At the point that the new version was installed one by one the services were coming back.

It could be a coincidence but does someone really know if having different versions affect the system?
Also if someone thinks my setup is still not the best practice again let me know, I just doubt the answer since everything is working well and I don’t see any performance/ strategic issues.