So this is a weird one.
I’m in the process of automating satellite deployment with Ansible. So far my playbook does everything right, but I’ve run into a weird issue and I’m at a loss.
I provisioned a satellite with the playbook. Icinga2 reloads successfully on the satellite and the master, connection direction is set master > satellite only, they’re both running the same versions of Icinga2 and Debian. I can see in the logs a successful TLS handshake and zone sync, and I can even verify with
netstat that the satellite is 100% connected to the master. I’ve even gone as far as logging into mysql and making sure data for it exists in IDO.
But, for some reason, the host disappears from icingaweb2. If I restart Icinga2 on either node, it usually comes back for 5-15 minutes, then disappears again. While it’s present I can view its services, inspect it, everything you would with a normal host. Then it suddenly disappears.
Occasionally it comes back on its own. Then disappears again.
There doesn’t seem to be anything in icinga2.log indicating concurrent disconnections, db errors, or anything else. Icingaweb2 was using the syslog user facility and set to Warning and didn’t reveal any errors, either. I’ve since set it to log Information instead of Warning in the hopes that will reveal something…
I spent about 1.5 hours googling and haven’t found anything similar so far. I’m hoping someone here has run into this before (and fixed it)?
After a little more digging, I found that in the
icinga_objects table of the IDO,
is_active is set to 0 for this host during the times it’s not displaying in the web ui. So the question becomes, what’s setting it to 0?