Hi, we currently have Icinga2 with Icingaweb2 / Director running on a set of Docker containers:
icinga node (includes webserver)
postgres node (IDO)
graphite node (graphs)
We recently stood up a dev instance to prepare for an HA deployment. While we considered using our in-house VMWare VPLEX HA, this is mainly just for host HA, not the application, meaning if the application goes down, that doesn’t really help. In reviewing https://www.icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#high-availability-master-with-clients, I have a few architectural questions before I devote time to attacking HA.
Our requirements:
- Icinga2 HA running with Director and Icingaweb2
- Postgres IDO backend
- No clients are necessary, unless that is good practice
Questions:
- Should each icinga master be on a different cluster for safer HA failover?
- Should each postgres node (if HA desired) be on a different server?
- Do we need, or should we have, a postgres HA failover solution like http://clusterlabs.github.io/PAF ?
- If master-1 is the “config master,” is this really true HA? From what I understand, Director will push to the config master, and then replicate to master-2
- Will we need a proxy in front of the masters like NGINX to load-balance/failover to the right master in the event of failure? How is this typically done?
I am thinking I should standup Icinga on two different hosts, and “tie them together.” I admit I need to study the first linked doc more, but I wanted to conceptually wrap my head around the high-level requirements. Our other thought was to leverage VMWare VIC (vendor Kubernetes) for some kind of HA, but we are not at the stage yet (testing first with a Jenkins node).