Execution command in pending state - Icinga2 HA

This forum was archived to /woltlab and is now in read-only mode.
  • Hi all,
    i have a problem with some checks in my High availability environment (icinga version 2.6.0).
    I have 2 host in a master zone and 3 satellite zones (sat1 : 2 hosts, sat1 6 hosts, sat3 2 host).
    Actually i have 72 hosts and 129 services, and all is ok. The configurations zone are synced, and all the commands are correctly executed.
    My problems are in the master zone. For security reason i have disabled accept_config and accept_command from the master that contains the information in zones.d, so, for the local check (load, users etc.), i have created a services.conf files in conf.d and there put the two services to run locally:

    object service "load" {
    host_name = NodeName


    apply service "users" {
    assign where host.name == NodeName

    But they never been executed (or in pending state, or in late). In icinga.log there is nothing important. Enabling debuglog i can see that the master1 send relay force execution to master2 (when i force check via web), but in master2 i can't see nothing.... The features are correctly enabled and all works fine except for this two services.
    For testing purpose i have added other 2 services in conf.d (disk and icinga) and these services works as expected.

    Two strange things:
    1) if i stop the second master, all works fine, but when i restart service the check remain in pending state (late execution);
    2) if i rename load in loads (or any other name), and users in user, all works fine again....

    This is a very strange behaviour, it seems like the cluster protocol assigns the two services to the second master that is not able to executes command because the declaration of services are private in the master1 (in conf.d). In effect, if i move services.conf in the master directory under zones.d, the check source is master2 and i need to force accept_commands in master1 and insert command_endpoint to force execution in master1.

    I have also deleted many times all files in /var/cache/icinga, /var/lib/icinga/api/repository, /var/lib/icinga/api/zone.d, /var/lib/icinga/icinga.state, drop ido icinga DB and recreate it, but i can't resolve this...
    There is a way to purge all cluster cache information?
    Someone have this strange beaviour?
    Thanks for your attention.

  • Putting the config into "conf.d" won't replicate it to the secondary node. Why don't you use zones.d/<masterzone> for those host/service objects?

  • Hi, excuse for the late.
    I know that the config in "conf.d" isn't replicate, but if i put this host/service in zone.d/master, i have to enable "accept_command" in api configuration in all masters, and (if possible) i want to avoid this....
    Do you know a different solution?
    Thanks very much for you reply.

  • Why would you want to avoid that? Without that setting enabled you won't get comments and downtimes replicated amongst the nodes either. Enable it and move the config objects file into zones.d/master/

  • Ok, thanks for your reply, in your opinion there isn't security issue enabling "accept_commands" in api configuration in master?

  • No, since all nodes in the same zone should trust each other. That setting is merely in place for clients, who might just deny that by default.