Icinga2 setup - InfluxDB missing client metrics

This forum was archived to /woltlab and is now in read-only mode.
  • Hey,


    i've just set up an Icinga2 monitoring cluster for testing purposes, with 3 VMs :


    1) Master Node (Endpoint) -> Master Zone

    2) Client1 Node (Endpoint) -> Client1 Zone

    3) Client2 Node (Endpoint) -> Client2 Zone


    I have used icinga2 node wizard on all nodes (1 master & 2 satellite) and also set up Icinga2 Web interface on master Node, using 2 MySQL databases.

    All nodes appear in the web interface looking good, zones are connected and hosts are running the default services (12 in number).


    I need to export the metrics from all 3 nodes to InfluxDB, so i enabled the correspondig feature only on the master Node with the needed configurations.

    In the InfluxDB Web Panel I can query the DB and get back metrics using the InfluxQL.


    ISSUE 1


    The problem is that only metrics from the master-Node are written in the InfluxDB.

    As i have understood, each client-Node sends the metrics to the master-Node, as my 2 client zones have the master zone as parent. So the master-Node should write all metrics it receives.

    What am I missing?

    I can't figure out if it's an Icinga related problem or an InfluxDB related problem


    ISSUE 2


    When querying InfluxDB, i get metrics with wrong TIME value, in comparison to my local TimeZone ( e.g. correct time is 20.00 and the metric has 18.00 in the Time column)

    Can i change that setting somehow?


    All 3 nodes have Ubuntu Server 16.04 freshly installed - if any .config files are needed. i can share them for better insight of the setup.


    Thanks a lot for any help!


    Cheers,

    gozek



    The post was edited 1 time, last by gozek ().

  • Can you please share which client mode you are using? (top down, bottom up) In addition to that a sample Service check definition, including the performance data output. That one can be fetched via REST API from the /v1/objects/services endpoint, last_check_result contains the performance data. Details on that on the troubleshooting docs.


    In terms of the timezone, I honestly don't know. A quick Google led me to this thread, maybe that is the issue? http://stackoverflow.com/quest…grafana-timezone-mismatch

  • Thanks for the quick reply.


    I am using the default config, which is top down. I hope I have properly configured that.


    Here is a sample Service Check Definitions (default ones), from client2-node: /etc/icinga2/conf.d/services.conf :



    Here is the output of the curl command at the endpoint you mentioned : (API call made from master-node, in case this is important)


    (you don't have to review it all, it's huge - i can filter specific services and re-post, if you need sth specific from the output).

    curl.txt


    How can I verify that the top->bottom config is correct ?

  • Hm, when you're defining the configuration on the client itself in conf.d/, this means that you're using the bottom up approach. Let me guess, you're calling "node update-config" on your master?

  • Well yes, i followed this tutorial to make the initial setup :


    http://linoxide.com/ubuntu-how…-manage-services-icinga2/


    This means i follow bottom->top config and that metrics from clients are not aggregated in master-node? In that case shall i enable influxdb feature on clients so they write in InfluxDB on their own?


    Ideally i want to switch to top-> bottom config, 1 master zone, 2 client zones, so only master interacts with InfluxDB.

    This means i need to configure everything in the master-node.

    I haven't fully understood how the config sync is done in that case.


    (also I haven't figured out when i need to make changes /etc/icinga2/repository.d directory)


    thanks for your time

  • I would recommend to remove the repository.d as it is from the bottom up config method which is deprecated.

    As you do not seem to really understand how Icinga2 works, with all the configs etc, I would advise you to use a test system, install the director and work with it.

    The Director will help you, building a solid top down config and also you will not have to write the config yourself, which seems to be something you struggle with.

    Linux is dead, long live Linux


    Remember to NEVER EVER use git repositories in a productive environment if you CAN NOT control them

  • Cool, I will check the Director for Icinga2.

    The repository.d directory was created by default after running the wizard and that's confusing, if is deprecated.

    So i only need /conf.d and /zones.d for my setup?

  • you will only need those two yes, but I would suggest ,that you remove any host services and commands from conf.d as those may give you problems with the director along the way.


    Also I would advise to install any possible nagios-plugin package that you can find, those will come in handy.

    Linux is dead, long live Linux


    Remember to NEVER EVER use git repositories in a productive environment if you CAN NOT control them

  • I guess you mean to remove hosts.conf , services.conf , commands.conf inside /conf.d from all 3 nodes, right?

    So the Director can 'overwrite' those.

    And also delete /repository.d from all 3 nodes?


    thanks

  • basically you can remove everything from conf.d. But you need to recreate it completely via the director, which is a onetime thing.


    I would advise to remove everything except the api-users.conf

    Linux is dead, long live Linux


    Remember to NEVER EVER use git repositories in a productive environment if you CAN NOT control them

  • I wrote some migration docs with tips for the deprecation in v2.6. Though "deprecation" does not allow for removal, therefore the repository.d config include is still there.


    In terms of that external howto - that is unfortunately outdated, and proposes the bottom up mode.


    I'd suggest to have a read on the official documentation instead.

    https://docs.icinga.com/icinga…er/distributed-monitoring

  • I followed the docs for setting up the top-down configuration and everything seems good, without using the Module Director.

    All nodes appear in Icinga2Web and now master writes all metrics of defined checks/services (master-node, client1-node, client2-node) to InfluxDB.


    No /repository.d, no node update-config


    One last thing I wish to clarify, is the difference between the 2 proposed methods : Top-Down Config Sync and Top-Down Command Endpoint

    In which case should each one be used ?


    cheers :)

  • command endpoint was invented for quick checks invoked by the master, thus sending a check message to the client. That one executes the check, sends back the result as cluster message to the master. There is no local host/service object config involved, only the CheckCommand definitions are required (global config sync zone e.g.).


    On the other hand it can be required that your client runs the checks independent from the master's scheduler. This is especially the case if the connection between them drops. The client (which probably is more of a satellite here) will continue to run and execute checks locally and store them in the replay log. Once the connection is re-established the client sends back the check result to the master. The master is then responsible for processing them, and allows for proper history (e.g. database backend or Graphite metrics).

    The config sync bit here means that you define the configuration objects on the master inside the client's zones.d directory, and the master takes care of syncing those configs to the client which happily receives them and does a reload without user interaction.


    There are certain advantages and disadvantages between these two modes, those are described in their respective documentation chapters.

  • Hey another question came up, getting my hands dirty in the previous config :


    I want to execute ping4 and ssh checks from the master-node towards the 2 client-nodes.

    The definition of these two services, as we follow a top-bottom configuration, are inside the master-node.


    When defined like this, the checks didn't seem to get executed on the clients, and also the "Check now" in webadmin interface didn't work :


    master-node: /etc/icinga2/conf.d/services.conf


    When, though, I added the following line in the previous file

    Code
    1. zone = "icinga2-master1"

    everything worked like a charm (icinga2web updating as expected and metrics are written in InfluxDB).


    The exact same happened for ping4, and i guess for any other "remote check".


    What exactly does this line do and why is it needed ?

    Is there another way to define the service or maybe even ommit it?


    Thanks!

  • could you show us your zone files? Sounds like they are not correct.

    Linux is dead, long live Linux


    Remember to NEVER EVER use git repositories in a productive environment if you CAN NOT control them

  • Can you show the files inside zones.d/ - e.g. with ls -lR /etc/icinga2/zones.d


    (Pinning certain checks to a specific zone is an advanced setting, but before explaining that I'd like to see your current structure).

  • Sure,



    Maybe should i create a folder for the master zone (icinga2-master1) under /zones.d/ ?

  • For consistency I would put the apply rule into zones.d/<master> but still pin it to the zone itself. The thing is once the apply rule gets evaluated, it will inherit the zone membership from the host. That is the client's zone in this regard, and not something you'd want (the master doesn't feel responsible for the service in a child zone, doesn't trigger checks). While explicitly marking this service for the master zone, the master instance will attempt to execute the checks locally in their zone.


    "6.9.3. Top Down Config Sync" partially mentions that for the host object check. I haven't got an idea how to visually explain that for experienced users. Maybe you've got a documentation patch and want to help out by sending in a PR on Github.