Posts by branvan2k

This forum was archived to /woltlab and is now in read-only mode.

    Hi all,


    We just upgraded one of our IcingaWeb2 sites (icingaweb2-2.4.1-1.el6) and now, we don't see the Notes from Problem handling, like we used to...

    The old version of IcingaWeb2 is:

    Git commit ID: 9b14fffc338b27c66fc480458686841861328b20

    Git commit date: 2015-06-18


    We use Notes to point to an xml file, with the Instructions for that particular service / host.


    Until now it was defined like this:

    notes = "<a title='SERVICE ["+name+"] NODE ["+host.name+"] ENVIRONMENT ["+vars.service_environment+"] SUPPORT ["+vars.service_supportgrp+"]' target='_blank' href='../../../instructions/php/services.php?servicename="+name+"&nodename="+host.name+"&environmentstatus="+vars.service_environment+"&supportgrp="+vars.service_supportgrp+"'>INSTRUCTIONS</a>"


    inside /etc/icinga2/zones.d/global in the generic-host definition.


    But now notes is not interpreted as an url any more....

    Any advices?


    I attach a couple of pics.


    thanks a lot!

    Hi dnsmichi,


    In my case, I haven't still created the CheckCommand, I was just playing with the script.

    And with some help I have been able to figure out what was going on :D


    The problem was that when using --config option, even that I was using --type=virtual when executing the check, the script only takes care of what's inside the cfg file. So I had to put the --type=virtual INSIDE cfg file.


    It goes like this:

    Code
    1. @searches = ( { tag => 'test', type => 'virtual', logfile => '/var/log/messages', criticalpatterns => ['whateverpattern'] } );

    @friesoft give it a try...


    Regards.

    I'm stuck with the same thing :(


    I've created a very simple config file with one word I know it's in /var/log/messages


    If i do a:

    Code
    1. ./check_logfiles -d --logfile /var/log/messages --config /opt/nagios/conf/error_kernel_patterns.cfg --type=virtual
    2. OK - no errors or warnings|test_lines=0 test_warnings=0 test_criticals=0 test_unknowns=0

    but if i do:

    Code
    1. ./check_logfiles -d --logfile /var/log/messages --criticalpattern='EXIT' --type=virtual
    2. CRITICAL - (6 errors in check_logfiles.protocol-2017-04-04-16-03-12) - Apr 4 02:44:45 amdes01 xinetd[4678]: EXIT: vnetd status=0 pid=19013 duration=0(sec) ...|default_lines=31 default_warnings=0 default_criticals=6 default_unknowns=0


    I'll keep on searching...


    Regards,


    Alba.

    I've got a question related to this issue...

    So, all the comments that were waiting for the system to do its job, were waiting to be exec at:

    /var/lib/icinga2/api/packages/_api/nodename-xxxx/conf.d/comments/


    Why is Icinga2 putting all of them in a unique query? instead of doing some small querys...

    Don't you think it will work better?

    Are you planning on changing this behaviour?


    thanks in advance!

    is that a central mysql database shared via VIP to those 2 masters?

    Yes it is.


    mgmt01 - master of configurations

    mgmt04 - master of checks


    mgmtbd01 - database


    mgmt01 and mgmt04 have the same config at /etc/icinga2/features-enabled/ido-mysql.conf



    Hi there,


    We are having a problem in our icinga2 platform.

    We have 2 masters (ha-zone) and 4 satellites (satellites, parent:ha-zone).

    Since 4 days ago, whenever we do a reload, the master that normally is the active endpoint to talk to the database can't syncronize correctly, and we have to stop the whole platform, do restart...a little bit of everything, and after a while, it does syncronize, until today.


    Right now, we can't have both masters running, cause the one that connects to the database does not connect anymore, as it is the one choosen to be the active endpoint, the other one never connects to the database, so the platform is not syncronize. If we stop icinga2 service on master (the one that we think has the problem), then, the other master gets the control, and everything goes fine...except we don't have HA!!!


    Whatever...


    In /var/log/icinga2/icinga2.log we have seen there's this critical msg:


    [2017-02-16 12:37:57 +0100] critical/IdoMysqlConnection: Error "Commands out of sync; you can't run this command now" when executing query " INCREDIBLE BIG QUERY JOIN ";


    That message is not in the other master (the one that works), so I guess that's the problem.


    It has these huge query somewhere and everytime the service starts, it tries to exec it, but it is so big, that ends up with "Commands out of sync; you can't run this command now".

    Where does this query come from?

    Is there any way to get rid of it?


    I appreciate your help....

    Really?


    I thouth that if you have this CheckCommand definition:

    You can set up the Service passing snmp_arguments...

    Nsup,


    In your first error, icinga2 tells you where's the problem:

    Location: in /etc/icinga2/conf.d/hosts.conf: 64:25-64:28

    So line 64 of /etc/icinga2/conf.d/hosts.conf

    You have put 2 lines in 1

    check_command = "snmp" vars.snmp_arguments = [ vars.snmp_community, vars.snmp_oid ]

    It should be:

    check_command = "snmp"

    vars.snmp_arguments = [ vars.snmp_community, vars.snmp_oid ]


    BUT i think you still need to pass the address in snmp_arguments:

    snmp_address - Optional. The host's address. Defaults to "$address$" if the host's address attribute is set, "$address6$" otherwise.

    vars.snmp_arguments = [ address, vars.snmp_community, vars.snmp_oid ]

    Is this right @dnsmichi?




    In the second error:

    critical/config: Error: Validation failed for object '3-ext_800!fan' of type 'Service'; Attribute 'check_command': Object 'check_snmp' of type 'CheckCommand' does not exist.


    It tells you there is no checkCommand with the name "check_snmp"


    What's the name of the checkCommand for snmp in /usr/share/icinga2/include/command-plugins.conf

    You say:

    I have an object CheckCommand "snmp" {…} in

    So the real name of the CheckCommand is snmp, right?


    I think this should work:



    OK, I have erase all downtime and notifications, not a big deal in our platform.


    Then, after icinga2 reload on master, two of the satellites still Not running.

    When I do checkconfig on these satellites, I still see downtimes...


    information/ConfigItem: Instantiated 1296 Downtimes.



    How can I tell them not to use Downtimes?

    The files from master have already been passed to these satellites:


    ls /var/lib/icinga2/api/zones/global/_etc/

    drwxr-xr-x 3 icinga icinga 4096 Jan 17 11:59 .

    drwxrwxr-x 5 icinga icinga 4096 Jan 16 14:25 ..

    -rw-r--r-- 1 icinga icinga 589 Oct 18 12:56 ALL_NODES_COMMANDS.conf

    -rw-r--r-- 1 icinga icinga 1913 Jan 17 11:59 ALL_NODES_DOWNTIMES.confBK

    -rw-r--r-- 1 icinga icinga 2171 Oct 18 12:56 ALL_NODES_HOSTGROUPS.conf

    -rw-r--r-- 1 icinga icinga 1585 Oct 18 12:56 ALL_NODES_HOSTS.conf

    -rw-r--r-- 1 icinga icinga 2193 Jan 17 11:59 ALL_NODES_NOTIFICATIONS.confBK

    -rw-r--r-- 1 icinga icinga 4393 Oct 18 12:56 ALL_NODES_SERVICES.conf

    -rw-r--r-- 1 icinga icinga 672 Oct 18 12:56 ALL_NODES_TIMEPERIODS.conf

    -rw-r--r-- 1 icinga icinga 330 Oct 18 12:56 ALL_SERVICES_SERVICEGROUPS.conf-rw-r--r-- 1 icinga icinga 2189 Oct 25 09:35 SUPPORT_GROUPS.conf





    Hi Nsup,


    Take a look if you have the snmp CheckCommand already defined in:

    /usr/share/icinga2/include/command-plugins.conf (maybe other path...)

    I guess you do...


    Following the CheckCommand definition, just put this info inside Service on the host.cfg file:


    apply Service "FAN" {

    import "generic-service"

    vars.snmp_address = "x.x.10.10"

    vars.snmp_oid = ".1.3.6.1.2.1.1.2.0"

    vars.snmp_community = "monitor"

    check_command = "snmp" vars.snmp_arguments = [ vars.snmp_address, vars.snmp_community, vars.snmp_oid ]

    import "service-generic-instructions"

    assign where host.name == "1-ext_800"


    Let me know if it works.



    Thanks dnsmichi.

    Pfff, I don't know what's the problem :(


    If we copy /etc/icinga2/zones.d/satellites/NRPE/NODES/DEV/host_t05app02.conf

    to sat1:/var/lib/icinga2/api/zones/satellites/_etc/NRPE/NODES/DEV/host_t05app02.conf


    and then service icinga2 reload (on sat1)

    all works fine!!!


    but if I do service icinga2 reload (on master) all satellites end up - Not running


    We also have a huge amount of segfaults lately :(


    icinga2[1723]: segfault at 28 ip 0000003167b4de42 sp 00007faf45b53c20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[2765]: segfault at 28 ip 0000003167b4de42 sp 00007fc037ae3c20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[3550]: segfault at 28 ip 0000003167b4de42 sp 00007f161c0f0c20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[5522]: segfault at 28 ip 0000003167b4de42 sp 00007f2219bf4c20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[5561]: segfault at 28 ip 0000003167b4de42 sp 00007fdf44501c20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[6712]: segfault at 28 ip 0000003167b4de42 sp 00007fd99dc0ec20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[8515]: segfault at 28 ip 0000003167b4de42 sp 00007f291995bc20 error 4 in libbase.so[3167a00000+1df000]

    icinga2[12349]: segfault at 28 ip 0000003167b4de42 sp 00002b2ab01fec20 error 4 in libbase.so[3167a00000+1df000]

    All the satellites stay - Not Running - after adding new hosts and reloading.


    I found these lines when checkconfig on satellite:


    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/global//.timestamp

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/global//_etc/.ALL_NODES_DOWNTIMES.conf.swp

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//.timestamp

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//_etc/NRPE/NODES/DEV/host_t05app01.conf

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//_etc/NRPE/NODES/DEV/host_t05app02.conf

    [2017-01-16 14:51:04 +0100] information/ApiListener: Restarting after configuration change.

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//.timestamp

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//_etc/NRPE/NODES/DEV/host_t05app01.conf

    [2017-01-16 14:51:04 +0100] information/ApiListener: Updating configuration file: /var/lib/icinga2/api/zones/satellites//_etc/NRPE/NODES/DEV/host_t05app02.conf


    I see that the new hosts are passed via apilistener to /var/lib/icinga2/api/zones/satellites//_etc/NRPE/blablabla

    when they have always been storage at /var/lib/icinga2/api/zones/satellites/_etc/NRPE/blablabla


    Has anyone experienced something similar???


    Why do we have a doble dash???

    :(

    Hey, just a short note to tell you today we have succesfully upgraded our icinga2 platform!!!
    It has gone from 2.3.8 to 2.5.4.
    I can not be more happy!


    Thanks a lot for all your hits and comments guys!

    Hi all,


    I just ended up the translation of chapter 6 of the Icinga2 documentation: Distributed Monitoring with Master, Satellites, and Clients - from english to spanish.
    If anyone is interested i can send them a copy ;)


    Even thouth the reading is easy, translate it to my native language has been great for better understanding :thumbup: .


    Regards,


    Alba.

    Thank you guys!


    So your vote goes for shutting down the whole platform, do the upgrades on every, node and then start them, right?
    We will be without monitoring alerts during the upgrade.
    Is what the rest of people do?

    Hi all,


    Our Icinga2 monitoring platform is running v.2.3.8
    We want to the the upgrade to 2.5 as soon as possible, so I need to plan it with care.


    It's the first time for me to achive this, so any advice is very wellcome.


    Our setup consist in:
    2 masters belonging 'config-ha-master' Zone
    4 satellites belonging to 'satellites' Zone, which is parent of 'config-ha-master'
    1 db that belongs to 'mgmtbd01' Zone


    I would like to mantain the platform operational during the upgrade.
    Do you think this is possible?


    Steps:


    Previous - snapshot of every node.


    1. Stop Icinga2 on checks master node
    2. Upgrade Icinga2
    3. Start Icinga2 service on checks master node


    At this point, the checks master node will complain about the database schema so...
    http://docs.icinga.org/icinga2…hapter/upgrading-icinga-2


    4. Re-start Icinga2 service on checks master node and hope all works fine!


    5. Stop Icinga2 on config master node
    6. Upgrade Icinga2
    7. Start Icinga2 service on config master node


    Do the same for the satellites, one by one.


    What do you think?


    Thanks a lot.