Posts by watermelon

    This is strange because it was previously working after my upgrade, it has just recently shown this problem.


    I did perform the same schema migration process up to 2.6.0.sql following this guide on and from this forum thread two Icinga2 instances. On one, it seems to work fine but on the other it seems like it didn't quite work (the one I'm having problems with).


    I ran the script that sru posted in that thread and have this as output:


    From my understanding, the above shows the differences from a fresh 2.6.0.sql schema migration and my current schema migration. Is there any way of remediating this? I assume that the large amount of output above means something is wrong.


    Thanks!

    Hey all,


    I'm currently using Director v1.3.1 (recently updated) along with Icinga2 v2.6.2 (also recently updated) and Icingaweb2 v2.4.1.


    I'm having this weird error with Director in which whenever I click on one of my Host, Service, etc. definitions, I am met with the following:


    I can create and delete these definitions without hitches but it seems like whenever I try to view the definition, this error occurs. Does anybody know how to remedy this issue?


    Another smaller but possibly related problem is that the Director tab shows that I have 7 pending changes to deploy, yet there is actually nothing there to deploy. It looks like this:




    :/


    Thanks to anybody that can help me out!

    Fiiti


    From my understanding of your question, all you want to do is pass arguments into a command in Director?


    To pass in arguments for your check_mem command (or any command) in Director, all you have to do is edit the command under Icinga Director > Commands.


    I assume you already have the definition in place. If not, then it should be relatively easy to create one. If so, see the following picture:




    This picture describes the process that I have previously mentioned (going to Icinga Director > Commands). You will see here that I have a created a custom check command for checking drive size using the by_ssh command that I have named check_drive_size-by_sshand that I have passed the arguments -C and -E into it under the Arguments tab on the right-hand side of the screen.


    Then, I can create a service to run under a host under Director using this command definition.


    If you have further questions or need more detail, please ask! This is a relatively high-level overview.

    mcktr


    I am not certain if putting up all my mail configurations would be useful because emails do send correctly, other than this problem. Did you want to see something specific?


    Currently, I have all types of notifications configured to be sent to me which includes the following:

    Problem, Acknowledgement, Recovery, Custom, FlappingStart, FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved


    I did add in the interval = 0 that you suggested, but it didn't seem to do anything. The notifications are still sending.


    --------------------


    dnsmichi


    Sorry for not being too specific. I thought that they might be related because they are both issues that are falsely sending emails out.


    Here's my issues:


    1. I am receiving email notifications for services that I have acknowledged using the "acknowledge" button through the Icinga2 web interface. These services have been acknowledged for about a month or so (with no notifications being sent out), but just recently it started to send out notifications again even though they are still "acknowledged".


    From the pictures, you can see that on the web interface it shows that the last notification was apparently sent on February 7th, but from my email you can see that I am still receiving email notifications about the same acknowledged service currently.


    2. One of my windows agents shows that it is not connected via email, but it actually is connected (as confirmed by the web interface).


    --


    Here's a list of things that I've done recently (in order):

    1. Updated Icinga2 to v2.6.1

    2. Installed NagVis and Livestatus (but have disabled Livestatus and am using IDO instead)

    3. Switched from perl script for sending mail to Postfix


    After updating Icinga2, I did not experience any issues with any database problems. Same with installing NagVis. I just wanted to use Postfix because it seems like what most people are using for sending out email.


    Update:


    So, all of this actually doesn't matter. Turns out my Icinga server was cloned into another virtual machine as a CentOS 7 template. I thought I had turned off the Icinga2 service on this new cloned machine (service icinga2 stop), but I guess it was rebooted at some point (I was not working on this new VM) and since I had set up the template to start at boot ( systemctl enable icinga2) because it was my main Icinga2 machine, this meant that the Icinga2 service had rebooted, therefore causing it to basically run as another Icinga2 instance.


    Since the email notifications had been configured for this cloned machine as well, they were sending from that machine. And since the notifications look identical (because they are cloned), I was extremely confused as to how they were sending to me even though I had acknowledged them on the actual server but it turns out they were not actually acknowledged on the clone.


    This is a bit of an extremely rare case, but I guess it may happen to someone else at some point as well. Make sure you disable the icinga2 service if you happen to clone your Icinga2 server to function as something else!


    TLDR;

    I cloned my Icinga2 server and the clone began to send me emails after being rebooted.


    Thanks for the help anyways mcktr and dnsmichi . I am ashamed.

    Hey all,


    I have a host node on my NagVis installation that is causing me grief.


    It reports the following values when I first add it in (which is correct. see the red underlined Host Name and Summary Output):



    Then, after a few moments, it changes into this (again, see the red underlined parts):





    This is very strange because not only does the host go into a critical state, it magically has ~200 services up from 53 services.


    So, how is this possible? I am unsure how to go about this.


    Thanks for any help!

    pacofer


    Ah, I understand. Perhaps support for multiline output in the History tab will be a feature in the future. You should suggest it!


    MarcusCaepio


    I think pacofer's problem is not necessarily with the deployment of the definitions, but scaling the resources with it. 5000+ checks would probably require a decent amount more resources than he would like to use (which is why he mentioned the distributed monitoring with several satellites). I'm not sure that your solution would reduce the amount of the checks, it just aids in the deployment of the checks themselves (tell me if I'm wrong).

    Hey all,


    I'm running this very peculiar problem right now. I am receiving email notifications when I'm not supposed to, meaning that I have acknowledged "unknown" services that still send emails to me like it would if I had not acknowledged it already. Additionally, I have an agent installed on one of my Windows machines that says that it is not connected, even though on the web interface, it shows it's connected. See pictures below.


    Windows Agent showing not connected via email notification

    Sending notification for unknown port that I have already acknowledged

    Icinga2 Web Interface proving that I acknowledged this already a month ago and that the "last notification" was sent on that date (even though I just started receiving them again)


    Icinga2 web interface proving that the Windows agent is connected (yet, the email notification says otherwise)


    I've checked /var/log/icinga2/icinga2.log but didn't find anything of use. Where else should I check? It seems as if this problem started about 3/8/17 (today is 3/13/17). All I've done since then was install NagVis. Also, it seems to be that these emails send at 30 minute intervals (at xx:15 and xx:45) consistently. I have tried removing the acknowledgement and readding it, to no avail.


    Does anybody know what the problem could be?

    Adding hosts and zones is relatively trivial. Even if you have a large setup, you can use host templates to aid in the process of creating additional hosts. Creating zones is also pretty easy. Use an external text editor like Atom or Brackets to help with visualization.

    I see, but I feel like in either case discussed above, the same amount of checks would have to be executed to check the same amount of ports. Or, perhaps there is a better way like you describe. Maybe dnsmichi might know?


    I'm currently on version 2.6.1 of Icinga2 and 2.4.1 of Icingaweb2.


    Here's a screenshot of the History tab:



    I think it also may depend on the plugin that you're using, but I am not sure.

    dnsmichi


    Ah, thank you for clarifying.


    Regarding the "multiple SNMP query requests" - I know that my plugin uses multiple OID queries, but I cannot even run a simple system description OID check (1.3.6.1.2.1.1.1) using snmpwalk.


    Regarding the "frequent snmp polling" - what's strange to me is that why would monitoring work for months and months, but then all of a sudden stop and never work again? The SNMP checks I configured constantly polled (about every 60s) for MONTHS and never had any problems.


    Regarding the "OIDs not being properly implemented" - I analyzed the plugin and it seems like there are 15 OIDs being queried to gather the information needed. What could I do about this? Do I have to do anything about this? Should I switch to a different plugin? I am not sure what makes a MIB bad or not.


    So, what you're saying is that my hardware may not be good enough to handle SNMP requests and/or that the firmware may need to be upgraded? How can I tell whether or not the it's the hardware/firmware that's acting up? The affected devices are Dell SonicWAll NSA 2600, Dell PowerConnect, VRTX 1 GB Switch Module, Dell SonicWall NSA 4600, and Dell N4032.


    What would you do in this situation?


    P.S.

    It's funny that the Cisco Catalyst is not even affected by this ordeal.

    I think in either case you'll be monitoring the same amount of ports but it seems like for you, it's just a matter of presenting the information. I think if I were you, I would do it as I have configured in my environment and not with any correlation logic because there is an option to search through the services that I have configured. For example, say I need to find out what's going on with port 0/48 on my Cisco Catalyst. I'll go to the "Services" tab and use the filter option to search for services named for "Port 0/48" and for hosts named "Cisco Catalyst".


    You can define this search even further if you like, as I'm sure you might need to. An alternative solution would be to instead use the "Hosts" tab, search for "Cisco Catalyst", then to the "Services" tab of the host, and I can find port 0/48 with relative ease, especially if it is critically flagged.


    Furthermore (if you configure it like I have), you can present the information in a broader perspective with NagVis. It's a network mapping plugin that you can add into Icinga2 to visualize your network better than a text-based interface. Here's an example:




    You see here that the host is critical because one of its services is down. This could be the logic you're looking for your switches. From there, you can create a link to your host in the Icinga2 web interface so that you can remediate easily. I just started using this plugin and so far it's very nice, apart from a few minor bugs.


    Let me know what you think.


    P.S.


    Regarding the multiline output -


    I'm not sure what plugin you're using to check your port interfaces, but I use "check_snmp_int.pl" and the output is usually only 1 line (interface name, up/down status, throughput %'s, and overall status of the port). Regardless, I have seen multiline output through the "History" tab with a different check that I use for check_wmi.

    I have already tried going into those switch settings but have verified that the SNMP settings that I configured a long time ago are still existing and should theoretically be working.

    The switch settings are the same, they have not been reset.


    The network devices are on the same subnet as the Icinga2 server, thus making it so that there can be no firewall in the way of the SNMP checks.

    Hmm, I think you are going about this the wrong way.


    I guess I didn't realize you were using the zones.ddirectory for all of this. The zones.ddirectory is used for cluster zone configurations (definition from the README). I think that you're basically not supposed to touch it unless you're running cluster zones. I do not quite understand why you're not able to use this directory for your current configurations, but I assume that there are links to this directory in the API that make it this way. I was able to recreate your error in my environment, but I have a few workarounds for you.


    Option A would be to revert your icinga2.confconfiguration to default and move your current folder structure from the zones.d directory into the conf.ddirectory. This way, you can still segregate the objects in your custom configuration. You can also remove all the .conf files in that conf.d directory if you don't need them.


    If you simply don't want to use the conf.d directory, you can Option B would be to create another directory in /etc/icinga2/ and keep the conf.ddirectory commented out, but add the new directory to the icinga2.conffile.


    For example, I created a test.ddirectory in /etc/icinga2/test.d/, added this line:


    Code: /etc/icinga2/icinga2.conf
    1. //include_recursive "conf.d"
    2. include_recursive "test.d"

    Then, I created your hierarchy like you described in post #5 and I was able to see the hosts/services popping up on my web interface.


    Let me know if you have further questions.

    sru


    Cool, thanks!


    I was unaware that the command generated the socket file.


    Installation Notes (perhaps this will help somebody else):

    Options that I chose while running ./install-sh -s icinga2


    - Specify the icinga2 base directory as /usr/share/icinga2

    - Set the NagVis base to /usr/share/nagvis

    - Confirmed use of backend mklivestatus

    - Set the NagVis web path to /nagvis

    - Set web-server user to apache

    - Set web-server group to apache

    - Confirmed creation of Apache config file


    I was then able to access the web interface at http://server-ip/nagvis


    Thanks to Wolfgang and sru for the help!


    Thread is now resolved.