Posts by watermelon

    I assume you're referring to email notifications?


    If so, there are a few guides out there for setting up Postfix or some type of email script to email you when something triggers an alert as well as countless threads on these forums about this topic that you could get some help from.


    If those don't suffice, let me know and I can show you how I have mine set up.

    Your while loop is not closed.


    Perhaps try this?


    Hmm.. I think it may be due to the fact that you did not specify the enable_active_checks and enable_passive_checks booleans on the Service itself. See my definition for reference:


    I think the same goes for your Host template definition. Try this out! I replicated your situation with the only exception being these additional variables being defined and when I ran the Service checks, they were able to execute.


    By the way, I thought that by knowing what your script does that might've helped in finding out the root cause of your Services staying in their pending state, but I'm pretty sure that the solution described above is the answer.

    How are you defining this line?


    Code
    1. ...
    2. assign where "Test" in host.templates
    3. ...

    I don't see the option to apply services to host.templates in the Director interface.


    Also, what does your check.plscript do?

    Hello all!


    It's been a while.


    I'm having some trouble with NagVis 1.9b15. This happened right after I upgraded from 1.8.5 (the most recent stable version) and it seems like I am unable to access the web interface at all now. As soon as I enter the address into my browser, I receive an HTTP Error 500. This perplexes me, because I have compared my config files with that of another NagVis instance (which is located on a different network) and they are virtually identical. If needed, I can post them here. Another thing that is strange is that I am still able to access the Icinga2 web interface from the same server, so I don't think there is a problem with apache.


    I have gone through the regular troubleshooting (checking access_logs and error_logs for apache, checking config files for NagVis and apache, restarting/rebooting services) and have not found any errors with that. Here are their contents:



    Is there much that I can do about this or do I have to try to revert back to v1.8.5? Let me know if more information is required.


    Thanks for the help in advance!


    UPDATE #1 (23 hours later):

    I downgraded my NagVis instance from 1.9b15 to 1.9b14 and now it seems to be working fine. This indicates that there seems to be some kind of bug or perhaps new functionality with authentication in 1.9b15. Although it doesn't seem to be mentioned in the documentation, I found the follow excerpt in the changelog from 1.9b14 to 1.9b15:


    • New backend "pgsql" to connect with PostgreSQL databases of Icinga. (Thanks a lot to Peter Pentchev for doing the work!)
    • FIX: Security fix: Authenticated users could read contents of local files by opening a custom URL (#29). Only https/http URLs can now be used.
    • FIX: Security fix: Only configured ULRs can be fetched using the Url module (#29)

    Perhaps someone can elaborate?


    Downloaders beware!

    This is strange because it was previously working after my upgrade, it has just recently shown this problem.


    I did perform the same schema migration process up to 2.6.0.sql following this guide on and from this forum thread two Icinga2 instances. On one, it seems to work fine but on the other it seems like it didn't quite work (the one I'm having problems with).


    I ran the script that sru posted in that thread and have this as output:


    From my understanding, the above shows the differences from a fresh 2.6.0.sql schema migration and my current schema migration. Is there any way of remediating this? I assume that the large amount of output above means something is wrong.


    Thanks!

    Hey all,


    I'm currently using Director v1.3.1 (recently updated) along with Icinga2 v2.6.2 (also recently updated) and Icingaweb2 v2.4.1.


    I'm having this weird error with Director in which whenever I click on one of my Host, Service, etc. definitions, I am met with the following:


    I can create and delete these definitions without hitches but it seems like whenever I try to view the definition, this error occurs. Does anybody know how to remedy this issue?


    Another smaller but possibly related problem is that the Director tab shows that I have 7 pending changes to deploy, yet there is actually nothing there to deploy. It looks like this:




    :/


    Thanks to anybody that can help me out!

    Fiiti


    From my understanding of your question, all you want to do is pass arguments into a command in Director?


    To pass in arguments for your check_mem command (or any command) in Director, all you have to do is edit the command under Icinga Director > Commands.


    I assume you already have the definition in place. If not, then it should be relatively easy to create one. If so, see the following picture:




    This picture describes the process that I have previously mentioned (going to Icinga Director > Commands). You will see here that I have a created a custom check command for checking drive size using the by_ssh command that I have named check_drive_size-by_sshand that I have passed the arguments -C and -E into it under the Arguments tab on the right-hand side of the screen.


    Then, I can create a service to run under a host under Director using this command definition.


    If you have further questions or need more detail, please ask! This is a relatively high-level overview.

    mcktr


    I am not certain if putting up all my mail configurations would be useful because emails do send correctly, other than this problem. Did you want to see something specific?


    Currently, I have all types of notifications configured to be sent to me which includes the following:

    Problem, Acknowledgement, Recovery, Custom, FlappingStart, FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved


    I did add in the interval = 0 that you suggested, but it didn't seem to do anything. The notifications are still sending.


    --------------------


    dnsmichi


    Sorry for not being too specific. I thought that they might be related because they are both issues that are falsely sending emails out.


    Here's my issues:


    1. I am receiving email notifications for services that I have acknowledged using the "acknowledge" button through the Icinga2 web interface. These services have been acknowledged for about a month or so (with no notifications being sent out), but just recently it started to send out notifications again even though they are still "acknowledged".


    From the pictures, you can see that on the web interface it shows that the last notification was apparently sent on February 7th, but from my email you can see that I am still receiving email notifications about the same acknowledged service currently.


    2. One of my windows agents shows that it is not connected via email, but it actually is connected (as confirmed by the web interface).


    --


    Here's a list of things that I've done recently (in order):

    1. Updated Icinga2 to v2.6.1

    2. Installed NagVis and Livestatus (but have disabled Livestatus and am using IDO instead)

    3. Switched from perl script for sending mail to Postfix


    After updating Icinga2, I did not experience any issues with any database problems. Same with installing NagVis. I just wanted to use Postfix because it seems like what most people are using for sending out email.


    Update:


    So, all of this actually doesn't matter. Turns out my Icinga server was cloned into another virtual machine as a CentOS 7 template. I thought I had turned off the Icinga2 service on this new cloned machine (service icinga2 stop), but I guess it was rebooted at some point (I was not working on this new VM) and since I had set up the template to start at boot ( systemctl enable icinga2) because it was my main Icinga2 machine, this meant that the Icinga2 service had rebooted, therefore causing it to basically run as another Icinga2 instance.


    Since the email notifications had been configured for this cloned machine as well, they were sending from that machine. And since the notifications look identical (because they are cloned), I was extremely confused as to how they were sending to me even though I had acknowledged them on the actual server but it turns out they were not actually acknowledged on the clone.


    This is a bit of an extremely rare case, but I guess it may happen to someone else at some point as well. Make sure you disable the icinga2 service if you happen to clone your Icinga2 server to function as something else!


    TLDR;

    I cloned my Icinga2 server and the clone began to send me emails after being rebooted.


    Thanks for the help anyways mcktr and dnsmichi . I am ashamed.

    Hey all,


    I have a host node on my NagVis installation that is causing me grief.


    It reports the following values when I first add it in (which is correct. see the red underlined Host Name and Summary Output):



    Then, after a few moments, it changes into this (again, see the red underlined parts):





    This is very strange because not only does the host go into a critical state, it magically has ~200 services up from 53 services.


    So, how is this possible? I am unsure how to go about this.


    Thanks for any help!

    pacofer


    Ah, I understand. Perhaps support for multiline output in the History tab will be a feature in the future. You should suggest it!


    MarcusCaepio


    I think pacofer's problem is not necessarily with the deployment of the definitions, but scaling the resources with it. 5000+ checks would probably require a decent amount more resources than he would like to use (which is why he mentioned the distributed monitoring with several satellites). I'm not sure that your solution would reduce the amount of the checks, it just aids in the deployment of the checks themselves (tell me if I'm wrong).

    Hey all,


    I'm running this very peculiar problem right now. I am receiving email notifications when I'm not supposed to, meaning that I have acknowledged "unknown" services that still send emails to me like it would if I had not acknowledged it already. Additionally, I have an agent installed on one of my Windows machines that says that it is not connected, even though on the web interface, it shows it's connected. See pictures below.


    Windows Agent showing not connected via email notification

    Sending notification for unknown port that I have already acknowledged

    Icinga2 Web Interface proving that I acknowledged this already a month ago and that the "last notification" was sent on that date (even though I just started receiving them again)


    Icinga2 web interface proving that the Windows agent is connected (yet, the email notification says otherwise)


    I've checked /var/log/icinga2/icinga2.log but didn't find anything of use. Where else should I check? It seems as if this problem started about 3/8/17 (today is 3/13/17). All I've done since then was install NagVis. Also, it seems to be that these emails send at 30 minute intervals (at xx:15 and xx:45) consistently. I have tried removing the acknowledgement and readding it, to no avail.


    Does anybody know what the problem could be?

    Adding hosts and zones is relatively trivial. Even if you have a large setup, you can use host templates to aid in the process of creating additional hosts. Creating zones is also pretty easy. Use an external text editor like Atom or Brackets to help with visualization.

    I see, but I feel like in either case discussed above, the same amount of checks would have to be executed to check the same amount of ports. Or, perhaps there is a better way like you describe. Maybe dnsmichi might know?


    I'm currently on version 2.6.1 of Icinga2 and 2.4.1 of Icingaweb2.


    Here's a screenshot of the History tab:



    I think it also may depend on the plugin that you're using, but I am not sure.

    dnsmichi


    Ah, thank you for clarifying.


    Regarding the "multiple SNMP query requests" - I know that my plugin uses multiple OID queries, but I cannot even run a simple system description OID check (1.3.6.1.2.1.1.1) using snmpwalk.


    Regarding the "frequent snmp polling" - what's strange to me is that why would monitoring work for months and months, but then all of a sudden stop and never work again? The SNMP checks I configured constantly polled (about every 60s) for MONTHS and never had any problems.


    Regarding the "OIDs not being properly implemented" - I analyzed the plugin and it seems like there are 15 OIDs being queried to gather the information needed. What could I do about this? Do I have to do anything about this? Should I switch to a different plugin? I am not sure what makes a MIB bad or not.


    So, what you're saying is that my hardware may not be good enough to handle SNMP requests and/or that the firmware may need to be upgraded? How can I tell whether or not the it's the hardware/firmware that's acting up? The affected devices are Dell SonicWAll NSA 2600, Dell PowerConnect, VRTX 1 GB Switch Module, Dell SonicWall NSA 4600, and Dell N4032.


    What would you do in this situation?


    P.S.

    It's funny that the Cisco Catalyst is not even affected by this ordeal.