Posts by watermelon

    Mordecaine


    What I like to do (not sure if it's recommended) is create custom check/service definitions on the remote Windows machine and from there I can edit /etc/icinga2/repository.d directory (which is where the "agent" configs go) on my master server to use the previously defined custom check/service definitions on the Windows machine to create a dummy service. This method does not require NRPE however, it can get tedious to create/delete checks.


    Here's specifically how I do it:


    1. Install the "agent" on your remote Windows machine

    2. Make sure the "agent" and master Icinga server can communicate via the parent-child relationship

    3. Create my custom check/service definitions for my remote Windows machine


    Here's an example check command definition that I use for checking Windows service statuses:



    Here's an example service definition I use for checking the specific service:


    Code
    1. apply Service "SQLWriter-check" {
    2. import "generic-service"
    3. check_command = "check-active-service"
    4. vars.service_win_service = "SQLWriter"
    5. assign where host.name == NodeName
    6. }

    4. Put these definitions somewhere in your C:\ProgramData\icinga2\etc\icinga2\conf.d directory on your remote Windows machine. This is where the configurations go for the default checks. You will have to enable viewing of hidden files. Make sure to restart the Icinga2 service on the Windows machine after updating the configs!


    5. Enter your master Icinga server and create the following definitions:


    Code: /etc/icinga2/repository.d/hosts/HOSTNAME.conf
    1. object Host "HOSTNAME" {
    2. import "satellite-host"
    3. check_command = "cluster-zone"
    4. }
    Code: /etc/icinga2/repository.d/hosts/HOSTNAME/services.conf
    1. object Service "HOSTNAME" {
    2. import "satellite-service"
    3. check_command = "dummy"
    4. host_name = "HOSTNAME"
    5. zone = "HOSTNAME"
    6. }
    Code: /etc/icinga2/repository.d/zones/HOSTNAME.conf
    1. object Zone "HOSTNAME" {
    2. endpoints = [ "HOSTNAME" ]
    3. parent = "master"
    4. }
    Code: /etc/icinga2/repository.d/endpoints/HOSTNAME.conf
    1. object Endpoint "HOSTNAME" {
    2. }


    Note: You don't necessarily have to follow this pattern of separating the files like this, but this is the default configurations that are generated if an agent is added.


    6. Restart your master server


    Now everything should be in place! Lemme know if you have additional problems or if you've found a different way to do this.

    I know how to do the check but I'd like to know if others have implemented this or something similar? I'd like to know the specifics of what the check would be looking for.


    Maybe a better way to reword this question would be: "which Windows services SHOULD I be monitoring for criticalities?"

    Hey all,


    This is not really an Icinga specific question but also not a plugin specific question.


    I am looking for a way to monitor Windows services that are required for the OS to boot which will therefore show when the machine is actually functioning, not just accessible by ping. I have a machine specifically that generally takes a few minutes after reboot to start up all the necessary services and processes that are needed for the OS to finally start up, but via ping checks it shows that the machine is alive as soon as it is turned on.


    Has anybody else done this and if so, could you share your technique with me? If not, why?


    Thanks in advance!

    Hey Mikesch ,


    Here's what I have:



    Code: icinga2 feature list
    1. icinga2 feature list
    2. Disabled features: compatlog debuglog graphite livestatus opentsdb statusdata syslog
    3. Enabled features: api checker command gelf ido-mysql influxdb mainlog notification perfdata




    Code: systemctl stop icinga2 && rm -f /var/log/icinga2/icinga2.log && systemctl start icinga2 && sleep 10 && fgrep -i ido /var/log/icinga2/icinga2.log
    1. systemctl stop icinga2 && rm -f /var/log/icinga2/icinga2.log && systemctl start icinga2 && sleep 10 && fgrep -i ido /var/log/icinga2/icinga2.log
    2. [2017-07-24 10:30:53 -1000] information/DbConnection: Resuming IDO connection: ido-mysql
    3. [2017-07-24 10:30:53 -1000] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.2')
    4. [2017-07-24 10:30:53 -1000] information/IdoMysqlConnection: Finished reconnecting to MySQL IDO database in 0.316027 second(s).


    After the last command, everything seems to start up okay.... but that doesn't quite make sense to me. Could you explain? It might also be because I let the server sit for 5 days without any activity.


    Thanks!

    Hey all,


    I know this is an older thread but I didn't want to create a new one because there are so many like this one.


    I'm having the same problem with the web frontend showing "Backend icinga2 is not running" but in the backend, I see that the icinga2 service is running fine. When I try to reload the service, the error goes away for about 2-3 minutes and then comes back.


    The error stems from having to migrate my Icinga VM from one vCenter to another.


    At one point, it showed that there was a line that said

    "Supervising process <number> which is not our child. We'll most likely not notice when it exits." when I ran service icinga2 status (i'm running Cent OS 7). I found some help on another thread that said to just kill <processnumber> and that seemed to get rid of that message, but did not help in fixing my overall issue.


    The error.log is empty and icinga.log did not show anything of use to my knowledge (low query rates like the ones posted above me in this thread).


    I ran mysql schema updates and tried updating icinga2, icinga2-ido-mysql, mysql anyway. That didn't help.


    The only thing that I think that might be going on is that the CPU is being overloaded from all the checks since I added a bunch of new ones, but I removed them just to see if that was the root cause and I STILL have the issue.


    I have tried analyzing this with htop, like top but with more information and it does seem to show a huge spike after reloading the process. I've attached an image of the output.


    If I need to, I'll make my own thread, but I'd love some input on asap.


    Thanks in advance!!


    EDIT 1:

    I think this is related to some kind of process that needs to be running in order for Icinga2 to work because I had to shutdown the Icinga server before migrating it to another vCenter. Therefore, since I had to shut it down, some process that was not configured to start automatically on boot is not present for Icinga. Any ideas?

    Hey all,


    This is just a quick question that I can't seem to find any answer to, directed to anybody who is familiar with the 'check_mysql_health' plugin. I was just wondering whether or not it's possible to execute this check locally. I can do it remotely just fine, but whenever I try on the Icinga server itself, I receive this error:


    CRITICAL - cannot connect to icinga2. Host 'my.icinga.server' is not allowed to connect to this MariaDB server


    I only want to do this because I'm curious as to what the health of my local database is.


    Thanks in advance!

    Hello,


    Apologies if this question is dumb or if it's been asked before.


    How do I pass a "0" into an Icinga Director service argument value? Currently when I try to do it, the preview shows "{}", indicating that putting 0 into the argument value creates an empty array.


    I tried putting double quotes, single quotes, and dollar signs around the 0, but none seem to work. I also tried changing the field value to "Integer", but there doesn't seem to be that option. Is there a reason for this? The "Icinga DSL" field type doesn't seem to work either. I've attached some pictures explaining the situation with more context.






    Any help is appreciated!

    I assume you're referring to email notifications?


    If so, there are a few guides out there for setting up Postfix or some type of email script to email you when something triggers an alert as well as countless threads on these forums about this topic that you could get some help from.


    If those don't suffice, let me know and I can show you how I have mine set up.

    Your while loop is not closed.


    Perhaps try this?


    Hmm.. I think it may be due to the fact that you did not specify the enable_active_checks and enable_passive_checks booleans on the Service itself. See my definition for reference:


    I think the same goes for your Host template definition. Try this out! I replicated your situation with the only exception being these additional variables being defined and when I ran the Service checks, they were able to execute.


    By the way, I thought that by knowing what your script does that might've helped in finding out the root cause of your Services staying in their pending state, but I'm pretty sure that the solution described above is the answer.

    How are you defining this line?


    Code
    1. ...
    2. assign where "Test" in host.templates
    3. ...

    I don't see the option to apply services to host.templates in the Director interface.


    Also, what does your check.plscript do?

    Hello all!


    It's been a while.


    I'm having some trouble with NagVis 1.9b15. This happened right after I upgraded from 1.8.5 (the most recent stable version) and it seems like I am unable to access the web interface at all now. As soon as I enter the address into my browser, I receive an HTTP Error 500. This perplexes me, because I have compared my config files with that of another NagVis instance (which is located on a different network) and they are virtually identical. If needed, I can post them here. Another thing that is strange is that I am still able to access the Icinga2 web interface from the same server, so I don't think there is a problem with apache.


    I have gone through the regular troubleshooting (checking access_logs and error_logs for apache, checking config files for NagVis and apache, restarting/rebooting services) and have not found any errors with that. Here are their contents:



    Is there much that I can do about this or do I have to try to revert back to v1.8.5? Let me know if more information is required.


    Thanks for the help in advance!


    UPDATE #1 (23 hours later):

    I downgraded my NagVis instance from 1.9b15 to 1.9b14 and now it seems to be working fine. This indicates that there seems to be some kind of bug or perhaps new functionality with authentication in 1.9b15. Although it doesn't seem to be mentioned in the documentation, I found the follow excerpt in the changelog from 1.9b14 to 1.9b15:


    • New backend "pgsql" to connect with PostgreSQL databases of Icinga. (Thanks a lot to Peter Pentchev for doing the work!)
    • FIX: Security fix: Authenticated users could read contents of local files by opening a custom URL (#29). Only https/http URLs can now be used.
    • FIX: Security fix: Only configured ULRs can be fetched using the Url module (#29)

    Perhaps someone can elaborate?


    Downloaders beware!

    This is strange because it was previously working after my upgrade, it has just recently shown this problem.


    I did perform the same schema migration process up to 2.6.0.sql following this guide on and from this forum thread two Icinga2 instances. On one, it seems to work fine but on the other it seems like it didn't quite work (the one I'm having problems with).


    I ran the script that sru posted in that thread and have this as output:


    From my understanding, the above shows the differences from a fresh 2.6.0.sql schema migration and my current schema migration. Is there any way of remediating this? I assume that the large amount of output above means something is wrong.


    Thanks!

    Hey all,


    I'm currently using Director v1.3.1 (recently updated) along with Icinga2 v2.6.2 (also recently updated) and Icingaweb2 v2.4.1.


    I'm having this weird error with Director in which whenever I click on one of my Host, Service, etc. definitions, I am met with the following:


    I can create and delete these definitions without hitches but it seems like whenever I try to view the definition, this error occurs. Does anybody know how to remedy this issue?


    Another smaller but possibly related problem is that the Director tab shows that I have 7 pending changes to deploy, yet there is actually nothing there to deploy. It looks like this:




    :/


    Thanks to anybody that can help me out!

    Fiiti


    From my understanding of your question, all you want to do is pass arguments into a command in Director?


    To pass in arguments for your check_mem command (or any command) in Director, all you have to do is edit the command under Icinga Director > Commands.


    I assume you already have the definition in place. If not, then it should be relatively easy to create one. If so, see the following picture:




    This picture describes the process that I have previously mentioned (going to Icinga Director > Commands). You will see here that I have a created a custom check command for checking drive size using the by_ssh command that I have named check_drive_size-by_sshand that I have passed the arguments -C and -E into it under the Arguments tab on the right-hand side of the screen.


    Then, I can create a service to run under a host under Director using this command definition.


    If you have further questions or need more detail, please ask! This is a relatively high-level overview.