Icinga2 Service for Check_WMI_plus not working

This forum was archived to /woltlab and is now in read-only mode.
  • Hi,


    I am trying to configure the check_wmi_plus nagios plugin to work with my Icinga2 server to remotely monitor certain Windows VMs.
    As part of the setup, i have installed wmic, check_wmi_plus plug-in and all the dependencies that would be needed for the plugin.


    When I try to test the setup, the commands run fine with the following output example -
    wmic -U domain/user%password //host "select * from Win32_OperatingSystem”
    Output > Successfully runs the command with data.


    When I try to run the check_wmi_plus.pl from the unix console, it runs successfully too.
    user console: /etc/icinga2/conf.d$ /usr/lib/nagios/plugins/check_wmi_plus.pl -m checkeachcpu -H '192.168.56.101' -A /etc/icinga2/wmi.auth --inidir /usr/lib/nagios/plugins/etc/check_wmi_plus/check_wmi_plus.d --inifile /usr/lib/nagios/plugins/etc/check_wmi_plus/check_wmi_plus.d/check_wmi_plus.ini


    Output > OK (Sample Period 1286 sec) - CPU0=3.8% CPU1=3.0% CPU2=4.6% CPU3=3.6% CPU_Total=3.8% |'Avg Utilisation CPU0'=3.8%; 'Avg Utilisation CPU1'=3.0%; 'Avg Utilisation CPU2'=4.6%; 'Avg Utilisation CPU3'=3.6%; 'Avg Utilisation CPU_Total'=3.8%;



    Now as a last step, I am trying to configure the service in Icinga2.
    However, the check returns with the following with the STATUS set as UNKNOWN -


    UNKNOWN - The WMI query had problems. The error text from wmic is: [librpc/rpc/dcerpc_connect.c:329:dcerpc_pipe_connect_ncacn_ip_tcp_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_ncacn_ip_tcp_recv
    [librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_b_recv[wmi/wmic.c:196:main()] ERROR: Login to remote object.NTSTATUS: NT_STATUS_NO_MEMORY - Memory allocation error


    Has anybody seen this before? What could be the problem here?


    --
    Thanks & Regards,
    Chirag Khara

  • Have you tried to run the check manually under the account the icinga service is running (network service) ?
    (google devxexec on how to do it on Windows 64 Bit boxes; for 32 Bit boxes, PSEXEC is fine)
    Furthermore, have you checked how many memory the command consumes (i.e. Process explorer, perfmon etc)

  • Hi,
    The connection is being done remotely on the Windows host from a Ubuntu server.


    The icinga2 process (in the ubuntu environment) runs under its own account, and its trying to make a connection on the windows host using the "wmic" process (also running on the ubuntu box) to get monitoring stats data.


    If I run the wmic commands manually, they run fine. So i don't see any reason why there would be any firewall issues.


    The problem can be seen only when we try to run the WMI query using the icinga2 service.


    In terms of the Windows host, I checked by disabling the firewall AND also checked the perfmon stats and the Windows host doesn't show any paging issues or memory issues.

  • Thank you for clarification, i was on the wrong path.
    From the initial post, we see two different hints:


    Error c0000017 with the explanation:
    0xC0000017


    STATUS_NO_MEMORY



    {Not Enough Quota} Not enough virtual memory or paging file quota is available to complete the specified operation.


    and:
    ERROR: Login to remote object (followed by STATUS_NO_MEMORY) like above.



    I would assume a Problem on the windows node with allocating memory which would be a thread for Check_WMI_Plus.


    However, you state that all is fine if run from the shell.


    May be it is wise to enable the icinga2 debuglog (icinga2 feature enable debuglog) and grep the exact commandline icinga runs and recheck that from the shell.


    Dont forget to disable the debuglog again, it is quite verbose.

  • Hi,


    Here's my configs -


    #hosts.conf
    object Host "openmedia-sandpit-windows-vm" {
    import "generic-host"
    address = "192.168.56.101"
    vars.os = "Windows"
    }


    # services.conf
    apply Service "CPU Utilization" {
    import "wmi-service"
    vars.check_mode = "checkeachcpu"
    assign where host.vars.os == "Windows"
    ignore where host.vars.disable_wmi
    }


    #check command
    /* WMI monitoring of OpenMedia Windows Environment */
    object CheckCommand "check_wmi" {
    import "plugin-check-command"
    command = [ PluginDir + "/check_wmi_plus.pl" ]
    arguments = {
    "--inidir" = "$wmi_inidir$"
    "--inifile" = "$wmi_ini_file$"
    "-H" = "$host.name$"
    "-A" = "$wmi_authfile_path$"
    "-m" = "$check_mode$"
    "-s" = "$wmi_submode$"
    "-a" = "$wmi_arg1$"
    "-o" = "$wmi_arg2$"
    "-3" = "$wmi_arg3$"
    "-4" = "$wmi_arg4$"
    "-y" = "$wmi_delay$"
    "-w" = "$wmi_warn$"
    "-c" = "$wmi_crit$"
    "--nodatamode" = {
    set_if = "$wmi_nodatamode$"
    }
    }
    vars.wmi_authfile_path = "/etc/icinga2/wmi.auth"
    vars.wmi_ini_file = "/usr/lib/nagios/plugins/etc/check_wmi_plus/check_wmi_plus.d/check_wmi_plus.ini"
    vars.wmi_nodatamode = false
    }


    # templates.conf


    template Service "wmi-service" {
    import "generic-service"
    check_command = "check_wmi"
    check_interval = 5m
    retry_interval = 5m
    }


    This would run on all hosts with vars.os="windows"


    If I try to run this manually, using wmic and check_wmi_plus, it runs absolutely fine. Output has been shown at the start of this thread.
    I am really not sure where to start debugging this problem, and nothing on the windows server indicates a problem in the memory or paging.


    I had already researched on the error as seen in the Microsoft MSDN documents - but none of the possibilities they mentioned looked viable.
    My question is, if the problem is with memory resource - why can I still get the data I want from the CLI every time?


    There is definitely something weird in the way the wmic interface is called through the icinga2 service, but I am running the icinga2 service under the nagios user, so essentially it has full access to run the check_wmi_plus.pl


    I did put the icinga2 service in debug mode and the output is -


    [2016-04-29 09:59:38 +0100] debug/IdoMysqlConnection: Query: UPDATE icinga_servicestatus SET acknowledgement_type = '0', active_checks_enabled = '1', check_command = 'check_wmi', check_source = 10.52.69.11, check_type = '0', current_check_attempt = '1', current_notification_number = '0', current_state = '3', endpoint_object_id = 1, event_handler = '', event_handler_enabled = '1', execution_time = '1.0759358406066895', flap_detection_enabled = '0', has_been_checked = '1', instance_id = 1, is_flapping = '0', is_reachable = '1', last_check = FROM_UNIXTIME(1461920377), last_hard_state = '3', last_hard_state_change = FROM_UNIXTIME(1461850308), last_state_change = FROM_UNIXTIME(1461850250), last_time_unknown = FROM_UNIXTIME(1461920377), latency = '0', long_output = '[librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_b_recv\\n[wmi/wmic.c:196:main()] ERROR: Login to remote object.\\nNTSTATUS: NT_STATUS_NO_MEMORY - Memory allocation error', max_check_attempts = '5', next_check = FROM_UNIXTIME(1461920676), normal_check_interval = '5', notifications_enabled = '1', original_attributes = 'null', output = 'UNKNOWN - The WMI query had problems. The error text from wmic is: [librpc/rpc/dcerpc_connect.c:329:dcerpc_pipe_connect_ncacn_ip_tcp_recv()] failed NT status (c0000017) in dcerpc_pipe_connect_ncacn_ip_tcp_recv', passive_checks_enabled = '1', percent_state_change = '0', perfdata = '', problem_has_been_acknowledged = '0', process_performance_data = '1', retry_check_interval = '5', scheduled_downtime_depth = '0', service_object_id = 138, should_be_scheduled = '1', state_type = '1', status_update_time = FROM_UNIXTIME(1461920377) WHERE service_object_id = 138

  • I would share your assumption that there are differences in what you type at the cli and in what the icinga2 generated call looks. That is why i would like to see the line starting with "Running command" from the debuglog.


    Here is an example from my host:

    Code
    1. [2016-05-01 07:11:26 +0200] notice/Process: Running command '/usr/lib64/nagios/plugins/check_mysql' '-H' '127.0.0.1' '-u' 'dbrdr': PID 8609
    2. [2016-05-01 07:11:26 +0200] debug/CheckerComponent: Check finished for object 'devsv1!MySQL'

    Next, just copy that given command line from the log to the cli to check if the created call works.


    By the way - from above check command definition i see that you use i.e. "$wmi_inidir$", but i can not see where that has been set.

    The post was edited 1 time, last by sru ().