WATO: SNMP service discovery listing only vanished services

#1

Hi,
I am trying to migrate from an old OMD 1.20 to OMD 1.5.0p19.cre. I backuped my old site and copied to a new debian server. So far so good, but I have some weird issues with SNMP devices, e.g. a Netapp filer:
When I click “service discovery” for that host in WATO, I get a list of all services, but all are considered as “vanished services”:


The hosts properties look like this:

I had the assumption, that it might have something to do with my host tags, as I created a tag “operating system” that implied the monitoring agent SNMP for those hosts. So I set the host on “none” and also tried with Agent type “Legacy: SNMP (Networking device, Appliance)”, but the same result.

A debug output of var/log/web.log shows the following:

2019-07-19 13:10:34,311 [20] [cmk.web.config.automations 22287] RUN: check_mk --automation try-inventory – @noscan @raiseerrors NETAPP
2019-07-19 13:10:34,315 [20] [cmk.web.config.automations 22287] STDIN: ‘’
2019-07-19 13:10:35,079 [20] [cmk.web.config.automations 22287] FINISHED: 0
2019-07-19 13:10:35,080 [10] [cmk.web.config.automations 22287] OUTPUT: ‘[(‘vanished’, ‘netapp_volumes’, None, u’vol0’, ‘None’, None, u’NetApp Vol vol0’, 0, ‘Received no data’, []), (‘vanished’, ‘netapp_nvram’, ‘netapp_nvram’, u’NVRAM’, ‘netapp_nvram_default_levels’, None, u’NetApp NVRAM’, 0, ‘Received no data’

On the other hand, I tried to do an inventory on the CLI (I’m not too firm on the CLI so I hope I fired the correct commands), I get this output (shortened):

OMD[rap130]:~$ cmk -vv -I NETAPP
.
11 df_netapp
3 if
1 netapp_cpu
1 netapp_fans
1 netapp_fcpio
1 netapp_hdds
1 netapp_nvram
1 netapp_ps
1 netapp_vfiler
5 netapp_volumes
1 snmp_info
1 snmp_uptime
SUCCESS - Found 28 services
OMD[rap130]:~$ cmk -R
OMD[rap130]:~$ cmk -D NETAPP
NETAPP
Addresses: xxx.xxx.xxx.xxx
Tags: /wato/, LG, db_no, ip-v4, ip-v4-only, lan, mail_no, none, prod, site:rap130, snmp, snmp-only, snmp-v2, ts_no, vm_alias, vm_no, wato, web_no
Host groups: LG
Contact groups: admins
Agent mode: No agent
Type of agent:
SNMP (Community: ‘community’, Bulk walk: yes, Port: default, Inline: no)
Process piggyback data from /omd/sites/rap130/tmp/check_mk/piggyback/NETAPP
Services:
checktype item params description groups
netapp_fcpio None {‘read’: (None, None), ‘write’: (None, None)} FCP I/O
if 1 {‘state’: [u’1’], ‘errors’: (0.01, 0.1), ‘speed’: 1000000000} Interface 1
.
df_netapp aggr0 {‘trend_range’: 24, ‘show_levels’: ‘onmagic’, ‘inodes_levels’: (10.0, 5.0), ‘magic_normsize’: 20, ‘show_inodes’: ‘onlow’, ‘levels’: (88.0, 92.0), ‘show_reserved’: False, ‘levels_low’: (50.0, 60.0), ‘trend_perfdata’: True} fs_aggr0
OMD[rap130]:~$ cmk -vv -d NETAPP
[piggyback] No persisted sections loaded
[piggyback] Execute data source

I found out, that there is a special agent for Netapp via WebAPI, but our Netapp software version is too old and will not get updated. But I face this issue also on a UPS.

I tried to create an empty test site with just this one host, there I do not have this problem, everything seems fine.

Does someone have an idea what could be the problem? Maybe some rules collide, but I don’t see which could be responsible for that?

(Philipp Näther) #2

Your snmp settings for cmk to contact your snmp devices might be not correct.

Try to “remove” the vanished services of one of your devices and run a “full scan” again. If it does not find any services, the snmp settings have to be adjusted. That would also be the reason why it shows your old services as vanished. The system compares the inventoried services with the current inventory and marks missing services in the current info as vanished.

Since there is no real migration path from OMD1.20 to check:mk raw 1.5.0, things can get pretty f’ed up. Especially with the snmp settings because the Kettner guys implemented a new “way” to tag hosts for snmp somewhere in the past few releases.

Worst case would be that you have to delete and recreate all of your snmp hosts.

#3

I tried deleting and recreating the host, a “full scan” does not provide new services.
What I don’t understand is, why does the CLI not show these services as vanished? Or does the CLI not follow the rules defined in WATO?
As I assume it is some rules colliding, is there a command/debug mode to show all matching rules for a specific host in one list?

(Philipp Näther) #4

Sometimes the CLI commands act differently than the web GUI because of different rights and maybe sometimes it does not use the same rules as WATO. I still do not understand the “magic” that causes this behavior.

You can see all rules for a host on the WATO page of the host by clicking the “paramaters” button at the top. On CLI there is nothing like that.

What does the diagnostic page of the host show you? Also accessible via the WATO page of the host.

1 Like