Opinion Poll: Twenty Years of Plugin Return Codes

plugins
nagios-plugins
icinga
icinga2

(Brian LaVallee) #1

As we approach ten years of icinga (2009-05-15), which started out as a fork of Nagios. Almost twenty years after the original NetSaint (1999), renamed Nagios in 2002. We have been using the same Plugin Return Codes for the last twenty years.

Numeric Value Service Status Status Description
0 OK The plugin was able to check the service and it appeared to be functioning properly
1 Warning The plugin was able to check the service, but it appeared to be above some “warning” threshold or did not appear to be working properly
2 Critical The plugin detected that either the service was not running or it was above some “critical” threshold
3 Unknown Invalid command line arguments were supplied to the plugin or low-level failures internal to the plugin (such as unable to fork, or open a tcp socket) that prevent it from performing the specified operation. Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states.

POSIX return codes support 0 ~ 255, and we only use a few.


Do you think it’s time to support additional status codes?

I’m just interested in the opinion of the community.

  • What’s a Plugin?
  • It’s long overdue
  • I could use additional status codes
  • If it ain’t broke, don’t fix it
  • Who’s going to change all of the Plugins?
  • I don’t have an opinion
  • Other (reply to topic)

0 voters

For example: I use check_users to see the number of users logged a system. Warning and Critical could be considered too severe when a single user is logged in. Having Information or Minor status codes would be nice to have.


(Matthias) #2

How would you incorporate additional exit codes into Icinga 2 though?
Or are you proposing to add additional states to OK/Warn/Crit/Unknown?


(Rafael Voss) #3

Maybe an option to define custom return code interpretation for yourself.
I would like a information color for myself. F.e. for maintenance things, like there are new updates, but its just the info about it, or firmware updates available. Logged on Users as info would be nice to, oo you can see that someone is working on the server, but its not a warning or a critical.

Existing plugins don’t need to be changes for this.


#4

Having an additional code for objects being in maintenance/downtime would sometimes clarify the “actual” state as they aren’t operational (UP/OK) nor faulty (DOWN/UNREACHABLE/CRITICAL), so at least for hosts there isn’t a “correct” state at all and WARNING doesn’t seem to fit either.


(Brian LaVallee) #5

I’m not proposing anything at this point. Just broaching the subject.

While non-trivial icinga2 is the easy part, assuming it reads the full byte 00 ~ FF. But there’s a whole ecosystem of modules, plugins, and even other monitoring software that could be affected.

Development of a standard would be the first step. There are also a handful of POSIX exit codes with special meanings to avoid.


I like this idea, could look something like this:

object State 2 {
  display_name = "Godzilla" // Override the Default 'Critical'
}
object State 100 {
  display_name = "Gamera"
  CheckCommands = [ "ping", "ping4", "ping6" ] // Limit to specific check commands.
  color = "#8ACEDB" // colors are NOT handled by icinga2 / icingaweb2 assigns the colors


(Matthias) #6

I like the proposal for custom return code interpretation.
Maybe there could be a range of return codes which are considered OK by monitoring tools like Icinga 2 but stored in the object as a variable.
That way the notification engine would not be affected but we could interpret custom status codes with special Icingaweb 2 themes and use them when retrieving objects via the API.


(Bård Dahlmo Lerbæk) #7

Hosts can be UP or DOWN, they could also need a UNKNOWN status.