Subscribed unsubscribe Subscribe Subscribe

Adding monitors for script checks

Check plugin execution results from the agent can be sent to Mackerel and monitored similarly to Nagios. This is only for versions of mackerel-agent 0.16.0 and later.

Additionally, we have an official plugin pack which is available. For more information please refer to Using the official check plugin pack for check monitoring.

Additionally, to develop a plugin using github.com/mackerelio/checkers (a helper library that is used in our official plugins), please refer to Creating check plugins using checkers.

By registering a command which outputs monitor results in the below-mentioned Nagios plugin compatible format, that output will be transmitted to Mackerel and visualized in the hosts details screen or the alerts screen.

Check items will be counted as 1 host metric each. Limits for each plan can be viewed here

Configuration

In the agent settings file, add an item as shown here:

[plugin.checks.ssh]
command = "ruby /path/to/check-ssh.rb"
notification_interval = 60
max_check_attempts = 3
check_interval = 5
  • Item name: With the key for the settings file, the item name must begin with "plugin.checks." and contain exactly two periods. Anything after the second dot will be used as the monitor settings name.
  • command: This command will have the agent temporarily execute, and use it’s exit status/standard output as the monitoring result.
  • notification_interval: The notification re-sending interval will be designated in minutes. If the notification is abbreviated, it will not be re-sent. An interval of less than 10 minutes can not be designated. If an interval of less than 10 min is designated, the notification will be re-sent at 10 minutes. This feature can be used in mackerel-agent v0.27.0 and higher.
  • max_check_attempts: An alert will be sent for any result other than “okay” in the designated number sequence. For example, if set at 3 and the latest monitoring result for all three is not ok, then a notification will be sent. This feature can be used in mackerel-agent v0.28.0 and higher.
  • check_interval: Designate the check monitoring execution interval in minutes. The default value is 1 minute. The configurable range is 1 to 60 minutes. If a value of less than 1 minute is designated, monitoring will be run at 1 minute intervals. If a value of more than 60 minutes is designated, monitoring will be run at 60 minute intervals.

Check plugin specs

The specs for the Nagios plugin and the Sensu check script are mostly the same. In the settings file, the assign command’s exit status will be treated as shown below.

exit status meaning
0 OK
1 WARNING
2 CRITICAL
other than 0,1, or 2 UNKNOWN

It’s possible to add an auxiliary message to the standard output. The maximum character limit for messages is 1024.

Check monitoring notifications

An alert notification will be sent when an alert has occurred and when the condition has been changed after an alert has occurred. Two cases for “when the alert condition has been changed” follow below.

  1. When the status has changed
    • ex. CRITICAL -> WARNING, WARNING -> CRITICAL, CRITICAL -> OK
    • Including when the status is “OK”
  2. When the message content being sent by check plugin has changed

When the condition of an alert has been changed, that information will also be available in the alert details screen.

An example in Ruby

This is a plugin that takes the values of a six-sided die as messages, 4 and 5 being a WARNING and 6 being a CRITICAL, and posts them to Mackerel.

#!/usr/bin/env ruby
dice = rand(6)+1
puts "value is #{dice}"
exit (dice >= 6 ? 2 : dice >= 4 ? 1 : 0)

By executing the agent configured with a check plugin, an item showing that monitoring is active will be displayed in the host details page as shown below.

If an alert is raised it will be displayed as shown below and can be confirmed in the alert details page.