Adding monitors for script checks

Check monitoring is a feature that monitors the check plugin execution results similarly to Nagios. The agent periodically runs the check plugin and sends the results to Mackerel.

As for metric monitoring, which monitors the thresholds of metric values ​​sent to Mackerel, they differ in the following ways.

  • With metric monitoring, the host posts metric values and Mackerel compares/judges those values against thresholds.
    • Metrics are displayed as graphs, and monitors can be configured from the web console or API
  • With check monitoring, plugins are used to make OK / NG (CRITICAL or WARNING or UNKNOWN) judgments within the host and post the results to Mackerel.
    • Graphs are not displayed because metrics are not posted. Monitors can not be configured from the web console, but configurations can be added to mackerel-agent installed on the host.


In order to use check monitoring with mackerel-agent, a program is required that performs the target monitoring process and returns the exit status according to the results. For this reason, an official check plugin pack is available. For more information please refer to Using the official check plugin pack for check monitoring.

Check monitors will be counted as 1 host metric each. See here for the limits of each plan, and here for metric limits per host and specifications when limits are exceeded.


In the agent settings file, add an item as shown here:

command = ["ruby", "/path/to/check-ssh.rb"]
custom_identifier = "SOME_IDENTIFIER" # optional
notification_interval = 60
max_check_attempts = 1
check_interval = 5
timeout_seconds = 45
prevent_alert_auto_close = true
env = { HOST = "hostname", PORT = "port" }
action = { command = "bash -c '[ \"$MACKEREL_STATUS\" != \"OK\" ]' && ruby /path/to/notify_something.rb", env = { NOTIFY_API_KEY = "API_KEY" }, user = "someone", timeout_seconds = 45 }
memo = "This check monitor is ..."
  • Item name: With the key for the settings file, the item name must begin with "plugin.checks." and contain exactly two periods. Anything after the second dot will be used as the monitor settings name.
  • command: This command will have the agent temporarily execute, and use it’s exit status/standard output as the monitoring result.
  • custom_identifier: Monitoring results are sent as a monitor of the host of the specified identifier, not the host on which the agent is running.
    • If the check result is not OK, it will be notified as an alert for the host that is specified here.
    • This can be useful for adding monitors to hosts integrated with AWS / Azure / Google Cloud Integration. For more details, refer to the AWS Integration Document.
  • notification_interval: Specify the interval for re-sending notifications in minutes. If the agent version is v0.67.0 or later, expressions such as "10m" or "1h" can also be written. If omitted, notifications will not be re-sent. An interval of less than 10 minutes can not be designated. If an interval of less than 10 min is designated, the notification will be re-sent at 10 minutes.
  • max_check_attempts: An alert will be sent for any result other than “okay” in the designated number sequence. For example, if set at 3 and the latest monitoring result for all three is not ok, then a notification will be sent. When used with prevent_alert_auto_close, the value of max_check_attempts will be treated as 1 regardless of the specified value.
  • check_interval: Specify the check monitoring execution interval in minutes. If the agent version is v0.67.0 or later, expressions such as "10m" or "1h" can also be written. The default value is 1 minute. The configurable range is 1 to 60 minutes. If a value of less than 1 minute is designated, monitoring will be run at 1 minute intervals. If a value of more than 60 minutes is designated, monitoring will be run at 60 minute intervals.
  • timeout_seconds: Specify the plugin timeout in seconds. The default value is 30 seconds. Since simultaneous activation for each plugin is not controlled, we recommend that the plugin execution interval not be exceeded.
  • prevent_alert_auto_close: With this value set to true, alerts opened for this check plugin will not be automatically closed. When used with max_check_attempts, max_check_attempts will always be treated as 1.
  • env: Environment variables can be specified to pass to command. Specify with TOML Table or Inline Table.
  • action.command: An action executed following the execution of the command configured in command. This is used when there is a process to be performed depending on the command result. The result of the previous/current command etc. is passed as an environment variable. The execution result is ignored.
  • action.env: Environment variables can be specified to pass to action.command. Specify with TOML Table or Inline Table.
  • action.timeout_seconds : Specify the timeout for action.command in seconds. The default value is 30 seconds.
  • action.user: Execute action.command as the user specified for this option. Not yet supported for Windows environments.
  • memo: Configure notes for check monitoring. The character string specified here can be checked in alert notifications / the alert details screen / the host details page.

Check plugin specs

The specs for the Nagios plugin and the Sensu check script are mostly the same. In the settings file, the assign command’s exit status will be treated as shown below.

exit status meaning
0 OK
other than 0,1, or 2 UNKNOWN

It's also possible to add an auxiliary message to the standard output. The maximum character limit for messages is 1024. This output is sent to Mackerel and visualized in the host's details and Alerts page. For this reason, please be careful not to unintentionally send confidential information such as passwords.

About to develop a plugin using (a helper library that is used in our official plugins), please refer to Creating check plugins using checkers.

Check monitoring notifications

An alert notification will be sent when an alert has occurred and when the condition has been changed after an alert has occurred. Two cases for “when the alert condition has been changed” follow below.

  • When the status has changed
    • Including when the status is “OK”

When the condition of an alert or the message content has changed, that information will also be available in the alert details screen. A notification will not occur if just the message content changes.

Environment variables available with action

Environment variable Description
MACKEREL_STATUS The result of the previous command (max_check_attempts not taken into account).Either OK, WARNING, CRITICAL, or UNKNOWN.
MACKEREL_PREVIOUS_STATUS The result of the command before the previous command (max_check_attempts not taken into account). The initial result is an empty string after starting-up the agent. Either an empty string, OK, WARNING, CRITICAL, or UNKNOWN.
MACKEREL_CHECK_MESSAGE The result message of the previous command ( command stdout).

An example in Ruby

This is a plugin that takes the values of a six-sided die as messages, 4 and 5 being a WARNING and 6 being a CRITICAL, and posts them to Mackerel.

#!/usr/bin/env ruby
dice = rand(6)+1
puts "value is #{dice}"
exit (dice >= 6 ? 2 : dice >= 4 ? 1 : 0)

By executing the agent configured with a check plugin, an item showing that monitoring is active will be displayed in the host details page as shown below.

If an alert is raised it will be displayed as shown below and can be confirmed in the alert details page.