- Register Monitor Configurations
- List Monitor Configurations
- Get Monitor Configurations
- Update Monitor Configurations
- Delete Monitor Configurations
Register Monitor Configurations
Monitors for various types of metrics as well as external monitors will be registered with Mackerel. The input procedure varies depending on the monitoring target.
POST
/api/v0/monitors
Required permissions for the API key
- Read
- Write
- Host metric monitoring
- Host connectivity monitoring
- Service metric monitoring
- External monitoring
- Expression monitoring
- Monitoring with Anomaly Detection for Roles
- Query monitoring
Host metric monitoring
Input (for host metric monitoring)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "host" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
duration |
number | average value of the designated interval (in minutes) will be monitored. valid interval (1 to 10 min.) |
metric |
string | name of the host metric targeted by monitoring. by designating a specific constant string, comparative monitoring is possible *1 |
operator |
string | determines the conditions that state whether the designated variable is greater (> ) or less than (< ). the observed value is on the left of ”>” or ”<” and the designated value is on the right |
warning |
number | [optional] the threshold that generates a warning alert. comparative monitoring has a valid range of 1-100*1 |
critical |
number | [optional] the threshold that generates a critical alert. comparative monitoring has a valid range of 1-100*1 |
maxCheckAttempts |
number | [optional] number of consecutive Warning/Critical instances before an alert is made. Default setting is 1 (1-10) |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
scopes |
array[string] | [optional] monitoring target’s service name or role details name *2 |
excludeScopes |
array[string] | [optional] monitoring exclusion target’s service name or role details name *2 |
isMute |
boolean | [optional] Whether monitoring is muted or not *3 |
Example Input
{ "type": "host", "name": "disk.aa-00.writes.delta", "memo": "This monitor is for Hatena Blog.", "duration": 3, "metric": "disk.aa-00.writes.delta", "operator": ">", "warning": 20000.0, "critical": 400000.0, "maxCheckAttempts": 3, "notificationInterval": 60, "scopes": [ "Hatena-Blog" ], "excludeScopes": [ "Hatena-Bookmark: db-master" ] }
Response (for host metric monitoring)
Success
{ "id" : "2cSZzK3XfmG", "type": "host", "name": "disk.aa-00.writes.delta", "memo": "This monitor is for Hatena Blog.", "duration": 3, "metric": "disk.aa-00.writes.delta", "operator": ">", "warning": 20000.0, "critical": 400000.0, "maxCheckAttempts": 3, "notificationInterval": 60, "scopes": [ "Hatena-Blog" ], "excludeScopes": [ "Hatena-Bookmark: db-master" ] }
id
will be given and returned.
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
400 | when the duration is outside the range of 1~10 |
400 | when warning or critical are outside the range of 0~100(%) in comparative monitoring settings *1 |
400 | when the maxCheckAttempts is outside the range of 1~10 |
400 | when the service name and role name that are assigned to scope and excludeScopes haven’t been registered yet |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
*1 comparative monitoring
If monitoring host metrics, by assigning a specific character string to metric
, comparative monitoring will be done for that metric. metric
s that can be assigned as comparative monitoring values are as follows.
metric |
---|
"cpu%" |
"memory%" |
"disk%" |
"swap%" |
"container-cpu%" |
"container-memory%" |
*2 Service name and Role service name
Service name as well as role service name are character strings in the format <service name>
and <service name>:<role name>
.
e.g. | If the service name for Hatena-Bookmark is Hatena-Bookmark then the db-master role in the service Hatena-Bookmark would be Hatena-Bookmark:db-master |
---|
Usable characters are /^[A-Za-z0-9][A-Za-z0-9_-]+$/
.
*3 Muted monitoring
This function disables notifications in monitoring. Alerts occur in response to monitoring thresholds, but notifications will not be sent to notification channels.
Host connectivity monitoring
Input (host connectivity monitoring)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "connectivity" |
name |
string | [optional] arbitrary name that can be referenced from the monitors list, etc. The default value is connectivity . |
memo |
string | [optional] notes for the monitoring configuration |
alertStatusOnGone |
string | [optional] The status of an alert generated by this monitor. Either "CRITICAL" (default) or "WARNING" . |
scopes |
array[string] | [optional] The service name or role details name of the monitoring target. *2 |
excludeScopes |
array[string] | [optional] The service name or role details name of the monitoring exception. *2 |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
isMute |
boolean | [optional] whether monitoring is muted or not |
Example Input
{ "type": "connectivity", "name": "connectivity service1", "memo": "A monitor that checks connectivity.", "alertStatusOnGone": "WARNING", "scopes": [ "service1" ], "excludeScopes": [ "service1: role3" ] }
Response (Host connectivity monitoring)
Success
{ "id" : "2cSZzK3XfmG", "type": "connectivity", "name": "connectivity service1", "memo": "A monitor that checks connectivity.", "alertStatusOnGone": "WARNING", "scopes": [ "service1" ], "excludeScopes": [ "service1: role3" ] }
id
will be given and returned
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when thememo exceeds 2048 characters |
400 | When the alertStatusOnGone is neither CRITICAL nor WARNING |
400 | when the specified service name or role details name is not registered in scope or excludeScopes |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
Service metric monitoring
Input (when monitoring service metrics)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "service" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
service |
string | name of the service targeted by monitoring |
duration |
number | monitors the average value of the designated number of points. range: most recent 1~10 points |
metric |
string | name of the monitoring target’s host metric name |
operator |
string | determines the conditions that state whether the designated variable is greater (> ) or less than (< ). the observed value is on the left of ”>” or ”<” and the designated value is on the right |
warning |
number | [optional] the threshold that generates a warning alert |
critical |
number | [optional] the threshold that generates a critical alert |
maxCheckAttempts |
number | [optional] number of consecutive Warning/Critical instances before an alert is made. Default setting is 1 (1-10) |
missingDurationWarning |
number | [optional] the threshold (in minutes) to generate a warning alert for interruption monitoring |
missingDurationCritical |
number | [optional] the threshold (in minutes) to generate a critical alert for interruption monitoring |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
isMute |
boolean | [optional] Whether monitoring is muted or not *3 |
Example Input
{ "type": "service", "name": "Hatena-Blog - access_num.4xx_count", "memo": "A monitor that checks the number of 4xx for Hatena Blog", "service": "Hatena-Blog", "duration": 1, "metric": "access_num.4xx_count", "operator": ">", "warning": 50.0, "critical": 100.0, "maxCheckAttempts": 3, "missingDurationWarning": 360, "missingDurationCritical": 720, "notificationInterval": 60 }
Response (when monitoring service metrics)
Success
{ "id" : "2cSZzK3XfmG", "type": "service", "name": "Hatena-Blog - access_num.4xx_count", "memo": "A monitor that checks the number of 4xx for Hatena Blog", "service": "Hatena-Blog", "duration": 1, "metric": "access_num.4xx_count", "operator": ">", "warning": 50.0, "critical": 100.0, "maxCheckAttempts": 3, "missingDurationWarning": 360, "missingDurationCritical": 720, "notificationInterval": 60 }
id
will be given and returned.
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
400 | when the duration is not in the range of 1~10 |
400 | when the maxCheckAttempts is not in the range of 1~10 |
400 | when the missingDurationWarning or missingDurationCritical is not a multiple of 10 minutes, or is more than a week |
400 | when the service name assigned to the service hasn’t been registered yet |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
External monitoring
Input (external monitoring)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "external" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
url |
string | monitoring target URL |
method |
string | [optional] request method, one of GET , POST , PUT , DELETE . If omitted, GET method is used. |
service |
string | [optional] service name (when response time is monitored, it will be graphed in the service metrics of the service linked here) |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
responseTimeWarning |
number | [optional] the response time threshold for Warning alerts (in milliseconds) service designation is required |
responseTimeCritical |
number | [optional] the response time threshold for Critical alerts (in milliseconds) service designation is required |
responseTimeDuration |
number | [optional] will monitor the avg. value of requests in the designated time frame (1-10 min.). service designation is required |
containsString |
string | [optional] string which should be contained by the response body |
maxCheckAttempts |
number | [optional] number of consecutive Warning/Critical instances before an alert is made. Default setting is 1 (1-10) |
certificationExpirationWarning |
number | [optional] certification expiration date monitor’s “Warning” threshold. number of days remaining until expiration. |
certificationExpirationCritical |
number | [optional] certification expiration date monitor’s “Critical” threshold. number of days remaining until expiration. |
skipCertificateVerification |
boolean | [optional] Whether or not to skip the verification of the certificate. |
isMute |
boolean | [optional] Whether monitoring is muted or not *3 |
headers |
array[object] | [optional] The values that should be configured as the HTTP request header specified by name and value . If this field is omitted, the default header will be configured. If you do not want to configure headers, specify an empty array. |
requestBody |
string | [optional] HTTP request body |
followRedirect |
boolean | [optional] Evaluates the response of the redirector as a result. If this field is omitted, the redirection destination in the response will not be tracked. |
In order to monitor response time, it's necessary to specify responseTimeDuration
and at least one of responseTimeWarning
and responseTimeCritical
.
In order to monitor the certification expiration date, it’s necessary to specify at least one of certificationExpirationWarning
and certificationExpirationCritical
.
Example Input
{ "type": "external", "name": "Example Domain", "memo": "Monitors example.com", "method": "GET", "url": "https://example.com", "service": "Hatena-Blog", "notificationInterval": 60, "responseTimeWarning": 5000, "responseTimeCritical": 10000, "responseTimeDuration": 3, "containsString": "Example", "maxCheckAttempts": 3, "certificationExpirationWarning": 90, "certificationExpirationCritical": 30, "isMute": false, "headers": [{"name": "Cache-Control", "value": "no-cache"}] }
Response (external monitoring)
Success
{ "id" : "2cSZzK3XfmG", "type": "external", "name": "example.com", "memo": "Monitors example.com", "method": "GET", "url": "https://example.com", "service": "Hatena-Blog", "notificationInterval": 60, "responseTimeWarning": 5000, "responseTimeCritical": 10000, "responseTimeDuration": 3, "containsString": "Example", "maxCheckAttempts": 3, "certificationLimitWarning": 90, "certificationLimitCritical": 30, "isMute": false, "headers": [{"name": "Cache-Control", "value": "no-cache"}] }
id
will be given and returned.
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
400 | when the url scheme is not http or https |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
400 | when the maxCheckAttempts is not in the range of 1~10 |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
Expression monitoring
Input(expression monitoring)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "expression" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
expression |
string | Expression of the monitoring target. Only valid for graph sequences that become one line. |
operator |
string | determines the conditions that state whether the designated variable is greater (> ) or less than (< ). the observed value is on the left of ”>”or ”<” and the designated value is on the right |
warning |
number | [optional] the threshold that generates a warning alert |
critical |
number | [optional] the threshold that generates a critical alert |
notificationInterval |
number | [optional] The time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
isMute |
boolean | [optional] whether monitoring is muted or not *3 |
Input example
{ "type": "expression", "name": "role average", "memo": "Monitors the average of loadavg5", "expression": "avg(roleSlots(\"service:role\",\"loadavg5\"))", "operator": ">", "warning": 5.0, "critical": 10.0, "notificationInterval": 60 }
Response(expression monitoring)
Success
{ "id" : "2cSZzK3XfmG", "type": "expression", "name": "role average", "memo": "Monitors the average of loadavg5", "expression": "avg(roleSlots(\"service:role\",\"loadavg5\"))", "operator": ">", "warning": 5.0, "critical": 10.0, "notificationInterval": 60 }
id
will be given and returned.
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
400 | when an invalid expression is designated |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
Monitoring with Anomaly Detection for Roles
Input (when monitoring with Anomaly Detection for Roles)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "anomalyDetection" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
scopes |
array[string] | [optional] monitoring target’s service name and role details name *2 |
warningSensitivity |
string | [optional] the sensitivity (insensitive , normal , or sensitive ) that generates warning alerts. |
criticalSensitivity |
string | [optional] the sensitivity (insensitive , normal , or sensitive ) that generates critical alerts. |
maxCheckAttempts |
number | [optional] number of consecutive Warning/Critical instances before an alert is made. Default setting is 3 (1-10) |
trainingPeriodFrom |
number | [optional] Specified training period (Uses metric data starting from the specified time) |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. If this field is omitted, notifications will not be re-sent. |
isMute |
boolean | [optional] whether monitoring is muted or not |
Example Input
{ "type": "anomalyDetection", "name": "anomaly detection", "memo": "my anomaly detection for roles", "scopes": [ "myService: myRole" ], "warningSensitivity": "insensitive", "maxCheckAttempts": 3 }
Response (Monitoring with Anomaly Detection for Roles)
Success
{ "id" : "2cSZzK3XfmG", "type": "anomalyDetection", "name": "anomaly detection", "memo": "my anomaly detection for roles", "scopes": [ "myService: myRole" ], "warningSensitivity": "insensitive", "maxCheckAttempts": 3 }
id
will be given and returned
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when thememo exceeds 2048 characters |
400 | when the specified service name or role details name is not registered in scope or excludeScopes |
400 | when the specified warningSensitivity or criticalSensitivity is not insensitive / normal / sensitive |
400 | when both of the warningSensitivity and criticalSensitivity are unspecified |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
400 | when a future value is specified for trainingPeriodFrom |
403 | when the API key doesn't have the required permissions |
Query monitoring
Input (query monitoring)
KEY | TYPE | DESCRIPTION |
---|---|---|
type |
string | constant string "query" |
name |
string | arbitrary name that can be referenced from the monitors list, etc. |
memo |
string | [optional] notes for the monitoring configuration |
query |
string | query of the monitoring target |
legend |
string | graph legend for the alerts |
operator |
string | determines the conditions that state whether the designated variable is greater (> ) or less than (< ). the observed value is on the left of ”>”or ”<” and the designated value is on the right |
warning |
number | the threshold that generates a warning alert |
critical |
number | the threshold that generates a critical alert |
notificationInterval |
number | [optional] the time interval (in minutes) for re-sending notifications. if this field is omitted, notifications will not be re-sent. |
isMute |
boolean | [optional] whether monitoring is muted or not *3 |
Example Input
{ "type": "query", "name": "cpu utilization", "memo": "Monitors the cpu utilization of httpbin", "query": "container.cpu.utilization{k8s.deployment.name=\"httpbin\"}", "legend": "cpu.utilization {{k8s.node.name}}", "operator": ">", "warning": 70.0, "critical": 90.0, "notificationInterval": 60 }
Response (query monitoring)
Success
{ "id" : "2cSZzK3XfmG", "type": "query", "name": "cpu utilization", "memo": "Monitors the cpu utilization of httpbin", "query": "container.cpu.utilization{k8s.deployment.name=\"httpbin\"}", "legend": "cpu.utilization {{k8s.node.name}}", "operator": ">", "warning": 70.0, "critical": 90.0, "notificationInterval": 60 }
id
will be given and returned.
Error
STATUS CODE | DESCRIPTION |
---|---|
400 | when the input is in a format that can’t be received |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
400 | when the notification re-sending time interval is not set at 10 minutes or more |
400 | when an invalid query is designated |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
List Monitor Configurations
GET
/api/v0/monitors
Required permissions for the API key
- Read
Response
{ "monitors": [ { "id" : "2cSZzK3XfmB", "type": "host", "name": "disk.aa-00.writes.delta", "memo": "This monitor is for Hatena Blog.", "duration": 3, "metric": "disk.aa-00.writes.delta", "operator": ">", "warning": 20000.0, "critical": 400000.0, "maxCheckAttempts": 3, "scopes": [ "Hatena-Blog" ], "excludeScopes": [ "Hatena-Bookmark: db-master" ] }, { "id": "2cSZzK3XfmA", "type": "connectivity", "alertStatusOnGone": "CRITICAL", "scopes": [], "excludeScopes": [] }, { "id" : "2cSZzK3XfmC", "type": "service", "name": "Hatena-Blog - access_num.4xx_count", "memo": "A monitor that checks the number of 4xx for Hatena Blog", "service": "Hatena-Blog", "duration": 1, "metric": "access_num.4xx_count", "operator": ">", "warning": 50.0, "critical": 100.0, "maxCheckAttempts": 1, "notificationInterval": 60 }, { "id" : "2cSZzK3XfmD", "type": "external", "name": "example.com", "memo": "Monitors example.com", "url": "http://www.example.com", "service": "Hatena-Blog", "headers": [{"name":"Cache-Control", "value":"no-cache"}], "maxCheckAttempts": 1 }, { "id" : "2cSZzK3XfmE", "type": "expression", "name": "role average", "memo": "Monitors the average of loadavg5", "expression": "avg(roleSlots(\"server:role\",\"loadavg5\"))", "operator": ">", "warning": 5.0, "critical": 10.0, "notificationInterval": 60 } ] }
- each field is the same as when the monitor was created
- list is ordered as monitor type -> name (same as the list of monitors on mackerel.io)
Get Monitor Configurations
GET
/api/v0/monitors/<monitorId>
Required permissions for the API key
- Read
Response
{ "monitor": { "id" : "2cSZzK3XfmB", "type": "host", "name": "disk.aa-00.writes.delta", "memo": "This monitor is for Hatena Blog.", "duration": 3, "metric": "disk.aa-00.writes.delta", "operator": ">", "warning": 20000.0, "critical": 400000.0, "maxCheckAttempts": 3, "scopes": [ "Hatena-Blog" ], "excludeScopes": [ "Hatena-Bookmark: db-master" ] } }
- each field is the same as when the monitor was created
Update Monitor Configurations
PUT
/api/v0/monitors/<monitorId>
As for requests and responses, just as when create monitors, every field must be specified. If there are any insufficient items that are required, an error will be generated.
When scopes
and excludeScopes
are updated, the JSON which was designated will be completely overwritten. For example, by omitting an item in scopes
when it has already been saved, scopes
will be deleted.
Connectivity Monitoring
When changing the alertStatusOnGone
field, alerts generated by that monitor prior to the change will be affected as follows:
Notifications configured to be resent (
notificationInterval
)After
alertStatusOnGone
has been changed, only notifications that are configured for resending will change to the new alert status once resent.Notifications not configured to be resent
The alert status will not change.
Additionally, if the alertStatusOnGone
field is not specified, its value will not be updated.
External Monitoring
If the headers
field is not specified, its value will not be updated. If you would like to delete the header settings, specify an empty array.
Required permissions for the API key
- Read
- Write
Response
Success
The updated monitoring configurations are returned. The same format as Register Monitor Configurations.
Error
same errors as when creating.
STATUS CODE | DESCRIPTION |
---|---|
400 | when trying to change the type |
400 | when the name is empty |
400 | when the memo exceeds 2048 characters |
404 | when the monitor configuration doesn’t have a saved <monitorId> which was assigned to the query parameter |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |
Delete Monitor Configurations
DELETE
/api/v0/monitors/<monitorId>
Required permissions for the API key
- Read
- Write
Response
Success
The status of the monitor configuration just before it is deleted will be returned. The format will be the same as when it was created.
Error
STATUS CODE | DESCRIPTION |
---|---|
404 | when the monitor configuration doesn’t have a saved <monitorId> which was assigned to the query parameter |
403 | when the API key doesn't have the required permissions / when accessing from outside the permitted IP address range |