Check plugins - check-aws-cloudwatch-logs-insights

check-aws-cloudwatch-logs-insights is a plugin that monitors log group of CloudWatch Logs. To run this plugin, you must be authorized to run the CloudWatch Logs Insights API. For more information, see Authentication and Required Policies. See difference from check-aws-cloudwatch-logs for the difference from check-aws-cloudwatch-logs.

How to Install

check-aws-cloudwatch-logs-insights is not included in the official check plugin pack and requires additional installation.

When installing using the mkr command

Set up mkr so that you can install plugins with the mkr command. Please refer to the following help page for installation instructions.

mackerel.io

Execute the following mkr command.

sudo mkr plugin install check-aws-cloudwatch-logs-insights

To download and use without installation

The plugin executable is available at GitHub Releases. Please download the appropriate file for your environment.

Monitoring Specifications

The first time check-aws-cloudwatch-logs-insights is run, it monitors logs from the time 5 minutes before the current time up to an additional minute before.

  • ex. 1st run at 12:00
    • Target is 11:54 (startTime) to 11:55 (endTime)

The endTime is recorded in State file each time it is executed, and from the second time onward, the log is monitored from that time until 5 minutes before the current time.

  • ex. 1st run 12:00, 2nd run 12:05
    • Target is 11:55 (startTime) to 12:00 (endTime)

If more than 90 minutes has passed since the last execution, it monitors logs from the time 5 minutes before the current time up to an additional 1 minute before as in the case of the first execution.

Configurable options

Option Short Description Default
--log-group-name Specify target log group
--filter -f Set search strings for monitoring in CloudWatch Logs Insights query syntax
--warning-over -w Warning alert is issued when the number of lines matching the detection pattern exceeds the specified value 0
--critical-over -c Critical alert is issued when the number of lines matching the detection pattern exceeds the specified value 0
--state-dir -s Specify the directory path where the State file is saved See About State file
--return -r Log lines matching the pattern will be noted in the alert notification (Up to 1024 characters)
--help -h Show help

How to write --pattern option

--pattern is written in CloudWatch Logs Insights query syntax. Please refer to the AWS documentation for details on how to specify.

docs.aws.amazon.com

About State file

If the --state-dir option is not specified, the State file will be saved in the following directory in the format <hash string>.json.

  • When run via mackerel-agent
    • /var/tmp/mackerel-agent/check-aws-cloudwatch-logs-insights
  • When run manually
    • /tmp/check-aws-cloudwatch-logs-insights

Example configurations

In the following configuration, a Warning alert is detected when one or more logs containing the word ERROR are output to the log group /aws/lambda/some-lambda-function in CloudWatch Logs, and a Critical alert is detected when 10 or more logs containing the word ERROR are output.

[plugin.checks.aws-cloudwatch-logs-insights-sample]
command = ["check-aws-cloudwatch-logs-insights", "--log-group-name", "/aws/lambda/some-lambda-function", "--filter", "filter @message like /ERROR/", "--warning-over", "1", "--critical-over", "10"]
env = { AWS_REGION = "ap-northeast-1" }

How to specify a region

The region is specified by the check monitoring environment variable, not by a plugin option. Named profiles by the AWS_PROFILE environment variable are also supported.

env = { AWS_REGION = "ap-northeast-1" }

Authentication and required policies

check-aws-cloudwatch-logs-insights uses the API of CloudWatch Logs Insights. Please check that the credentials of the IAM user/role are available to perform the following actions on the monitored log groups.

  • logs:GetQueryResults
  • logs:StartQuery
  • logs:StopQuery

The following methods are supported for setting credentials.

  • Use of instance profiles (when monitoring from EC2 instances)
  • Use named profiles in the AWS_PROFILE environment variable
  • Specify the environment variable AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY directly with env = {} in mackerel-agent.conf.

Difference from check-aws-cloudwatch-logs

There is also a plugin check-aws-cloudwatch-logs that monitors CloudWatch Logs logs. The differences from check-aws-cloudwatch-logs are as follows

Can monitor log groups with high volume

check-aws-cloudwatch-logs may time out if the log volume is too large. check-aws-cloudwatch-logs-insights uses the 'CloudWatch Logs Insights API' and can successfully monitor a large number of logs.

To filter the characters to monitor, use a different syntax

Monitor logs up to 5 minutes before the current time

As described in Monitoring Specifications, this plugin monitors logs up to 5 minutes before the current time, so the monitoring is less real-time than check-aws-cloudwatch-logs.

CloudWatch Logs Insights API usage fee will be charged

The FilterLogEvents API used by check-aws-cloudwatch-logs is not charged for API requests. On the other hand, the CloudWatch Logs Insights API used by check-aws-cloudwatch-logs-insights charges a fee based on the amount of log data scanned. For CloudWatch Logs Insights API pricing, please see the AWS pricing page.

aws.amazon.com

Troubleshooting

UNKNOWN: context canceled occurred

  • Possible Causes
    • This error occurs when plugin execution is interrupted. This plugin uses the CloudWatch Logs Insights API provided by AWS to retrieve log data on CloudWatch. It is possible that the plugin execution timed out because it took a long time to send the API request or receive a response (We are not able to investigate what kind of issue was occurring on the AWS-provided API side).
  • Affect
    • Fails to check logs for the period that was monitored when the error occurred. Please check the Monitoring Specifications for the period to be monitored.
  • How to deal with it
    • If the timeout mentioned above is the actual cause of the error, you may be able to avoid the error by extending the time between plugin execution timeouts with timeout_seconds. For more information on timeout_seconds, see the Configuration items.

Repository

https://github.com/mackerelio/check-aws-cloudwatch-logs-insights