Mackerel blog #mackerelio

The Official Blog of Mackerel

Release of Anomaly Detection for roles and more

Mackerel team CRE Miura (id:missasan) here.

Thank you to everyone who came out to the Meetup last weekend. I hope everyone had a good time. The event report will be coming out soon, so be sure to keep a lookout for that.

Last week we finally released Anomaly Detection for roles, our new feature that uses machine learning. The feature is scheduled for an official release in May, but until then, you can try out the Anomaly Detection for roles feature as part of our free promotion. Be sure to give it a try and let us know what you think.

Now on to the this week’s update information.

Release of Anomaly Detection for roles (with free promotional campaign)

Our new feature Anomaly Detection for roles (beta version), which uses machine learning, has been released. For more on how to use the feature, be sure to check out the page linked below.

For more details, refer to the help page linked below as well.

We are currently offering a free promotional campaign!

During the feature’s beta period, Anomaly Detection for roles can be used for free (with no additional charges). This feature is for ‘Standard’ and ‘Trial’ plans. Be sure to take advantage of this free period and try out the new feature in a variety of different environments. We’re looking forward to your feedback!

Please note, following the feature’s official release scheduled for May, environments that have Anomaly Detection for roles enabled will automatically switchover to incur charges.

Help and other Mackerel documents made open-source

Mackerel's Help pages and other documents are now open-source.

If there are any parts that need correction regarding the Help or FAQ, we are accepting pull requests. Japanese only and pull requests in Japanese are also welcome.

We look forward to your pull requests!

check-log plugin now supported for log read timeout

With the release of go-check-plugins v0.28.0, the check-log plugin is now supported for log read timeouts. Up until now, if a timeout occurred while a log was being read, it would sometimes result in an error. With this release, improvements were made and timeouts during log reading can now be handled normally.

If the command configuration in the Mackerel agent configuration file is not set to array specifications (for character string specifications), there is a possibility that timeouts can not be handled normally depending on the environment. Therefore, it is recommended that you set the command configuration to array specifications.

Operation Monitoring Solution Seminar with Cloud portal x SIOS Coati x Mackerel on March 12th (Tues)!

Together, Hatena, SIOS TECHNOLOGY, Inc., and Sony Network Communications, Inc. will be holding a seminar in Tokyo.

We’ve heard from quite a few companies that have had operational issues with the introduction of AWS. This seminar will introduce application management tools to help you automate as much as possible and ensure that AWS is on track! We’ll go over the best practices for managing AWS with Hatena's Mackerel, SIOS Technology’s SIOS Coati, and Sony Network Communications' Managed Cloud Portal.

Event Details

  • Date and time:Tuesday, March 12, 2019 from 3:00 p.m. - 5:30 p.m.(Reception starts at 2:30 p.m.)
  • Venue:Akihabara UDX 4F Next-2 (2 min. walk from Akihabara station) [MAP]
  • Admission:Free
  • Sponsors:SIOS TECHNOLOGY, Inc., and Sony Network Communications, Inc., and Hatena, Inc.

Apply here (Japanese only)

DevOps Hands-on ~Building a safe and secure DevOps environment with AWS and Mackerel~ on March 15th (Fri) !

Hatena and Classmethod, Inc. will hold a hands-on seminar at the Shibuya Hikarie on Friday March 15th.

At this event, Hatena (Mackerel) and Classmethod, both of which who have earned DevOps competency certifications in the AWS partner system, will explain hands-on how to build CI/CD pipeline environments that combine Mackerel and AWS Code series. This is a great opportunity to learn more about the latest DevOps environments that combine monitoring and CI/CD pipelines.

Event Details

  • Date and time:Friday, March 15, 2019 from 2:00 p.m. - 4:30 p.m. (Reception starts at 1:30 p.m.)
  • Venue:Shibuya Hikarie 11th floor Sky Lobby Hikarie Conference Room C [MAP]
  • Capacity:20 people
  • Addmission:Free
  • Sponsors:Classmethod, Inc. and Hatena, Inc.

Apply here (Japanese only)

New feature・How to use Anomaly Detection for roles

Hello. Mackerel Team Director id:daiksy here.

The beta version of Mackerel’s new feature ‘Anomaly Detection for roles’ which uses machine learning is now being offered. You might have heard about the development of this feature at Meetup and other past events.

Anomaly detection differs slightly from the way monitoring has been used up until now. In this article, we’ll take a look at what anomaly detection is and how it can be used.

What is Anomaly Detection for roles?

‘Anomaly Detection for roles’ is a function that uses machine learning to detect abnormalities in the server without having to set special monitoring items for hosts within a role in Mackerel.

Up until now, a substantial amount of experience and know-how regarding server monitoring was needed to be able to configure monitors precisely. Let’s say you want an alert issued when the CPU load gets high, but it’s actually quite difficult to determine what percentage of CPU usage is considered high-load, or what thresholds should be set for which items when detecting for application abnormalities. In order to be able to make these kinds of decisions, operational experience and technical knowledge are needed. On top of this, the idiosyncrasies of applications change daily, and if left alone, monitor configurations can become obsolete, so regular maintenance is a must.

‘Anomaly Detection for roles’ can help with these types of monitoring complications.

With Mackerel, it’s recommended that you organize your servers into roles. The role being the role that server plays in a service. By appropriately setting roles, you can classify groups of servers with similar load trends such as "application servers" or "database servers". Mackerel's ‘Anomaly Detection for roles’ feature uses machine learning to learn a server's "normal state" from past trends of metrics over the entire role. Newly posted metrics are monitored against the learned results, and anything that is outside of the "normal state" is regarded as an anomaly, and an alert occurs. In other words, the ‘Anomaly Detection for roles’ feature detects server abnormalities without having to configure individual monitors.

Role configuration is vital to improving detection precision

With "Anomaly Detection for roles", roles are specified as the monitoring target. Mackerel uses past system metrics from hosts that are registered in the specified role to learn trends. As previously mentioned, it is recommended that roles be categorized by the role a server plays in a service, such as application servers and database servers, because of this we can assume that a role contains servers with similar metric trends, and trends can be learned from the entire role. Consequently, if a role contains servers with significantly different trends, or those with extremely different specifications, accuracy will fall. For example, when "active" and "standby" servers coexist for a long period of time, servers with different trends get mixed together in the role.

Therefore, In order to increase the precision of Mackerel's Anomaly Detection, it is important to first properly categorize servers by roles.

How to use Anomaly Detection for roles

‘Anomaly Detection for roles’ learns trends from past metrics. If newly posted metrics are determined to be outside of those trends, an alert will occur. This alert notification will also display the metrics that were determined to be abnormal. From this information, the user who receives the alert can estimate what kind of anomaly is occurring in the server. For example, the alert may show an increase in memory usage outside of the normal trend.

However, even if the alert shows that the detected anomaly is based on memory usage metrics, there is no guarantee that the cause of the issue applies to memory. This point requires careful attention.

‘Anomaly Detection for roles’ performs a combination of learning and judgment for trends of system metrics of hosts configured in a role. When an issue occurs in a server, more often than not several metrics are affected by the primary cause of the issue. For example, when the amount of data written to a disk increases, the network's transfer load needed to send that data also increases at the same time, and as a result, this may also affect memory usage. Even if trends change in such a complex manner, only the one metric used as the basis of detection is recorded in alerts of ‘Anomaly Detection for roles’. So, you need to be able to see the role graph transversely when an alert occurs.

Due to the nature of this feature, it’s slightly difficult to define a typical troubleshooting response such as "restart this server when this monitoring alert occurs". We recommend using ‘Anomaly Detection for roles’ to quickly detect the rare case anomalies in auxiliary monitoring applications while also configuring monitors with thresholds based on your past operational experience. You might also consider an operation cycle where you add a threshold based monitor upon receiving an alert with anomaly detection.

Current limitations with Anomaly Detection for roles

At the moment, ‘Anomaly Detection for roles’ is only supported for Linux environments with mackerel-agent installed. Windows and Integration environments are not currently supported.

cron and other batch jobs can now be monitored with the mkr wrap command and more

Mackerel team CRE Miura(id:missasan) here.

As the end of the week approaches, we look forward to Mackerel Meetup # 13 Tokyo on March 1st (Friday). The number of LT spots has been increased. There’s still 1 open, so please apply! General participation is also still available. Let's have a drink together at Meetup this Friday! (Japanese only)

Now on to this week’s update information.

cron and other batch jobs can now be monitored with the mkr wrap command

With mkr v0.35.0, cron and other batch jobs can now be monitored with the mkr wrap command. When you execute a command such as % mkr wrap -- /path/to/your-batch …, and the command returns with a non-zero exit, an alert will occur in Mackerel.

For more details, check out the help page linked below.

check plugin operation can now be checked with the mkr check run command

With mkr v0.35.0, you can now use the mkr check run command to check the configuration and operation of the check plugin specified in mackerel-agent.conf.

When you execute a command such as % mkr checks run, results similar to the following will be displayed.

ok 7 - load
  command: ['check-load -w 2,2,2 -c 5,5,5']
  status: OK
  stdout: 'LOAD OK: load average: 0.06, 0.03, 0.05'

If the check fails, it will have a non-zero exit.

source option added to mackerel-plugin-mongodb

With mackerel-agent-plugins v0.55.0, the -source option was added to mackerel-plugin-mongodb. By specifying -source=<authenticationDatabase> when executing the plugin, it is now possible to select the database specified during user authentication.

socket option added to mackerel-plugin-php-fpm

With mackerel-agent-plugins v0.55.0, the -socket option was added to mackerel-plugin-php-fpm. This option allows you to retrieve metrics via UNIX domain sockets and TCP services.

Check out the README below for more details on how to use the option.

Metrics added to AWS Integration Redshift

New metrics such as QueriesCompletedPerSecond and more have been added to AWS Integration Redshift.

For more details on obtainable metrics, check out the help page linked below.

A big thank you to everyone who contributed!

The release of Mackerel container agent (public beta) and more

Mackerel team CRE Miura (id:missasan) here.

The long-awaited Mackerel container agent (public beta) has finally been released!

Also, Mackerel Meetup #13 Tokyo is scheduled to be held on Friday, March 1st at the Tokyo office of Cybozu, Inc. At the event, a lot of the topics will focus on the operation and monitoring of containers. This will be a great opportunity to get the story behind the development and some know-how regarding using Mackerel container agent.You don’t want to miss it!

Now on to this week’s update information.

The release of Mackerel container agent (public beta)

Mackerel container agent has been released. Use it to monitor containers on container orchestration platforms. Currently, the following platforms are supported.

Executable as a task/Pod sidecar, you can post CPU, memory, and network interface metrics as system metrics for each container. You can also configure the monitoring rate based on the obtained metric in the Monitor Settings screen.

For details regarding specifications or the setup process for Mackerel container agent, check out the Help page linked below.

Mackerel container agent is still in the development process in preparation for its official release. So please give it a try and let us know what you think. Incompatible changes will not be made without an announcement, however, depending on demand, changes may be made with advance notice.

For Amazon ECS / AWS Fargate, 1 task will count as one host. For Kubernetes, 1 Pod will count as 1 host. The number of hosts will be calculated using a moving average of the previous month. For more details, please refer to FAQ · Calculating the number of hosts.

※The billing system for registered hosts with container agent will change in the future.

Mackerel Meetup #13 Tokyo on March 1st (Fri)!

We’ll be borrowing Cybozu Inc.’s Tokyo office seminar room to hold Mackerel Meetup #13 Tokyo! Our very own Imai (id:hayajo_77), the developer behind Mackerel container agent, is scheduled to speak at the event. The presentation will go over the agent’s specifications and features, as well as design ideas and implementation methods.

Click on the link below to apply. (Japanese only)

Event details

  • Date and time: March 1, 2019 (Fri) at 4:30 p.m. ~ 9:00 p.m. (JST) (Reception starts at 4:00 p.m.)
  • Venue: Cybozu Inc. Tokyo Office
  • Address: 〒103-6028 Tokyo, Chuo-ku, Nihonbashi 2-7-1, Tokyo Nihonbashi Tower 27th floor (Reception 7F)
  • Access: Tokyo Office Access Map | Cybozu Inc.
  • Cost: Free

Microsoft Teams added to notification channels

Mackerel team CRE Miura (id:missasan) here.

This week’s release will come as welcome news for those using Microsoft Teams for team communication.

In the past, we’ve heard from many users requesting the ability to send alert notifications to Microsoft Teams channels. With this release, alert notifications for Microsoft Teams is now supported as a standard function. It’s easy to setup and the alert content is easy to see. Definitely give it a try!

Now on to this week’s update information.

Microsoft Teams added to notification channels

When attempting to integrate alert notifications in Microsoft Teams in the past, mail notifications had to be sent to the email address issued by Microsoft Teams. Although this method worked, we received feedback that the display was partially distorted and it was difficult to use. With this release, the ability to integrate notifications to Microsoft Teams is now available as a standard function. Problems regarding visibility and image loss have also been resolved.

Configurations can be made from the Channel Settings screen.

When an alert occurs, it will be displayed like shown below.

For more details, refer to the Help page.

We look forward to your thoughts and feedback.

The system shutdown for database maintenance scheduled for Feb. 7th (Thur) has been cancelled

Thank you for choosing Mackerel.

Regarding our previous announcement and the urgent maintenance scheduled to take place on February 7th (Thur) at 2:30 pm, a resolution for the problem has been found and the scheduled maintenance has been cancelled. The use of Mackerel will continue as usual during that aforementioned time period.

Reason for cancellation

We’ve confirmed the restoration of autovacuum in PostgreSQL (RDS), the data store used by Mackerel. And since the problem of the transaction ID being depleted has also been resolved, we have decided that emergency maintenance is unnecessary.

We will continue monitor the situation and make every effort to ensure that the operation of Mackerel is stable in the future.

We apologize for any inconvenience this may have caused.

Thank you for understanding and cooperation.

New information added that can be obtained with the user list API and more

Mackerel team CRE Miura (id:missasan) here.

As previously announced, several fascinating presentations are lined-up for Mackerel UG Kansai Meetup # 1. Itec Hankyu Hanshin will be presenting on "Introducing Mackerel in fully managed hosting" and Beyond Co., Ltd. will be talking about "Comparing server monitoring contents with Mackerel and Zabbix”. Right now it seems like there are still some spots available, so please come and join us if you’re in the Kansai area! (Japanese only)

Now on to this week’s update information.

New information added that can be obtained with the user list API

The added information is as follows.

  • isInRegistrationProcess
  • isMFAEnabled
  • authenticationMethods
  • joinedAt

For more details, refer to the help page linked below.

Specification changes for the host update API

As was announced in the following entry, specifications of the host update API have changed. For more details, see the entry below.