Mackerel case study: GMO Pepabo

Aggregating information and raising
the infrastructure’s competitiveness!
In addition to transitioning to the private cloud
GMO Pepabo is incorporating Mackerel on a large scale.

GMO Pepabo http://pepabo.com/en/

  • Hatena Co., Ltd.Service Development DivisionDirectorand chief engineerMasayuki Matsukiid:Songmu@songmu
  • Hatena Co., Ltd.Service Development DivisionProducerHiromichi Sugiyamaid:sugiyama88
  • GMO Pepabo, Inc.company executive CTOKentaro Kuribayashiid:antipop@kentaro
  • GMO Pepabo, Inc.Engineering Department’s technology infrastructure teamHiroshi Shibataid:h-sbt@hsbt
  • GMO Pepabo, Inc.Technology Department’s infrastructure groupYuki Takayaid:buty4649@buty4649
  • InterviewerAkio Hoshi(IT Journalist)

Date of publication: February 2, 2016 · All information contained herein is accurate as of when this interview took place.

For infrastructure monitoring and such, introducing approximately 500 units

Please introduce yourselves. First, from the Hatena side.

Sugiyama I’ve been Mackerel’s producer since September, 2015. My previous job was in system integration for a manufacturing company. As an infrastructure engineer, I have overseen the scale of servers, networks, and such in the thousands, from planning to investment. The past few years I’ve accumulated management experience by leading a ITIL-based service management project. In the past, useful structures and services were often those from overseas and I wanted to engage in what could be an international competitor from Japan. It is with the desire to achieve this goal that I am now working at Mackerel.

Matsuki I’m the director of Mackerel’s development team. And because I’m also an application engineer, it’s interesting for me to be involved with Mackerel, which in a service geared toward engineers.

Kuribayashi I’m the executive CTO of GMO Pepabo. But people might know me better as "Anchipokun". Being the CTO, I don’t have to move my hands too much, but GMO Pepabo is currently in the midst of a large infrastructure renewal and I’m in charge of guiding the technical side . When considering what operation management tool to use facing the new infrastructure, we decided to go with Mackerel.

Shibata I’m the chief engineer. I silently carry out the policies and strategies that are decided by Kuribayashi CTO. Not only has Mackerel been fun to use, but I also feel that it is worth programming. Writing programs and running services on our own is fun.

Kuribayashi Now I want to play around with it (LOL).

Shibata It’s that kind of feeling. We play around and explore, and sometimes we find a small "hole" that we didn’t know we could go through (LOL). It’s in this way that we’re trying to promote the installation and popularize the service.

Takaya I mainly oversee the infrastructure of the blog service「JUGEM 」. We were previously using (the server monitoring tools) Munin and Nagios , but during the company’s infrastructure transition to the private cloud, we had some difficulties integrating these tools into the cloud’s architecture, so we started using Mackerel. It’s easy to use and I have a feeling that it has a variety of applications beyond monitoring and metrics collecting. We are still in the process of exploring.

Around what time did you introduce Mackerel?

Shibata After Mackerel’s initial release (September 2014), we tested it out with 2-3 hosts and from around January or Febuary of 2015 we decided to introduce it at a large scale in the handmade market service「minne 」With a scale of 100 hosts, we thought, "We’re with Mackerel from now on!".

Kuribayashi Officially as an infrastructure, from April of 2015.

Takaya I began to fiddle with it (in JUGEM) around August of 2015.

Kuribayashi Now, our host count is just under 500.

With the transition to the private cloud a new tool is necessary

500 hosts is quite the scale, but what led you to introducing Mackerel?

Kentaro Kuribayashi

Kuribayashi Our transition to the cloud was the main reason. GMO Pepabo manages a variety of services, but speaking from a larger perspective, they are divided into three categories: hosting, EC support, and community businesses. Hosting consists of the server rental service「lollipop! 」, domain management service「muumuu domain 」, etc. EC assistance consists of services such as「Color Me Shop 」which aids those who want their own net shop. In 2015, we particularly focused on the service, "minne". Community businesses consists of JUGEM and CGM type services. The business can be divided into these three areas, but the nature of the service is divided into two large groups. In hosting, the challenge is in how to get customers to use the server. Because of this, hardware characteristics are important. It’s better to do this on-premises. Conversely, since mobility becomes cost effective in Web services geared toward consumers, we’re trying to transition to the private cloud. If this happens, we’ll go from a world where we don’t replace the current server, to a world where the server inconsistently turns on and off. Since this is a world where the monitoring target’s individual server is always replaced, the compatibility of the static file-based configuration is poor. Since the architecture (in the cloud) changes, the tools should as well. And for this reason, we are fully investing in Mackerel. Specifically, the web service side of each of the 3 business areas. In hosting (rather than on the server side), the portal and management screens that will be shown to the user. In EC, some of the micro services that make up the payment feature of the shopping cart. And, in JUGEM. These parts have already been transferred to the private cloud and it’s there that we are inserting Mackerel.

So your transition to the private cloud was a big trigger.

Kuribayashi There was another reason for introducing Mackerel. During the complete replacement of our infrastructure, there was an aim to introduce a structure that could aggregate information. It isn’t necessarily "I want to see this server of this service" out of the many servers, sometimes it’s wanting to know what’s going on "as a whole". And that was difficult with the tools used up until now. For example, if one location is managed with CSV and a different part used YAML, I wanted them to be seen and technically controlled in a unified manner. The compulsory aggregation of information is one aspect of using Mackerel.

Engineers who specialize in constructing are striving for automation in Mackerel

Can you tell us more about adopting Mackerel from a developer’s perspective.

Hiroshi Shibata

Shibata I was originally an application engineer, but I started working with and developing all the necessary items for a service by myself, including infrastructure. For this reason, I’m not particularly fond of the existing tools. I would say that most anything is okay as long as connectivity monitoring and resource monitoring are capable. As for tools that are easy to use, I think there are some that seem easy to use because a person is accustomed to them, and others that are like that from the very beginning. For example, using Nagios could be really convenient to someone who is used to it, but might feel confusing to someone who isn’t. So rather than study Nagios, because we were motivated to quickly resolve the problems in front of us, we decided to fully introduce Mackerel in "minne". In the process of creating a private cloud, application engineers have begun to handle servers that, until now, were only handled by infrastructure engineers. And it isn’t limited to OpenStack (private cloud infrastructure). It’s the same with Google’s cloud service and AWS (Amazon Web Services), but more and more people who aren’t well versed in infrastructure are creating servers and it’s becoming an era of service development, yet the number of engineers who say things like "I don’t understand Nagios, but I need to monitor" or "I don’t understand Munin, but I can’t maintain my service if I don’t watch the CPU rates" has increased. In that case, we thought, "let’s try to solve the problem in front of us by using something that, to some extent, has been prepared from the beginning". We adopted Mackerel with this thought process.

Kuribayashi By using the private cloud, the conversation turns to "Let’s continue to automate on the server side". And since the development people specialize in writing code, it’s natural to lead to "Let’s automate monitoring on the same level".

Application engineers thinking about and visualizing infrastructure

Hiromichi Sugiyama

Sugiyama With the introduction of Mackerel, not just infrastructure engineers, but also application engineers have started to have a hand in infrastructure. How will things change?

Shibata Application engineers have also started to voluntarily take interest in the operation of infrastructure. They have started to take notice when the server looks like it will overload or when the memory has been used up and a swap might happen.

Sugiyama Do infrastructure engineers have a different perspective or expertise?

Shibata They can easily realize certain problems like if the server is heavy or the service’s response time is slow. Dividing into teams of development and operations, we tend to say, "Let’s leave this to the infrastructure engineers", but I think the staff that creates the service has an easier time dealing with these problems.

Sugiyama Is there somewhere the infrastructure engineers and application engineers can help each other?

Shibata For a monitoring plugin, an application engineer will do the writing, because that’s their specialty (LOL). If i said, "by counting this number of this column in the DB, you can understand which job is stagnating, and if you take that value and plot it in a graph, wouldn’t you think that was useful?", the response would be "Yes!" (LOL). So an application engineer wrote monitoring for background jobs running on Ruby on Rails . Even if you’re not an infrastructure engineer, you can still oversee your own service from the plotted values.

Matsuki It is difficult to notice issues when the infrastructure and application engineer teams are disconnected. By posting a graph in Slack , it becomes easy for application engineers as well to notice that something is wrong when the trends start to change. And this is likely to turn into a good cycle.

Sugiyama How are you sharing the situation with members on the business side?

Shibata That’s been the case since before the introduction of Mackerel, but when red letters flow in Slack, they ask us, "Is that okay?" (LOL).

Takaya When we started using Slack as a company, I tried to post trends of the rising CPU usage in a graph, but Mackerel already had a feature for that and it was astounding. I thought, "We need to use this!. Mackerel is an agent type monitoring tool and it even prevents monitoring failures like when servers increase and making an addition to the configuration is forgotten. I thought that was really convenient. When we were managing Munin, an addition was made to the management file every time a server was added, but sometimes we forgot when the servers started to increase and were like "It’s not here!" (LOL). With Mackerel, that kind of oversight stopped happening. There are other agent type’s such as Zabbix and Sensu , but they were too difficult to study and introduce as new tools.

Plugins are "fun to make".

Matsuki Did you have any trouble switching over from your original environment to Mackerel?

Takaya No, we didn’t. A feature was released in Mackerel that allowed us to continue to use the Nagios and Munin plugins so we didn’t really have any problems.

Sugiyama To add to that, Mackerel can be used in popular OSS monitoring tools such as Nagios (Check monitoring) and Sensu (Check, Metric monitoring) because the plugins basically share the same specifications. Conversely, it’s also possible to use the Nagios or Sensu plugins in Mackerel. And by using the plugin for data format conversion, you can use Munin plugins as well.

Examples of plugins available with mackerel-agent-plugins

Shibata I love plugins. They’re incredible for “the first step of programming" (everyone nods in agreement). For plugins, as long as the constraints are met even a shell script one-liner is fine, and with a little effort, they can be written in C, Ruby, and Golang . If you write a plugin that only takes care of the things that you want and then pop it in, it will run as though it was an original Mackerel feature. And from an engineer’s perspective, there’s a level of enjoyment that’s like, "Alright, I did it!".

Matsuki I know what you mean. For plugin systems, even if I make a minor change, I feel like I contributed and that makes me really happy. I want the community of people who create plugins to continue to grow, in Mackerel as well.

Sugiyama Mackerel hopes to be something that engineers are excited about, so we’re really glad to hear that you’ve enjoyed our plugin system. I think that if it’s not something that people voluntarily want to try out, it won’t reach it’s true value.

The most important thing is "to be used in the field"

Matsuki When adopting Mackerel, on what basis did you examine it? Did you compare it with other tools and such?

Kuribayashi We examined other tools as well. For example, we tried out Datadog. Other existing tools, such as New Relic for example, are usable if some money is spent on it (LOL), but with Mackerel the price by Hatena corresponds with the features offered (LOL).

Matsuki How do you use Mackerel and New Relic differently?

Shibata New Relic is an application tracer. It’s used to find out things such as how much time it takes for certain SQL statements and how much time it takes for a reponse to return after internal processing for each request that’s dispatched. We are using it for resource management of the top level of our services. Other than that, with Mackerel we can see the lower levels, what conditions the servers were in from the start, when they spiked, etc.

Kuribayashi The biggest factor about adopting a new system, is whether or not it can be used in the field. A reasonable price and good features are also important, but there is no point of introducing the system if nobody uses it. There’s a certain confidence that "Hatena will get the job done" whether regarding technological strength or community formation of the plugin structure.

More than the features and price, what matters most is the ecosystem and whether or not it’s practical for a wide range of users right?

Kuribayashi That’s the most important thing.

Shibata And by ecosystem, we mean the infrastructure of people that make up Japan’s Web community. They share knowledge, to a certain extent, that surpasses the company’s and I think that’s special even in this industry. With Hatena tools, there is a sense of security with sharing this kind of background. Even when you give feedback, the explanation is easy to understand. It takes too much time to receive an explanation with overseas tools.

Matsuki Within Hatena, we were originally using Mackerel for server monitoring and for server managing. For this reason, there is a part that integrates the service management and server management. Even when revealing the deploy target from inside the server, information is obtained from Mackerel. Do you have an example of using Mackerel for server management?

Yuki Takaya

Takaya The biggest problem is figuring out how to manage the deploy target. If you’re managing targets with a file and you forget to fill out information then the target won’t deploy. If you use Mackerel, misses such as having the server leaked from the deploy target are less likely to occur. This is because the host list can be obtained with service and role units using Mackerel’s mkrcommand. You can also pull the host name and IP address of the interface and create a hosts file with Mackerel. This is useful when it isn’t worth setting up a DNS.

Shibata If you want to hear an interesting way of using Mackerel, we are managing the number of Mackerel server licenses with Mackerel. Choose an active host and if you approach the upper limit of your contract, a warning will occur (LOL). Graphing is easy as well.

Matsuki Getting used to graphing, whatever the subject, is exciting because its usefulness really expands right?

Sugiyama We have customers who put units of advertising sales into a graph and have an alert occur if there’s a decrease. Another unique usage, you can install the agent in "Raspberry Pi", a single-board computer, and when the discomfort index surpasses a certain set value, send a message like "Please change the configured temperature of the air conditioning" to the General Affairs Department (LOL).

Matsuki Mackerel is fixated on API design as well and it’s become easy to hack. For this reason, it is easy for application engineers to build upon.

Are there any requests and such for extensions to that area?

Kuribayashi Server management, when viewed as a structure of inventory (asset management), it would be nice if there were more attributes connected to the host name.

Matsuki We see that kind of feedback on the Slack channel regularly and it’s encouraging and we appreciate it.

Raising the competitiveness of infrastructure by placing strict demands on Mackerel

Masayuki Matsuki

Kuribayashi There’s the saying, "eat your own dog food", but is Mackerel used within Hatena?

Matsuki Yeah. The new services in particular are being used. However, there are also places that still use the old version of Mackerel because of the convenience of the remaining specialized features in Hatena’s infrastructure. Such as Mackerel’s monitoring of itself. I’d like to get to a place where we can rely on Mackerel alone, but the effect on the users would be too big if a malfunction occurred.

So monitoring tools are mission-critical as well.

Kuribayashi Because GMO Group is a company focused on hosting, we’re trying to be competitive by operating the server by ourselves. So we’ve made a large investment in OpenStack-related technology. And as we’ve talked about before, we’ve started to use Mackerel for our active services. For this reason, in monitoring tools as well, there’s a role for making the OpenStack environment more efficient and increasing the competitiveness. There was also a time in which I thought about making a tool like this myself. Before, there was this company called Hatena (LOL), and all of the sudden there was this interior Mackerel-like tool and I thought it would be useful to have something like it. But we ended up introducing Mackerel (the successor of that tool) after all.

Shibata In the private cloud, host server monitoring is very important. You have to be sensitive to resource control in particular. As mentioned before, the number of servers monitored with Mackerel is just under 500, but out of that number, 200 are monitoring host servers that operate the private cloud.

Matsuki In making the foundation of the private cloud, I believe there are techniques such as not letting the same service co-exist inside the same node.

Shibata Such ideas have been incorporated. With OpenStack, you can create what is referred to in AWS as "availability zone" all by yourself. Each service, to a certain extent, is placed separately so as not to concentrate on a specific rack and PC zone. For example, if JUGEM has everything on one host and that host goes down, then the entire service would be dead. The problem is, OpenStack resource management features have nothing but "free space". It does not show to you that a CPU has 16 cores and which core is used at what percentage. And that is what we would like to expect from Mackerel.

Mackerel’s free OSS plan supporting Ruby’s development server

Mr. Shibata is a Ruby committer. It seems that Mackerel is also being used in Ruby’s development server. Could you tell us about that?

Shibata There is a website called ruby-lang.org . One part uses the cloud service, but the build server is a physical server and is built on each release. There are committers working full-time on the development side of Ruby as well, but the infrastructure is completely run by volunteers. Two others besides myself operate it as if it were a hobby, but if the server were to go down, we would receive tons of complaints from committers and users. So, we discussed with Mr. Shinji Tanaka, Hatena CTO(Also the founding producer of Mackerel), and started to use Mackerel for free. We are using the feature that sends an alert if the disk is close to becoming full.

Matsuki In Mackerel, we offer a plan geared toward OSS which is being used in the Ruby project as well. With the OSS plan, you can use the standard features in full at no charge. In recent open source projects, demands for quality are higher than before and development costs such as server costs are getting expensive. Because Hatena is a company that is expanding services while making the most of many OSS, it would be nice to be able to give back to the community by supporting them with Mackerel (※ If you are interested in the OSS plan, please contact us).

Summary

Now that you’ve fully introduced OpenStack (a private cloud foundation) and Mackerel, what does the future look like from here?

Kuribayashi As i mentioned before, GMO Group is a company focused on hosting. We operate the server by ourselves and that’s what makes us competitive. We want to be proud of constructing the infrastructure ourselves for the services we offer. In this process, we’ve made a big investment in Openstack and started actual operation. With the help of Hatena’s Mackerel, we would like to continue to strengthen this foundation.

Shibata We want to make good use of Mackerel and continue doing the things that we want to do. Mackerel’s agent is an OSS, so we’ll play around with it by ourselves and do things such as make it so that OpenStack information can be registered. Recently we’ve been thinking that contributing to Mackerel to an extent possible from outside of Hatena seems to be a good idea.

Thank you very much.

Interested in Mackerel?

First off, register and explore with the two week trial.Try out features limited to the Standard plan such as URL external monitoring and AWS integration. translation missing: en.customersPermalink_guest_largescale

Try it free Browse Mackerel's plan options.

Get a 1 week extension on your free trial!

If you read this whole interview, we'd like to say thanks by offering a 1 week extension on your free trial. Contact sales with the name of the organization you'd like the extension for.

Contact sales