Infrastructure-as-Code (IaC) has revolutionized the management and provisioning of everything from local virtual machines to exotic AWS services. It is time for Monitoring-as-Code (MaC) to do the same in the application performance monitoring (APM) and synthetic monitoring fields — and the good news is that everyone stands to benefit.
Provisioning monitoring checks by hand is slow, too slow when the checks need to keep up with an application that is evolving quickly. This stays true no matter which monitoring service or platform you are using.
Another problem is with documentation. Who documents what the monitoring setup should look like, as a whole and in its parts? What is each check’s configuration? What about the alerting logic for when things catch on fire? It’s up to you to put that all down on paper to avoid the risks that come with having everything live in somebody’s head.
Possibly the worst and often unseen issue, though, is that manual monitoring workflows do not fit in the bigger picture. They do not tie into how software is being built, really. They lie by the wayside, hopefully getting enough of our attention to prevent costly outages while not diverting it too much from our real objective: shipping incredible applications.
The Monitoring-as-Code explosion
Pretty much all we said above applied to infrastructure provisioning and management until Infrastructure-as-Code became mainstream. In the past couple of years, having noticed that, a few providers in the monitoring space have started offering ways to replicate IaC workflows for APM and synthetics. It only makes sense that the current darling of the IaC world, HashiCorp’s Terraform, would be used first to test the water. This is when the most forward-looking platforms started offering Terraform providers to enable users to specify exactly what their monitoring setup should look like, as code.
This first wave of providers became a differentiator overnight. The reason is that DevOps and developer-led organizations were entirely familiar with the challenges highlighted at the beginning of this article, and quickly found that moving to a MaC flow enabled:
- The provisioning speed and scalability required by today’s faster delivery cycles.
- Better documentation and easier integration through checking in monitoring configs into source control.
- The extension of established workflows of which monitoring now is an organic part.
The last point in this list is likely the one that deserves the most attention. Technology organizations are quickly understanding that they now can simply take their favorite monitoring service’s provider, write a description of what their monitoring setup should look like from top to bottom, and have a tool such as Terraform automatically do the grunt work for them – all this, as a stage in their existing deployment pipeline.
Today, this kind of solution is essentially plug-and-play for most DevOps teams. We are now at the beginning of the next phase, where we are seeing other players entering the field of IaC and consequently being leveraged for MaC, too. Pulumi is a notable example, enabling teams to move away from DSLs such as HashiCorp’s HCL and instead use known programming languages they already might be using in their day-to-day.
Users will not want to go back
The scenarios MaC opens up are highly desirable. For example, imagine being able to modify your application and, at the same time, the synthetic monitoring checks that will need to ensure that it is running as expected, all in one pull request to your usual repository. Your policies, processes and stack need no change in order to support that. All of a sudden, the world in which you’d have to configure checks by hand, one by one, seems terribly outdated.
The transition to MaC is in full swing. You can observe it by looking at which cloud monitoring platforms are on the rise, and which ones have one or more publicly available, quality providers for top IaC tools. The shift has achieved critical mass and Monitoring-as-Code is showing up in more and more businesses around the world, small startups and large enterprises alike. Regardless of who wins the arms race, it’s clear that users are not going to want to go back to the old paradigm.