My Take on the Observability Maturity Model

A prelude to our upcoming six-part Observability Maturity Model Fundamentals blog series. Use it to identify where you are on the observability path, understand the road ahead and provide guidance to help you find your way.

Skip to: Part 1 | Part 2 | Part 3 | Part 4 / Part 5 / Part 6

By Lodewijk Bogaards

At StackState, we have spent eight years in the monitoring and observability spaces. During this time, we have spoken with countless DevOps engineers, architects, SREs, heads of IT operations and CTOs, and we have heard the same struggles over and over.

Today’s consumers are used to great technology that works all the time. They have little tolerance for outages or performance issues. These expectations push businesses to stay competitive through frequent releases, ever-faster responses and greater reliability. At the same time, the move towards cloud- based applications – with all their ever-changing functions, microservices and containers – makes IT environments more complex and harder than ever to operate and monitor.

As a result, we have seen great commonalities in the monitoring challenges that are unfolding globally, such as this colorful issue described by a customer:

“When something big broke in the infrastructure, storage, networking equipment or something like that... every time we saw the same movie. The monitoring gets red, red, red, thousands of alarms, nobody knows what’s the root cause. Everybody is panicked – real total chaos.”

- Georg Höllebauer, Enterprise Metrics Architect at APA-Tech (listen to the podcast episode with Georg here)

Since we released the original Monitoring Maturity Model in 2017, it has become clear that the original monitoring tools – which simply notified IT teams when something was broken – were no longer sufficient. Today’s engineers need to immediately understand the priorities and context surrounding a problem: what’s the impact on customer experience and business results? Then, if the impact is high: why did it break and how do we fix it?

The concept of observability has evolved from monitoring to answer those questions. Observability is vital in maintaining the level of service reliability needed for business success. Unfortunately, navigating the monitoring and observability space is hard, especially as AIOps enters the picture. First, nobody seems to have a clear definition of what observability and AIOps mean. Second, many vendors are making a lot of noise in the market and new open source projects are popping up left and right. It’s hard to know who really does what, and even harder to know which capabilities really matter.

With the Observability Maturity Model, we hope to shine some light in the darkness. Our goal is not to present you with the perfect model of what your observability journey should look like. We know it doesn’t work like that. To quote a famous British statistician, “All models are wrong, some are useful.” Rather, we developed this Observability Maturity Model based on real world experience and our work over the years with many enterprises to help you identify where you are on the observability path, understand the road ahead and give you a map to help you find your way.

May this model be useful to you on your journey!

Lodewijk Bogaards

Co-Founder and Chief Technology Officer at StackState

Download the Observability Maturity Model white paper here.

Real World Insights - My Take on the Observability Maturity Model

Related resources

Mastering Node Affinity in Kubernetes

SIGKILL vs SIGTERM: A Developer's Guide to Process Termination

Understanding and Troubleshooting Out of Memory Error Code 137