Splunk Completes Acquisition of Plumbr Learn more

To blog |

Confessions Of A DevOps-aholic.

May 27, 2019 by Ivo Mägi Filed under: DevOps

I just realized that I belong to the DevOps world too.

Over the past 12 years, I’ve written software in several languages. I’ve also contributed code that was written specially for the frontend, backend, databases, and other parts of the server and underlying infrastructure. I’ve been part of teams that did releases, rollbacks, and maintenance.

The culture was demanding.

Back when I began, the response to everything was “Fix it!”. Customers yelling in frustration about the application taking too long to load? “Fix it!” Users sending a screenshot of an error? “Fix it!” Pointy-haired bosses asking for a demo? “Fix it! Fix it! Fix it!” Report from QA? “Fix it!”


Nobody ever asked why software on production broke. The question was always ‘Who broke it?’. There was zero empathy for human error. This became an important source of stress for all the engineers on the team. It was common for the engineers to break into a cold sweat when they saw an email from customers, or when they got a notification that the build was broken.

The blame game was common.

Every function was carried out by different teams, each specialized for a specific function. Each team had their own priorities. They were optimized in size, skill, and smarts to execute one kind of function. The interfaces between these teams were as good as the individual engineers on them. Unsurprisingly, things kept falling through the cracks.

Today there is a paradigm shift.

Currently, I work on a team that shares responsibilities across several domains. We design, develop, release, and maintain our software. Our small team of engineers osmote with product, customer support, and marketing teams. We share a common goal, we don’t have fixed roles, and we enjoy shared responsibilities. We all work together towards our integrated mission.

Today, transparency into how software works is unprecedented.

This was made possible by a change in the way software is written and operated. The years in between bore witness to many methods of telemetry being introduced. All languages now support some form of instrumentation. Interactions can be recorded, sessions built, and the experience while using software can be faithfully reproduced, both in a technical and non-technical sense.

Maturity – courtesy data, information, knowledge, and wisdom.

The “Fix it!” and “Whodunnit” culture has given way to more mature responses. Today we have the maturity to assess errors and performance problems in the context of usage. This allows us to make decisions about when, where, and how to fix problems with our software. We’re no longer chasing utopian dreams of 100% error-free software. There are bound to be errors in software no matter what. Here’s a link to a study about errors in software. This was made possible by patterns revealed by monitoring tools.

Firefighting has been replaced with streamlined responses.

Errors can be caught before reaching production by good test coverage and well designed CI/CD pipelines. Errors in production can be captured along with the context. If the application takes too long to respond, the distributed trace reveals the span of the transaction. This lets on-call engineers make decisions based on evidence, eliminating the normal anxiety. With broken code and poor engineering abstracted away, it will feel that releases affect software quality. This is a good abstraction, and provides a convenient way to characterize the appearance of errors after releases. This specific analysis provides a way to quantify how releases affect software quality.

The quintessential DevOps toolchain.

To carry out our work, we rely on several tools. Version control is a must. Ticketing systems are needed for planning our work. CI/CD tools, and artifact management for releasing on-demand. And the final stages of feedback closed with monitoring tools. The cherry on the cake is our real-user monitoring solution. The ability of our RUM tools combined with our APM provides us with the most objective measure of how users and usage is affected by what we build and release.

No more between the hammer and the anvil.

Perhaps the biggest benefit of the modern practices has been the end of the “Blame game,” as the new processes allow responsibilities to be shared. The team can finally function as one. Finger-pointing has been replaced with handshaking, allowing for a truly benign environment for engineers to flourish and work together.
Interested in finding out more about how you can accelerate DevOps cultures within your organization? Read more