DEV Community

Cover image for Fix the issue first, improve the process later
Filipe Ximenes
Filipe Ximenes

Posted on • Originally published at open.substack.com

Fix the issue first, improve the process later

Fixing your process won't get you out of a crisis. During a crisis, your goal should be to get out of it, not to fix your operation. Crises are special situations that, for whatever reason, couldn't be prevented - meaning your process either failed or doesn't have answers to the situation. But in both cases what gets you out of the crisis is not fixing the process.

While fixing the process can prevent the same issue from happening again, it's often not the best way to address the problem at hand. What gets you out of a crisis is analyzing the specific context, evaluating risks, and finding the optimal solution for that unique case. Sometimes that even means bending the rules and stepping outside of established processes.

During a crisis, you and your team are under pressure, stakeholders are waiting for a solution, and there's likely financial risk involved. Speed is the name of the game, so the best solution is whatever reduces impact and gets you out of the situation fastest.

It's also important to note that during a crisis, your ability to think clearly, analyze the situation, identify systemic problems, and propose solutions is severely impaired. This isn't the right environment for making long-term plans. The sooner you leave crisis mode, the sooner you'll have the space to properly think about necessary process changes.

The good news is that fixing the immediate issue is often much simpler and faster than changing processes. It doesn't require as much planning and alignment, and needs a much narrower solution. Keeping your full focus on solving that particular situation gives you flexibility to come up with one-off solutions that don't need to scale or meet long-term requirements of the product.

There's a story in the Risk Management chapter of my book that I think illustrates this concept really well:

"We had a situation in one of Vinta’s projects where an application used a secondary database to store metadata about user’s actions. These were relevant information but non-critical to the operation of the system. An issue caused by our infrastructure provider during a planned maintenance window took down this secondary database but it didn’t affect the main one. Unfortunately, the application code was designed in a way that made the error in the secondary database to affect and break some of the main flows of the product. We evaluated the situation and decided that it was more important to restore the application to users even if that meant losing some of the metadata. So our first step was to comment out all calls to the secondary database and deploy a new version of the application so people could get back to using it. We then deployed a new secondary database, restored the data from a backup and uncommented calls. At this point the application was back to fully operational and the team out of crisis mode so we could take our time to design and build a long-term solution that would prevent the issue from happening again."

If you like my writing please consider supporting me by subscribing to my newsletter and buying my book Strategic Software Engineering on Amazon.

Top comments (0)