- Alejandro's Eclectic Newsletter
- Posts
- EN 42: We can't be heroes, not even for one day
EN 42: We can't be heroes, not even for one day
The first newsletter of 2024 is here, and it’s written from Madrid, that’s a new one! Unfortunately, I’m not on holidays, on the beach, sunbathing with a mojito in one hand and the laptop in another. It’s winter here this time of the year, cold and with some rainy days, but at least I’ve seen the sun for longer stretches. Nonetheless, changing the working environment from London to Madrid for a few weeks is priceless. Let’s jump now to the topic of this week.
I remember several times being in a meeting after an incident that caused the servers to go down. Appreciations flew my way on every occasion I resolved the issues and brought the service up again. While it didn’t feel bad to see my efforts recognised, it didn’t feel good. I knew that, most likely, something was going to happen soon, in a few days or next week.
As if it were normal that the application was the Hydra or the Nemean lion that had to be slain regularly in one of the many labours, I was being praised as a hero. Performing heroic deeds day in and day out took a toll on me, I was stressed, tired, I couldn’t relax or feel like I could take long holidays. At that time, I was the only developer, if I would’ve got hit by a bus, the company would’ve had a big problem, and so would I, but of another kind.
When I was being praised for the heroics, all I could think about is that I wanted to hang the hero’s hat. It can feel good to be “the guy”, the person everyone depends on, it might boost the ego or make you feel important, but, under the veneer of “the guy” there’s just a bottleneck, a single point of failure. Don’t be “that guy”.
Neverending firefighting and heroics lead to stress, it’s not sustainable for anybody, and eventually, you just burn out. I'd rather not be indispensable, I’d rather teach a man to fish.
All in all, I managed to get out of the situation over time. The system was in a poor state and me being the sole and most recent developer meant that I went off the deep end and had to catch up quick and adapt or die.
My first goal was to see the incidents as a valuable training and a learning opportunity. In a normal situation, incidents don’t happen often, so people aren’t prepared to deal with them unless they practice regularly. If, for example, you never recover from backups and once in a lifetime you have to do it, you won’t be ready; moreover, it’s not the ideal time to discover that the procedure doesn’t work.
My second goal was to get the system to a better state gradually, focusing on the critical areas first: adding tests, refactoring, fixing faulty logic, improving the CI pipeline, etc.
The third objective was to shine a light on the system and not wait for incidents to happen to get feedback—and ideally, to get much faster feedback. Initially, the application was so opaque, that understanding what was happening was an odyssey and users or colleagues in other departments were the ones that notified us that the system was down or that something wasn’t working. It’s not the greatest feeling, everything’s out of your control, you feel like a fool, it can undermine the perception of engineering within the organisation and even erode the confidence in yourself and your skills. Focusing on observability was one of the key elements in the journey.
When we finally managed to hire developers, the supreme goal was to share my knowledge to grant me the ability to be hit by a bus, a plane or whatever came my way.
Within the last goal was the idea of nudging the organisation’s culture to one that didn’t glorify heroics. Unless I’m saving the human race, I would rather not be praised for essentially being stressed, busy all the time and getting closer to burning out.
If it were up to me, we would have an environment where we value healthily ways of working, a sustainable pace, a humane on call, and make it so we don’t encourage heroic behaviour. I might be wrong, but I get the sense that paying for extra hours and incidents it’s a good deterrence, since there’s a very tangible cost to constant incidents and people needing to stay up late because of unsustainable work.
While there are scenarios that might require unsustainable work for an amount of time, because of the role you're in, your involvement or the stage of the company, my personal experience tells me that most unsustainable work could be avoided or mitigated, but heroic efforts and constant busyness are often romanticised, tied with success, and in my view, too often an excuse for too many priorities, lack of strategy, or overall dysfunction.
Sometimes extraordinary events might need extraordinary measures, and that’s okay, but shouldn’t be the norm, and when they happen, we’re ready to tackle them, we have support, the right tools, the right skills, and we learn from them.
Nowadays, every time I hear a manager or executive exalting somebody for their countless nights or early mornings spent on releasing a product, for their titanic efforts, or for being the person everyone requires, my brain does a facepalm.
Reply