Building Resilience & Psychological Safety in Engineering Teams
đź’ˇ
This post is adapted from a conference talk that I've given multiple times, originally at the 0111 CTO Conference in San Diego, California in 2021. It's been revised and updated many times - this is the latest version which I keep up to date.

Being an engineer for the last 10 years and growing Unibuddy's engineering organisation from just myself to ~50 engineers, I've recently realised I’ve been missing a really big trick for most of my career. While I’ve spent a lot of time working on technical aspects like architecture and developer tooling or people aspects like attracting talent and getting the team topology right, what I’ve become really passionate about is the psychological side of building elite engineering teams. I've found that it's the psychological and emotional scaffolding that holds together great talent, the right team topology, excellent technical architecture, and brilliant tooling. I believe what I'm going to share in this post is really critical to all of us and it's what I call the "psychological architecture” of your team.

The Missing Piece of the Puzzle?

Psychological architecture is a framework that recognises, acknowledges and supports what makes engineers human and makes them "tick".

If we don't have this it limits the impact of all the great technical and other work we might be doing.

Here's why we need focus on it:

  • We tend to focus on tools and processes as technology leaders. The agile manifesto talks about people over process.
  • If we're neglecting team psychology, it's probably because we think it's something less measurable or it’s the job of HR. I'm here to tell you why it's our job and how we can quantify it.
  • Lastly, psychological architecture offers best practices that makes us much better engineers by focussing on how we behave and interact, instead of what technical skills we have.

I'm going to offer you an overview of this framework that you can use to build this in in your team.

How I Created This Framework

When I first started building teams at Unibuddy around 2017, I did what every technology leader has ever done before me, I took things that were measurable, that were in all the recommended books, that were clear best practices - and I made sure we implemented these practices.

As we grew however, I started to wonder why we weren't innovating like we used to five years ago, and I wondered if it was my fault for not empowering our product squads enough. Was it possible that we had focussed on process too much and it was now detrimental to getting the best out of engineers? Or is it inevitable  that as you become bigger - every organisation suffers from this feeling of dwindling speed and innovation? The engineer in me went out to find the answers.

I started looking around, and one of the first things that came across my desk is the State of DevOps report. The critical foundation that elite teams are built on is psychological safety which is the biggest finding of this report - and this is from one of the biggest engineering studies of all time! The report referenced Project Aristole and that's where I went next. For those of you who haven't heard of it, it's a study done by Google that examined all their teams and it further backed up the emphasis of psychological safety. This made me realise that I needed to understand more about psychology .

So I wanted to go outside engineering and expand my horizons. I remembered that often the best innovation comes from the cognitive diversity produced from cross-disciplinary learnings so I took this psychology course online at University of Pennsylvania that focussed on resilience and positive psychology.

The combination of the DevOps report, Project Aristotle and the Psychology course, made me realise what I’m missing from my toolset is a framework for building resilience and psychological safety in engineering teams. These are the key components of a sustainable psychological architecture (PA). Once you put this framework in place, it's a multiplier and you have it for life - therefore it's worth the investment. In addition, let me say that building this psychological architecture is our job as technology leaders, not HR’s because it's a leadership issue. Psychological architecture is about building habits which start with you - both calling out bad behaviour and setting an example of good one’s.

Now, let's dive into the framework.

Component 1: Building Resilience

The first component of psychological architecture (PA) is resilience which is the most critical quality that you want to build. The definition of reslience as defined by psychologists is stated below.

Resilience is the ability to bounce back from negative emotional experiences and flexibly adapt to the changing demands of stressful experiences. But as important: it's the ”ability to grow from challenges”.

The research is clear that teams and companies that are resilient will outlast the competition and perform better time and time again. As a team leader it's important to understand the scientific variables that either support or inhibit resilience in each individual in your team - there are eight of them in total.

As you can see, there are some variables you can't affect or change such as the biology/genetics of individuals in your team but many you can- the most impactful one being optimism and that's where we're going to focus. There are three tools I'm going to share that you can use to build resilience in yourself and your team through optimism.

Tool 1: Be aware of optimistic behaviours

First, we have to be aware of the behaviours psychologists have identified in optimists - there are at least nine behaviours from being skilful at identifying problems to being approach oriented by coming up with strategies instead of avoiding problems.

I won’t go through each behaviour – but I do want to highlight two behaviours that I’ve found to be really powerful.

a) Seeing problems as a challenge instead of a threat

One of the most useful optimistic behaviours is reframing problems as challenges instead of threats. To give you an example, a few months ago we were implementing a tool called Haystack to measure engineering KPIs of our squads at Unibuddy. Some of our squads felt a bit intimidated by this, and confided in me they were worried that the numbers might not be good.  I realised they felt threatened so I attempted to reframe it and suggested they look at it differently: "What if you used these KPIs as an opportunity to identify technical debt and challenging engineering work that will improve them?”. Since then, they have been more curious than frightened to see the KPIs and figure out how to influence them in the right direction. The result was that I noticed squads taking more ownership to build their technical roadmap.

b) Using humour to deal with situations

The other optimistic behaviour I want to highlight is the ability to use humour to deal with situations – even really stressful one’s. This is an optimistic behaviour that can instantly change the atmosphere and improve team morale. Let me give you an example. A couple of years ago, we had some down time where a server stopped working. At the time we were hosting our apps on Heroku instead of AWS – and one of our senior engineers had two tabs open in the browser: one with our QA environment and another with our production environment. He was planning to turn off the QA server but confused the tab he was in and turned off the production server – happily confirming the “Are you sure?” confirmation dialog that came up thinking they were turning off the QA server.

During our post-mortem we realised two things that lightened the situation. The first was that this mistake could happen to anyone so no-one needed to be blamed. But more than that we realised that no matter how seriously we take ourselves as engineers, there's no surefire way to eliminate risk. Even the best guardrails or warnings like "Are you sure you want to turn this off" won't solve the problem if you don't know which environment you're in.  And don't we all lose track of this at some point with so many tabs open? That made a very tense situation to a humorous one – and we then focussed on action items to make the environment you’re in to be more obvious to engineers to prevent such a mistake from happening in future.

So those are just two behaviours of optimists that I’ve highlighted. Ultimately, you want to be aware of all such behaviours, show them to your team and praise team members when they exhibit these behaviours. You can even look at incentivising these behaviours in your career progression frameworks and company value system.

Tool 2: Use the "Optimistic Method"

The ”Optimistic Method” is actually a simple formalisation of one of the optimistic behaviours in the previous list where optimists “focus on aspects of the problem they can control, and accept what they cannot control”.

We face a variety of challenges as technology leaders from unexpected product delays to tough hiring markets to team members that might be difficult to work with. Whenever you’re faced with this, develop the habit of sitting down and coming up with three lists:

  • Things you can control
  • Things you have to accept
  • List of action items

It will usually only take you five minutes but you’ll instantly feel more optimistic and in control after doing it. Develop the habit yourself and train your team to do the same and it will make your whole company more resilient.

Tool 3: Communicating as an optimist

The third and final tool to build resilience through optimism is communicating from an optimistic perspective. To demonstrate this, we’re going to look at two narratives after a technical incident at a large company by two different tech leaders.

Let's say Facebook goes down for 5 hours after an unpredicted spike in traffic. The engineering team starts to look for the root cause internally and finds it was a messaging service that didn’t scale.

The VP Engineering now starts to feel unsure of the reliability of all systems, if this happens to one service -  it could happen to any service right? The VP tells all teams to improve the scalability of their services. Both the VP and the business as a whole start to feel a lack of confidence with the reliability of all products which turns into a lack of trust of the entire engineering team. The lack of trust spreads amongst the engineering team and they feel they aren’t doing well in general and are stuck with poor, untrustworthy systems”. So the situation goes from a single problematic incident to the team feeling like it's the end of the world as they know it.

Psychologists call this way of describing the situation INTERNAL, STABLE and GLOBAL.

  • Internal because it focussed on the internal causes or why we are to blame.
  • Stable because the team now think they have permanently unreliable systems even though it was a temporary event that caused it.  
  • Global, because one service going down has affected the entire company’s confidence in every system.

This kind of catastrophising and communication happens far more often that it should in organisations both big or small and can destroy a company’s culture.

Instead, let’s apply the optimistic perspective to the same incident. We want to encourage a perspective that describes the situation as majorly caused by EXTERNAL, TEMPORARY AND SPECIFIC events.

What really happened is that we had a unique circumstance which was an external spike in traffic came into play. This is temporary event that isn’t happening every day on a permanent basis. The issue we found is very specific to a single service while we have 100’s of services that scaled exceptionally well - so in reality only 1 in 100 services are problematic. When we describe it like this, we're saying we’re doing really well as an engineering team even if we had a setback- and the team can then focus on fixing the specific problem to prevent it from happening again.

You can apply this process to any event that could create negative feelings in your team; if you had to fire someone, if you had a product launch go bad, if you had someone resign - how did describe the situation to your team? What was the impact? How could you have done it differently? Communicating from an optimistic perspective going forward will improve your team's morale and resilience.

To summarise, "Component 1: Building Resilience": you've now learnt about three powerful tools to build resilience through optimism. Research shows there are many benefits to building optimism ranging from coping better with stress to being less likely to suffer from depression and even medical benefits such as having a more robust immune response or being less likely to have heart disease. Optimistic and resilience people also have a greater quality of life and achieve higher academic, sports and career performance

Ultimately, by applying these tools in your team- you're making them happier at work and happy teams become high performing teams, not the other way around.

Component 2: Measure & Improve Psychological Safety

The second and final component of our framework is psychological safety. It’s a term coined by Professor Amy Edmunson at Harvard about fifteen years ago and it has become a bit of a buzzword today – so let’s start by defining it clearly.

Psychological safety is a belief that one will not be punished or humiliated for speaking up with ideas, questions, concerns or mistakes.

So where do we even start with this and how can you measure and improve psychological safety in your team? When I set about doing this, I spent hours searching for resources and reading tons of blog posts. Most resources I found are vague and don’t provide you with concrete action items, but there’s one excellent resource which I’d recommend.

đź’ˇ
This is the psychological safety toolkit that you can download at psychsafety.co.uk.

There’s a bunch of amazing documents in it and it’s impossible to give you the detailed process in this post– so I’m going to highlight a few critical pieces of information to give you a taste of what you need to do to start measuring and building psychological safety in your team.

The Fundamentals of Psychological Safety

Let’s start with the three fundamentals of safety, these are the behaviours that you want to be aware of and incentivise:

  1. Creating the space to speak: how are you modelling curiosity and encouraging your team to question and challenge yourself and other team members?
  2. Everything is an experiment: we know how important framing is, are you framing work as a learning problem or an execution problem?
  3. Admit your mistakes: How often do you acknowledge your own mistakes and fallibility? As a leader you have to set the example.

So now that we have an overview of what safety is about, the first thing you want to do, is measure it! This involves sending out a quick survey consisting of these 10 questions that you’ll find in the toolkit, these questions are adapted from Amy Edmunson’s Fearless Organisation. It will take each team member about five minutes to complete anonymously and you’ll immediately identify the current state of safety in your team.

Finally, in this crash-course on psychological safety, I want to point out that there is a misconception that psychological safety is about being “soft” or accepting lower standards. This is not the case. In fact, a culture of psychological safety enables higher standards of performance. The problem is you can have high levels of safety but low drive. The psychological safety quadrant illustrates this in the top left of the quadrant where the team is in a comfort zone.

So in addition to measuring safety, you also want to understand the drive in your team. You can measure if there’s low drive in your team by doing the 'team performance workshop' described in the toolkit which helps you complete this quadrant. For building drive in your team, see Daniel Pink's classic Drive which breaks drive down to the three components of autonomy, mastery and purpose - you're likely missing one of those if drive is low.

We’ve now discussed the entire framework and you have a set of tools to build both resilience and safety in your teams. In essence this all about how you make people feel. Engineers will decide to stay or leave your team, give 200% or not and make every other important decision ultimately based on how you make them feel and how they feel being a part of your team.

As Maya Angelou famously said:

“People will forget what you said, people will forget what you did, but people will never forget how you made them feel.”

Well done...you got to the end! Liked this post? I'm a minimalist at heart and hate a cluttered email inbox as much as you do. So if you subscribe, I promise that I'll send you a maximum of one email per month - probably less, but never more.

Share this article: Link copied to clipboard!