Skip to content

The Field Guide to Understanding ‘Human Error’ by Sidney Dekker

Share This Post

Book about human error, so unsurprisingly focuses on airline and medical industry.  Is safety making sure those few things don’t go wrong, or that as many things as possible go right, which includes not necessarily severe ones, there is a balance to be made.  The old view of safety sees people as a problem to control whereas the new view of safety sees people seen as a resource to harness.

Explains the hindsight bias. Finding out about an outcome increases the estimate we make about its likelihood. In other words, as a retrospective reviewer who knows the outcome of an event, you exaggerate your own ability to predict and prevent the outcome.  Of note most are not aware of this bias when analysing adverse events.

The outcome bias. Once you know the outcome, it changes your evaluation of decisions that led up to it. If the outcome is bad, then you are not only more willing to judge the decisions, but also more likely to judge them more harshly.

Divide an operational system into a sharp end and a blunt end:

At the sharp end (for example the train cab, the cockpit, the surgical operating table), people are in direct contact with the safety-critical process.

At the blunt end is the organization or set of organizations that both supports and constrains activities at the sharp end (for example, the airline or hospital; equipment vendors and regulators).

Consider starting an investigation at blunt end rather than sharp end.

Make an effort to understand ‘human error,’ and avoid micro-matching or cherry-picking.  You have to put yourself in their shoes at the time imagine that you don’t know the outcome.  Try to reconstruct which cues came when, which indications may have contradicted them.  Try to envisage what was unfolding with a trickle or flow of cues and indications could have meant to people, given their likely understanding of the situation at the time. 

Try to understand their understanding of the situation was not static or complete, as yours perhaps is in the review situation.  There was an incomplete, unfolding and uncertain.

There are a few names for human error.

Ineffective crew resource management (CRM) why a plane crashed, the failure to invest in common ground, to coordinate operationally significant data among crewmembers.

“Loss of situation awareness” the failure to notice things that in hindsight turned out to be critical.

Complacency is also a name for ‘human error’ which is the failure to recognize the gravity of a situation or to follow procedures or standards of good practice. Complacency is an incorrect strategy that leads to sub-optimal monitoring. Important signals may be missed because of operator complacency, because they have too great a trust in their systems doing the right thing. It is essential in the battle against complacency to help retain situation awareness, otherwise they keep missing those warning signals.

“Non-compliance with procedures is the single largest cause of ‘human error’ and failure”.  This book clearly points out labelling things isn’t really helpful.  Commonly it is perceived there is a need to establish the root cause – however there is often not a single root cause an in fact many factors interplay.

There is a concept known as plan continuation in which early and strong cues suggest that sticking with the original plan is a good, and safe, idea. Only later, and weaker, cues suggest that abandoning the plan would be better. In hindsight, it is easy to forget to see the cues from the point of view of people at the time, and when and how strongly they appeared.

You must appreciate that something can only take moments and a very small amount of time but afterwards a large amount of time can be spent studying the adverse outcome in which time is not as crucial a factor. 

Dynamic fault management is typical for event-driven domains in which we must appreciate when a situation is unfolding one must bear in mind that people have to commit cognitive resources to solving them while maintaining process integrity. i.e. other things don’t stop – people need to keep the aircraft flying (or the patient breathing) while figuring out what is going wrong.

Not trouble-shooting or correcting may challenge the integrity of the entire process.

“Investigation” suggests that the ultimate aim is to find out where things went wrong, to be able to offer the one official or definitive account of what happened. Suggests that the is missing the point as there often isn’t one thing or person hat is the cause.  The ultimate aim is to learn and improve.

Checklist and procedure assumptions

Assumption 1—The environment is linear.

Assumption 2—The environment is predictable in which tasks and events can all be exactly anticipated, both in nature and timing.

Assumption 3—The environment is controllable.

It is worth acknowledging that complacency may arise when you perceive that an automated process is highly reliable, operators may not merely trust it, but trust it too much, so that they fail to monitor the variables often enough.

There are different models to evaluate errors

Hazard Triangle

Swiss cheese

Chain of events

Barrier model

All have different advantages but need to think what the factors are present when you decide which model to display and show in the model. In addition there may not be a clear time line.

Often trade-offs occur when one aspect of safety conflicts with another part of the business process. These little trades off are to be negotiated and resolved in the form of thousands of little and larger daily decisions and trade-offs. In time these are no longer decisions and trade-offs made deliberately by the organization, but by individual operators or crews.

What then is accepted as risky or normal will shift over time:

as a result of pressures and expectations put on them by the organization;

as a result of continued success, even under those pressures.

This is known as drift into failure. Drift happens insidiously.

Murphy’s law is wrong. What can go wrong usually goes right, and then we draw the wrong conclusion: that it will go right again and again.  It is with this that we borrow a little more from our safety margins.

A safety culture is a culture that allows the boss to hear bad news.

What presents difficulty on a daily basis, the often encountered workarounds and frustrations? Such things might indeed be better predictors of system safety and risk than your formally reported incidents.  To apply this principle when you do your next safe work observations, do not walk around telling people how they are supposed to work. Try to understand why they work the way they do, and why it is, or seems, normal for them at the time.

If you are running a safety department try to be the concerned outsider who understands the inside, Independent of how you get your safety intelligence. Aim to establish constructive involvement in management activities and decisions that affect trade-offs between safety and efficiency.  In terms of qualifications just being a practitioner (or having once been one) does not in itself qualify a person to be a key member of a safety department.  Safety staff members should want to be educated in safety management. 

Safety has increasingly morphed from operational value into bureaucratic accountability. Those concerned with safety are more and more removed organizationally, culturally, psychologically from those who do safety.  Workers who perform critical work at the sharp end can see the safety processes which develops or is enforced bureaucratically by those who are at a distance from the operation, fantasy documents.” Fantasy documents bear no relation to actual work or actual operational expertise.

Avoid disputes based on one group vs another group, such as one set of job roles versus another. 

So can human error’ go away? The answer isn’t as simple as the question. A ‘human error’ problem, after all, is an organizational problem. It is at least as complex as the organization that has helped create it. To create safety, you don’t need to rid your system of ‘human errors’. Instead, you need to realize how people at all levels in the organization contribute to the creation of safety and risk through goal trade-offs that are legitimate and desirable in their setting.

Rather than trying to reduce “violations,” aim to find out more about the gap between work-as-imagined and work-as-done—why it exists, what keeps it in place and how it relates to priorities among organizational goals (both stated and unstated).

Aim to learn about authority – responsibility mismatches in which you expect responsibility of your people, but the situation are not giving them authority to live up to that responsibility.

You know your organization is improving when it tries to learn about safety including if your organization is calibrating whether its strategies for managing safety and risk are up-to-date.

Every organization has room to improve its safety. What separates a strong safety culture from a weak one is not how large this room is. The most important thing is that organization is willing to explore this space, to find leverage points to learn and improve.

Subscribe To My Newsletter

Sign up to my newsletter and receive 5 tips to get the most of your life!

More To Explore

small_c_popup.png

Subscribe to My newsletter

Sign up to my newsletter and receive 5 tips to get the most of your life!