There are a number of outcomes resulting from the processes producing the
output which are not desired. These can be called any number of things,
including—but not limited to—not creating value, unproductive, ineffectual,
deficient, defective, barriers, discrepancies, waste, injuries, losses,
etc. The operational system resides in a larger system called the
organization. It, too, has systems and people. To differentiate between the
two, the people at the operational level are the producers, and the people
and the organizational level are the managers. The systems at the operation
level may be the plant and equipment as well as practices and processes
focused on producing the product or rendering the service. At the
organizational level, the systems are policies and procedures designed to
run the business and manage the workers in an efficient and productive
manner.
Figure 1: Organization and It's Operational System
Performance Management
Traditionally, performance has been managed by setting goals for
(employees) producers to achieve. These goals may be related to production,
quality, or injury. When the goals are not achieved, a series of actions
invariably follow. The worker is then trained, counseled, retrained,
admonished, possibly punished, demoted, or let go. Generally, the
interventions are directed at the worker, ignoring the fact that in the
operational model as described above there are two sources of failure risk:
people and processes. Since producers work within the system (interface with
the system), it influences them and may cause them to take actions, make
choices or decisions that may results in errors or discrepancies and so lead
to underachievement. Also, the organizational systems will affect the
producers as well.
Generally, attribution of such human failings (producer errors) is to
inattentiveness, poor judgment, lack of focus, capability, or negligence, to
name a few. This prevents the digging into the inner workings for reasons
causing such failure, which general resides deep in the systems, process,
procedures, and practices of the organization. Human error is simply a
difference between an actual state and a desired state. It is important to
note that all human errors do not result in catastrophic outcomes; in many
cases, the results are tolerable, inconsequential, or may even turn out to
have positive results. To understand failure, we must also understand our
reaction and response to failure. People do not operate in a vacuum, where
they can decide and act all-powerfully. To err or not to err is not a
choice. Instead people's work is subject to, and constrained by, multiple
factors.
The Human Error Factor
The impact of human error on organizations is far-reaching in terms of
productivity, customer service, quality, teamwork, decision-making,
execution, injury, and loss. There is little in terms of statistics for most
of these categories except for accidents. In many of the most serious
accidents in the last 50 years, almost all initial findings attributed the
failures primarily to human error. As examples:
- In 1965, in Little Rock, Arkansas, 53 contract workers were killed
during a fire at a Titan missile silo.
- In 1978, in a construction site disaster at a power plant in West
Virginia, a cooling tower collapsed, killing 51 workers.
- In 1984, in Bhopal, India, a Union Carbide plant explosion released
cyanide gas, killing 20,000 people.
- In 1988, the Piper Alpha oil platform explosion killed 167 and
resulted in a major oil spill.
- The 1989 Phillips explosion in Pasadena, Texas, killed 23.
- The 1989, Exxon Valdez oil spill
in Alaska was a major environmental disaster.
- In 1991, the Hamlet Chicken processing plant fire in North Carolina
killed 25 workers.
- In 2005, the Texas City BP refinery explosion killed 15 workers.
- In 2006, a sugar refinery explosion in Georgia killed 42 workers.
- In 2010, BP Deepwater Horizon oil spill in the Gulf of Mexico killed
11 men and injured 17 others.
Going back to the organizational model within which the operational model
exists, we identified systems and people (management). Management devises
the systems, and—as humans—are fallible, creating systems with latent
defects. The producers at the operational level have to function within the
systems, and these latent defects, combined with operator errors, may lead
to failures. The progression of latent conditions may start with the
organization's hiring practices, followed by employee development; promotion
practices; management's actions; supervisor's goals; operational
constraints, requirements, communication, and information flow; task design;
and physical environment. All these latent conditions influence the
producer's (worker's) choices and decision making. When the worker makes the
wrong choice or makes an error, these may lead to an active failure, which
may or may not have adverse effects on whatever is being evaluated
(production, quality, or injury). Latent conditions are discrepancies in the
systems that facilitate error on the part of the producer.
Traditional Approaches to Combatting Human Error
The traditional approaches to managing human performance are not highly
effective. Most of this is driven by performance goals, metrics and
recognition/rewards. One of the underlying reasons is that the established
goals and metrics are set without a thorough understanding of the impact
these may have on other aspects of performance and results. There is a vast
array of reasons for underperformance. One of the insidious reasons is human
error. This is a "newer" area of study of human factors, and until recently,
its causal analysis and interventions has been more an art than a science.
Performance has to be reliable, and the system has to be robust. That
means that the human-task interface has be free of error, and the system has
to be tolerant of unexpected conditions should they arise. Another aspect of
performance is resilience. The system has to be able to recover and return
to a steady state without much difficulty or delay.
Human error is inevitable and occurs for many reasons. The reasons may
reside with the individual or the organization's systems. This mismatch may
be due to a misunderstanding of the task, task demand, capability,
knowledge, motivation, goals, information, communication, politics, human
dynamics, supervision, climate, culture, and leadership to name a few. It
also is impacted by the ability of humans to perform the task in a myriad of
different ways or break (often unintentional) an "unbreakable" system. This
is one of the reasons why some of the implemented protective systems
sometimes are breached.
Research has shown that humans do learn from their mistakes. So, from
this perspective, making mistakes is not all that bad. It would seem that
the way to address performance issues is to make the result (consequences)
of the mistakes as inconsequential as possible. Therefore, a performance
management strategy might include a number of elements, one of which might
be designing out the error producing elements of the systems, or at least
reducing their frequency. The next step might be handling the consequences
of the error so as not to impact the goal/mission achievement by returning
the process to its former unimpaired state.
Human Error Prevention
There are two ways to prevent human error from affecting performance. The
first is to stop people from making mistakes (avoidance) or keeping the
mistake from impacting (interception) the system. The preventive
interventions require that the possible/potential errors be known before
they occur. This technique includes design, automation, reduction of
exposure time, error proofing, training, etc. For training to be most
effective, it has to focus on concepts (education) and not just practice and
procedures. Stopping mistakes from occurring has proven difficult as humans
invariably find different ways to go about performing their tasks, bypassing
interlocks or aids and just plain making mistakes. That does not mean giving
up on prevention as it does have benefits and reduces some of the
possibility and potential for making errors.
Another aspect of human error is that the error may be made by another
person upstream from the producer's activities. These are latent (defects)
errors. The process itself may fail and cause the producers to fail.
Designing systems with an understanding of recovery time is also important.
Consider an example of the Soyuz 11 capsule. On its return to earth, at the
capsule separation stage, a pressure equalization value prematurely opened,
venting the internal atmosphere. This took about 45 seconds. To manually
close the valve took 60 seconds. There is evidence that the crew attempted
to close the valve, but events overtook them, and they perished. The design
should have taken this into account so that the manual operation could be
completed before total loss of breathable air occurred.
Developing Error Tolerance
But errors are going to be made, and error avoidance is not "foolproof,"
so the next step is critical in optimizing performance—minimize the
consequences of the errors. Error tolerance can be achieved in a couple of
ways.
- For systems where error that cannot be designed out or blocked,
there should be a way to detect errors early, and mechanisms developed to
recover from them without significant impairment of performance. An example
of this is a checklist utilized before engaging in an activity. Pilots
routinely go through a preflight checklist. This has helped to render flying
safer. Checklists can also be used after completion of an activity, such as
maintenance, to ensure that the equipment is operable and in good working
order.
- Deviations or errors that are not detected or detected "late"
are going to have consequences. The minimization of these unexpected and
undesired outcomes must be dealt with effectively so as not to adversely
impact performance. Such a process will keep an error from escalating into a
major undesirable event. Examples of this might include routines
maintenance, redundant systems, seatbelts, fall arrest, etc.
Become Resilient
The next element in managing human error is making the organization and
its systems resilient. That means there is a built-in mechanism to deal with
error, and changing conditions effectively while recovering from adverse
effects to quickly return to "normal" operations seamlessly. Agile
resilience has five elements: Leadership, culture, people, systems, and the
work environment.
Figure 2: Illustration of Resilient Elements
Resilience begins with a vision set by the leadership. The organization
must select the right people and provide the resources to devise the systems
that foster resilience. Leadership must also establish the acceptable level
of risk and the "right" balance between risk taking and risk avoidance.
Leadership must create a climate where it is okay to make mistake and, once
made, ensure that lessons are learned and disseminated throughout the
organization.
A resilient culture is built on four pillars. These are trust, purpose,
empowerment, and accountability. Such an organization has a strong sense of
purpose that flows vertically and horizontally to all the employees. It
encourages self-directed teams that innovate and communicate
cross-functionally. The four pillars bind the organization into a cohesive,
innovative, purposeful group with a sense of commitment to action problem
resolution and win-win thinking, with a passion for excellence.
The core of any organization is its people. The organization must select
the "right" people, who are motivated, have the courage to challenge the
process, are willing to work toward a common goal, share a common vision and
purpose, and are willing to overcome obstacles and barriers. The
organization must provide the timely information and resources which will
facilitate effective decision making and problem solving.
Systems in a resilient organization have an open structure that allows
for the flow of information and resources. Such systems foster innovation
and agility. The systems and subsystems are integrated and aligned with the
organization's goals and objectives. It enhances risk assessment and
selection. It allows for effective planning and strategy implementation. It
supports and rewards innovation, cooperation, enhances flow, and creates
value.
The work environment in a resilient organization is flexible and
conducive to learning (from one's mistakes). It is designed so as to
minimize latent defects in the systems. The strategy, objectives, goals, and
metrics are integrated so as the accomplish excellence.
Conclusion
Performance management has taken on urgency in the realities of the 21st
century. The traditional business models and management approaches that
have worked well in the past cannot be used to solve the problems of today
(Einstein). It is these very tools and techniques that have gotten us to
where we find ourselves now. The more productive approach is to identify the
challenges, define the problems, face reality, stop treating the symptoms,
dispel the myths, assess the organizational system and people constraints,
foster integration, communicate a compelling vision, move away from command
and control, foster trust, empower the people, and lead, lead, and lead.