Guide To Root Cause Analysis – Steps, Techniques & Examples

This Tutorial Explains What is Root Cause Analysis and Different Root Cause Analysis Techniques like Fishbone Analysis and 5 Whys Technique:

RCA (Root Cause Analysis) is a structured and effective process to find the root cause of issues in a Software Project team. If performed systematically, it can improve the performance and quality of the deliverables and the processes, not only at the team level but also across the organization.

This tutorial will help you define and streamline the Root Cause Analysis process in your team or organization.

Root Cause Analysis

This tutorial is intended for Delivery Managers, Scrum Masters, Project Managers, Quality Managers, Development Team, Test Team, Information Management Team, Quality Team, Support Team, etc. to understand the basics of Root Cause Analysis and provides templates and examples of it.

What Is Root Cause Analysis?

RCA (Root Cause Analysis) is a mechanism of analyzing the Defects, to identify its cause. We brainstorm, read and dig the defect to identify whether the defect was due to “testing miss”, “development miss” or was a “requirement or designs miss”.

When RCA is done accurately, it helps to prevent defects in the later releases or phases. If we find, that a defect was due to design miss, we can review the design documents and can take appropriate measures. Similarly, if we find that a defect was due to testing miss, we can review our test cases or metrics, and update it accordingly.

RCA should not be limited only to testing the defects. We can do RCA on production defects as well. Based on the decision of RCA, we can enhance our Test Bed and include those production tickets as Regression Test cases. This will ensure that the defect or similar kinds of defects are not repeated.

Root Cause Analysis Process

Introduction

RCA is not only used for defects reported from a customer site, but also for UAT defects, Unit Testing defects, Business, and Operational process-level problems, day-to-day life problems, etc. Hence it is used in multiple industries like Software Sector, Manufacturing, Health, Banking Sector, etc.

Conducting Root Cause Analysis is similar to the work of the doctor who treats a patient. The doctor will first understand the symptoms. Then he will refer to laboratory tests to analyze the root cause of the disease.

If the root cause of the disease is still unknown, the doctor will refer for scan tests to understand further. He will continue the diagnosis and study until he narrows down to the root cause of the patient's sickness. The same logic applies to Root Cause Analysis performed in any industry.

So, RCA is aimed at finding the root cause and not treating the symptom, by following a specific set of steps and associated tools. It is different from defect analysis, troubleshooting, and other problem-solving methods as these methods try to find the solution for the specific issue, but RCA tries to find the underlying cause.

Origin of the name Root Cause Analysis:

Origin of the name Root Cause Analysis

[image source]

Leaves, trunk, and roots are the most important parts of a tree. Leaves [Symptom] and trunk [Problem] which are above the ground are visible, but roots [Cause] which are under the ground aren’t visible and roots grow deeper and can spread further more than we expect. Hence, the process of digging to the bottom of the issue is called Root Cause Analysis.

Advantages Of Root Cause Analysis

Enlisted below are some of the benefits, you will get:

  • Prevent the reoccurrence of the same problem in the future.
  • Eventually, reduce the number of defects reported over time.
  • Reduces developmental costs and saves time.
  • Improve the software development process and hence aiding quick delivery to market.
  • Improves customer satisfaction.
  • Boost productivity.
  • Find hidden problems in the system.
  • Aids in continuous improvement.

Types Of Root Causes

#1) Human Cause: Human-made error.

Examples:

  • Under skilled.
  • Instructions not duly followed.
  • Performed an unnecessary operation.

#2) Organizational Cause: A process that people use to make decisions that were not proper.

Examples:

  • Vague instructions were given from Team Lead to team members.
  • Picking the wrong person for a task.
  • Monitoring tools not in place to assess the quality.

#3) Physical Cause: Any physical item failed in some way.

Examples:

  • The computer keeps restarting.
  • The server is not booting up.
  • Strange or loud noises in the system.

Steps To Do Root Cause Analysis

A structured and logical approach is required for an effective root cause analysis. Hence, it’s necessary to follow a series of steps.

Steps to do Root Cause Analysis

#1) Form RCA Team

Every team should have a dedicated Root Cause Analysis Manager [RCA Manager] who will collect the details from the Support team and initiate the kick-off process for RCA. He will coordinate and allocate resources who need to attend RCA meetings depending on the stated problem.

Teams, who attend the meeting, should have personnel from each team [Requirement, Design, Testing, Documentation, Quality, Support & Maintenance] who are most familiar with the problem. The team should have people who are directly linked to the defect as well. For example, the Support engineer who gave an immediate fix to the customer.

Share the problem details with the team before attending the meeting so that they can do some initial analysis and come prepared. Team members also gather information related to the defect. Depending on the incident report, each team will trace what went wrong w.r.t to this scenario in their respective phases. Being prepared will increase the efficiency of the upcoming discussion.

#2) Define The Problem

Collect the details of the problem like, incident reports, problem evidence (screenshot, logs, reports, etc.), then study/analyze the problem by asking the below questions:

  • What is the problem?
  • What is the sequence of events that led to the problem?
  • What systems were involved?
  • How long the problem existed?
  • What is the impact of the problem?
  • Who was involved and determine who should be interviewed?

Use ‘SMART’ rules to define your problem:

  • SPECIFIC
  • MEASURABLE
  • ACTION-ORIENTED
  • RELEVANT
  • TIME-BOUND

John Dewey quote

#3) Identify Root Cause

Conduct the BRAINSTORMING session within the RCA team formed to identify the causes. Use the Fishbone diagram or 5 Why Analysis method or both to arrive at the root cause/s.

RCA manager should moderate the meeting and set the rules for the Brainstorming session. For example, the rules can be:

  1. Criticizing/blaming others should not be allowed.
  2. Don’t judge other's ideas. No ideas are bad they encourage wild ideas.
  3. Build on the ideas on others. Think about how you can build on other's ideas and make it better.
  4. Give each participant due time to share their views.
  5. Encourage out of box thinking.
  6. Stay focused.

All ideas should be recorded. RCA manager should assign a member to record the minutes of the meeting and update of RCA templates.

#4) Implement Root Cause Corrective Action (RCCA)

Correction action involves giving fix to the solution by identifying the real root cause. To facilitate this, a delivery manager has to be present who can decide in which all versions the fix has to be implemented and what should be the delivery date.

RCCA should be implemented in such a way that this root cause will not occur again in the future. Fix given by the support team will be temporary for the customer site where the issue is reported. When this fix is merged into an ongoing version, do proper impact analysis to ensure no existing feature is broken.

Give the steps to validate the fix and monitor the implemented solution to check if the solution is effective.

#5) Implement Root Cause Preventive Action (RCPA)

The team needs to come up with a plan for how such a similar issue can be prevented in the future. For example, Update Instruction Manual, improve skillset, update the team assessment checklist, etc. Follow proper documents of preventive actions and monitor whether the team is adhering to the preventive actions taken.

Please refer to this research paper on “Defect Analysis and Prevention for Software Process Quality Improvement” published in the International Journal of Software Engineering & Applications to get an idea of the types of defects reported in each software phase and suggested preventive actions for them.

The information gained from RCA can go as input into Failure Mode and Effect Analysis (FMEA) to identify points where the solution can fail.

Implement Pareto Analysis with the causes identified during RCA over a period, say half-yearly or quarterly which will help to identify the top causes which are contributing to the defects and focus on preventive action for them.

Duke Ellington quote

Root Cause Analysis Techniques

#1) Fishbone Analysis

Fishbone diagram is a visual root cause analysis tool to identify the possible causes of the identified problems and hence it’s also called Cause and Effect diagram. It allows you to get down to the real root cause of the issue rather than solving its symptom.

It’s also called the Ishikawa Diagram as it was created by Dr.Kaoru Ishikawa [a Japanese quality control statistician]. It’s also known as Herringbone or Fishikawa diagram.

Fishbone analysis is used in analyze phase of six sigma’s DMAIC approach for problem-solving. It’s one of the 7 basic tools of quality control.

Steps to create a Fishbone Diagram:

Fishbone diagram resembles the skeleton of a fish with the problem forming the head of fish and causes forming the spine and bones of the fish.

Follow the below steps to create a fishbone diagram:

  1. Write the problem at the head of the fish.
  2. Identify the category of causes and write at end of each bone [cause category 1, cause category 2 …… cause category N]
  3. Identify the primary causes under each category and mark it as primary cause 1, primary cause 2, primary cause N.
  4. Extend the causes to secondary, tertiary, and more levels as applicable.

FIshbone_template

An example of how a fishbone diagram is applied to a software defect (see below).

FIshbone_softwareDefect

There are a lot of free as well as paid tools available for creating a fishbone diagram. The Fishbone diagram in this tutorial was created using ‘Creately’ online tool. More details about fishbone templates and tools will be explained in our next tutorial.

#2) The 5 Whys Technique

5 Why Technique was developed by Sakichi Toyoda and was used at Toyota in their manufacturing industry. This technique refers to a series of questions where each answer is responded with a Why question. It can be related to how a child will ask questions to grown-ups. Based on the answer grown-up gives, they will ask “Why” questions again and again till they are satisfied.

5 Why technique is used standalone or as part of fishbone analysis to drill down to the root cause of the problem. The number of steps is not limited to 5. It can be less or more than 5 until the diagnosis of the problem has arrived. 5 Whys are relatively a simpler technique and faster way to arrive at the root causes. It facilitates quick diagnosis to rule out the symptoms and arrive at the root cause.

The success of the technique depends on the knowledge of the person. There can be different answers to the same Why question. So, selecting the right direction and focus in the meeting is important.

Steps to create 5 Whys diagram

Start the brainstorming discussion by defining the problem. Then follow with subsequent Why and their answers.

5Why_Template

An example of how 5 Whys diagram is applied to a software defect:

5Why_Software_Defect Example

5 Why template and images are drawn using Creately online software.

Factors Causing Defects

There are many factors which provoke the Defects to occur:

  • Unclear / Missing / Incorrect Requirements
  • Incorrect Design
  • Incorrect Coding
  • Insufficient Testing
  • Environment Issues (Hardware, Software or Configurations)

These factors should always be kept in mind while performing the RCA process.

RCA starts and proceeds with brainstorming on the defect. The only question which we ask ourselves while doing RCA is “WHY?” and “WHAT?” We can dig into each phase of the life cycle to track, where the defect persists.

Let’s start with the “WHY?” questions, (the list is not limited). You can start from the outer phase and move towards the inner phase of SDLC.

  • “WHY” the Defect was not caught during the Sanity Test in production?
  • “WHY” the Defect was not caught during Testing?
  • “WHY” the Defect was not caught during the Test case review?
  • “WHY” the Defect was not caught during “Design Review”?
  • “WHY” the Defect was not caught during the Requirement phase?

The answer to this question will give you the exact phase, where the defect exists. Now once you identify the phase and the reason, then comes the “WHAT” part.

“WHAT will you do to avoid this in the future?

The answer to this “WHAT” question, if implemented and taken care of, will prevent the same defect or the kind of defect to arise again. Take proper measures to improve the identified process so that the defect or the reason for the defect is not repeated.

Based on the results of RCA, you can determine which of the phase has problem areas.

For Example, if you determine most of the RCA of the defects are due to requirement miss, then you can improve the requirement gathering/understanding phase by introducing more reviews or walk-through sessions.

Similarly, if you find that most defects are due to testing miss, you need to improve the testing process. You can introduce metrics like Requirement Traceability Metrics, Test Coverage Metrics, or can keep a check on the review process or any other step that you feel would improve the efficiency of the testing.

Conclusion

It is the responsibility of the entire team to sit and analyze the defects and contribute to the product and process improvement.

In this tutorial, you have got a basic understanding of RCA, steps to be followed for doing an efficient RCA and different tools to be used such as Fishbone analysis and 5 Why Technique. In the upcoming tutorials, there will coverage on different RCA templates, examples, and use cases on how to implement it.