Failure Mode and Effects Analysis (FMEA) – How to Analyze Risks for Better Software Quality & Satisfied Customers!


Failure mode and effects analysis (FMEA) is a risk management technique.

If implemented properly this can be a great addition to the best quality assurance processes to be followed. In this article, our goal is to introduce you to this risk analysis technique which in the end, is very useful for improving the software quality.


Failure mode and effects analysis

FMEA is mostly used by the upper management or stakeholders. In practice, the testers get little insights into this technique. But now the trend is changing and I feel if the testers understand this concept properly, they can drive their thought process of writing test cases

to one level up by utilizing this technique to:

  • Understand the stakeholder’s goals of testing the application.
  • Understand the business.
  • Derive the high-level test scenarios based on business and management interest.
  • Derive effective test cases which provide a better coverage to the risk-prone areas.
  • Prioritize the test cases.
  • Decide what to test and what to defer at any phase.


RISK ANALYSIS is a crucial aspect of Test Management. The question then arises – What is Risk analysis? And why is it important? To understand this, it is vital to understand – what is RISK?

risk analysis

See Also => Types of Risks in Software Projects.

RISK as its literal meaning is a possibility of a negative or undesirable outcome or event. Risks if not handled or managed properly may lead to poor quality, unsatisfied customers and sometimes loss of business.

Risk has 2 attributes – Probability and Impact.

Probability means the chances of a particular risk to occur and impact means extent of the effect of the risk.

What is Risk Analysis?

Risk analysis is a mechanism by which the identified potential risks are analyzed & studied thoroughly in order to find the probability and impact. It is advisable to measure the two attributes and based on the result we identify:

  • What to test first
  • What to test more
  • What not to test(This time)

There are many methods of doing the Risk Analysis and they are broadly classified into two types:

Informal Techniques – Which are based on experience, judgment, and intuition

Formal Techniques – Identifying and weighing the risk attributes.

Failure Mode And Effects Analysis (FMEA) – Is a formal method of doing risk analysis. In the following sections, I will be discussing more on FMEA and try to elaborate it with the example.

FMEA is a formal technique of doing Risk Analysis. It is a systematic and quantitative tool in a form of a Spread Sheet which assists the members to analyze what might get wrong. To do the FMEA we require the right people on the table. It requires a representative from all walks of the industry including customers.


FMEA starts and continues with Brainstorming sessions. Participants need to identify all the components, modules, dependencies, limitations that could fail in a production environment and eventually lead to poor quality, reliability and may result in loss of business.

During FMEA we not only identify the extent of the loss but also try to identify the cause of those failures. To measure FMEA, we require 3 attributes:

  1. The severity of the failure (S)
  2. Priority of the failure (P)
  3. Likelihood of the failure (L)

We put each of these attributes in a scale shown below:

Severity Scale:

Loss of data, hardware or safety issuesUrgent1
Loss of functionality without a workaroundHigh2
Loss of functionality with a workaroundMedium3
Partial loss of functionalityLow4
Cosmetic or trivialNone5

Priority scale:

Complete loss of system valueUrgent1
Unacceptable loss of system valueHigh2
Possibly reduction in system valueMedium3
Acceptable reduction of system valueLow4
Negligible reduction in system valueNone5

Likelihood scale

Certain to effect all usersUrgent1
Likely to impact some usersVery High2
Possible impact on some usersHigh3
Limited impact to few usersLow4
Unimaginable in actual usageNone5

All these three attributes (Severity, Priority and Likelihood) are individually measured in scale and then multiplied to get a Risk Priority Number (RPN).

I.e. Risk Priority Number (RPN) = S*P*L

Based on this RPN value, we determine the extent of testing. Lesser is the RPN, higher is the risk.

Let’s try to understand it with an example:

Failure Mode Effect Analysis Example:

(This is a hypothetical example only for an understanding purpose. Actual implementation and features may vary)

Let’s consider a simple example of a banking application which has 4 features.

  1. Feature 1 – Withdraw
  2. Feature 2 – Deposit
  3. Feature 3 – Home Loan
  4. Feature 4 – Fixed Deposits.

A risk analysis team is formed which consist of the Bank manager, UAT test manager ( representing end user), Technical Architect, Test architect, Network administrator, DBA and a Project manager.

After a series of brainstorming sessions the team came up with the following risks:

  1. Complex business logic in case of calculating interest rate of the home loan.
  2. The system fails at 200 concurrent users.
  3. The system fails to handle documents which are more than 6 MB.

Now let’s try to calculate the severity, priority and likelihood of these identified risks.


Complex business logic in case of calculating interest rate of home loanVery High2
System fails at 200 concurrent usersHigh3
System fails to handle documents which are more than 6 MBVery High2


Complex business logic in case of calculating interest rate of home loanVery High2
System fails at 200 concurrent usersHigh3
System fails to handle documents which are more than 6 MBHigh3


Complex business logic in case of calculating interest rate of home loanHigh3
System fails at 200 concurrent usersHigh3
System fails to handle documents which are more than 6 MBLow4

Now let’s put all these attributes together:





Complex business logic in case of calculating interest rate of home loan223
System fails at 200 concurrent users333
System fails to handle documents which are more than 6 MB234

Now let's calculate the Risk Priority Number (RPN = Severity * Priority * Likelihood)






Complex business logic in case of calculating interest rate of home loan22312
System fails at 200 concurrent users33327
System fails to handle documents which are more than 6 MB23424

Now the key is: Lower is the RPN – Higher is the risk.

So here for this particular example, Feature 1 (Complex business logic in case of calculating interest rate of the home loan) has the highest risk and feature 2 (System fails at 200 concurrent users) has the lowest risk.

How to use this to derive test cases?

Since feature 1 is the riskiest feature, the test cases should be rigorous and more in-depth. Write the test cases to cover complete functionality and affecting modules by the feature. Use all sorts of test case writing techniques (Equivalence Partitioning and BVA, Cause and effect graph, State transition diagram) to derive the test cases.

The test cases should not only be functional but also non-functional (Load test, Stress and Volume test etc.). Basically, we need to do an exhaustive testing of this particular feature, so base your test cases accordingly. Also, consider all the dependent modules on this important feature.

Feature 2 is the LEAST RISKY feature, so base your test cases on the major functionality. Just high-level test cases to validate that the feature works as expected should be sufficient.

Feature 3 is a MODERATE RISK feature, so base your test cases to cover all the major and dependent functionality. Write some BVA test cases to validate a few negative scenarios as well. The extent of the test cases should be between High risk and Low-risk factor. If required, include few non-functional test cases as well.

FMEA and Degree of testing

Based on the RPN value, we determine the extent or degree of testing to be done.

Normally if:

  • RPN is between 1-10, we do Extensive Testing (Covering in and out of the feature/module)
  • RPN is between 11-30, we do Balanced Testing ( Covering all the major functionality of the feature/module)
  • RPN is between 31-70, we do opportunity testing (Covering the basic functionality of the feature/module)
  • RPN is more than 70 – No testing or when time permits, only anomaly reporting.

These ranges or numbers are not restricted to the ones I mentioned above. They may vary as per the nature of the project.

 Resources: Download FMEA Software and FMEA Template.


Risk Analysis using FMEA requires time and experience. Desired results can be achieved only by equal participation from all the responsible team members. Though this technique is formal, it requires a series of brainstorming sessions and it is equally important to document all the identified risks.

Since most of the applications are exclusive, the scale to measure the parameters of FMEA (i.e. priority, severity and likelihood) also dependents on the application. If done appropriately, there are many advantages of the FMEA technique. It can be used for identifying potential risks and based on this team can plan an effective mitigation strategy.

About the Author: This is a guest article by Shilpa Chatterjee Roy. She is working in software testing field for the past 8.5 years in various domains.

If you have used this technique please feel free to comment on your experience below.

Recommended Reading

15 thoughts on “Failure Mode and Effects Analysis (FMEA) – How to Analyze Risks for Better Software Quality & Satisfied Customers!”

  1. you are introducing us to new and new testing processes which we never used. these are really helpful technical methods to perform testing. thank you soooo much for sharing. keep up the good work.

  2. Thought Provoking Article.

    Just One Doubt –

    When to do this FMEA meeting ?

    1. If At the Starting of Development – Then How we know what is going to be come in front of us during Development.
    In your case how you know before development that the Application fails to handle documents which are more than 6 MB

    2. If after the Development -Then Does it will definitely delay the Release of Application as we have to fix the urgent Issues.

    Please explain when FMEA should be done and Why ?
    If you explain with the same example it will be well and Good.


  3. Thank you all!


    Risk analysis is basically done during the planning stage. In fact this forms the basis of creating the dev plan and test plan.

    To understand and identify the risk is a tricky job but Their is no rocket science involved to identify the risk. It requires experience. Based on your experience and judgement we do the risk analysis and thats why i mentioned, it requires lots and lots of brainstorming sessions.

  4. Hi Shilpa,

    Really its very useful, i thought it was risky, now am feeling its very simple if you have little testing experience.

    Thanks a lot..!!!!!!1

  5. Hi,

    It is indeed a great article Shipla.

    In addition, I may be able to add a few thoughts to FMEA in reference with 15 years of hands-on implementation of the tools. FMEA is classified into 2: 1) DFMEA (Design Failure Mode Effects Analysis) which is used during the product design stage. At this point designer would have to review the historical problems or complaints encountered and perform necessary analysis before proceeding with new/improved design. In this manner VALUE of the product could be increased and the COST on the other hand could be tremendously reduced. While adding value the company would be able to realize higher profit margin. Remember DESIGN stage is crucial or else the failure will spill into processes/production/services. This may further impact process cost, internal and external customer satisfaction. What is more scary is shrinking of profit margin. It would be more costly to resolved severe failure at PROCESS stage instead at DESIGN stage. Companies are losing tons and millions cash because they are unaware of the importance of DFMEA. 2) PFMEA (Process Failure Mode Effects Analysis) which is used as a tools for process improvement by identifying the potential failure/risk surfacing from processes. In most cases the failure identified would be adopted into the process manual or SOP to ensure that the day-to-day processes are complied by the operators. It is to bare in mind that both the DFMEA & PFMEA is to be developed prior to product design and processing. In most cases DFMEA & PFMEA documentation will be finalised during the prototype or trial stage. I hope this info is helpful! Be blessed!

  6. Very Good Article. I am not into software, however this tool is very much useful in my Pharmaceutical quality systems.
    It would be great help if you can give some examples according to my area.

Leave a Comment