What is Longevity Testing? How to Catch the Bugs Before the Customer Finds It
This article explains the meaning of “Longevity Testing” and how it helps to assess the stability of the System or the Product and reduce the defects found by the customer i.e. “Catch the bugs in-house before the customer finds it”.
By the end of this article QA Managers, Leads and Testers will have a fair knowledge about:
- What is Longevity Testing?
- Why is Longevity Testing required?
- Planning and Executing Longevity Tests
- What are the Pros and Cons of Longevity Testing?
What is Longevity Testing?
Longevity Testing is a Testing Activity:
- To validate system or product stability and serviceability features over a longer period against appropriate load and stress condition with real-time traffic and applications
- To reduce the occurrence of defects surfacing at the Customer site
Flow diagram of handling Customer reported issues (Fig. 1)
Background to Longevity Testing
#1) Usually, in the first few weeks of the Product deployment or after an upgrade to the latest Software release at the customer site, all things run well. However, over a period of few weeks, a customer starts reporting the issues.
#2) Many of the issues may be simple features as they are reported by the customer and are not easily reproducible in-house. They need a lot of time and careful analysis by Expert Team across the spectrum. Hint:Time=$$$!!!
#3) One or more of the following happens when customer(s) find the defect (Fig. 1)
- Severity of the defect will have a direct impact on the Customer’s business i.e. $$$
- Any service request to Technical Support Center is costing $$$ to the Product Engineering Organization
- Seldom the issues raised by the customer are resolved by the front end Technical Support Team
- Such requests or tickets are escalated to the Escalation Support Team
- Customer Ticket Escalation will cost more $$$ to the Organization
- If Escalation team is unable to resolve the issue, it will now have to involve the Engineering Team (Development and QA)
- By now the cost $$$ to resolve the issue would have also raised up substantially
- Longer the defect resolution higher the probability of dissatisfied customer(s) who would not give repeat orders and the worst scenario is when the customer decides to move to a competitor’s solution at an opportune time. However, in both the cases it is a revenue loss to any Product Engineering Organization
4) The higher percentage of such issues reported by a customer(s) are related to typical System or Product stability in combination with customer topology, infrastructure, traffic, and application specific.
Why is Longevity Testing required?
1) Any ‘Defect’ that arises out of Customer reported the issue is usually a Test Escape.
2) Any such defects cost bottom-line $$$ to the Customer as well as the Engineering Organization that provides solutions and services to the customers.
3) In a normal scenario, the defect should have been noticed internally during various testing cycles including Regression Testing by one or more testers from the Testing Team depending upon the complexity of the issue.
4) Most importantly, such defects arising out of customer reported issues also point out an appropriate test scenario or a test case from being missed out at the point of Test Plan execution.
5) Many of the Testers must have experienced that a particular feature is failing at customer site but passing in-house in various Testbeds like
6) Key observations to be considered –
- During any software release cycle, System Under Test (SUT) or Device Under Test (DUT) in all Testbeds are frequently soft or hard rebooted for want of things like loading of new code drop, bug verification etc.
- Even Automated Regression Test suites usually reboot or reset the SUT or DUT post execution of a particular test case script or a series of test case scripts
- So the SUT or DUT is not running long enough without a soft or hard reboot
- Whereas the situation is entirely different at the customer site. The customer cannot afford to keep rebooting the System frequently thereby resulting in productivity disruptions
- Customers follow a proven practice where they announce a proper maintenance window to the intended audience and then carry out Software upgrade or Hardware replacement etc.
- Such maintenance windows can be for a specific duration from Quarterly to Yearly depending on the internal guidelines and procedures of the Customer’s Organization
- In reality, actual health picture of the System or the Product at customer site is entirely different to that of Testbeds during a given Software Release cycle in any Product Engineering Organization
- Many customers also look for an authorized quality document having passed particular Vertical Model Testing, especially Financial, healthcare and Federal Verticals
Considering few Test gaps as mentioned above =>
- It is apparent that the System or the Product should undergo longer duration of tests or Longevity Tests with end-to-end scenario mimicking Customer Site or verticals
- Longer duration can be 72-720 hrs. (3-30 days) or appropriate duration based on EFD or CFD data and specific customer cases
- It is a recommended practice for QA Managers, Leads and Testers is to carry out Longevity Testing as a separate activity in a given Software Release cycle
- Net-Net, Longevity Testing is very much relevant to the stability of the System or the Product as it has direct relationship to bottom-line $$$ of the Organization
Planning and Executing Longevity Tests
It is important that QA Managers, Leads, and Testers include Longevity Testing as part of their overall Test Strategy.
- Engineering Organizations carry out in-house Test Escape Analysis (TEA) exercise from time to time for many Product (Hardware and Software). Some even have an integrated and automated mechanism in place to dig Test Escape data usually based on ‘Externally Found Defects (EFD)’ or ‘Customer Found Defects (CFD)’ logged by Support Escalation Team
- EFDs or CFDs should be carefully analyzed in context with Customer’s Live deployment from end-to-end perspective, not just the Infrastructure but also the end user devices, applications, traffic patterns
Understanding Customer Verticals:
Customers usually fall into one of the below broader verticals:
- Federal (Govt)
#1) Develop a separate Test Plan and Test Case for Longevity Testing. This will also help to track the test execution, bugs logging, and verification
#2) Identify test cases based on Test Escape Analysis inputs – usually bug scrub of EFDs or CFDs
#3) It is very important that QA team mimics test beds of one or more verticals depending on the organization’s line of business with number of verticals
#4) Dedicated Test Bed(s) should have
- Network Topology similar to that of an intended vertical or multiple verticals
- Infrastructure having similar switches, routers, back-end servers, firewalls etc
- Most frequently and popularly used application servers from a given vertical(s)
- Most frequently and popularly used end-user gadgets from a given vertical(s)
#5) Appropriate tools for generating Load, Stress and Real-time Traffic
#6) Identify Manual execution resource
#7) Identify Automation resource/strategy for faster and repeated execution
#8) Identify START and END of Longevity Testing for a given release
Two approaches for START and END of Longevity Testing:
I) Approach 1:
- Software code or Hardware should be in a stable condition
- START at the end of FEATURE Test Completion
- END before Code Freeze
II) Approach 2:
- Take a minor hit by allowing slightly unstable code
- START at the 70% completion of FEATURE test cycle
- END before Code Freeze
#9) Bug verification for resolved defects
#10) Move Longevity Testing to Regression for subsequent Regression Testing
- Set-up the Testbed(s) to mimic one or more Customer Verticals
- Ensure that all the back-end Infra, Application and Database including flavors are similar to that of the customer’s
- Ensure end-user devices are similar to that of the customer’s use are available and used during Test Plan execution
- Ensure that appropriate tools are available to generate moderate Stress and Load of the System or Product
- Execute Entire Test suite from the Longevity test Plan without soft or hard reboot of SUT or DUT, back-end servers another Infra related devices
- Multiple runs of tests should be run in the above fashion for a defined non-stop duration from the slot 72-720 hrs.
- Record the results
- Log all the bugs identified
- Verify all the bugs
What are the Pros and Cons of Longevity testing?
- Helps identify critical bugs before customer finds it
- Helps stabilize the System or Product for its serviceable feature that is critical to Customer’s productivity and business
- Helps increase Customer Satisfaction
- Saves lots of costs $$$ to the Organization’s – money saved is money earned!!!
- Longevity testing report can also be turned into a Quality Certification proof catering to different verticals
- Initial cost for including Longevity Testing and its related activities as part of a given release and Regression activities
- Ideally suited for Waterfall model
- Agile/Scrum models need tweaking of duration and coverage
Many of the ‘Defects’ that arise out of Customer reported issues are primarily due to Test Escape. This, in turn, begs for a lot of questions like Test Plan development, review, coverage and execution.
Externally Found Defects (EFD) or Customer Found Defects (CFD) have a business ($$$) impact for the Customer as well as for the Product Organization.
Longevity Testing being unique, should help any Product organization to improve Customer Satisfaction by the way of identifying and resolving defects before customer catches them. Longevity Testing also helps improve stability resulting in robust quality System or Product.
About the author: This article is written by STH author Vinayak. He is having 12 years of QA/testing experience in Fortune 500 companies.
Let us know if you have any questions or suggestions about this article.