A Guide to defect Density:
Test Metrics are tricky. They are the only way to measure, yet the variety is overwhelming.
You could be collecting something that isn’t giving you the analytics you want. The safest way here is to walk on the well-beaten path.
Almost every team in the world relies on defect Density to understand defect trends.
Today’s article is an all-in-one guide on Defect Density (DD).
What You Will Learn:
What is Defect Density?
Let’s look at what density literally means.
It is “the degree of compactness of a substance (Source: Google)”.
So, Defect Density is the compactness of defects in the application. (Ok, so it is just a refined version of defect distribution.)
Applications are divided into functional areas or more technically KLOC (Thousand Lines of code). Thus, the average number of defects in a section or per KLOC of a software application is bug density.
How is Bug Density calculated?
It is a simple math.
Step #1: Collect the raw material: You are going to need the total no. of defects (for a release/build/cycle).
Step #2: Calculate the average no. of defects/Functional area or KLOC
Defect density Formula with calculation example:
Example #1: For a particular test cycle there are 30 defects in 5 modules (or components). The density would be:
Total no. of defects/Total no. of modules = 30/5 = 6. DD per module is 6.
Example #2: A different perspective would be, say, there are 30 defects for 15KLOC. It would then be:
Total no. of defects/KLOC = 30/15 = 0.5 = Density is 1 Defect for every 2 KLOC.
Example 2 is just for those teams who are aware of the KLOC and who needs a measurement against it. Most teams don’t work with that kind of a statistic. But if you need to, you can find out how many KLOC your application is.
Why is Bug Density Important?
Every metric that test team collects conveys one of the following:
- Progress
- Productivity
- Quality
If not, you are wasting your time.
DD is the most effective way to understand Quality.
For example: An application with DD 5 per KLOC is of better quality vs. another one with 15 per KLOC.
The higher bug density, the poorer the Quality.
It serves two important purposes:
- Inform: Information is power, isn’t it? Knowing the weakest areas of your application helps decide if it is ‘fit-to-use’ or not.
- Call to Action: A module with higher DD needs mending. DD helps identify them.
Don’ts
#1) Don’t take into account duplicates/returned defects
Inaccurately computed Defect Density can mislead your team.
Do not include duplicates/returned defects (not a bug, working as intended, not reproducible, etc.) It increases the count of the total no. of defects, which means the DD will increase proportionally. As a result, your defect metric will suggest poor quality, which would be a definite false alarm.
#2) Don’t do this based on one day’s data
Let’s look at this hypothetical situation:
On day 1, the DD is higher. This could send your team into a panic mode immediately.
So, wait till you have better raw material. In other words, a few days’ worth of data.
Also, when computing DD, you want a cumulative defect count.
In the above table, your DD from Day 2 on does not take into account the number of defects so far. It looks at just that day’s data alone.
It is giving me the impression that: “The defect density from day 2 is reducing and increasing and there is no trend.” Also, how can defect density reduce when nothing is done about the defects reported on the day before? Isn’t it? Think about it.
A better way to do this is:
Once again, if doing this daily, take a cumulative defect count into account.
Variations
Depending on the level of refinement your team needs, you can tweak this defect metric.
- For DD of High/Critical severity issues, your formula can be:
Total no. of High/Critical defects per KLOC or modules
- You can do this for returning issues per modules too. Here you will only collect the count of issues that keep coming back across builds/releases
At what values of Bug Density does the Software become unacceptable?
Defect Density Industry Standard:
Well, this varies for every industry, application and every team. Manufacturing would have a specific threshold and it would be completely different for IT.
DD at its face value shows poor quality. But it is, in turn, the seriousness of the individual defects that decide if the product is fit for use or not.
High DD is your indicator to delve deeper and analyze your defects for their consequences.
Who would not like zero defect density, right? Therefore, even though there is no specific standard, the lower this value, the better.
Final Thoughts:
- It is not a predictive count. A value of DD does not help expect the future quality of the product. It may be better or worse. Historical data won’t help with future predictions.
- During critical test stages/cycles (such as UAT), DD is calculated based on time. For example: DD/ First hour, DD per day, etc.
- When collating multiple release/cycle defect statistics, defect density can be per cycle or per release.
- A simple graphical representation of the tabular data can be as below:
In conclusion
Defect Density is a key quality indicator. You can’t go wrong with collecting and presenting this defect metric. What’s more? It is one of the easiest to compute.
I hope this article has given you enough exposure to start using Defect Density for deeper insights.
Author: STH team member Swati has written this detailed tutorial.
Do you calculate defect density in your teams? If yes, do you do it per cycle, per module or per KLOC? If not, what other metrics help you understand quality? Please share your comments and questions below.
Thanks for posting this. This is really wonderful. Please do post about other metrics as well.
Yes i think defect density is important but it should be combined with checking the complexity also.
1 critical defect carries more value than 10 minor ones.
good metric that testers should use cautiously
@All: Thank you for stopping by. Your comments make our content richer.
@Jafar: Any metric in particular?
@Gaurav: Totally agree. “DD at its face value shows poor quality. But it is, in turn, the seriousness of the individual defects that decide if the product is fit for use or not.” this part of the article talks exactly about that.
Great, really helpful I will start using here on.
What is 80:20 Rule ? Please explain through Example.
Coming in a little late here. I started floating the idea of this metric around our dept and getting mixed results. The significant pushback is coming up with how to measure code complexity. Some platforms have lines of code, but others are done in Java or other platforms with function points. Any insights how to calculate defect density for platforms that do not have traditional lines of code?
defects/coding hours? I like measuring everything off time because it can be easily converted to money or cost.
Although “hours of code” is similar to “lines of code” in that they both assume more (lines/hours) equals higher complexity. Maybe depends on which assumption you’re most comfortable with? Or measure both?
Total no. of defects/KLOC = 30/15 = 0.5 = Density is 1 Defect for every 2 KLOC. — is this correct? I thought it would be, 2 defects for every 1 KLOC.
Nicely written Defect Density is the key metric that should be given it’s due importance to see where a team/group stands in terms of Quality of Product/Application under test . I agree while calculating DD one should also consider the Complexity of Defect and also the technology of the application under Test as we might have different acceptable level of DD for applications based on different technologies .
@Christina: yes I think you found a defect in the article.
Good technique but would certainly be improved by considering severity as others have pointed out.
This would be easily achieved by using a multiplier to convert number of defects into an impact measure, eg x10 for Critical, x5 for medium and x1 for minor/cosmetic.
Thanks for the details. How do you calculate defect density in object-oriented languages like C# or JAVA?
Thanks for the information about defect density
I calculated the defect density with reference to testing
Test Defect Density – No of valid defects reported/No of TCs executed ( Instead of KLOC)
Great Article on defect density….just one point from my experience we can use it for future predictions as well assuming we don’t change drastically e.g. developers are same their coding style is same so they will keep producing the same amount of defect/kloc..testers are same using the same process so they will find similar no of defects. So in some sense DD can be predictable as well unless we make changes like improved unit testing/better code reviews to make code quality better( and eventually less DD)
Also lower the density doesn’t necessarily mean its better …it can also mean not enough testing is done.
What is the industry standard for delivered defect density (DDD)?
How Defect Density can be rated as Excellent, Good, Fair and Poor?
Depending upon the factor like if DD is ‘zero’ then rating is excellent, if DD is 0.1 to 0.2 then rating is good, etc.
Hi – thanks for the information. QUick question though – Our Product is in production from last 1 year and we are making enhancements in 7 modules. We follow Sprint cycle of 15 days . How should I calculate DD ? KLOC may not be right measure for me !!
Total no. of defects/KLOC = 30/15 = 0.5 = Density is 1 Defect for every 2 KLOC.
is incorrect
it comes out to be 2 and defect density is 2 defect per 1KLOC
Calculation error: 30/15 = 2 (2 defect per KLOC)
If there is any setup changes in system and no new code written then how to calculate the defect density?
Great way to improve your DD: reduce QA effort!
Why is a high DD considered “bad” (given the bugs are fixed) – it just shows that QA did a good job. In another software with a lower DD many bugs are yet to be discovered by the customer…
“An application with DD 5 per KLOC is of better quality vs. another one with 15 per KLOC.”
That does not mean it’s of better quality. It may, but it may not. If two applications do the exact same thing, and the second one has 6 times as much code but only twice the number of defects, I’d argue strongly that was not a reason for the second one being better quality.
My example above is obviously a ridiculous one, but judging the quality of what dev teams produce by defects per KLOC creates a perverse incentive for them to write more KLOCs to do the same thing!
Need acceptable defect density per story point?
I am wondering what to do about a large code base.
Say I have 100k LOC from 1 to 10 years old.
Now I am add 10 kLOC of features and enhancements
I find 1000 defects, some of which are against the older code base…
What is the numerator?
What is the denominator?
Is it the open defects at the point of measure that is tracked, or the open+resolved ones for a period of time? I keep getting confused and there is no clear statement on which one is the correct way of counting the defects.