Database Testing – Properties of a Good Test Data and Test Data Preparation Techniques

A couple of months ago, I wrote about database testing strategies. It covered the aspect that is entirely related to the execution of test cases. It was all about black-box testing of a database. There is another important aspect of DB testing activity which we will cover in this article.

As a tester, you have to test the ‘Examination Results’ module of the website of a university. Consider the whole application has been integrated and it is in ‘Ready for Testing’ state. ‘Examination Module’ is linked with ‘Registration’, ‘Courses’ and ‘Finance’ modules. Assume that you have adequate information of the application and you created a comprehensive list of test scenarios. Now you have to design, document and execute these test cases. In ‘Actions/Steps’ section of the test cases, you must mention the acceptable data as input for the test. The data mentioned in test cases must be selected properly. The accuracy of ‘Actual Results’ column of TC Document is primarily dependent upon the test data. So, step to prepare the input test data is significantly important. Thus, here is my rundown on “DB Testing – Test Data Preparation Strategies”.

Properties of Test Data:

DB testing

The test data should be selected precisely and it must possess the following four qualities:

1. Realistic: By realistic, it means the data should be accurate in the context of real life e.g. in order to test ‘Age’ field, all the values should be positive and 18 or above. It is quite obvious that the candidates for an admission in the university are usually 18 years old (this might be defined in requirements).

2. Practically valid: This is similar to realistic but not the same. This property is more related to the business logic of AUT e.g. value 60 is realistic in age field but practically invalid for a candidate of Graduation or even Masters Programs. In this case, a valid range would be 18-25 years (this might be defined in requirements).

3. Versatile to cover scenarios: There may be several subsequent conditions in a single scenario, so choose the data shrewdly to cover maximum aspects of a single scenario with minimum set of data, e.g. while creating test data for result module, do not only consider the case of regular students who are smoothly completing their program. Give attention to the students who are repeating the same course and belong to different semesters or even different programs. The data set may look like this:


Sr# Student_ID Program_ID Course_ID Grade
1 BCS-Fall2011-Morning-01 BCS-F11 CS-401 A
2 BCS-Spring2011-Evening-14 BCS-S11 CS-401 B+
3 MIT-Fall2010-Afternoon-09 MIT-F10 CS-401 A-

There might be several other interesting and tricky sub-conditions. E.g. the limitation of years to complete a degree program, passing a prerequisite course for registering a course, maximum no. of courses a student may enroll in a single semester etc. etc. Make sure to cover all these scenarios wisely with finite set of data.

4. Exceptional data (if applicable/required): There may be certain exceptional scenarios that are less frequent but demand high importance when occur, e.g. disabled students related issues.

Test data preparation techniques:

We have briefly discussed the important properties of test data and it also elaborates how test data selection is important while database testing. Now let’s discuss the techniques to prepare test data.

There are only two ways to prepare test data:

Method 1. Insert New Data:

Get a clean DB and insert all the data as specified in your test cases. Once, all your required and desired data has been entered, start executing your test cases and fill ‘Pass/Fail’ columns by comparing the ‘Actual Output’ with ‘Expected Output’.  Sounds simple, right? But wait, it’s not that simple.

Few essential and critical concerns are as follows:

  1. Empty instance of database may not be available
  2. Inserted test data may be insufficient for testing some cases like performance and load testing.
  3. Inserting the required test data into blank DB is not an easy job due to the database table dependencies. Because of this inevitable restriction, data insertion can become difficult task for tester.
  4. Insertion of limited test data (just according to the test cases needs) may hide some issues that could be found only with the large data set.
  5. For data insertion, complex queries and/or procedures may be required, and for this sufficient assistance or help from the DB developer(s) would be necessary.

Above mentioned five issues are the most important and the most obvious drawbacks of this technique for test data preparation. But if there are some advantages as well:

  1. Execution of TCs becomes more efficient as the DB has the required data only.
  2. Bugs isolation requires no time as only the data specified in test cases present in the DB.
  3. Less time required for testing and results comparison.
  4. Clutter-free test process

Method 2. Choose sample data subset from actual DB data:

This is the feasible and more practical technique for test data preparation. However it requires sound technical skills and demands detailed knowledge of DB Schema and SQL. In this method you need to copy and use production data by replacing some field values by dummy values. This is the best data subset for your testing as it represents the production data.  But this may not be feasible all the time due to data security and privacy issues.

This strategy deserves one separate post which we’ll discuss in next article ‘Database gray-box testing’ and precautions to take while testing database.

This is a guest article by Rizwan Jafri.
Author is having more than 4 years of experience and Currently working as a Sr. QA Engineer in Systems Limited Lahore, Pakistan.

If you have any questions, please feel free to ask in below comment section.

Recommended reading


#1 Laxmi N.

nice one.
just want to share one tip – before using production data make sure you mask all the data values. I faced big problem due to this.We used same Db for testing without masking the user emails and our testing resulted in actual emails to customers. This a big no no..

#2 Pratap

Very Nice info..keep shating Pradeep..such infomative articles..Thank You :)

#3 Afshan Rauf

Nice. Keep it up!

#4 Khush

agree with Laxmi. same issue I faced in my career. Since then I am very reluctant while using the production DB :)

#5 Justin Hunter

Nice article.

You mentioned: “There may be several subsequent conditions in a single scenario, so choose the data shrewdly to cover maximum aspects of a single scenario with minimum set of data,”

I agree. An excellent way to accomplish this, which not enough testers know about, is through pairwise testing (and more thorough combinatorial testing). Entering the test inputs into a pairwise test case generating tool, like Hexawise, can be an excellent way to thoroughly test a system with a minimal number of tests.

In addition, such combinatorial tests will have a minimal amount of repetition from test to test.

– Justin
See, e.g.,:

#6 jayalakshmi

Really nice concept and useful one thank u for this great information

#7 Yamraaj

Comprehensive, Detailed, Helpful.
Very good and easy to understand example.
Important information given in well organized way.

#8 Mallesh

its very useful to develop the testing techniques of qtp scripts and functionality of the AUT

#9 Tomek

Thank you, interesting!

#10 Manoj

Thanks for the aritcle Are you suggesting back end verification?, if yes can you please provide some examples which will stand while convincing management ? Have you to some examples for bugs specific to database used ?

#11 Manoj

* have you got some examples for …

#12 Rizwan Jafri - The Author

Just wait for the next (and Last) article on this topic from me. All examples will be given.

@ Justin: Thank you so much for sharing hexawise URL, I have used it and found an excellent tool.

Thank you all for appreciation.


#13 Manmohan

Excellent usage but one good peace of advice as we know that we grasp quick we are demonstrated with instances so please introduce examples as well.


#14 mallaiah d

what is meant by data migration testing

#15 abrar tararr

Hi all…
i m doing manual testing of web applications in a pvt company for 2,3 months. Unfortunately i have no senior to guide me…
I need some practical test scenarios & Test cases (for guidance purpose.i.e. how to write test cases).
would any of you like to help me? plzz send me some test cases on the following id…i’ll be vry thankful…

#16 yogesh

nice this usful for me.

#17 Ritesh

Quite useful

#18 STC Technologies

Thank u for share the useful information

#19 Vishal

Nice and informative article.

Test Data preparation and management has always been challenge for testing team. Use of right test data during execution guarantees successful testing.

Test data generation and preparation vary from application to application.

Test data requirement for Multi tier application that required data inputs from several other applications to generate test data for Application under test will be different from a simpler application.

In such case Test Data Management tool can be used to create and manager test data.

InfoSphere Optim Test Data Management Solution from IBM is one the Test Data Management Tool. Very efficient and easy to learn.

#20 simran rai

can you pls suggest something for those who have no technical background means knows nothing about testing.
can u suggest something like which kind of testing field they can go in as there are many kinds of testing?
can u pls share some info about automation and manual testing and which one is better for people with no technical background?


#21 Daniel Islas

great article, I found it very usefull

#22 rashmi

Hi all…
i m doing manual testing of web applications in a pvt company for 2 months.
I need some practical test scenarios would any of you like to help me? please send me some test samples on the following id… will like to thank you so much…

#23 Inder

Also, testing tables and stored procedures dependencies will help to find relevant information. For instance, if a stored procedure is being called inside another procedure and some argument value is missing, then it might cause inconsistencies.

Great post…Thanks!!

#24 Umesh

It would be great if you share an example of Test cases for real database testing.


#25 Yamini

For test data maintenance which is better either
1. test data in excel or
2. test data in database because its a huge application having many tables and relations between them.

#26 Vishal

Nice explanation.Please share the test cases for database tesing

Leave a Comment