ETL vs. DB Testing – A Closer Look at ETL Testing Need, Planning and ETL Tools

Software Testing has a variety of areas to be concentrated. Major varieties are functional and non-functional testing. Functional Testing is the procedural way to ensure so that the functionality developed works as expected. Non-functional testing is the approach by which the non-functional aspects like enhanced or performance at an acceptable level can be ensured.

There is another flavor of testing called DB testing. Data is organized in the database in the form of tables. For business, there can be flows where the data from the multiple tables can be merged or processed on to a single table and vice versa.

ETL vs. DB Testing

ETL Testing is one another kind of testing that is preferred in the business case where a kind of reporting need is sought by the clients. The reporting is sought in order to analyze the demands, needs and the supply so that clients, business and the end-users are very well served and benefited.

What will you learn in this tutorial?

In this tutorial, you will learn what is Database Testing, what is ETL Testing, a difference between DB Testing and ETL Testing, and more details about ETL testing need, process, and planning with real examples.

We have also covered ETL Testing in more detail on the below page. Also, have a look at it.

=> ETL Testing / Data Warehouse Testing Tips and Techniques

DB Testing vs. ETL Testing

Most of us are a little confused over considering that both database testing and the ETL testing are similar and the same. The fact is they are similar but not the same.

DB Testing 1

DB Testing:

DB Testing is usually used extensively in the business flows where there are multiple data flows occurring in the application from multiple data sources on to a single table. The data source can be a table, flat file, application or anything else that can yield some output data.

In turn, the output data obtained can still be used as input for the sequential business flow. Hence when we perform DB testing the most important thing that has to be captured is the way the data can get transformed from the source along with how it gets saved in the destination location.

Synchronization is one major and the essential thing that has to be considered when performing the DB Testing. Due to the positioning of the application in the architectural flow, there might be few issues with the data or DB synchronization. Hence while performing the testing, this has to be taken care of as this can overcome the potential invalid defects or bugs.

Example #1:

Project “A” has integrated architecture where the particular application makes use of data from several other heterogeneous data sources. Hence the integrity of these data with the destination location has to be done along with the validations for the following:

  • Primary foreign key validation
  • Column values integrity
  • Null values for any columns

What is ETL Testing?

etl testing 2

ETL Testing is a special type of testing that the client wants to have it done for their forecasting and analysis of their business. This is mostly used for reporting purposes. For instance, if the clients need to have reported on the customers who use or go for their product based on the day they purchase, they have to make use of the ETL reports.

Post analysis and reporting, this data is data warehoused to a data warehouse where the old historical business data has to be moved.

This is a multiple level testing as the data from the source is transformed into multiple environments before it reaches the final destined location.

Example #2:

We will consider a group “A” doing retail customer business through a shopping market where the customer can purchase any household items required for their day to day survival. Here all the customers visiting are provided with a unique membership id with which they can gain points every time they come to purchase things from the shopping market.

The regulations provided by the group say that the points gained expire every year. And depending upon their usage, the membership can be either upgraded to a higher grade member or downgraded to a lower grade member comparatively to the current grade.

After 5 years of shopping market establishment now management is looking for scaling up their business along with revenue.

Hence they required few business reports so that they can promote their customers.

In Database Testing we perform the following:

#1) Validations on the target tables which are created with columns with logical calculations as described in the logical mapping sheet and the data routing document.

#2) Manipulations like Inserting, Updating and Deletion of the customer data can be performed on any end-user POS application in an integrated system along with the back-end database so that the same changes are reflected in the end system.

#3) DB testing has to ensure that there is no customer data that has been misinterpreted or even truncated. This might lead to serious issues like incorrect mapping of customer data with their loyalty

In ETL Testing we check for the following:

#1) Assuming there are 100 customers in the source, you will check whether all these customers along with their data from the 100 rows have been moved from the source system to the target. This is known as verification of Data completeness check.

#2) Checking if the customer data has been properly manipulated and demonstrated in the 100 rows. This is simply called verification of Data Accuracy check.

#3) Reports for the customers who have gained points more than x values within a particular period.

Comparative Study Of ETL And DB Testing

ETL and DB testing have few of the aspects differing within themselves that is more essential to be understood before performing them. This helps us in understanding the values and significance of the testing and the way it helps the business.

Following is a tabular form that describes the basic behavior of both the testing formats.

 DB TestingETL Testing
Primary goalData integration BI Reporting
Applicable placeIn the functional system where the business flow occursExternal to the business flow environment. input is the historical business data
Automation toolQTP, Selenium Informatica, QuerySurge, COGNOS
Business impactSevere impacts can lead as it is the integrated architecture of the business flowsPotential impacts as in when the clients wants to have the forecasting and analysis to be done
Modelling usedEntity RelationshipDimensional
SystemOnline Transaction ProcessingOnline Analytical Processing
Data NatureNormalized data is being used hereDenormalized data is being used here

Why Should The Business Go For ETL?

Plenty of business needs are available for them to consider ETL testing. Every business has to have its unique mission and the line of business. All business has its product life cycle which takes the generic form:

product life stages

It is very clear that any new product enters the market with tremendous growth in sales and till a stage called maturity and thereafter it declines in sales. This gradual change witnesses a definite drop in business growth. Hence it is more important to analyze the customer needs for the business growth and other factors required to make the organization more profitable.

So in reality, the clients want to analyze the historical data and come up with some reports strategically.

etl testing 4

ETL Test Planning

One of the main steps in ETL testing is about planning the test that is going to be executed. It will be similar to the Test Plan for the System Testing that is usually performed except few attributes like requirements and test cases.

Here the requirements are nothing but a mapping sheet that will have kind of mapping between data within different databases. As we are aware that the ETL testing occurs on multiple levels, there are various mappings needed for validating this.

etl testing 5

Most of the time the data is captured from the source databases are not directly. All the source data will have the tables’ view from where the data can be used.

Example: Following is an example of how the mappings can be provided. The two columns VIEW_NAME and TABLE_NAME can be used to represent the views for reading data from the source and the table in the ETL environment respectively.

It is advisable to maintain the naming convention that can help us while planning for automation. Generic notation that can be used is just prefixing the name of the environment.

etl testing 6

The most significant thing in ETL is about identifying the essential data and the tables from the source. The next essential step is the mapping of tables from the source to the ETL environment.

Following is an example of how the mapping between the tables from the various environments can be related to the ETL purpose.

etl testing 7

The above mapping assumes the data from the source table to the staging table. And from then on to the tables in EDW and then to OLAP which is the final reporting environment. Hence at any point in time, data synchronization is very important for the ETL’s sake.

Critical ETL Needs

As we understand ETL is the need for forecasting, reporting and analyzing the business in order to capture the customer needs in a more successive manner. This will enable the business to have higher demands than in the past.

Here are few of the critical needs without which ETL testing cannot be achieved:

  1. Data and tables identification: This is important as there can be many other irrelevant and unnecessary data that can be of least importance when forecasting and analyze the customer needs. Hence the relevant data and the tables have to be selected before starting up the ETL works.
  2. Mapping sheet: This is one of the critical needs while doing ETL works. Mapping of the right table from the source to the destination is mandatory and any problems or incorrect data in this sheet might impact the whole ETL deliverable.
  3. Table designs and data, column type: This is the next major step when considering the mapping of source tables into the destined tables. The column type has to match with the tables at both the places etc.
  4. Database access: The main thing is access to the database where ETL goes on. Any restrictions on the access will have an equivalent impact.

etl testing 8

ETL Reporting and Testing

Reporting in ETL is more important as it explains and directs the clients the customer needs. By this, they can forecast and analyze the exact customer needs

Example #3:

A company which manufactures silk fabric wanted to analyze their annual sales. On review of their annual sales, they found during the month of August and September there was a tremendous fall in sales with the use of the report they generated.

Hence they decided to roll out the promotional offer like the exchange, discounts, etc., that enhanced their sales.

Basic Issues In ETL Testing

There can be a number of issues while performing ETL testing like the following:

  • Either the access to the source tables or the views will not be valid.
  • The column name and the data type from the source to the next layer might not match.
  • A number of records from the source table to the destined tabled might not match.

And there might be much more.

Following is a sample of mapping sheet where there are columns like VIEW_NAME, COLUMN_NAME, DATA_TYPE, TABLE_NAME, COLUMN_NAME, DATA_TYPE, and TRANSFORMATION LOGIC present.

The first 3 columns represent the details of the source database and the next 3 are the details for the immediate preceding database. The last column is very important. Transformation logic is the way the data from the source is read and stored in the destined database. This depends on the business and ETL needs.

etl testing 9

Points To Remember While ETL Test Planning And Execution

The most important thing in ETL testing is the loading of data based on the extraction criteria from the source DB. When this criterion is invalid or obsolete then there will be no data in the table to perform ETL testing that really brings in more issues.

Following are a few of the points to be taken care while ETL Test Planning and Execution:

#1) Data is being extracted from the heterogeneous data sources
#2) ETL process handling in the integrated environment that have different:

  • DBMS
  • OS
  • Hardware
  • Communication protocols

#3) Necessity in having a logical data mapping sheet before the physical data can be transformed
#4) Understanding and examining of the data sources
#5) Initial load and the incremental load
#6) Audit columns
#7) Loading the facts and the dimensions

ETL Tools And Their Significant Usage

ETL tools are basically used to build and convert the transformation logic by taking data from the source into another applying the transformation logic. You can also map the schemas from the source to the destination which occurs in unique ways, transform and clean up data before it can be moved to the destination, along with loading at the destination in an efficient manner.

This can significantly reduce the manual efforts as the mapping can be done that is used for almost all of the ETL validation and verification.

ETL tools:

  1. Informatica – PowerCenter – is one of the popular ETL tools that is introduced by the Informatica Corporation. This has a very good customer base covering wide areas. The major components of the tool are its tools for clients and the repository tools and the servers. To know more about the tool please click here
  2. IBM – Infosphere Information ServerIBM who is the market leader in terms of Computer technology has developed the Infosphere Information server that is used for Information Integration and Management in the year 2008. To know more about the tool please click here
  3. Oracle – Data Integrator – Oracle Corporation has developed its ETL tool in the name of Oracle – Data Integrator. Their increasing customer support has made them update their ETL tools in various versions. To know more about the tool please click here

More examples of the usage of ETL testing:

Considering some Airlines which want to roll out promotions and offers to attract the customers strategically. Firstly they will try to understand the demands and needs of the customer’s specifications. In order to achieve this, they will require the historical data preferably the previous 2 years’ data. Using the data they will analyze and prepare some reports that will be helpful in understanding the customers’ needs.

The reports can be of the following kind:

  1. Customers from region A who travels to region B on certain dates
  2. Customers with specific age criterion travel to city XX

And there can be many other reports.

Analyzing these reports will help the clients in identifying the kind of promotions and offers that will benefit the customers and at the same time can benefit businesses where this can become a Win-Win situation. This can be easily achieved by ETL testing and reports.

In parallel, the IT segment faces a serious DB issue that has been noticed that has stopped multiple services, in turn, has the potential to cause impacts in the business. On investigation, it was identified that some invalid data has corrupted a few databases that needed to be corrected manually.

In the former case, it is ETL reports and testing that will be required.

Whereas the latter case is where the DB testing has to be done properly to overcome issues with invalid data.


Hope the above tutorial has provided a simple and clear overview of what ETL testing is and why it has to be done along with the business impacts or benefits they yield. This does not stop here, but it can extend to set foresight in growth in business.

About the author: This tutorial is written by Nagarajan. He is a Test Lead with over 6 years of Software Testing experience in various functional areas like Banking, Airlines, and Telecom in terms of both manual and automation.

Please let us know your thoughts/questions in the comments below.

Recommended Reading

21 thoughts on “ETL vs. DB Testing – A Closer Look at ETL Testing Need, Planning and ETL Tools”

  1. @Surbhi, Sathish, Venkateswara Rao Kotta, Njuneki Jayne

    Thanks for your comments. You can always get back to us regarding any clarifications from Software Testing.


  2. @Surbhi,

    We are Planning to conduct training on ETL testing as well. Will keep posted on our training’s calendar.. Please keep checking!!


  3. Useful Article, Thank you for Sharing.

    I think your website have a problem with html link. i didn’t see any images on this page, fyi.

  4. Nice post!!!

    I am confused with my work that I am working for my company.
    My job is to validate the excel reports for correctness and completeness against database entries manualy…

    What I thought its not a testing.

    But after reading you article I thought its a kind of ETL testing.
    Can you explain a little….

  5. About the Automation tool for DB Testing & ETL testing as you suggested:
    – I completely agree: automation tools (for ETL testing) are Informatica, QuerySurge, COGNOS, …
    – But in the automation tools (for Database testing): I don’t think QTP, Selenium are the automation tool for Database testing. Because if you research well about the nature of QTP or Selenium, you’ll find out that the main function of these tools focus to automation GUI (rather than communicating directly to the database).
    **Example: We have built an Framework X to test a website with the data shown on website are getting from a specific Database (SQL Server). The Framework X use Selenium WebDriver API for interacting on GUI (such as: click, mouse move, assert text, …). Besides for the communicating directly to the database, the Framework X use the SqlClient API.
    ==> As you see, for the testing a value shown on website need the combination both SqlClient API & Selenium WebDriver API. The SqlClient API to help you get expected data from Database, the WebDriver API help you to get the current value on GUI for testing as an actual result. Base on the expected & actual result, you’ll make a comparison via two ways: 1-use assert function of Selenium WebDriver API for comparison; 2-use nature assert function of Framework X for comparison.
    ==> Let’s get to the core of the discussion: I think that for the automation test Database.
    – We can’t focus to 1 specific Automation GUI API for Database testing.
    – We should focus to many things, such as: Automation GUI API, Communication database API, etc … instead of only focusing on Automation GUI API.

    Many thanks,


Leave a Comment