ETL vs. DB Testing – A Closer Look at ETL Testing Need, Planning and ETL Tools

By Vijay

By Vijay

I'm Vijay, and I've been working on this blog for the past 20+ years! I’ve been in the IT industry for more than 20 years now. I completed my graduation in B.E. Computer Science from a reputed Pune university and then started my career in…

Learn about our editorial policies.
Updated July 27, 2024

Discover the concepts of Database Testing, ETL Testing, and the disparities between them, and delve into ETL testing needs, processes, and planning with real-life examples in this tutorial.

Software testing has a variety of areas to concentrate on. Major varieties are functional and non-functional testing. Functional Testing is the procedural way to ensure that the functionality developed works as expected. Non-functional testing is the approach by which the non-functional aspects, like enhanced or performance at an acceptable level, can be ensured.

There is another flavor of testing called DB testing. Data is organized in the database as tables. For business, there can be flows where the data from the multiple tables can be merged or processed onto a single table and vice versa.

Guide to ETL and Database Testing

ETL vs. DB Testing

ETL Testing is another kind of testing preferred in the business case where the clients seek a kind of reporting need. The reporting is sought to analyze the demands, needs, and supply so that clients, businesses, and end-users are very well served and benefited.

What will you learn in this tutorial?

In this tutorial, you will learn what is Database Testing, what is ETL Testing, the difference between DB Testing and ETL Testing, and more details about ETL testing needs, process, and planning with real examples.

We have also covered ETL Testing in more detail on the below page. Also, have a look at it.

Suggested Read => ETL Testing / Data Warehouse Testing Tips and Techniques

DB Testing vs. ETL Testing

Most of us are a little confused considering that both database testing and ETL testing are similar and the same. The fact is, they are similar, but not the same.

database testing and ETL testing

DB Testing

DB Testing is usually used extensively in business flows where multiple data flows are occurring in the application from multiple data sources on a single table. The data source can be a table, flat file, application, or anything else that can yield some output data.

The output data obtained can still be used as input for the sequential business flow. Hence, when we perform DB testing, the most important thing that has to be captured is the way the data can get transformed from the source along with how it gets saved in the destination location.

Synchronization is one major and essential thing that has to be considered when performing DB Testing. Due to the positioning of the application in the architectural flow, there might be a few issues with the data or DB synchronization. Hence, while performing the testing, this has to be taken care of as this can overcome the potential invalid defects or bugs.

Example #1:

Project “A” has an integrated architecture where the particular application makes use of data from several other heterogeneous data sources. Hence, the integrity of these data with the destination location has to be done along with the validations for the following:

  • Primary foreign key validation
  • Column values integrity
  • Null values for any columns

What is ETL Testing?

etl testing

ETL Testing is a special type of testing that the client wants to have it done for their forecasting and analysis of their business. This is mostly used for reporting purposes. For instance, if the clients need to have reported on the customers who use or go for their product based on the day they purchase, they have to make use of the ETL reports.

Post analysis and reporting, this data is data warehoused in a data warehouse where the old historical business data has to be moved.

This is a multiple-level testing as the data from the source is transformed into multiple environments before it reaches the final destination.

Example #2:

We will consider a group “A” doing retail customer business through a shopping market where the customer can purchase any household items required for their day-to-day survival. Here all the customers visiting are provided with a unique membership ID with which they can gain points every time they come to purchase things from the shopping market.

The regulations provided by the group say that the points gained expire every year. Depending upon their usage, the membership can be either upgraded to a higher grade member or downgraded to a lower grade member comparatively to the current grade.

After 5 years of shopping market establishment now management is looking to scale up their business along with revenue.

Hence they required a few business reports so that they could promote their customers.

In Database Testing, we perform the following:

#1) Validations on the target tables which are created with columns with logical calculations as described in the logical mapping sheet and the data routing document.

#2) Manipulations like Inserting, Updating and Deletion of the customer data can be performed on any end-user POS application in an integrated system along with the back-end database so that the same changes are reflected in the end system.

#3) DB testing has to ensure that there is no customer data that has been misinterpreted or even truncated. This might lead to serious issues like incorrect mapping of customer data with their loyalty

In ETL Testing we check for the following:

#1) Assuming there are 100 customers in the source, you will check whether all these customers along with their data from the 100 rows have been moved from the source system to the target. This is known as the verification of Data completeness check.

#2) Checking if the customer data has been properly manipulated and demonstrated in the 100 rows. This is simply called the verification of Data Accuracy check.

#3) Reports for the customers who have gained points more than x values within a particular period.

Comparative Study Of ETL And DB Testing

ETL and DB testing have a few of the aspects differing within themselves that are essential to be understood before performing them. This helps us understand the values and significance of the testing and the way it helps the business.

Following is a tabular form that describes the basic behavior of both testing formats.

DB TestingETL Testing
Primary goalData integration BI Reporting
Applicable placeIn the functional system where the business flow occursExternal to the business flow environment. input is the historical business data
Automation toolQTP, Selenium Informatica, QuerySurge, COGNOS
Business impactSevere impacts can lead as it is the integrated architecture of the business flowsPotential impacts as in when the clients wants to have the forecasting and analysis to be done
Modelling usedEntity RelationshipDimensional
SystemOnline Transaction ProcessingOnline Analytical Processing
Data NatureNormalized data is being used hereDenormalized data is being used here

Why Should The Business Go For ETL

Plenty of business needs are available for them to consider ETL testing. Every business has to have its unique mission and the line of business. All business has their product life cycle, which takes the generic form:

product life stages

It is very clear that any new product enters the market with tremendous growth in sales till a stage called maturity and thereafter it declines in sales. This gradual change witnesses a definite drop in business growth. Hence it is more important to analyze the customer needs for the business growth and other factors required to make the organization more profitable.

So, in reality, the clients want to analyze the historical data and come up with some reports strategically.

etl

ETL Test Planning

One of the main steps in ETL testing is about planning the test that is going to be executed. It will be similar to the Test Plan for the System Testing that is usually performed except for a few attributes like requirements and test cases.

Here the requirements are nothing but a mapping sheet that will have a kind of mapping between data within different databases. As we are aware, ETL testing occurs on multiple levels, and there are various mappings needed for validating this.

ETL Test Planning

Most of the time, the source databases do not capture the data directly. All the source data will have the tables’ view from where the data can be used.

Example: The following is an example of how the mappings can be provided. The two columns VIEW_NAME and TABLE_NAME can represent the views of reading data from the source and the table in the ETL environment, respectively.

It is advisable to maintain the naming convention that can help us while planning for automation. Generic notation that can be used is just prefixing the name of the environment.

table in the ETL environment

The most significant thing in ETL is about identifying the essential data and the tables from the source. The next essential step is the mapping of tables from the source to the ETL environment.

Following is an example of how the mapping between the tables from the various environments can be related to the ETL purpose.

tables from the various environments

The above mapping assumes the data from the source table to the staging table. From then on to the tables in EDW and then to OLAP, which is the final reporting environment. Hence, at any point in time, data synchronization is very important for the ETL’s sake.

Critical ETL Needs

As we understand, ETL is the need for forecasting, reporting, and analyzing the business to capture the customer needs more successively. This will enable the business to have higher demands than in the past.

Here are a few of the critical needs without which ETL testing cannot be achieved:

  1. Data and Tables Identification: This is important as there can be many other irrelevant and unnecessary data that can be of the least importance when forecasting and analyzing customer needs. Hence, the relevant data and the tables have to be selected before starting up the ETL works.
  2. Mapping Sheet: This is one of the critical needs while doing ETL work. Mapping of the right table from the source to the destination is mandatory and any problems or incorrect data in this sheet might impact the whole ETL deliverable.
  3. Table Designs and Data, Column type: This is the next major step when considering the mapping of source tables into the destined tables. The column type has to match the tables at both places etc.
  4. Database Access: The main thing is access to the database where ETL goes on. Any restrictions on the access will have an equivalent impact.
Critical ETL Needs

ETL Reporting and Testing

Reporting in ETL is more important as it explains and directs the client’s customer needs. By this, they can forecast and analyze the exact customer needs.

Example #3:

A company that manufactures silk fabric wanted to analyze its annual sales. On review of their annual sales, they found during the month of August and September there was a tremendous fall in sales with the use of the report they generated.

As a result, they chose to introduce promotional offers such as exchanges and discounts, which boosted their sales.

Basic Issues In ETL Testing

There can be several issues while performing ETL testing, like the following:

  • Either the access to the source tables or the views will not be valid.
  • The column name and the data type from the source to the next layer might not match.
  • Several records from the source table to the destined tabled might not match.

And there might be much more.

Following is a sample of the mapping sheet where there are columns like VIEW_NAME, COLUMN_NAME, DATA_TYPE, TABLE_NAME, COLUMN_NAME, DATA_TYPE, and TRANSFORMATION LOGIC present.

The first 3 columns represent the details of the source database and the next 3 are the details for the immediate preceding database. The last column is very important. Transformation logic is the way the data from the source is read and stored in the destined database. This depends on the business and ETL needs.

etl testing 9

Points To Remember While ETL Test Planning And Execution

The most important thing in ETL testing is the loading of data based on the extraction criteria from the source DB. When this criterion is invalid or obsolete, then there will be no data in the table to perform ETL testing which really brings in more issues.

Following are a few of the points to be taken care of while ETL Test Planning and Execution:

#1) Data is being extracted from the heterogeneous data sources.
#2) ETL process handling in the integrated environment that has different:

  • DBMS
  • OS
  • Hardware
  • Communication protocols

#3) Necessity in having a logical data mapping sheet before the physical data can be transformed
#4) Understanding and examining the data sources
#5) Initial load and the incremental load
#6) Audit columns
#7) Loading the facts and the dimensions

ETL Tools And Their Significant Usage

ETL tools are used to build and convert the transformation logic by taking data from one source into another and applying the transformation logic. You can also map the schemas from the source to the destination, which occurs in unique ways, transform and clean up data before it can be moved to the destination, along with loading at the destination in an efficient manner.

This can significantly reduce the manual efforts as the mapping can be done that is used for almost all the ETL validation and verification.

ETL tools:

  1. Informatica – PowerCenter is one of the popular ETL tools that is introduced by the Informatica Corporation. This has a very good customer base covering wide areas. The major components of the tool are its tools for clients the repository tools and the servers. To learn more about the tool, please click here.
  2. IBM – Infosphere Information Server – IBM which is the market leader in terms of computer technology has developed the Infosphere Information server that is used for Information Integration and Management in the year 2008. To learn more about the tool, please click here.
  3. Oracle – Data Integrator – Oracle Corporation has developed its ETL tool in the name of Oracle – Data Integrator. Their increasing customer support has made them update their ETL tools in various versions. To learn more about the tool, please click here.

More examples of the usage of ETL testing:

Considering some Airlines that want to roll out promotions and offers to attract customers strategically. Firstly, they will try to understand the demands and needs of the customer’s specifications. To achieve this, they will require historical data, preferably the previous 2 years’ data. Using the data, they will analyze and prepare some reports that will help understand the customers’ needs.

The reports can be of the following kinds:

  1. Customers from Region A who travel to Region B on certain dates
  2. Customers with specific age criteria travel to city XX

And there can be many other reports.

Analyzing these reports will help the clients in identifying the kind of promotions and offers that will benefit the customers and can benefit businesses where this can become a Win-Win situation. ETL testing and reports can easily achieve this.

In parallel, the IT segment faces a serious DB issue that has been noticed that has stopped multiple services, in turn, has the potential to cause impacts on the business. On investigation, it was identified that some invalid data had corrupted a few databases that needed to be corrected manually.

In the former case, it is ETL reports and testing that will be required.

Whereas the latter case is where the DB testing has to be done properly to overcome issues with invalid data.

Conclusion

Hope the above tutorial has provided a simple and clear overview of what ETL testing is and why it has to be done along with the business impacts or benefits they yield. This does not stop here, but it can extend to set foresight in growth in business.

About the author: This tutorial is written by Nagarajan. He is a Test Lead with over 6 years of Software Testing experience in various functional areas like Banking, Airlines, and Telecom in terms of both manual and automation.

Please let us know your thoughts/questions in the comments below.

Was this helpful?

Thanks for your feedback!

Recommended Reading

21 thoughts on “ETL vs. DB Testing – A Closer Look at ETL Testing Need, Planning and ETL Tools”

  1. @Surbhi,

    We are Planning to conduct training on ETL testing as well. Will keep posted on our training’s calendar.. Please keep checking!!

    -Nagarajan

    Reply
  2. About the Automation tool for DB Testing & ETL testing as you suggested:
    – I completely agree: automation tools (for ETL testing) are Informatica, QuerySurge, COGNOS, …
    – But in the automation tools (for Database testing): I don’t think QTP, Selenium are the automation tool for Database testing. Because if you research well about the nature of QTP or Selenium, you’ll find out that the main function of these tools focus to automation GUI (rather than communicating directly to the database).
    **Example: We have built an Framework X to test a website with the data shown on website are getting from a specific Database (SQL Server). The Framework X use Selenium WebDriver API for interacting on GUI (such as: click, mouse move, assert text, …). Besides for the communicating directly to the database, the Framework X use the SqlClient API.
    ==> As you see, for the testing a value shown on website need the combination both SqlClient API & Selenium WebDriver API. The SqlClient API to help you get expected data from Database, the WebDriver API help you to get the current value on GUI for testing as an actual result. Base on the expected & actual result, you’ll make a comparison via two ways: 1-use assert function of Selenium WebDriver API for comparison; 2-use nature assert function of Framework X for comparison.
    ==> Let’s get to the core of the discussion: I think that for the automation test Database.
    – We can’t focus to 1 specific Automation GUI API for Database testing.
    – We should focus to many things, such as: Automation GUI API, Communication database API, etc … instead of only focusing on Automation GUI API.

    Many thanks,

    Reply
  3. @Surbhi, Sathish, Venkateswara Rao Kotta, Njuneki Jayne

    Thanks for your comments. You can always get back to us regarding any clarifications from Software Testing.

    -Nagarajan

    Reply
  4. Nice post!!!

    I am confused with my work that I am working for my company.
    My job is to validate the excel reports for correctness and completeness against database entries manualy…

    What I thought its not a testing.

    But after reading you article I thought its a kind of ETL testing.
    Can you explain a little….

    Reply
  5. Useful Article, Thank you for Sharing.

    I think your website have a problem with html link. i didn’t see any images on this page, fyi.

    Reply

Leave a Comment