ETL Testing Data Warehouse Testing Tutorial (A Complete Guide)

ETL Testing / Data Warehouse Process and Challenges:

Today let me take a moment and explain my testing fraternity about one of the much in demand and upcoming skills for my tester friends i.e. ETL testing (Extract, Transform, and Load).

This tutorial will present you with a complete idea about ETL testing and what we do to test ETL process.

Complete List Tutorials in this series: 

It has been observed that Independent Verification and Validation is gaining huge market potential and many companies are now seeing this as prospective business gain.

ETL Testing _ Data Warehouse Testing - Tips, Techniques, Process and Challenges

Customers have been offered a different range of products in terms of service offerings, distributed in many areas based on technology, process, and solutions. ETL or data warehouse is one of the offerings which are developing rapidly and successfully.

ETL process

Through ETL process, data is fetched from the source systems, transformed as per business rules and finally loaded to the target system (data warehouse). A data warehouse is an enterprise-wide store which contains integrated data that aids in the business decision-making process. It is a part of business intelligence.

Why do organizations need Data Warehouse?

Organizations with organized IT practices are looking forward to creating the next level of technology transformation. They are now trying to make themselves much more operational with easy-to-interoperate data.

Having said that data is most important part of any organization, it may be everyday data or historical data. Data is the backbone of any report and reports are the baseline on which all the vital management decisions are taken.

Most of the companies are taking a step forward for constructing their data warehouse to store and monitor real-time data as well as historical data. Crafting an efficient data warehouse is not an easy job. Many organizations have distributed departments with different applications running on distributed technology.

ETL tool is employed in order to make a flawless integration between different data sources from different departments. ETL tool will work as an integrator, extracting data from different sources; transforming it into the preferred format based on the business transformation rules and loading it in cohesive DB known are Data Warehouse.

Well planned, well defined and effective testing scope guarantees smooth conversion of the project to the production. A business gains the real buoyancy once the ETL processes are verified and validated by an independent group of experts to make sure that data warehouse is concrete and robust.

ETL or Data warehouse testing is categorized into four different engagements irrespective of technology or ETL tools used:

  • New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer requirements and different data sources and new data warehouse is built and verified with the help of ETL tools.
  • Migration Testing – In this type of project customer will have an existing DW and ETL performing the job but they are looking to bag new tool in order to improve efficiency.
  • Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.
  • Report Testing – Report is the end result of any Data Warehouse and the basic propose for which DW builds. The report must be tested by validating layout, data in the report and calculation.

ETL process

(Note: Click on the image for enlarged view)

ETL testing

ETL Testing Techniques

1) Data transformation Testing: Verify that data is transformed correctly according to various business requirements and rules.

2) Source to Target count Testing: Make sure that the count of records loaded in the target is matching with the expected count.

3) Source to Target Data Testing: Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.

4) Data Quality Testing: Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.

5) Performance Testing: Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.

6) Production Validation Testing: Validate the data in production system & compare it against the source data.

7) Data Integration Testing: Make sure that the data from various sources has been loaded properly to the target system and all the threshold values are checked.

8) Application Migration Testing: In this testing, it is ensured that the ETL application is working fine on moving to a new box or platform.

9) Data & constraint Check: The datatype, length, index, constraints, etc. are tested in this case.

10) Duplicate Data Check: Test if there is any duplicate data present in the target systems. Duplicate data can lead to wrong analytical reports.

Apart from the above ETL testing methods other testing methods like system integration testing, user acceptance testing, incremental testing, regression testing, retesting and navigation testing is also carried out to make sure everything is smooth and reliable.

ETL/Data Warehouse Testing Process

Similar to any other testing that lies under Independent Verification and Validation, ETL also goes through the same phase.

  • Requirement understanding
  • Validating
  • Test Estimation based on a number of tables, the complexity of rules, data volume and performance of a job.
  • Test planning based on the inputs from test estimation and business requirement. We need to identify here that what is in scope and what is out of scope. We also look out for dependencies, risks and mitigation plans in this phase.
  • Designing test cases and test scenarios from all the available inputs. We also need to design mapping document and SQL scripts.
  • Once all the test cases are ready and are approved, testing team proceed to perform pre-execution check and test data preparation for testing
  • Lastly, execution is performed till exit criteria are met. So, execution phase includes running ETL jobs, monitoring job runs, SQL script execution, defect logging, defect retesting and regression testing.
  • Upon successful completion, a summary report is prepared and closure process is done. In this phase, sign off is given to promote the job or code to the next phase.

The first two phases i.e. requirement understanding and validation can be regarded as pre-steps of ETL test process.

So, the main process can be represented as below:

main process

It is necessary to define test strategy which should be mutually accepted by stakeholders before starting actual testing. A well-defined test strategy will make sure that correct approach has been followed meeting the testing aspiration.

ETL/Data Warehouse testing might require writing SQL statements extensively by testing team or maybe tailoring the SQL provided by the development team. In any case, a testing team must be aware of the results they are trying to get using those SQL statements.

Difference between Database and Data Warehouse Testing

There is a popular misunderstanding that database testing and data warehouse is similar while the fact is that both hold different direction in testing.

  • Database testing is done using a smaller scale of data normally with OLTP (Online transaction processing) type of databases while data warehouse testing is done with large volume with data involving OLAP (online analytical processing) databases.
  • In database testing normally data is consistently injected from uniform sources while in data warehouse testing most of the data comes from different kind of data sources which are sequentially inconsistent.
  • We generally perform the only CRUD (Create, read, update and delete) operation in database testing while in data warehouse testing we use read-only (Select) operation.
  • Normalized databases are used in DB testing while demoralized DB is used in data warehouse testing.

There is a number of universal verifications that have to be carried out for any kind of data warehouse testing.

Below is the list of objects that are treated as essential for validation in this testing:

  • Verify that data transformation from source to destination works as expected
  • Verify that expected data is added to the target system
  • Verify that all DB fields and field data is loaded without any truncation
  • Verify data checksum for record count match
  • Verify that for rejected data proper error logs are generated with all details
  • Verify NULL value fields
  • Verify that duplicate data is not loaded
  • Verify data integrity

=> Know the difference between ETL/Data warehouse testing & Database Testing.


ETL Testing Challenges

This testing is quite different from conventional testing. There are many challenges we faced while performing data warehouse testing.

Here are few challenges I experienced on my project:

  • Incompatible and duplicate data
  • Loss of data during ETL process
  • Unavailability of the inclusive testbed
  • Testers have no privileges to execute ETL jobs by their own
  • Volume and complexity of data are very huge
  •  Fault in business process and procedures
  • Trouble acquiring and building test data
  • Unstable testing environment
  • Missing business flow information

Data is important for businesses to make the critical business decisions. ETL testing plays a significant role validating and ensuring that the business information is exact, consistent and reliable. Also, it minimizes the hazard of data loss in production.

Hope these tips will help ensure your ETL process is accurate and the data warehouse build by this is a competitive advantage for your business.

Complete List of ETL Testing Tutorials:

This is a guest post by Vishal Chhaperia who is working in an MNC in a test management role. He is having extensive experience in managing multi-technology QA projects, Processes and teams.

Have you worked on ETL testing? Please share your ETL/DW testing tips and challenges below.

Recommended Reading


194 thoughts on “ETL Testing Data Warehouse Testing Tutorial (A Complete Guide)”

  1. further simplified —
    In simple words it is collection of data from different departments of a company or from different technologies to one warehouse.
    Extract from source, transform in a suitable format and then load into destination data warehouse.

    thanks for sharing this resource..

  2. Informative post.
    How to get into this profile? As no matter what exp or knowledge one has it requires exp of ETL testing.

  3. Hai frnds, I am new to testing.Did any one know best coaching center for learning ETL testing in chennai.PLZ reply me asap.

  4. Very informative post on ETL testing. Specially the section which talks about the difference in DB Testing and DW Testing. Thanks.

  5. Hi Bishal its very useful info shared by you, I was looking for some docs related to ETL testing got them here. I just checked by Anindita di in Lotus notes the link that u shared and goin thru same, would like to learn more about ETL testing.
    Thanks
    Sanjay

  6. Please let me know any coaching center in bangalore.
    I have around 6 months of exp in ETL testing .I want to learn more in this area.

  7. I think to Start with ETL Testing One Need a Very Hands On SQL concepts specially Conversion Funcions as like when you want to change date in different format or Numeric to string etc

  8. @Kishore: There is no prerequisite as such, may not be one need to have knowledge of any ETL tool still fair understanding of SQL concepts are much desirable.

    @Aarthi & Bhavya: I believe there is no institute which provides training on ETL testing, still you can check with institutes those are proving training in ETL tools.

    @Chaitu: Yes absolutely, one need to be very conversant in SQL as to map the SQL vs actual business transformation rule that applies. Clear Understanding of Conversion Functions are also much required.

  9. Hi

    Thanks for sharing such a useful information..
    i am very confused about the difference between the database testing and data warehouse testing..now it is clear..Thanks..

    Coming to challenges the major thing is to create a test data for the scenarios which we found in planning phase …….

  10. Thanks for sharing the more info of DWH TESTING
    and could please share the oracle queries in general asked ,
    anyway thanks for sharing the info..

    Thanks,
    Purna,

  11. Hi,
    S this is a good study on a warehouse testing. I feel it parallel o my approach on testing a warehouse

  12. this is really informative and appreciate Vijay’s crisp explanation about ETL testing and really liked the way he gave the difference between Database testing & Data Warehouse testing.

  13. The problems generally found in ETL testing comprising the graphics transformations is the unavailability of tools or language(symbol) gap. The article however gives a detail of ETL testing.

  14. @ Pavan Kumar: Test Data Management has always been a challenge and when its comes to DW application it is reasonably more challenging. Its very tough to get correct, clean and secure data considering the fact that data is coming from multiple sources.
    On contrary there are many good Test Data Management tools which are potentially capable to overcome these problems.
    @Purna: Talking about SQL that are asked in any QA interview are mostly from Joins, Referential Integrity concepts, some basic SQL like sorting, grouping and query related to where clause.
    @JD: Could you please make your question more precise? Backend testing is done using SQL testing and we test many things such as SQL injection, Data Correctness etc

  15. Data warehouse — information delivery system of an organization.

    Testing of data warehouse the process of making sure data acceptance of end business users(Report generators)

    • hello Jagadesh, very good day to you.
      I am working in a MNC company as a Manual tester , Chennai location.

      with 2 years of experience.
      Iam planning to jump to a new company as a ETL TESTING.

      please give me your inputs.

    • hello, very good day to you.
      I am working in a MNC company as a Manual tester , Chennai location.

      with 2 years of experience.
      Iam planning to jump to a new company as a ETL TESTING.

      please give me your inputs.
      my Mail id pandy05@Gmail.com

  16. Hi , I am working in ETL project since 4 years and having overall 9 years experience. Currently working in one of Top MNC in India and presently working from client place(London) since 2.5 years in banking project. If you have any questions on ETL or any banking domain then you can reach me at sreenivasulu.borra@gmail.com

    Best Regards
    Sreenivas Reddy

  17. Hi All,

    I provide online trainings. I cover Database Testing and ETL Testing.

    To get good amount of knowledge in both Database Testing and ETL Testing, one should have sufficient knowledge in SQL Queries.

    Please do contact me for more information.

    Thanks,
    Durga.
    9848805111

  18. I am currently in manual testing.I want to learn informatica and move to datawarehouse testing. how feasible is it. Will I have any opportunity to move.

  19. As you are currently working in Manual testing, its very easy for u to learn ETL testing and it would be easy for u to work on ETL testing projects. but going to informatica development may not be a good idea. this is just my opinion.

  20. Hi. I have 12 months of experince in manual testing.I know the basics of JAVA and SQL.
    Now i got opprtunity in 1.ETL Testing and 2.Automation testing ( SELENIUM ).
    Please advice which one to choose for better career growth.

  21. Very good information and detailed diff between DB and DW testing. Appriciate your efforts in doing this one and Thanks a lot for sharing such a nice post.

  22. i wanna to know the list of data warehouse products (like db2, ibm data stage2…) if u know the latest products for this means pls rply….

  23. i have been offered a part time trainer position in a company for DWH testing.. what is the pay they generally offer per session?

  24. Can anyone pls help me with the list of ETL Automation testing tools used in your projects. open source or commercial both will be fine. Thanks.

  25. I have 3 year of Experience in Java and Oracle and i want to work in ETL tools. Can anyone help me to know its future prospects and salary.and where i get the good knowledge of ETL tools.
    +91-9250110356

  26. Hi,

    I am working in ETL Testing. Please do not always go for tools like Informatica, Web Intelligence 11 etc. First and foremost thing that comes into action is “Writing the queries”. In case to know more please do contact :

    meet16ravi@gmail.com

  27. This article is nice one . I just want to know is it in scope of testing if different short of data loading method testing .

    as my project experience delta data loading and bulk data loading these two have generate two separate scenarios and corresponding issues mainly data loss and duplicate data and null value .

    Does ETL testing has any proper path to solve this case ?

Leave a Comment