ETL Testing Data Warehouse Testing Tutorial (A Complete Guide)

ETL Testing / Data Warehouse Process and Challenges:

Today let me take a moment and explain my testing fraternity about one of the much in demand and upcoming skills for my tester friends i.e. ETL testing (Extract, Transform, and Load).

This tutorial will present you with a complete idea about ETL testing and what we do to test ETL process.

Complete List Tutorials in this series: 

It has been observed that Independent Verification and Validation is gaining huge market potential and many companies are now seeing this as prospective business gain.

ETL Testing _ Data Warehouse Testing - Tips, Techniques, Process and Challenges

Customers have been offered a different range of products in terms of service offerings, distributed in many areas based on technology, process, and solutions. ETL or data warehouse is one of the offerings which are developing rapidly and successfully.

ETL process

Through ETL process, data is fetched from the source systems, transformed as per business rules and finally loaded to the target system (data warehouse). A data warehouse is an enterprise-wide store which contains integrated data that aids in the business decision-making process. It is a part of business intelligence.

Why do organizations need Data Warehouse?

Organizations with organized IT practices are looking forward to creating the next level of technology transformation. They are now trying to make themselves much more operational with easy-to-interoperate data.

Having said that data is most important part of any organization, it may be everyday data or historical data. Data is the backbone of any report and reports are the baseline on which all the vital management decisions are taken.

Most of the companies are taking a step forward for constructing their data warehouse to store and monitor real-time data as well as historical data. Crafting an efficient data warehouse is not an easy job. Many organizations have distributed departments with different applications running on distributed technology.

ETL tool is employed in order to make a flawless integration between different data sources from different departments. ETL tool will work as an integrator, extracting data from different sources; transforming it into the preferred format based on the business transformation rules and loading it in cohesive DB known are Data Warehouse.

Well planned, well defined and effective testing scope guarantees smooth conversion of the project to the production. A business gains the real buoyancy once the ETL processes are verified and validated by an independent group of experts to make sure that data warehouse is concrete and robust.

ETL or Data warehouse testing is categorized into four different engagements irrespective of technology or ETL tools used:

  • New Data Warehouse Testing – New DW is built and verified from scratch. Data input is taken from customer requirements and different data sources and new data warehouse is built and verified with the help of ETL tools.
  • Migration Testing – In this type of project customer will have an existing DW and ETL performing the job but they are looking to bag new tool in order to improve efficiency.
  • Change Request – In this type of project new data is added from different sources to an existing DW. Also, there might be a condition where customer needs to change their existing business rule or they might integrate the new rule.
  • Report Testing – Report is the end result of any Data Warehouse and the basic propose for which DW builds. The report must be tested by validating layout, data in the report and calculation.

ETL process

(Note: Click on the image for enlarged view)

ETL testing

ETL Testing Techniques

1) Data transformation Testing: Verify that data is transformed correctly according to various business requirements and rules.

2) Source to Target count Testing: Make sure that the count of records loaded in the target is matching with the expected count.

3) Source to Target Data Testing: Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.

4) Data Quality Testing: Make sure that ETL application appropriately rejects, replaces with default values and reports invalid data.

5) Performance Testing: Make sure that data is loaded in data warehouse within prescribed and expected time frames to confirm improved performance and scalability.

6) Production Validation Testing: Validate the data in production system & compare it against the source data.

7) Data Integration Testing: Make sure that the data from various sources has been loaded properly to the target system and all the threshold values are checked.

8) Application Migration Testing: In this testing, it is ensured that the ETL application is working fine on moving to a new box or platform.

9) Data & constraint Check: The datatype, length, index, constraints, etc. are tested in this case.

10) Duplicate Data Check: Test if there is any duplicate data present in the target systems. Duplicate data can lead to wrong analytical reports.

Apart from the above ETL testing methods other testing methods like system integration testing, user acceptance testing, incremental testing, regression testing, retesting and navigation testing is also carried out to make sure everything is smooth and reliable.

ETL/Data Warehouse Testing Process

Similar to any other testing that lies under Independent Verification and Validation, ETL also goes through the same phase.

  • Requirement understanding
  • Validating
  • Test Estimation based on a number of tables, the complexity of rules, data volume and performance of a job.
  • Test planning based on the inputs from test estimation and business requirement. We need to identify here that what is in scope and what is out of scope. We also look out for dependencies, risks and mitigation plans in this phase.
  • Designing test cases and test scenarios from all the available inputs. We also need to design mapping document and SQL scripts.
  • Once all the test cases are ready and are approved, testing team proceed to perform pre-execution check and test data preparation for testing
  • Lastly, execution is performed till exit criteria are met. So, execution phase includes running ETL jobs, monitoring job runs, SQL script execution, defect logging, defect retesting and regression testing.
  • Upon successful completion, a summary report is prepared and closure process is done. In this phase, sign off is given to promote the job or code to the next phase.

The first two phases i.e. requirement understanding and validation can be regarded as pre-steps of ETL test process.

So, the main process can be represented as below:

main process

It is necessary to define test strategy which should be mutually accepted by stakeholders before starting actual testing. A well-defined test strategy will make sure that correct approach has been followed meeting the testing aspiration.

ETL/Data Warehouse testing might require writing SQL statements extensively by testing team or maybe tailoring the SQL provided by the development team. In any case, a testing team must be aware of the results they are trying to get using those SQL statements.

Difference between Database and Data Warehouse Testing

There is a popular misunderstanding that database testing and data warehouse is similar while the fact is that both hold different direction in testing.

  • Database testing is done using a smaller scale of data normally with OLTP (Online transaction processing) type of databases while data warehouse testing is done with large volume with data involving OLAP (online analytical processing) databases.
  • In database testing normally data is consistently injected from uniform sources while in data warehouse testing most of the data comes from different kind of data sources which are sequentially inconsistent.
  • We generally perform the only CRUD (Create, read, update and delete) operation in database testing while in data warehouse testing we use read-only (Select) operation.
  • Normalized databases are used in DB testing while demoralized DB is used in data warehouse testing.

There is a number of universal verifications that have to be carried out for any kind of data warehouse testing.

Below is the list of objects that are treated as essential for validation in this testing:

  • Verify that data transformation from source to destination works as expected
  • Verify that expected data is added to the target system
  • Verify that all DB fields and field data is loaded without any truncation
  • Verify data checksum for record count match
  • Verify that for rejected data proper error logs are generated with all details
  • Verify NULL value fields
  • Verify that duplicate data is not loaded
  • Verify data integrity

=> Know the difference between ETL/Data warehouse testing & Database Testing.


ETL Testing Challenges

This testing is quite different from conventional testing. There are many challenges we faced while performing data warehouse testing.

Here are few challenges I experienced on my project:

  • Incompatible and duplicate data
  • Loss of data during ETL process
  • Unavailability of the inclusive testbed
  • Testers have no privileges to execute ETL jobs by their own
  • Volume and complexity of data are very huge
  •  Fault in business process and procedures
  • Trouble acquiring and building test data
  • Unstable testing environment
  • Missing business flow information

Data is important for businesses to make the critical business decisions. ETL testing plays a significant role validating and ensuring that the business information is exact, consistent and reliable. Also, it minimizes the hazard of data loss in production.

Hope these tips will help ensure your ETL process is accurate and the data warehouse build by this is a competitive advantage for your business.

Complete List of ETL Testing Tutorials:

This is a guest post by Vishal Chhaperia who is working in an MNC in a test management role. He is having extensive experience in managing multi-technology QA projects, Processes and teams.

Have you worked on ETL testing? Please share your ETL/DW testing tips and challenges below.

Recommended Reading


193 thoughts on “ETL Testing Data Warehouse Testing Tutorial (A Complete Guide)”

  1. I’m Working in a software company in Chandigarh as a DotNet Developer. Someone suggested me to change the profession and go for ETL or WareHousing. As i don’t know anything about ETL and WH also.
    And i found lots of people here who having knowledge about both terms.So anyone can please guide me.
    aman_wadhwa88@yahoo.com

  2. Hi All,

    At present I am as a manual test engineer in one of the MNC, Bangalore. I want to change my domain from manual to ETL testing, so can anyone suggest me that what are the basic skills required to switch from manual to ETL.
    Also are there any centres or institutes that provide training on ETL. Incase there are no such centres how to prepare myself.
    Please suggest me.

    Thanks in advance.

    +91 9008078349

  3. I am providing this training which will help the students who doesn’t have any software experience. We will take scenarios and execute them and guide them when they get into job.
    I have experience of 12+ years in the industry. I worked as a Test manager, Business Analyst, project manager.
    I Worked with the companies like ITC Infotech, Honeywell, Accenture, TCS.
    I provide corporate training also.
    After this training the student will scale up to 3+ years experience level.
    Contact: 9886991978
    Syllabus:
    DWH/Datawarehouse Concepts.
    Manual Testing.(Web Applications Architecture)
    Why ETL Testing.(Datawarehouse Architecture)
    Difference between manual/automation and ETL Testing/BI testing.
    SQL required for ETL Testing – Easy and only required SQL s for interviews.
    ETL Transformations.
    ETL test case documentation
    TOAD.
    Quality Center.
    ETL Test Life Cycle.
    Realtime Scenarios of Interaction between Business Analyst, Developers and Testers.
    Test Planning(Useful for people trying for lead Level)
    Discussion of Interview Questions.
    Resume Preparation.
    How to Accept and handle the work after joining the company.

  4. Hello Friends ,
    IBM openings for ETL Testing & QTP
    If anyone willing to join kindly send me ur resume on
    preetideshmukh08@gmail.com
    Desired Skills & Experience

    Total & Relevant 3 to 4 Years of Experience for ETL Testing & QTP ( NO Contractual Employee )

    16 Years Education Mandatory ( BE, B.Tech, or 16 Year Education )

    Interview & Joining Location for ETL TESTING — Bangalore.

    Interview & Joining Location for QTP — Chennai.

    Interview Date : 8 Dec 2012

  5. Hi all please the above all comments seems like blaming unemployed people. i am not understanding why people coming to etl testing side. you should go development side

  6. Hi,

    I had seen that many persons wanted to learn ETL Testing. I am not aware about the institute but i can teach each and everything about ETL testing or you can mail me your questions if you stuck somewhere, i will reply to all the answers. This will help me also to gain knowledge. I am having 6 years exp in this technologies and working for No. 1 product based (database) organization.

    Thanks
    Siraj

  7. Hi if any body wanted to learn Practical ETL testing online please contact on my mobile +60-107767499 (Malaysia) .
    I have experience of 8+ years in the industry and worked in various domains as ETL test lead.

    Regards
    Krishna

  8. Hi i would like to say my self maniteja working as a Etl and Bi testing in wipro pune. if any body want to learn Etl and Bi testing contact me. classes available only weekends(pune) and Online.

  9. Could you plz support me to find ETL testing tool to :-

    – Verify that expected data is added in target system
    – Verify that all DB fields and field data is loaded without any truncation
    – Verify data checksum for record count match
    – Verify that for rejected data proper error logs are generated with all details
    – Verify NULL value fields
    – Verify that duplicate data is not loaded
    – Verify data integrity

  10. Hi,only by knowing etl testing doesn’t make sense because you need to know atleast one etl tool to understand and to work on etl testing.So make sure to learn etl tool as well.If any body wants to learn etl testing with etl tool through ONLINE we are giving ETL testing training and we give support untill you got the job.

    Base on batch schdules we are giving etl testing and etl tool traing on daily basis as well as weekend batches.

    We have successfull track record.call @ +91-8237320101

  11. Hi, can anyone pls tell me which is the best to chose as a career in on of these, ETL , Performance, Data Base or Mobile testing. I feel performance testing is the best because it has lot onsite opportunity, well paid and less resources in market. I would like to here from some experienced teters. Pls help me to choose the right career for my life, ur comments n replies are very impt to me, pls guide me properly…. thanks…..

  12. Hi,
    I Provide Online Training on ETL Testing.
    I am working in a MNC in Hyderabad.
    ETL TESTING
    ============================================

    what is Testing ?
    what is the need for Testing?
    Testing Methodologies
    Software development life cycle
    software Testing life cycle
    Types of Testing

    Data Ware Housing Concepts:
    • What is Data Ware House?
    • Difference between OLTP and Data Ware Housing
    • Data Extraction
    • Data Transformation
    • Data Loading

    Data Marts
    • Dependent Data Mart
    • Independent Data Mart

    Data ware housing Approaches
    Top Down Approach
    bottom up Approach

    Data Base Design
    • Star Schema
    • Snow Flake Schema

    SCD(slowly changing dimension)
    • Type-1 SCD
    • Type-2 SCD
    • Type-3 SCD

    Basic Concepts in SQL
    • SQL Concepts(A-Z) – execute the pl/sql procedures,functions….
    • Overview of ETL Tool Architecture
    • Testing Functionality on Different Transformation Rules

    Data Ware House Life Cycle
    Different Types of Testing Techniques in ETL
    ETL Testing Introduction
    • What is use of testing
    • What is quality & standards
    • Responsibilities of a ETL Tester

    Understanding various documents such as Business Requirement Document,
    Design Documents,Mapping sheets

    • ETL Testing Work Flow Process
    • How to Prepare the ETL Test Plan
    • How to design the Test cases in ETL Testing.
    • How to reporting the Bugs in ETL Testing ?
    • ETL Testing Responsibilities in DataStage, Informatica, Abinitio etc;
    • How to detect the bugs through database queries
    • ETL Performing Testing & Performing Tuning

    ETL Test Cases
    etl testing life cycle
    day to day process

    UNIX

    Thanks,
    Bunny(bunny.friendly@gmail.com,(HYD-+91-9550580658)

  13. eKnowledge Infotech, a Leading providers of ETL Testing Training in Pune, Qlikview Training Institute in Bangalore. People can have benefit of this.

    Thanks,
    Nimisha

  14. My questions is:

    The differences between manual and automated testing, inkl. some examples. Is there practical example of test strategies?.

    Thanks

  15. If you are using source and target SQL to do your manual testing then you are more then likely using the minus function to aggregate the results. If this is the case then there is a new tool on the market to automate the testing process for you. Google ‘QuerySurge’

    Thanks

  16. Hi all,am in production support from 3 years and I am desparate for a move..Know the concepts of manual testing and sql queries…will I be able to get into ETL testing,can anyone guide me.

  17. Friends,
    Let me know any ETL/DW testing opening in Singapore/Malaysia. Please reach me @ +91-9840219089 (india-mobile).

  18. Any good training institutes for learning etl testing in Bangalore.

    If any body is looking for group preparation or search then contact me.

  19. Hi all,

    I am working as a manual tester in a MNC.
    Which one is better among QTP and ETL testing.
    Career wise and job security wis.

    Thanks in advance

  20. Hi,

    Can any one advise me ETL is best or Selenium is best.
    Which tool have more openings in Market. Plz advise me ASAP…….

  21. Thanks a lot for the information
    I came to know a lot.
    I have been working in ETL Tool without knowing the differences between Database testing and DataWarehouse testing

  22. Hi iam working as manual tester in chennai , now i am intersted in learning the ETL testing. please help me how to go ahead with ETL testing. Now i am in chennai , trainer in chennai please contact to my mobile number.
    Ph:9962850556

  23. Hi, This is a very nice and Informative post. Currently I am working as a manual tester and testing a BI application. I am pretty much interested to learn DWH testing. Please lemme know if anyone is willing to teach this in Pune (Near Wakad).

    My mail ID is – chourasiasourabh@gmail.com

  24. Hi All,
    If anybody wants online training /offline training / corporate training on ETL testing,Informatica,Sql with realtime project and job assistance in bangalore,pune and hyderabad pls contact @ 8237320101

    Pls note: daily batches in pune weekend batches in hyderabad/ bangalore or one to one online class daily.

  25. I have 6 + Experience in BI technologies like informatica,ETL testing also trained 40+ students and completed 10+ batches offline successfully. 90% students already placed in top mnc’s.

    Contact: 8237320101

    Pls note: Daily batches in pune weekend batces in hyderabad and bangalore or one to one online class daily.

Leave a Comment