How to Perform ETL Testing Using Informatica PowerCenter Tool

It is a known fact that ETL testing is one of the crucial aspects of any Business Intelligence (BI) based application. In order to get the quality assurance and acceptance to go live in business, the BI application should be tested well beforehand.

The primary objective of ETL testing is to ensure that the Extract, Transform & Load functionality is working as per the business requirements and in sync with the performance standards. 

Before we dig into ETL Testing with Informatica, it is essential to know what ETL and Informatica are.etl-testing-using-the-informatica-powercenter

What you will learn in this ETL tutorial:

  • Basics of ETL, Informatica & ETL testing.
  • Understanding ETL testing specific to Informatica.
  • Classification of ETL testing in Informatica.
  • Sample test cases for Informatica ETL testing.
  • Benefits of using Informatica as an ETL tool.
  • Tips & Tricks to aid you in testing.

In computing, Extract, Transform, Load (ETL) refers to a process in database usage and especially in data warehousing that performs:

  • Data extraction – Extracts data from homogeneous or heterogeneous data sources.
  • Data Transformation – Formats the data into required type.
  • Data Load – Move and store the data to a permanent location for long term usage.

Informatica PowerCenter ETL Testing Tool:

Informatica PowerCenter is a powerful ETL tool from Informatica Corporation. It is a single, unified enterprise data integration platform for accessing, discovering, and integrating data from virtually any business system, in any

It is a single, unified enterprise data integration platform for accessing, discovering, and integrating data from virtually any business system, in any format and delivering that data throughout the enterprise at any speed. Through Informatica PowerCenter, we create workflows that perform end to end ETL operations.

Download and Install Informatica PowerCenter:

To install and configure Informatica PowerCenter 9.x use the below link that has step by step instructions:
=> Informatica PowerCenter 9 Installation and Configuration Guide

Understanding ETL testing specific to Informatica:

ETL testers often have pertinent questions about what to test in Informatica and how much test coverage is needed?

Let me take you through a tour on how to perform ETL testing specific to Informatica.

The main aspects which should be essentially covered in Informatica ETL testing are:

  • Testing the functionality of Informatica workflow and its components; all the transformations used in the underlying mappings.
  • To check the data completeness (i.e. ensuring if the projected data is getting loaded to the target without any truncation and data loss),
  • Verifying if the data is getting loaded to the target within estimated time limits (i.e. evaluating performance of the workflow),
  • Ensuring that the workflow does not allow any invalid or unwanted data to be loaded in the target.

Classification of ETL Testing in Informatica:

For better understanding and ease of the tester, ETL testing in Informatica can be divided into two main parts –

#1) High-level testing
#2) Detailed testing

Firstly, in the high-level testing:

  • You can check if the Informatica workflow and related objects are valid or not.
  • Verify if the workflow is getting completed successfully on running.
  • Confirm if all the required sessions/tasks are being executed in the workflow.
  • Validate if the data is getting loaded to the desired target directory and with the expected filename (in case the workflow is creating a file), etc.

In a nutshell, you can say that the high-level testing includes all the basic sanity checks.

Coming to the next part i.e. detailed testing in Informatica, you will be going in depth to validate if the logic implemented in Informatica is working as expected in terms of its results and performance.

  • You need to do the output data validations at the field level which will confirm that each transformation is operating fine
  • Verify if the record count at each level of processing and finally if the target is as expected.
  • Monitor thoroughly elements like source qualifier and target in source/target statistics of session
  • Ensure that the run duration of the Informatica workflow is at par with the estimated run time.

To sum up, we can say that the detailed testing includes a rigorous end to end validation of Informatica workflow and the related flow of data.

Let us take an example here:

We have a flat file that contains data about different products. It stores details like the name of the product, its description, category, date of expiry, price, etc.

My requirement is to fetch each product record from the file, generate a unique product id corresponding to each record and load it into the target database table. I also need to suppress those products which either belong to the category ‘C’ or whose expiry date is less than the current date.

Say, my flat file (source) looks like this:

(Note: Click on any image for enlarged view)

flat-file

Based on my requirements stated above, my database table (Target) should look like this:

Table name: Tbl_Product

Prod_ID (Primary Key)Product_nameProd_descriptionProd_categoryProd_expiry_dateProd_price
1001ABCThis is product ABC.M8/14/2017150
1002DEFThis is product DEF.S6/10/2018700
1003PQRSThis is product PQRS.M5/23/20191500

Now, say, we have developed an Informatica workflow to get the solution for my ETL requirements.

The underlying Informatica mapping will read data from the flat file, pass the data through a router transformation that will discard rows which either have product category as ‘C’ or expiry date, then I will be using a sequence generate to create the unique primary key values for Prod_ID column in Product Table.

Finally, the records will be loaded to Product table which is the target for my Informatica mapping.

Examples:

Below are the sample test cases for the scenario explained above.

You can use these test cases as a template in your Informatica testing project and add/remove similar test cases depending upon the functionality of your workflow.

#1) Test Case ID: T001

Test Case Purpose: Validate workflow – [workflow_name]

Test Procedure:

  • Go to workflow manager
  • Open workflow
  • Workflows menu-> click on validate

Input Value/Test Data: Sources and targets are available and connected
Sources: [all source instances name]
Mappings: [all mappings name]
Targets: [all target instances name]
Session: [all sessions name]

Expected Results: Message in workflow manager status bar: “Workflow [workflow_name] is valid “

Actual Results: Message in workflow manager status bar: “Workflow [workflow_name] is valid “

Remarks: Pass

Tester Comments:

------------

#2) Test Case ID: T002

Test Case Purpose: To ensure if the workflow is running successfully

Test Procedure:

  • Go to workflow manager
  • Open workflow
  • Right click in workflow designer and select Start workflow
  • Check status in Workflow Monitor

Input Value/Test Data: Same as test data for T001

Expected Results: Message in the output window in Workflow manager: Task Update: [workflow_name] (Succeeded)

Actual Results: Message in the output window in Workflow manager: Task Update: [workflow_name] (Succeeded)

Remarks: Pass

Tester Comments: Workflow succeeded

Note: You can easily see the workflow run status (failed/succeeded) in Workflow monitor as shown in below example. Once the workflow will be completed, the status will reflect automatically in workflow monitor.

workflow-monitor

In the above screenshot, you can see the start time and end time of workflow as well as the status as succeeded.

#3) Test Case ID: T003

Test Case Purpose: To validate if the desired number of records are getting loaded to target

Test Procedure: Once the workflow has run successfully, go to the target table in database
Check the number of rows in target database table

Input Value/Test Data: 5 rows in the source file
Target: database table – [Tbl_Product]
Query to run in SQL server: Select count(1) from [Tbl_Product]

Expected Results: 3 rows selected

Actual Results: 3 rows selected

Remarks: Pass

Tester Comments:

#4) Test Case ID: T004

Test Case Purpose: To check if sequence generator in Informatica mapping is working fine for populating [primary_key_column_name e.g. Prod_ID] column

Test Procedure: Once the workflow has run successfully, go to the target table in database
Check the unique sequence generated in column Prod_ID

Input Value/Test Data: value for Prod_ID left blank for every row in source file
Sequence Generator mapped to Prod_ID column in the mapping
Sequence generator start value set as 1001
Target: database table- [Tbl_Product] opened in SQL Server

Expected Results: Value from 1001 to 1003 populated against every row for Prod_ID column

Actual Results: Value from 1001 to 1003 populated against every row for Prod_ID column

Remarks: Pass

Tester Comments:

#5) Test Case ID: T005

Test Case Purpose: To validate if router transformation is working fine to suppress records in case the product category is ‘C’ or the product has got expired.

Test Procedure: Once the workflow has run successfully, go to the target table in database
Run the query on the target table to check if the desired records have got suppressed.

Input Value/Test Data: 5 rows in the source file
Target: database table – [Tbl_Product]
Query to run in SQL server: Select * from Product where Prod_category=’C’ or Prod_expiry_date < sysdate;

Expected Results: no rows selected

Actual Results:  no rows selected

Remarks: Pass

Tester Comments: (if any)

#6) Test Case ID: T006

Test Case Purpose: To check the performance of the workflow by recording the workflow runtime.

Test Procedure:

  • Open the workflow monitor and go the run that was done as part of T001.
  • Record the start time and end time of workflow.
  • Calculate total run time by subtracting start time from end time.

Input Value/Test Data: Workflow has run successfully
Start time of workflow in monitor
End time of workflow in monitor.

Expected Results: 2 min 30 secs

Actual Results:  2 min 15 secs

Remarks: Pass

Tester Comments: Considering the test as ‘Pass’ in case the actual run duration is +/- 10% of expected run duration.

#7) Test Case ID: T007

Test Case Purpose: To validate data at target table column level in order to ensure that there is no data loss.

Test Procedure: Once the workflow has run successfully, go to the SQL Server.
Run the query on the target table to check there is no data loss.

Input Value/Test Data: Workflow has run successfully
One sample record from source flat file.
SQL Query: Select Top 1 * from Tbl_Patient;

Expected Results:
1 row returned

Prod_ID (Primary Key)Product_nameProd_descriptionProd_categoryProd_expiry_dateProd_price
1001ABCThis is product ABC.M8/14/2017150

Actual Results: 
1 row returned.

Prod_ID (Primary Key)Product_nameProd_descriptionProd_categoryProd_expiry_dateProd_price
1001ABCThis is product ABC.M8/14/2017150

Remarks: Pass

Tester Comments: Considering the test as ‘Pass’ in case the actual run duration is +/- 10% of expected run duration.

Benefits of Using Informatica as an ETL tool:

Informatica is a popular and successful ETL tool because:

  • It has a high “go live” success rate (nearly 100%)
  • Informatica has the capability of enabling Lean Integration.
  • It is a moderately priced tool when compared to other ETL tools.
  • It comes with internal job scheduler. So, there is no need to use third-party scheduler separately like some other ETL tools do.
  • Easy training and tool availability has made Informatica more popular.

Some useful Tips to assist you in Informatica ETL testing:

  • Generate the test data before executing the test scenarios.
  • The test data should be in sync with the test case it is used for.
  • Make sure that you have covered all the 3 scenarios – no data is submitted, invalid data is submitted and valid data is submitted as an input to Informatica workflow.
  • Make sure to test that all the required data is getting loaded to target completely. For this, you can use test case – T003 described above as a sample.
  • It is very important to test that the workflow is doing all the data transformations correctly as per business rules.
  • I would suggest that for each transformation applied in your Informatica mapping, you should have a checklist to verify output data against it. That way, you can report bugs easily if any transformation is not working fine.

Conclusion:

So, we have seen in detail, some of the sample test cases that can be used as a template to cover ETL testing in Informatica. As I mentioned earlier, you can add/remove/modify these test cases depending on the scenario you have in your project.

As I mentioned earlier, you can add/remove/modify these test cases depending on the scenario you have in your project.

The Informatica PowerCenter is a foundation for any data integration activities.

You can easily perform script-free automated testing of data copied to test, dev or production environment, and that is the reason why PowerCenter is the most popular ETL tool nowadays.

Recommended reading => ETL vs. DB Testing – A Closer Look at ETL Testing Need

About the author: This is a guest article by Priya K. She is having 4+ years of hands-on experience in developing and supporting Informatica ETL applications.

Feel free to post your queries/comments about this ETL tool.




4 comments ↓

#1 Devanath

Really useful and nice info. Informatica installation steps are very clear.

#2 Sumit

good tutorial. which are other best tools for etl?

#3 ganesh

..

#4 Priya Kaushal

Hi Sumit,

To answer your question, below is the list of all the ETL tools:

Informatica – Power Center
IBM – Websphere DataStage(Formerly known as Ascential DataStage)
SAP – BusinessObjects Data Integrator
IBM – Cognos Data Manager (Formerly known as Cognos DecisionStream)
Microsoft – SQL Server Integration Services
Oracle – Data Integrator (Formerly known as Sunopsis Data Conductor)
SAS – Data Integration Studio
Oracle – Warehouse Builder
AB Initio
Information Builders – Data Migrator
Pentaho – Pentaho Data Integration
Embarcadero Technologies – DT/Studio
IKAN – ETL4ALL
IBM – DB2 Warehouse Edition
Pervasive – Data Integrator
ETL Solutions Ltd. – Transformation Manager
Group 1 Software (Sagent) – DataFlow
Sybase – Data Integrated Suite ETL
Talend – Talend Open Studio
Expressor Software – Expressor Semantic Data Integration System
Elixir – Elixir Repertoire
OpenSys – CloverETL

If you are a software tester, I would like to add that Informatica Data Validation Option provides an ETL testing tool that can accelerate and automate ETL testing in both production environments and development & test. This means that you can deliver complete, repeatable and auditable test coverage in less time with no programming skills required.

Leave a Comment