An Excellent Way of Data Testing Using XML Technologies (White Paper)

In the SDLC, if the application uses waterfall model, testing activities are planned at the end. This poses a risk of rework with respect to requirements, design, code and test cases if QA team identifies defects. It is better to avoid waiting till the end to identify the defects in an application.

Tests that are not based on functional execution of the application can find defects without mandating the release of all the components into the test environment. This can be accomplished by data testing.

The XML and related technologies used for communication between different tiers of an application provide an opportunity to carry out the tests that need not wait for the entire application to be readily available for testing.

data testing using XML

This document outlines one possible way of looking at data testing option early in the life-cycle of a product release.


This document assumes the reader is familiar with software testing concepts and fundamental usage of a database and XML Technologies.

Focus group:

QA team (QA), data team (DT), developer (DEV)


The sample data identified for testing a product defines the extent of testing performed, adds confidence in the test results and quality of the product. Identifying the data for a test depends on the requirements of the test to be performed.

This document focuses on validating the test data before seeing it on the user interface.

This process needs test data management in order to have effective test results. Data as we all know can be saved in a database or a flat file. But the data transfer from / to a database can be handled using XML. There exists a very close relationship between XML[1], XSD[2], XPATH[3] & XSLT[4]. (See all definitions below).

[1] XML – eXtensible Markup Language. It is a World Wide Web Consortium (W3C) recommendation to describe data. With a set of correct syntax rules applied, one can ensure a XML document is “well formed”

[2] XSD – Used to denote the structure of a XML document. A “well formed” XML document can be validated against a XSD (XML Schema) to validate it

[3]XPATH – A “valid” and “well formed” XML should be navigated through to pick up an appropriate data from the XML. XPATH expressions look like a traditional file path in a directory.

[4] XSLT – eXtensible Stylesheet Language Transformations – While representing the data from a XML on a user interface (UI), any style (font, color, size, etc.) can be applied using XSLT. XSLT uses XPath to locate information from the XML.

Data presented in the XML is validated against a schema (XSD file). The XML can be output into different formats with XSLT and XPATH.

For the purpose of this discussion we shall use the following example.

Example – A publishing house has a website displaying information about the books it has published. One of the webpage displays a summary about every chapter of a book. Testing should ensure that the content is appropriate on this webpage. The publishing house by now has published millions of books.

Any information related to the published books is saved in a database. Yet, the webpage in question needs a subset of the data (about a new book and its chapters) to be extracted from the database into a XML.

The XML given below represents the metadata about the book.

XML file Book.xml

<?xml version="1.0"?>
<Book xmlns:xsi="" xsi:noNamespaceSchemaLocation="Book.xsd">
    <Title> A book on test data</Title>
    <Chap_3>What is data</Chap_3>
    <!-- Like this there will be chapters up to 10 in this XML file-->
    <Reference>List of references</Reference>

XML Schema Book.xsd

<?xml version="1.0"?>
<xs:schema xmlns:xs="">
<xs:element name="Book">
<xs:element name="Title" type="xs:string"/>
<xs:element name="Author" type="xs:string" minOccurs="1" maxOccurs="2"/>
<xs:element name="Publication_Year" type="xs:positiveInteger"/>
<xs:element name="Category" type="xs:string"/>
<xs:element name="Language" type="xs:string" default="English"/>
<xs:element name="Pages" type="xs:postiveInteger"/>
<xs:element name="Number_of_Chapters" type="xs: postiveInteger "/>
<xs:element name="Chap_1" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="Chap_2" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="Chap_3" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- Similarly the number of chapters can be represented separately-->
<xs:element name="Reference" type="xs:string"/>

Test data management life cycle

Similar to other process, test data management has its own life cycle (LC) stages.


  1. Identify data requirements
  2. Plan data collection
  3. Build the data
  4. Test the data
  5. Data maintenance (not detailed in this document because it is not relevant)

#1. Identify data requirements

In the above example, the database stores millions of records. If the content of all the books is extracted into a XML file, it requires detailed validation. As and when new information has to be output into the webpage, the XML and schema might undergo changes.

The changes to the XML, XSD, XPATH and XSLT require proper validation. But this testing need not wait for presentation, middleware and data tier release. QA team can analyze XSD to prepare data requirement plan.

Life Cycle stageEntry CriteriaActivities / ResponsibilityExit Criteria
Identify test data requirementsFollowing documents are available

Database design, UI design, requirement specification, technical architecture, data flow diagram, Use case diagrams
Understand the data requirements referencing the documents from entry criteria (QA, DT, DEV)

Test data requirements (QA, DT, DEV) - Documents all data needs for every screen showing a mapping between screen display names and corresponding XML element
Review the test data requirements document (QA, DEV, DT)

The process of identifying all the data requirements for a product should address the following:

a) Coverage and completeness – Do the identified requirements cover all use cases?

Example – It is very important to test the data combinations for title, author, category, language in the above XML sample; since the schema mandates these fields.

This can be easily handled by looking at the XML schema that describes presence of an element / attribute and their order in the XML

b) Quality – Is the data collected of best possible quality? The test data used determines the quality of the testing performed on the application.

  • Positive and negative scenarios – Testing should check how the application behaves with the valid / invalid input data

The test data requirements document lists data needs across all tiers of the application. Data from the database can be used directly in UI and/or manipulated (calculations, concatenation, etc.). Hence it is required to capture all data needs.

The table below represents a sample data table:

Field Name Data Type Test data RemarksTest result
AuthorStringBlank fieldSince it is a mandatory field. The test should fail.
AuthorStringAuthor+@Has special charactersThis test should fail
AuthorStringAuthor NameIncludes a spaceThis test should pass
AuthorString123AuthorStarts with a numberThis test should fail
AuthorString@!AuthorStarts with special charactersThis test should fail
AuthorString AuthorPrefixed with spacesThis test should fail

In the above example, use of string data type for Author field can be avoided. Instead a pattern can be enforced.

E.g. alphabets only, start with an uppercase letter, no special characters etc. A pattern (restricting an element value defined in XSD) can be defined as <xs:pattern value=”[a-zA-Z0-9]{12}”/>.

If this is set for the author element in the above example, it means, the author element should have the value with a combination of uppercase, lowercase alphabets and positive integers only.

#2. Plan data collection

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Plan data collectionApproved test data requirements documentIdentify frequency of data needs (DEV, QA)

List test data (QA)

Define XML Schema (DEV)
Review the frequency of data needs and test data(DT)

#3. Build the data

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Build dataData request fileBuild the data in the DB (DT)

Extract the data from the DB into the XML (DT)

Validate the XML against Schema (DT)

Share the XML file with QA (DT)
XML file is received by QA team

#4. Test the data

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Test the dataData request XML fileValidate the XML against schema for completeness and correctness (QA)

Update the mapping document with test results (QA)
Test results shared with DEV, DT team

As listed in the above tables, QA validates the XML against the schema to check if the data is available as expected. Once the schema matches, the content and its structure can be confirmed to be fine. Yet this does not confirm that the data is picked up accurately by the system.

As we know XML shows a tree structure with parent-child-sibling-ancestor-descendent relationship between the nodes.

Look at the table below to understand simplest XPATH conventions:

simplest XPATH conventions

In order to represent the fields from the XML on a screen (as HTML for example) XSLT – XPATH combination is used.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="">

<xsl:template match="/">
<h2>Latest Book</h2>
<table border="6">
<tr bgcolor="#9cad32">
<th style="text-align:left">Title</th>
<th style="text-align:left">Author</th>
<th style="text-align:left">Publication_Year</th>
<th style="text-align:left">Category</th>
<th style="text-align:left">Language</th>
<th style="text-align:left">Pages</th>
<xsl:for-each select="Book">
<td><xsl:value-of select="Title"/></td>
<td><xsl:value-of select="Author"/></td>
<td><xsl:value-of select="Publication_Year"/></td>
<td><xsl:value-of select="Category"/></td>
<td><xsl:value-of select="Language"/></td>
<td><xsl:value-of select="Pages"/></td>

On a browser finally the resultant XML is represented as below. Since the data has already been verified, focus of testing can be more at look and feel of the screen.

latest book


  • Data testing performed early in the development-testing life-cycle saves money as the cost of fixing a bug during the functional test execution is much more than fixing it early in the life-cycle
  • The effort spent initially in validating the XML file , XPath and XSLT with XSD documents helps avoid multiple iterations of the release
  • QA team can work closely with the development team and provide a value added service
  • QA team can help mock up various combinations of data to ensure the coverage and correctness

I am sure you will find this technique useful. Feel free to comment if you have any queries.

Recommended reading


#1 Rahul Suryavanshi

Excellent one. We often use this method for our agile process to start test cycles as early as possible.

#2 Naga Sekhar

Very useful and great article

#3 CJ

Interesting and informative article. One thing though, wouldn’t you still want to verify the data validation on the screen just to ensure that the validation still works? I am thinking that sometimes, even though you test at that level, the validation eventually passes, when all components of the system are integrated, there might still be a risk of parts of that system (including the validation) to be broken.

#4 Aruna Shankar

Hello CJ,

You are correct. We cannot stop testing on the UI. If data is tested before itself, we can be sure that the problem (if any) need not be resulting from data. It may be from how the data is used by receiving system that represent it on the UI.


#5 Dhaval

As a beginner , i had lots of confusion @the purpose of XML testing but your article cleared all the doubt.Thanks.

Leave a Comment