An Excellent Way of Data Testing Using XML Technologies (White Paper)

By Vijay

By Vijay

I'm Vijay, and I've been working on this blog for the past 20+ years! I’ve been in the IT industry for more than 20 years now. I completed my graduation in B.E. Computer Science from a reputed Pune university and then started my career in…

Learn about our editorial policies.
Updated December 4, 2024
Edited by Swati

Edited by Swati

I’m Swati. I accidentally started testing in 2004, and since then have worked with at least 20 clients in 10 cities and 5 countries and am still counting. I am CSTE and CSQA certified. I love my job and the value it adds to software…

Learn about our editorial policies.

In this article, we will learn about a wonderful way of Data Testing Using XML Technologies. Let’s get started.

In the SDLC, if the application uses a waterfall model, testing activities are planned at the end. This poses a risk of rework with respect to requirements, design, code and test cases if QA team identifies defects. It is better to avoid waiting till the end to identify the defects in an application.

Tests that are not based on functional execution of the application can find defects without mandating the release of all the components into the test environment. This can be accomplished by data testing.

The XML and related technologies used for communication between different tiers of an application provide an opportunity to carry out the tests that need not wait for the entire application to be readily available for testing.

An Excellent Way of Data Testing Using XML Technologies

data testing using XML

This document outlines one possible way of looking at data testing options early in the life-cycle of a product release.

Assumption

This document assumes the reader is familiar with software testing concepts and fundamental usage of a database and XML Technologies.

Focus group

QA team (QA), data team (DT), developer (DEV)

Purpose

The sample data identified for testing a product defines the extent of testing performed, and adds confidence in the test results and quality of the product. Identifying the data for a test depends on the requirements of the test to be performed.

This document focuses on validating the test data before seeing it on the user interface.

This process requires test data management in order to have effective test results. Data as we all know can be saved in a database or a flat file. But the data transfer from / to a database can be handled using XML. There exists a very close relationship between XML[1], XSD[2], XPATH[3] & XSLT[4]. (See all definitions below).

[1] XML – eXtensible Markup Language. It is a World Wide Web Consortium (W3C) recommendation to describe the data. With a set of correct syntax rules applied, one can ensure a XML document is “well formed”

[2] XSD – Used to denote the structure of a XML document. A “well formed” XML document can be validated against a XSD (XML Schema) to validate it

[3]XPATH – A “valid” and “well formed” XML should be navigated through to pick up appropriate data from XML. XPATH expressions look like a traditional file path in a directory.

[4] XSLT – eXtensible Stylesheet Language Transformations – While representing the data from a XML on a user interface (UI), any style (font, color, size, etc.) can be applied using XSLT. XSLT uses XPath to locate information from XML.

Data presented in the XML is validated against a schema (XSD file). The XML can be output into different formats with XSLT and XPATH.

For the purpose of this discussion we shall use the following example.

Example – A publishing house has a website displaying information about the books it has published. One of the webpages displays a summary about every chapter of a book. Testing should ensure that the content is appropriate on this webpage. The publishing house has by now published millions of books.

Any information related to the published books is saved in a database. Yet, the webpage in question needs a subset of the data (about a new book and its chapters) to be extracted from the database into a XML.

The XML given below represents the metadata about the book.

XML file Book.xml

<?xml version="1.0"?>
<Book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Book.xsd">
    <Title> A book on test data</Title>
    <Author>Jim</Author>
    <Publication_Year>2015</Publication_Year>
    <Category>Technical</Category>
    <Language>English</Language>
    <Pages>120</Pages>
    <Number_of_Chapters>10</Number_of_Chapters>
    <Chap_1>Acknowledgement</Chap_1>
    <Chap_2>Introduction</Chap_2>
    <Chap_3>What is data</Chap_3>
    <!-- Like this there will be chapters up to 10 in this XML file-->
    <Reference>List of references</Reference>
</Book>

XML Schema Book.xsd

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Book">
<xs:complexType>
<xs:sequence>
<xs:element name="Title" type="xs:string"/>
<xs:element name="Author" type="xs:string" minOccurs="1" maxOccurs="2"/>
<xs:element name="Publication_Year" type="xs:positiveInteger"/>
<xs:element name="Category" type="xs:string"/>
<xs:element name="Language" type="xs:string" default="English"/>
<xs:element name="Pages" type="xs:postiveInteger"/>
<xs:element name="Number_of_Chapters" type="xs: postiveInteger "/>
<xs:element name="Chap_1" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="Chap_2" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="Chap_3" type="xs:string" minOccurs="1" maxOccurs="1"/>
<!-- Similarly the number of chapters can be represented separately-->
<xs:element name="Reference" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Test Data Management Life Cycle

Similar to other processes, test data management has its own life cycle (LC) stages.

  1. Identify data requirements
  2. Plan the data collection
  3. Build the data
  4. Test the data
  5. Data maintenance (not detailed in this document because it is not relevant)

#1. Identify data requirements

In the above example, the database stores millions of records. If the content of all the books is extracted into a XML file, it requires detailed validation. As and when new information has to be output into the webpage, the XML and schema might undergo changes.

The changes to the XML, XSD, XPATH and XSLT require proper validation. However, this testing need not wait for presentations, middleware and data tier releases. QA team can analyze XSD to prepare a data requirement plan.

Life Cycle stageEntry CriteriaActivities / ResponsibilityExit Criteria
Identify test data requirementsFollowing documents are available

Database design, UI design, requirement specification, technical architecture, data flow diagram, Use case diagrams
Understand the data requirements referencing the documents from entry criteria (QA, DT, DEV)

Test data requirements (QA, DT, DEV) – Documents all data needs for every screen showing a mapping between screen display names and corresponding XML element
Review the test data requirements document (QA, DEV, DT)

The process of identifying all the data requirements for a product should address the following:

a) Coverage and completeness – Do the identified requirements cover all use cases?

Example – It is very important to test the data combinations for title, author, category, language in the above XML sample; since the schema mandates these fields.

This can be easily handled by looking at the XML schema that describes the presence of an element / attribute and their order in the XML

b) Quality – Is the data collected of best possible quality? The test data used determines the quality of the testing performed on the application.

  • Positive and negative scenarios – Testing should check how the application behaves with valid / invalid input data.

The test data requirements document lists the data needed across all tiers of the application. Data from the database can be used directly in UI and/or manipulated (calculations, concatenation, etc.). Hence, it is required to capture all the data requirements.

The table below represents a sample data table:

Field Name Data Type Test data RemarksTest result
AuthorStringBlank fieldSince it is a mandatory field. The test should fail.
AuthorStringAuthor+@Has special charactersThis test should fail
AuthorStringAuthor NameIncludes a spaceThis test should pass
AuthorString123AuthorStarts with a numberThis test should fail
AuthorString@!AuthorStarts with special charactersThis test should fail
AuthorString AuthorPrefixed with spacesThis test should fail

In the above example, use of string data type for Author field can be avoided. Instead, a pattern can be enforced.

E.g. alphabets only, start with an uppercase letter, no special characters etc. A pattern (restricting an element value defined in XSD) can be defined as <xs:pattern value=”[a-zA-Z0-9]{12}”/>.

If this is set for the author element in the above example, it means, the author element should have the value with a combination of uppercase, lowercase alphabets and positive integers only.

#2. Plan data collection

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Plan data collectionApproved test data requirements documentIdentify frequency of data needs (DEV, QA)

List test data (QA)

Define XML Schema (DEV)
Review the frequency of data needs and test data(DT)

#3. Build the data

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Build dataData request fileBuild the data in the DB (DT)

Extract the data from the DB into the XML (DT)

Validate the XML against Schema (DT)

Share the XML file with QA (DT)
XML file is received by QA team

#4. Test the data

LC stageEntry CriteriaActivities / ResponsibilityExit Criteria
Test the dataData request XML fileValidate the XML against schema for completeness and correctness (QA)

Update the mapping document with test results (QA)
Test results shared with DEV, DT team

As listed in the above tables, QA validates the XML against the schema to check if the data is available as expected. Once the schema matches, the content and its structure can be confirmed to be fine. Yet this does not confirm that the data is picked up accurately by the system.

As we know XML shows a tree structure with parent-child-sibling-ancestor-descendent relationship between the nodes.

Look at the table below to understand the simplest XPATH conventions:

simplest XPATH conventions

In order to represent the fields from the XML on a screen (as HTML for example) XSLT – XPATH combination is used.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
<xsl:template match="/">
<html>
<body>
<h2>Latest Book</h2>
<table border="6">
<tr bgcolor="#9cad32">
<th style="text-align:left">Title</th>
<th style="text-align:left">Author</th>
<th style="text-align:left">Publication_Year</th>
<th style="text-align:left">Category</th>
<th style="text-align:left">Language</th>
<th style="text-align:left">Pages</th>
</tr>
<xsl:for-each select="Book">
<tr>
<td><xsl:value-of select="Title"/></td>
<td><xsl:value-of select="Author"/></td>
<td><xsl:value-of select="Publication_Year"/></td>
<td><xsl:value-of select="Category"/></td>
<td><xsl:value-of select="Language"/></td>
<td><xsl:value-of select="Pages"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

On a browser finally the resultant XML is represented as below. Since the data has already been verified, the focus of testing can be more on the look and feel of the screen.

latest book

Conclusion

  • Data testing performed early in the development-testing life-cycle saves money as the cost of fixing a bug during the functional test execution is much more than fixing it early in the life-cycle
  • The effort spent initially in validating the XML file, XPath and XSLT with XSD documents helps avoid multiple iterations of the release
  • QA team can work closely with the development team and provide a value added service
  • QA team can help mock up various combinations of data to ensure coverage and correctness

I am sure you will find this technique useful. Feel free to write to us in the comments section if you have any doubts or queries. We would love to hear from you. 

Was this helpful?

Thanks for your feedback!

Recommended Reading

  • XML to Database Testing Tutorial Featured Image

    This article will help the readers understand the XML to Database testing concept, which is a challenging testing type. Let's get started. Data comparison is a critical task to accomplish with quality. Any flaw will result in one or more failures in an application. XML is an electronic communication message…

  • DataWaewhouse ETL Testing

    This Tutorial Covers Goals & Significance of Data Warehouse Testing, ETL Testing Responsibilities, Errors in DW and ETL Deployment in detail: In this In-Depth Data Warehouse Training Series, we had a look at the What Is ETL Process in Data Warehouse in detail in our previous tutorial. This tutorial will give…

  • ETL Testing Data Warehouse Testing (1)

    ETL Testing / Data Warehouse Process and Challenges: Today let me take a moment and explain my testing fraternity about one of the most demanding and upcoming skills for my tester friends i.e. ETL testing (Extract, Transform, and Load). This tutorial will present you with a complete idea about ETL testing…

  • black box vs white box testing

    A Thorough Study of Black Box Testing Vs White Box Testing: Software testing includes several types of testing and as a software tester, we must know how each of them is performed. Among the various types of testing, one of the most confusing topics is that of the Black box…

  • Recovery Testing

    This tutorial explains what is Recovery Testing, its lifecycle, disaster recovery best practices, and the differences between Recovery testing and Reliability testing: Software failures are unavoidable, some failures do not let the complete system down, but some failures can be a disaster. To reduce the impact of the disaster, “Recovery…

  • Introduction to BigData

    This Tutorial Explains all about Big Data Basics. Tutorial Includes Benefits, Challenges, Technologies, and Tools along with Applications of Big Data: In this digital world with technological advancements, we exchange large amounts of data daily like in Terabytes or Petabyte. If we are exchanging that amount of data daily then…

  • BI testing

    In this article, we have provided a detailed explanation of the process of testing business data. Let's understand the steps to business intelligence. Business Intelligence (BI) is a process of gathering, analyzing, and transforming raw data into accurate, efficient, and meaningful information which can be used to make wise business decisions…

  • PARAMETERIZED TESTING

    Explore the Ways of Writing Data-driven or Parameterized Tests with the Spock Framework: In this Free Spock Training Tutorial Series, we explored all about Unit Testing in Spock and Test fixtures, Assertions and Reporting in our previous tutorial. In this tutorial, we will try to understand what parameterized tests are…


6 thoughts on “An Excellent Way of Data Testing Using XML Technologies (White Paper)”

  1. Hello CJ,

    You are correct. We cannot stop testing on the UI. If data is tested before itself, we can be sure that the problem (if any) need not be resulting from data. It may be from how the data is used by receiving system that represent it on the UI.

    regards,
    Aruna

    Reply
  2. Interesting and informative article. One thing though, wouldn’t you still want to verify the data validation on the screen just to ensure that the validation still works? I am thinking that sometimes, even though you test at that level, the validation eventually passes, when all components of the system are integrated, there might still be a risk of parts of that system (including the validation) to be broken.

    Reply
  3. Can someone clarify below for me, I don’t know how this should work
    1. Provided table for simplest XPAT conventions stated that UI Display text for line # 2.” Author Name” and XML node column listed “Author”.
    2. Latest Book UI display showing column as “Author”
    Question: Isn’t UI display should have “Author Name”, like below

    Book Title | Author Name | Year Published | Category | Language | Total Pages

    Please help understand this.

    Reply

Leave a Comment