What is Scalability Testing? How to Test the Scalability of an Application

Introduction to Scalability Testing:

Scalability Testing is a non-functional test methodology in which an application’s performance is measured in terms of its ability to scale up or scale down the number of user requests or other such performance measure attributes.

Scalability testing can be performed at a hardware, software, or database level.

Parameters used for this testing differ from one application to the other, for a web page, it could be the number of users, CPU usage, and network usage, while for a web server it would be the number of requests processed.

What is Scalability Testing

This tutorial will give you a complete overview of Scalability Testing along with its attributes and the various steps involved in performing the test with practical examples to enable you to understand the concept in a better way.

Scalability Testing Vs Load Testing

Load Testing measures the application under test under maximum load at which the system would crash. The main purpose of load testing is to identify the peak point after which the users would not be able to use the system.

Both Load and Scalability come under the Performance Testing methodology.

Scalability differs from Load Testing in the fact that scalability test measures the system at the minimum and maximum loads at all levels, including the software, hardware, and database levels. Once the maximum load is found out, developers need to respond appropriately to ensure that the system is scalable after a particular load.

Example: If scalability testing determines the maximum load to be 10,000 users, then for the system to be scalable, developers need to take measures on factors such as decreasing response time after the 10,000 user limit is reached or increasing the RAM size to accommodate the growing user data.

Load Testing involves placing a maximum load on the developed applications at one go, while scalability testing involves gradually increasing the load over a period of time progressively.

Load testing determines the point at which the application crashes, while scalability tries to identify the reason for the application crash and take steps to resolve the issue.

In short, Load Testing helps to identify the performance problems while scalability testing helps to identify if the system can scale up to the growing number of users.

Scalability Testing Attributes

Scalability test attributes define the performance measures based on which this testing will be performed.

Following are some of the common attributes:

1) Response Time:

  • Response Time is the time between the user request and the application response. This testing is done to identify the response time of the server under minimum load, threshold load, and maximum load to identify the point at which the application would break.
  • Response time may increase or decrease based on varying user load on the application. Ideally, the response time of an application would decrease as the user load keeps increasing.
  • An application can be deemed to be scalable if it can deliver the same response time for varying levels of user load.
  • In the case of clustered environments where the application load is distributed among multiple server components, scalability testing must measure the extent to which the load balancer is distributing the load among multiple servers. This will ensure that one server is not overloaded with requests while the other server is sitting idle waiting for a request to come in.
  • The response time of each server component must be carefully measured if the application is hosted in a clustered environment and scalability testing must make sure that the response time of each server component must be the same regardless of the amount of load placed on each server.
  • Example: Response time can be measured as the time at which the user enters the URL on a web browser to the time until which the web page takes to load the content. The lesser the response time, the higher the performance of an application would be.

2) Throughput:

  • Throughput is the measure of the number of requests processed over a unit of time by the application.
  • The outcome of throughput may differ from one application to another. If it is a web application throughput is measured in terms of the number of user requests processed per unit time and if it is a database. throughput is measured in terms of the number of queries processed in unit time.
  • An application is deemed to be scalable if it can deliver the same throughput for varying levels of load on the internal applications, hardware, and database.

3) CPU Usage:

  • CPU Usage is a measure of CPU Utilization for performing a task by an application. CPU Utilization is usually measured in terms of the unit MegaHertz.
  • Ideally, the more optimized the application code is, the lesser will be the CPU Utilization observed.
  • In order to achieve this, many organizations use standard programming practices to minimize CPU Utilization.
  • Example: Removing dead code in the application and minimizing the use of Thread. Sleep methods are one of the best programming practices to minimize CPU Utilization.

4) Memory Usage:

  • Memory usage is a measure of the memory consumed for performing a task by an application.
  • Ideally, memory is measured in terms of bytes(MegaBytes, GigaBytes, or Tera Bytes) that the developed application uses in order to access Random Access Memory(RAM).
  • Memory usage of an application can be minimized by following the best programming practices.
  • Examples of best programming practices would be not to use redundant loops, reduce the hits to the database, use of the cache, optimize the use of SQL queries, etc. An application is deemed to be scalable if it minimizes the usage of the memory to the maximum extent possible.
  • Example: If the storage space available for a specified number of users runs out of memory, then the developer will be forced to add additional database storage to compensate for the loss of data.

5) Network usage:

  • Network usage is the amount of bandwidth consumed by an application under test.
  • The goal of network usage is to reduce network congestion. Network usage is measured in terms of bytes received per second, frames received per second, segments received and sent per second, etc.
  • Programming techniques such as the use of compression techniques can help to reduce congestion and minimize network usage. An application is deemed to be scalable if it can perform with minimum network congestion and deliver high application performance.
  • Example: Instead of following a queue mechanism for processing the user requests, a developer may write the code to process the user requests as and when the request arrives in a database.

Apart from these parameters, there are a few other less used parameters such as Server request response time, Task execution time, Transaction time, Web Page loading time, Time to fetch the response from the database, Reboot time, Printing time, session time, screen transition, transactions per second, hits per second, requests per second, etc.

The attributes for scalability testing may differ from one application to another as the performance measure for web applications may not be the same as that of a desktop or a client-server application.

Steps to Test the Scalability of an Application

The main advantage of performing this testing on an application is to understand the user behavior when maximum load is reached and the ways to resolve it.

Also, this testing allows the testers to identify server-side degradation and response time with respect to the application user load. As a result, this testing is being preferred by several organizations worldwide.

Given below is the list of steps to test the scalability of an application:

  • Create repeatable test scenarios for each of the scalability testing attributes.
  • Test the application for varying levels of load such as low, medium, and high loads, and verify the behavior of an application.
  • Create a test environment that is stable enough to withstand the entire scalability testing cycle.
  • Configure the hardware necessary to perform this testing.
  • Define a set of virtual users for verifying the behavior of an application under varying user loads.
  • Repeat the test scenarios for multiple users under varying conditions of internal applications, hardware, and database changes.
  • In the case of a clustered environment, validate if the load balancer is directing the user requests to multiple servers to ensure that no server is overloaded by a series of requests.
  • Execute the test scenarios in the test environment.
  • Analyze the reports generated and verify the areas of improvement, if any.


In a nutshell,

=> Scalability testing is a non-functional testing methodology to verify if an application can scale up or scale down to the varying attributes. Attributes used for this testing will vary from one application to the other.

=> The main objective of this testing is to determine when an application starts to degrade at a maximum load and take proper steps to ensure that the developed application is scalable enough to accommodate the changes in the internal applications, software, hardware, and also database changes in the future.

=> If this testing is done properly, major errors with respect to performance in the software, hardware, and database can be uncovered in the developed applications.

=> A major disadvantage of this testing would be its data storage limitation, with limits on the database size and the buffer space. Also, the network bandwidth limitations can be an impediment to scalability testing.

=> The process of scalability testing differs from one organization to another organization as the scalability test attributes of one application will be different from the other applications.