Monday, May 4, 2009

Stress Testing,Performance Testing and Performance Testing

Stress Testing

Stress testing executes a system in a manner that demands resources in abnormal quantity, frequency, or volume. The following types of tests may be conducted during stress testing;

·         Special tests may be designed that generate ten interrupts per second, when one or two is the average rate.

·         Input data rates may be increases by an order of magnitude to determine how input functions will respond.

·         Test Cases that require maximum memory or other resources.

·         Test Cases that may cause excessive hunting for disk-resident data.

·         Test Cases that my cause thrashing in a virtual operating system.

 

Performance Testing

Performance testing of a Web site is basically the process of understanding how the Web application and its operating environment respond at various user load levels. In general, we want to measure the Response Time, Throughput, and Utilization of the Web site while simulating attempts by virtual users to simultaneously access the site. One of the main objectives of performance testing is to maintain a Web site with low response time, high throughput, and low utilization.

 

Response Time

Response Time is the delay experienced when a request is made to the server and the server's response to the client is received. It is usually measured in units of time, such as seconds or milliseconds. Generally speaking, Response Time increases as the inverse of unutilized capacity. It increases slowly at low levels of user load, but increases rapidly as capacity is utilized. Figure 1 demonstrates such typical characteristics of Response Time versus user load.


Figure1. Typical characteristics of latency versus user load

The sudden increase in response time is often caused by the maximum utilization of one or more system resources. For example, most Web servers can be configured to start up a fixed number of threads to handle concurrent user requests. If the number of concurrent requests is greater than the number of threads available, any incoming requests will be placed in a queue and will wait for their turn to be processed. Any time spent in a queue naturally adds extra wait time to the overall Response Time.

To better understand what Response Time means in a typical Web farm, we can divide response time into many segments and categorize these segments into two major types: network response time and application response time. Network response time refers to the time it takes for data to travel from one server to another. Application response time is the time required for data to be processed within a server. Figure 2 shows the different response time in the entire process of a typical Web request.

 

Figure 2 shows the different response time in the entire process of a typical Web request.

 

Total Response Time = (N1 + N2 + N3 + N4) + (A1 + A2 + A3)

Where Nx represents the network Response Time and Ax represents the application Response Time.

In general, the Response Time is mainly constrained by N1 and N4. This Response Time represents the method your clients are using to access the Internet. In the most common scenario, e-commerce clients access the Internet using relatively slow dial-up connections. Once Internet access is achieved, a client's request will spend an indeterminate amount of time in the Internet cloud shown in Figure 2 as requests and responses are funneled from router to router across the Internet.

To reduce these networks Response Time (N1 and N4), one common solution is to move the servers and/or Web contents closer to the clients. This can be achieved by hosting your farm of servers or replicating your Web contents with major Internet hosting providers who have redundant high-speed connections to major public and private Internet exchange points, thus reducing the number of network routing hops between the clients and the servers.

Network Response Times N2 and N3 usually depend on the performance of the switching equipment in the server farm. When traffic to the back-end database grows, consider upgrading the switches and network adapters to boost performance.

Reducing application Response Times (A1, A2, and A3) is an art form unto itself because the complexity of server applications can make analyzing performance data and performance tuning quite challenging. Typically, multiple software components interact on the server to service a given request. Response time can be introduced by any of the components. That said, there are ways you can approach the problem:

·         First, your application design should minimize round trips wherever possible. Multiple round trips (client to server or application to database) multiply transmission and resource acquisition Response time. Use a single round trip wherever possible.

·         You can optimize many server components to improve performance for your configuration. Database tuning is one of the most important areas on which to focus. Optimize stored procedures and indexes.

·         Look for contention among threads or components competing for common resources. There are several methods you can use to identify contention bottlenecks. Depending on the specific problem, eliminating a resource contention bottleneck may involve restructuring your code, applying service packs, or upgrading components on your server. Not all resource contention problems can be completely eliminated, but you should strive to reduce them wherever possible. They can become bottlenecks for the entire system.

·         Finally, to increase capacity, you may want to upgrade the server hardware (scaling up), if system resources such as CPU or memory are stretched out and have become the bottleneck. Using multiple servers as a cluster (scaling out) may help to lessen the load on an individual server, thus improving system performance and reducing application latencies.

 

Throughput

Throughput refers to the number of client requests processed within a certain unit of time. Typically, the unit of measurement is requests per second or pages per second. From a marketing perspective, throughput may also be measured in terms of visitors per day or page views per day, although smaller time units are more useful for performance testing because applications typically see peak loads of several times the average load in a day.

As one of the most useful metrics, the throughput of a Web site is often measured and analyzed at different stages of the design, develop, and deploy cycle. For example, in the process of capacity planning, throughput is one of the key parameters for determining the hardware and system requirements of a Web site. Throughput also plays an important role in identifying performance bottlenecks and improving application and system performance. Whether a Web farm uses a single server or multiple servers, throughput statistics show similar characteristics in reactions to various user load levels. Figure 3 demonstrates such typical characteristics of throughput versus user load.

Figure 3. Typical characteristics of throughput versus user load

As Figure 3 illustrates, the throughput of a typical Web site increases proportionally at the initial stages of increasing load. However, due to limited system resources, throughput cannot be increased indefinitely. It will eventually reach a peak, and the overall performance of the site will start degrading with increased load. Maximum throughput, illustrated by the peak of the graph in Figure 3, is the maximum number of user requests that can be supported concurrently by the site in the given unit of time.

Note that it is sometimes confusing to compare the throughput metrics for your Web site to the published metrics of other sites. The value of maximum throughput varies from site to site. It mainly depends on the complexity of the application. For example, a Web site consisting largely of static HTML pages may be able to serve many more requests per second than a site serving dynamic pages. As with any statistic, throughput metrics can be manipulated by selectively ignoring some of the data. For example, in your measurements, you may have included separate data for all the supporting files on a page, such as graphic files. Another site's published measurements might consider the overall page as one unit. As a result, throughput values are most useful for comparisons within the same site, using a common measuring methodology and set of metrics.

In many ways, throughput and Response time are related, as different approaches to thinking about the same problem. In general, sites with high latency will have low throughput. If you want to improve your throughput, you should analyze the same criteria as you would to reduce latency. Also, measurement of throughput without consideration of latency is misleading because latency often rises under load before throughput peaks. This means that peak throughput may occur at a latency that is unacceptable from an application usability standpoint. This suggests that Performance reports include a cut-off value for Response time, such as:250 requests/second @ 5 seconds maximum Response time

 

 

 

Utilization

Utilization refers to the usage level of different system resources, such as the server's CPU(s), memory, network bandwidth, and so forth. It is usually measured as a percentage of the maximum available level of the specific resource. Utilization versus user load for a Web server typically produces a curve, as shown in Figure 4.

Figure 4. Typical characteristics of utilization versus user load

As Figure 4 illustrates, utilization usually increases proportionally to increasing user load. However, it will top off and remain at a constant when the load continues to build up.

If the specific system resource tops off at 100-percent utilization, it's very likely that this resource has become the performance bottleneck of the site. Upgrading the resource with higher capacity would allow greater throughput and lower latency—thus better performance. If the measured resource does not top off close to 100-percent utilization, it is probably because one or more of the other system resources have already reached their maximum usage levels. They have become the performance bottleneck of the site.

To locate the bottleneck, you may need to go through a long and painstaking process of running performance tests against each of the suspected resources, and then verifying if performance is improved by increasing the capacity of the resource. In many cases, performance of the site will start deteriorating to an unacceptable level well before the major system resources, such as CPU and memory, are maximized. For example, Figure 5 illustrates a case where response time rises sharply to 45 seconds when CPU utilization has reached only 60 percent.

Figure 5. An example of Response Time versus utilization

As Figure 5 demonstrates, monitoring the CPU or memory utilization alone may not always indicate the true capacity level of the server farm with acceptable performance.

Applications

While most traditional applications are designed to respond to a single user at any time, most Web applications are expected to support a wide range of concurrent users, from a dozen to a couple thousand or more. As a result, performance testing has become a critical component in the process of deploying a Web application. It has proven to be most useful in (but not limited to) the following areas:

·         Capacity planning

·         Bug fixing

Capacity Planning

How do you know if your server configuration is sufficient to support two million visitors per day with average response time of less than five seconds? If your company is projecting a business growth of 200 percent over the next two months, how do you know if you need to upgrade your server or add more servers to the Web farm? Can your server and application support a six-fold traffic increase during the Christmas shopping season?

Capacity planning is about being prepared. You need to set the hardware and software requirements of your application so that you'll have sufficient capacity to meet anticipated and unanticipated user load.

One approach in capacity planning is to load-test your application in a testing (staging) server farm. By simulating different load levels on the farm using a Web application performance testing tool such as WAS, you can collect and analyze the test results to better understand the performance characteristics of the application. Performance charts such as those shown in Figures 1, 3, and 4 can then be generated to show the expected Response Time, throughput, and utilization at these load levels.

In addition, you may also want to test the scalability of your application with different hardware configurations. For example, load testing your application on servers with one, two, and four CPUs respectively would help to determine how well the application scales with symmetric multiprocessor (SMP) servers. Likewise, you should load test your application with different numbers of clustered servers to confirm that your application scales well in a cluster environment.

Although performance testing is as important as functional testing, it's often overlooked .Since the requirements to ensure the performance of the system is not as straightforward as the functionalities of the system, achieving it correctly is more difficult.

The effort of performance testing is addressed in two ways:

  • Load testing
  • Stress testing

 

Load testing

Load testing is a much used industry term for the effort of performance testing. Here load means the number of users or the traffic for the system. Load testing is defined as the testing to determine whether the system is capable of handling anticipated number of users or not.

 

In Load Testing, the virtual users are simulated to exhibit the real user behavior as much as possible. Even the user think time such as how users will take time to think before inputting data will also be emulated. It is carried out to justify whether the system is performing well for the specified limit of load.

 

For example, Let us say an online-shopping application is anticipating 1000 concurrent user hits at peak period. In addition, the peak period is expected to stay for 12 hrs. Then the system is load tested with 1000 virtual users for 12 hrs. These kinds of tests are carried out in levels: first 1 user, 50 users, and 100 users, 250 users, 500 users and so on till the anticipated limit are reached. The testing effort is closed exactly for 1000 concurrent users.

 

The objective of load testing is to check whether the system can perform well for specified load. The system may be capable of accommodating more than 1000 concurrent users. But, validating that is not under the scope of load testing. No attempt is made to determine how many more concurrent users the system is capable of servicing. Table 1 illustrates the example specified.

 

 

Stress testing

Stress testing is another industry term of performance testing. Though load testing & Stress testing are used synonymously for performance–related efforts, their goal is different.

 

Unlike load testing where testing is conducted for specified number of users, stress testing is conducted for the number of concurrent users beyond the specified limit. The objective is to identify the maximum number of users the system can handle before breaking down or degrading drastically. Since the aim is to put more stress on system, think time of the user is ignored and the system is exposed to excess load. The goals of load and stress testing are listed in Table 2. Refer to table 3 for the inference drawn through the Performance Testing Efforts.

 

Let us take the same example of online shopping application to illustrate the objective of stress testing. It determines the maximum number of concurrent users an online system can service which can be beyond 1000 users (specified limit). However, there is a possibility that the maximum load that can be handled by the system may found to be same as the anticipated limit. The Table<##>illustrates the example specified.

 

Stress testing also determines the behavior of the system as user base increases. It checks whether the system is going to degrade gracefully or crash at a shot when the load goes beyond the specified limit.

                                    Table 1: Load and stress testing of illustrative example

Types of Testing

 

Number of Concurrent users

Duration

Load Testing

1 User à 50 Users à100 Users à250 Users à500 Users…………. à1000Users

12 Hours

Stress Testing

1 User à 50 Users à100 Users à250 Users à500 Users…………. à1000Users àBeyond 1000 Users……….. àMaximum Users

12 Hours

 

 

 

 

Table 2: Goals of load and stress testing

Types of testing

Goals

Load testing

  • Testing for anticipated user base
  • Validates whether system is capable of handling load under specified limit

Stress testing

  • Testing beyond the anticipated user base
  • Identifies the maximum  load a system can handle
  • Checks whether the system degrades gracefully or crashes at a shot

 

Table 3: Inference drawn by load and stress testing

 

Type of Testing

Inference

Load Testing

Whether system Available?

If yes, is the available system is stable?

Stress Testing

Whether system is Available?

If yes, is the available system is stable?

If Yes, is it moving towards Unstable state?

When the system is going to break down or degrade drastically?

No comments:

Post a Comment