Protecting Web Transactions and Services Through Transaction Management

As more and more business is conducted via the Web, high-performance, 100-percent-reliable application transactions become mission-critical, not only to success but to survival. Of Cisco's annual business, 75 percent is conducted over the Web, while 100 percent of Amazon.com's business is generated over the Web, as is all of E*Trade's business.



For E*Trade in particular, the success of each transaction is of paramount importance, since it executes orders to buy or sell securities, which can be extremely volatile. A failure to execute a buy or sell order, or even a delay, could result in grave losses to E*Trade's customers. It might very well be that E*Trade's loss of a customer's transaction could result in the loss of the customer.

A downtime of just one minute could cost sites as much as $10,000, according to the Standish Group, a research consultancy. By that count, a two-hour blackout carries a price tag of $1.2 million.

Perhaps even more detrimental is the potential for losing future business by alienating customers. A study done awhile back by the former Zona Research (now Sageza) reported that the average online customer will wait roughly eight seconds for a page to completely download before leaving the site. Where are these online customers going? Research from Jupiter Communications suggests that in a business-to-consumer (B2C) scenario, 46 percent of these impatient prospects go to competitors.

With all that's at stake, it is extremely critical for IT and e-business managers to keep their fingers on the pulse of their application transactions. These fingers should continuously monitor the performance and reliability of the application transactions to make sure they are up to par. Such monitoring ensures that the mission-critical application is functioning at peak performance around the clock. As the next section explains, this monitoring is best done in a top-down fashion, using a solution that delivers application infrastructure insight.

Application performance monitoring (APM) delivers insight by using a lens that cuts through the transaction value chain horizontally, looks at each link in the chain and pinpoints bottlenecks and weak links faster than the traditional "stovepipe" approach. The stovepipe approach relies on device-centric tools to monitor individual components of the chain without regard to the value that each chain link provides to delivering transaction value to the end user, which often results in misdiagnosis and finger-pointing.

This difference in the two approaches is discussed below.

The End-User Application Transaction Perspective

The end-user application transaction perspective is the best place to start when monitoring the performance of the online business. This is where the rubber meets the road. It isn't enough to measure uptime or other resource parameters of the system, because these parameters may still be within acceptable limits even when the end user is receiving unacceptable performance.

Actively generating "ghost" transactions that mimic typical real end-user scenarios on a 24x7 basis means that accurate information will be provided on how customers perceive application performance around-the-clock. Such monitoring allows quick reaction to impending problems and the ability to solve them before they become disasters.

Analysis and actions based on such proactive monitoring are far more useful than reliance on measuring the response times of actual transactions made by real users. Administrators who rely on real user data become alerted to problems only when the real users are actually experiencing them. This places them in a reactive mode, allowing them to respond only to end-user complaints. A 24x7 active ghost-transaction monitoring activity allows detection of middle-of-the-night problems when they occur instead of waiting for users to detect them in the morning.

Inside the Online Application Infrastructure--Hierarchical Data Model

Figure 1 depicts a common online IT infrastructure geared toward serving Web-based applications to customers over the Internet.


Figure 1

On the left are customers accessing the application via a Web browser. They enter the online business through an optional DMZ (demilitarized zone). The DMZ is typically used to provide security: e.g., to make sure only certain TCP ports are allowed through and to ensure that critical resources in Web servers, application servers and enterprise databases are protected from hackers. Gateways to external services allow the internal system to interface with outside services such as Visa or MasterCard payment services.

A B2B application is a variation where, instead of consumers, businesses access the resources of the online business through either the Internet or an extranet. The external services in this case might take on a more important role as a link to suppliers, partners, affiliates, etc. Although we have used examples in the B2C case, the discussion that follows applies to both B2C and B2B scenarios.

Online application transactions originating at the customer's Web browser thread through the online application infrastructure. They traverse different paths through this infrastructure and use network, system and application resources as they do their work. With an end-to-end APM approach, it's possible to visualize the wave of transactions coming in from the Internet and gradually rippling through the different components of the online infrastructure. As each component is traversed and resources are utilized, the overall performance may diminish or slow.

It helps to think through the transaction cause-effect data model hierarchically. A typical application transaction breaks down into 25 or more subtransactions, and these in turn rely on the shared application infrastructure: applications, middleware, firewalls, load balancers, servers and network devices. Thinking in terms of this top-down transaction hierarchy is valuable in automating the trouble-shooting process.

For example, the end user might be trying to buy a book from Amazon.com. This transaction consists of several dozen smaller subtransactions: displaying the initial Web page, searching for the title, finding it, placing it in the shopping cart, proceeding to the checkout, selecting the delivery option, calculating the tax, entering the credit card number, validating it, DNS lookup, database subtransaction, etc. Each subtransaction adds its own time delay to the overall picture, as does the use of resources in each infrastructure component traversed.

Any solution to this issue should include the recording of complex multi-URL, multiform Web transactions, including handling secure sessions, dynamic cookies, pop-up windows and session IDs. These recorded transactions can be replayed periodically from one or many locations on the Internet, thereby constantly measuring the customer or partner Web experience.

Performance variables within the application infrastructure can be numerous: 5,000 to 50,000, including firewalls, Web servers, application servers, databases, load balancers, IP services and network devices. If the application transactions start to slow down, the solution can quickly pinpoint the possible cause. In addition, dealing proactively with a few trends may prevent application problems in the future.

Let's say, for example, that a transaction normally takes 25 seconds, but for several hours the time increased to more than 200 seconds. This abnormal behavior would immediately be linked in real-time with performance delivery and resource utilization numbers for the different IT elements in the online infrastructure. For example, if the database subtransaction was slow at the same time, that correlation is important. A performance management solution should go further and also correlate any abnormalities associated with the database, thereby further pinpointing what may be causing the problem.

In the example we're discussing, the database server detects abnormal resource utilization during the same period. This indicates that the database server was overutilized during the same period that the end user response time was slow. This evidence points to the database server as the bottleneck and the primary cause of the delayed response to the end user.

This same approach can be used to plan for the future capacity of the application infrastructure. With a complex IT infrastructure, it's often very difficult to tell which resource is in danger of being used to capacity. An end-to-end approach monitors all network, system and application resources simultaneously. By tying this monitored information to business metrics, such as the number of page views or click-throughs, the operations staff can link the utilization of the different resources to the load. This technique further allows better management of the budget and the assurance that IT departments are upgrading the correct resource and getting the maximum benefit for each incremental dollar spent.

Summary

This paper has described an application transaction-centric approach to monitoring and managing the application infrastructure used to provide high-performance, highly reliable online solutions. This approach starts by looking at the application transaction from an end-user perspective and delivers insight by selectively drilling down to uncover the cause of poor performance.

This transaction performance management approach is complementary to the traditional stovepipe approach, where individual parts of the value chain are monitored independently, mostly for availability. Not only does the transaction management approach help identify and fix problems quickly, it also allows you to do advance proactive capacity planning in an intelligent fashion.

Copyright © 2002 ProactiveNet, ProactiveNet Software

Related Webinars

Integration Improves Relationship Management in the Insurance Industry

Leveraging Your Host Systems

About the Authors

Atul Garg, the CTO of ProactiveNet, has more than 18 years of experience in networking and network management at TCSI, Bay Networks and HP. While at Bay, he developed the long-term vision and architecture that led to the development of intelligent network trending and monitoring applications. As a project manager at HP, he initiated and led the development of key components of HP OpenView.

More by Atul Garg

Ronald Schmidt was the co-founder and CTO of SynOptics Communications and Bay Networks. He serves on the boards of several Silicon Valley high-tech companies, including ProactiveNet.

More by Ronald Schmidt

About ProactiveNet

ProactiveNet provides real-time, end-to-end performance management solutions for online applications and their associated infrastructures. ProactiveNet offers a minimally invasive, holistic approach to monitoring, identifying, diagnosing and resolving performance problems for online applications. For more information, visit the company's Web site at www.proactivenet.com.