Systems and Business Service Management
Protecting Web Transactions and Services Through Transaction Management
Atul Garg, ProactiveNet
Ronald Schmidt, SynOptics Communications
As more and more business is conducted via the Web, high-performance, 100-percent-reliable
application transactions become mission-critical, not only to success but to
survival. Of Cisco's annual business, 75 percent is conducted over the Web,
while 100 percent of Amazon.com's business is generated over the Web, as is
all of E*Trade's business.
For E*Trade in particular, the success of each transaction is of paramount
importance, since it executes orders to buy or sell securities, which can be
extremely volatile. A failure to execute a buy or sell order, or even a delay,
could result in grave losses to E*Trade's customers. It might very well be that
E*Trade's loss of a customer's transaction could result in the loss of the customer.
A downtime of just one minute could cost sites as much as $10,000, according
to the Standish Group, a research consultancy. By that count, a two-hour blackout
carries a price tag of $1.2 million.
Perhaps even more detrimental is the potential for losing future business by
alienating customers. A study done awhile back by the former Zona Research (now
Sageza) reported that the average online customer will wait roughly eight seconds
for a page to completely download before leaving the site. Where are these online
customers going? Research from Jupiter Communications suggests that in a business-to-consumer
(B2C) scenario, 46 percent of these impatient prospects go to competitors.
With all that's at stake, it is extremely critical for IT and e-business managers
to keep their fingers on the pulse of their application transactions. These
fingers should continuously monitor the performance and reliability of the application
transactions to make sure they are up to par. Such monitoring ensures that the
mission-critical application is functioning at peak performance around the clock.
As the next section explains, this monitoring is best done in a top-down fashion,
using a solution that delivers application infrastructure insight.
Application performance monitoring (APM) delivers insight by using a lens that
cuts through the transaction value chain horizontally, looks at each link in
the chain and pinpoints bottlenecks and weak links faster than the traditional
"stovepipe" approach. The stovepipe approach relies on device-centric
tools to monitor individual components of the chain without regard to the value
that each chain link provides to delivering transaction value to the end user,
which often results in misdiagnosis and finger-pointing.
This difference in the two approaches is discussed below.
The End-User Application Transaction Perspective
The end-user application transaction perspective is the best place to start
when monitoring the performance of the online business. This is where the rubber
meets the road. It isn't enough to measure uptime or other resource parameters
of the system, because these parameters may still be within acceptable limits
even when the end user is receiving unacceptable performance.
Actively generating "ghost" transactions that mimic typical real
end-user scenarios on a 24x7 basis means that accurate information will be provided
on how customers perceive application performance around-the-clock. Such monitoring
allows quick reaction to impending problems and the ability to solve them before
they become disasters.
Analysis and actions based on such proactive monitoring are far more useful
than reliance on measuring the response times of actual transactions made by
real users. Administrators who rely on real user data become alerted to problems
only when the real users are actually experiencing them. This places them in
a reactive mode, allowing them to respond only to end-user complaints. A 24x7
active ghost-transaction monitoring activity allows detection of middle-of-the-night
problems when they occur instead of waiting for users to detect them in the
Inside the Online Application Infrastructure--Hierarchical Data Model
Figure 1 depicts a common online IT infrastructure geared toward serving Web-based
applications to customers over the Internet.
On the left are customers accessing the application via a Web browser. They
enter the online business through an optional DMZ (demilitarized zone). The
DMZ is typically used to provide security: e.g., to make sure only certain TCP
ports are allowed through and to ensure that critical resources in Web servers,
application servers and enterprise databases are protected from hackers. Gateways
to external services allow the internal system to interface with outside services
such as Visa or MasterCard payment services.
A B2B application is a variation where, instead of consumers, businesses access
the resources of the online business through either the Internet or an extranet.
The external services in this case might take on a more important role as a
link to suppliers, partners, affiliates, etc. Although we have used examples
in the B2C case, the discussion that follows applies to both B2C and B2B scenarios.
Online application transactions originating at the customer's Web browser thread
through the online application infrastructure. They traverse different paths
through this infrastructure and use network, system and application resources
as they do their work. With an end-to-end APM approach, it's possible to visualize
the wave of transactions coming in from the Internet and gradually rippling
through the different components of the online infrastructure. As each component
is traversed and resources are utilized, the overall performance may diminish
It helps to think through the transaction cause-effect data model hierarchically.
A typical application transaction breaks down into 25 or more subtransactions,
and these in turn rely on the shared application infrastructure: applications,
middleware, firewalls, load balancers, servers and network devices. Thinking
in terms of this top-down transaction hierarchy is valuable in automating the
For example, the end user might be trying to buy a book from Amazon.com. This
transaction consists of several dozen smaller subtransactions: displaying the
initial Web page, searching for the title, finding it, placing it in the shopping
cart, proceeding to the checkout, selecting the delivery option, calculating
the tax, entering the credit card number, validating it, DNS lookup, database
subtransaction, etc. Each subtransaction adds its own time delay to the overall
picture, as does the use of resources in each infrastructure component traversed.
Any solution to this issue should include the recording of complex multi-URL,
multiform Web transactions, including handling secure sessions, dynamic cookies,
pop-up windows and session IDs. These recorded transactions can be replayed
periodically from one or many locations on the Internet, thereby constantly
measuring the customer or partner Web experience.
Performance variables within the application infrastructure can be numerous:
5,000 to 50,000, including firewalls, Web servers, application servers, databases,
load balancers, IP services and network devices. If the application transactions
start to slow down, the solution can quickly pinpoint the possible cause. In
addition, dealing proactively with a few trends may prevent application problems
in the future.
Let's say, for example, that a transaction normally takes 25 seconds, but for
several hours the time increased to more than 200 seconds. This abnormal behavior
would immediately be linked in real-time with performance delivery and resource
utilization numbers for the different IT elements in the online infrastructure.
For example, if the database subtransaction was slow at the same time, that
correlation is important. A performance management solution should go further
and also correlate any abnormalities associated with the database, thereby further
pinpointing what may be causing the problem.
In the example we're discussing, the database server detects abnormal resource
utilization during the same period. This indicates that the database server
was overutilized during the same period that the end user response time was
slow. This evidence points to the database server as the bottleneck and the
primary cause of the delayed response to the end user.
This same approach can be used to plan for the future capacity of the application
infrastructure. With a complex IT infrastructure, it's often very difficult
to tell which resource is in danger of being used to capacity. An end-to-end
approach monitors all network, system and application resources simultaneously.
By tying this monitored information to business metrics, such as the number
of page views or click-throughs, the operations staff can link the utilization
of the different resources to the load. This technique further allows better
management of the budget and the assurance that IT departments are upgrading
the correct resource and getting the maximum benefit for each incremental dollar
This paper has described an application transaction-centric approach to monitoring
and managing the application infrastructure used to provide high-performance,
highly reliable online solutions. This approach starts by looking at the application
transaction from an end-user perspective and delivers insight by selectively
drilling down to uncover the cause of poor performance.
This transaction performance management approach is complementary to the traditional
stovepipe approach, where individual parts of the value chain are monitored
independently, mostly for availability. Not only does the transaction management
approach help identify and fix problems quickly, it also allows you to do advance
proactive capacity planning in an intelligent fashion.
Copyright © 2002 ProactiveNet, ProactiveNet Software
Integration Improves Relationship Management in the Insurance Industry
Leveraging Your Host Systems
About the Authors
Atul Garg, the CTO of ProactiveNet, has more than 18 years of experience in
networking and network management at TCSI, Bay Networks and HP. While at Bay,
he developed the long-term vision and architecture that led to the development
of intelligent network trending and monitoring applications. As a project manager
at HP, he initiated and led the development of key components of HP OpenView.
More by Atul Garg
Ronald Schmidt was the co-founder and CTO of SynOptics Communications and Bay
Networks. He serves on the boards of several Silicon Valley high-tech companies,
More by Ronald Schmidt
ProactiveNet provides real-time, end-to-end performance management solutions for online applications and their associated infrastructures. ProactiveNet offers a minimally invasive, holistic approach to monitoring, identifying, diagnosing and resolving performance problems for online applications. For more information, visit the company's Web site at www.proactivenet.com.