Business Activity Monitoring (BAM)
Proactive IT Strategy: Why You Need One
By Charley Rich, VP of Marketing & Product Mgmt., Nastel
According to a recent Aberdeen Group report, nearly half of the 158 organizations
surveyed do not have the ability to measure the business impact caused by insufficient
application performance, nor do they detect problems before users are impacted.
Organizations across all industries are losing millions of dollars suffering
business process disruption, risking customer desertion and utilizing expensive
talent to fix issues after users are impacted. The question these businesses
need to ask themselves is: "Why fight fires when you can take away the
kindling before they start?"
A large part of addressing this question is making sure problems are detected
by support when they first show symptoms and before they are actually impactful.
This gives a chance for these firms to act and not just react. Companies typically
view their production environment as a serious of silos: network, web server
farm, application servers, middleware messaging, databases and mainframe.
Each has separate management tools that result in a stovepipe view -- one that
can be misleading regarding the state of your applications and their impact
on your business. Managing via silos leads to an overburdened service desk and
millions of dollars in potential firefighting after the fact.
In fact, many of these firms practice what is often likened to the blame game.
When a serious problem occurred, IT management would bring into a single room
the following groups: network group, web server farm, application servers, application
availability monitoring, database group management and development. Each group
might say the problem is not theirs but instead is another group's fault.
Typically, management orders the various groups to stay in the room until the
problem is resolved. Sessions can last as long as eight hours. This is a very
expensive way to resolve problems. Not surprisingly, the results are less than
stellar: service levels to customers decrease, a high number of tickets are
opened at the service desk, and when results are tabulated 65 percent of their
problems are identified by customers before support is aware of them.
To curb wasteful spending, more and more companies are adopting more of a business
transaction management strategy -- an effective, strategic approach to end-to-end
monitoring of applications and processes called Business Transaction Performance
(BTP). The strategy's goal is to improve efficiency and squeeze stealth waste
from your business through the identification and remediation of business transaction
The concept is this: provide auto-discovered end-to-end view of your business
transactions, visualize this stitched-together topology in a UI, analyze for
latency and its causes, automate problem resolution, report on results, and
improve business process efficiency. BTP is the next step in the evolution of
the discipline, known as application management, application monitoring or transaction
Getting to the heart of the problem
One of the main keys to proactive IT strategy is discovering problems in as
close to real-time as possible. Having a BTP solution in place empowers businesses
to correlate operational, transaction management and business performance real-time
data so businesses can get a comprehensive, 360-degree situational awareness
across their entire enterprises. From a centralized dashboard, users should
be able to view and resolve latency or operational issues quickly and easily.
The predictive capabilities of such strategies are in place to manage risk
and prevent business process disruption, which can cause heavy financial burdens
on your company. Do you want to fix problems when they show up or when they
blow up? Clearly, it is far less expensive to handle the former rather than
the latter. Not only does BTP proactively save money that would otherwise be
spent on defects, slowed production and recalls, but it also comes at a significantly
lower cost than the alternative.
This alternative is one of the industry's best kept secrets: when an application
runs slower than necessary, "throw hardware at it until the problem goes
away." With unprecedented scrutiny on IT expenses, contemporary businesses
can no longer rely on over-provisioning of hardware to accommodate performance.
Today, general improvement in service comes as a result of multi-tier visibility,
resulting in low-latency, high performance and (most importantly) happier customers.
Prebuilt policies can take this a step further and provide automatic remediation
of problems without user interaction.
Applications speak with each other via transactions and messages with much
of this communication transpiring over middleware messaging. This is often due
to business process management systems whose process orchestration spawns these
messages and transactions in order to actuate your business processes. Therefore,
it is essential to have a deep introspection of the middleware layer in order
to "see" and "hear" all the conversations that are occurring.
Without this capability, your view of your business will have serious blind
spots which could lead to the wrong conclusions and decidedly non-optimal decisions.
Business normal vs. business abnormal
By embracing a BTP approach, users gain enough perspective to accurately define
"business normal" and "business abnormal" states based on
hard-won knowledge of their business. This enables extraordinarily accurate
decisions and resultant automation based on user-defined business normal conditions.
When the engine compares the data that it has instantly discovered, correlated
and analyzed to these definitions, the system automatically recognizes symptoms
of potential problems before they have a chance to materialize and derail business
For example, banks in the U.S. must reconcile with the Federal Reserve by the
end of the day. At 4:00 p.m., your BTP system might detect that 20 transactions
are in flight with an average duration of 10 minutes and that you will not clear
these transactions in time. No tickets are open and no users are impacted
However, you proactively recognize there will be a problem and can either act
yourself or let the system respond using predefined automation before the problem
actually has impact.
From this 360-degree situational awareness "lens," it becomes possible
to view your applications veering towards a business-abnormal condition long
before they get there and users are adversely impacted.
Defining "business normal" is what gives the system its predictive
capabilities. But to handle the largest and most dynamic environments, the BTP
system has to be able to process millions of rules per second. In order to do
this, it is crucial to run the BTP system in conjunction with a complex event
processing engine. That way, the system will not back up, and businesses with
high volumes of messages and transactions can easily recognize the patterns
that appear in "business abnormal" conditions.
Conditions leading to the BTP development
The country's recent economic woes have hit the financial industry hard, and
cries for reform are not going unheard. One of the most crucial factors for
a competitive electronic financial business platform is low-latency. The difference
in latency between winners and losers can be measured in mere nanoseconds.
Some IT personnel for financial organizations are skeptical as to whether current
systems can handle increasing loads. Time is money, and BTP methodology, coupled
with a complex event processing engine, has the potential to keep real-time
transactional data in the financial industry.
The other key desirable on Wall Street is the concept of best execution. Without
appropriate visibility, there is no way to know if your trades have achieved
An important result of the visibility that BTP provides is the knowledge of
the detailed path your transactions have taken without presenting blind spots.
This knowledge can produce surprising results. Many users discover that their
application performance problems are not initially due to resource constraints
but instead are caused by transactional misbehavior or logic errors resulting
from unexpected actions. This is sometimes referred to as "normal accidents."
As many highly transactional environments are extraordinarily complex, not
every permutation can be tested. As a result, in a production environment, a
transaction may act in a novel way, certainly not delivering best execution
and likely contributing to aggregate latency.
Other recessionary factors include increased governance, regulation and government
ownership for investment banks, brokerages and other organizations. BTP strategies
increase speed and accuracy for transactions.
Partially due to recessionary pressures to make use of what you already have
and to greater realization of their power and practicality, mainframes have
moved into the 21st century. As an essential part of business transaction environments,
businesses require the ability to track complex transactions across all tiers
of their IT infrastructures, including mainframes, whether they are running
zOS or zLinux. Remember: no blind spots! This support is offered by enabling
a true, comprehensive BTP strategy.
A lesson in efficient manufacturing operations
An industry-leading technology infrastructure manufacturing/distribution company
powered operations with a variety of application and transaction types, including
SOA, Web Services, integrations, Point of Sale, manufacturing, services and
support, and others. All together, the operations team supported 300+ applications
that leverage IBM MQ (which was put in place to support the manufacturing floor,
integrations between sales and manufacturing systems, equipment servicing, and
several other applications).
The company had developed an in-house solution for monitoring queues on MQ
(around 65,000 daily) that was outdated and expensive and often did not notify
the team when transactions were broken, resulting in between $750,000 and $1.5
million in costs per Severity 1 problem. Especially for a large company like
this one, numbers like these are at stake when you don't have an effective monitoring
platform that provides 360-degree situational awareness.
To replace the old system, the manufacturer/distributor put a well-known commercial
BTP solution in place. The BTP solution provided a unique and comprehensive
situational awareness that combined transactional, operational and business
KPI data and correlated it using a complex event processing engine to proactively
notify and take action before users are impacted.
When all was said and done, based on the solution's ability to predict problems,
Severity 1 problems were reduced from 15 per year to one in the past year and
a half, saving the company $15,500,000 annually. On top of that, the company
was able to save $350,000 annually in staff fees. Trouble tickets were reduced
by 70 percent, customer satisfaction improved, and Tier 1 was able to handle
many more customer problems than before.
By identifying pain points before they were reached, the company was able to
save themselves time, hassle and money. This could only be achieved by putting
cutting-edge technology in place as a method for proactive application performance