Proactive IT Strategy: Why You Need One

Untitled Document

According to a recent Aberdeen Group report, nearly half of the 158 organizations surveyed do not have the ability to measure the business impact caused by insufficient application performance, nor do they detect problems before users are impacted.

Organizations across all industries are losing millions of dollars suffering business process disruption, risking customer desertion and utilizing expensive talent to fix issues after users are impacted. The question these businesses need to ask themselves is: "Why fight fires when you can take away the kindling before they start?"

A large part of addressing this question is making sure problems are detected by support when they first show symptoms and before they are actually impactful. This gives a chance for these firms to act and not just react. Companies typically view their production environment as a serious of silos: network, web server farm, application servers, middleware messaging, databases and mainframe.

Each has separate management tools that result in a stovepipe view -- one that can be misleading regarding the state of your applications and their impact on your business. Managing via silos leads to an overburdened service desk and millions of dollars in potential firefighting after the fact.

In fact, many of these firms practice what is often likened to the blame game. When a serious problem occurred, IT management would bring into a single room the following groups: network group, web server farm, application servers, application availability monitoring, database group management and development. Each group might say the problem is not theirs but instead is another group's fault.

Typically, management orders the various groups to stay in the room until the problem is resolved. Sessions can last as long as eight hours. This is a very expensive way to resolve problems. Not surprisingly, the results are less than stellar: service levels to customers decrease, a high number of tickets are opened at the service desk, and when results are tabulated 65 percent of their problems are identified by customers before support is aware of them.

To curb wasteful spending, more and more companies are adopting more of a business transaction management strategy -- an effective, strategic approach to end-to-end monitoring of applications and processes called Business Transaction Performance (BTP). The strategy's goal is to improve efficiency and squeeze stealth waste from your business through the identification and remediation of business transaction latency.

The concept is this: provide auto-discovered end-to-end view of your business transactions, visualize this stitched-together topology in a UI, analyze for latency and its causes, automate problem resolution, report on results, and improve business process efficiency. BTP is the next step in the evolution of the discipline, known as application management, application monitoring or transaction monitoring.

Getting to the heart of the problem

One of the main keys to proactive IT strategy is discovering problems in as close to real-time as possible. Having a BTP solution in place empowers businesses to correlate operational, transaction management and business performance real-time data so businesses can get a comprehensive, 360-degree situational awareness across their entire enterprises. From a centralized dashboard, users should be able to view and resolve latency or operational issues quickly and easily.

The predictive capabilities of such strategies are in place to manage risk and prevent business process disruption, which can cause heavy financial burdens on your company. Do you want to fix problems when they show up or when they blow up? Clearly, it is far less expensive to handle the former rather than the latter. Not only does BTP proactively save money that would otherwise be spent on defects, slowed production and recalls, but it also comes at a significantly lower cost than the alternative.

This alternative is one of the industry's best kept secrets: when an application runs slower than necessary, "throw hardware at it until the problem goes away." With unprecedented scrutiny on IT expenses, contemporary businesses can no longer rely on over-provisioning of hardware to accommodate performance. Today, general improvement in service comes as a result of multi-tier visibility, resulting in low-latency, high performance and (most importantly) happier customers. Prebuilt policies can take this a step further and provide automatic remediation of problems without user interaction.

Applications speak with each other via transactions and messages with much of this communication transpiring over middleware messaging. This is often due to business process management systems whose process orchestration spawns these messages and transactions in order to actuate your business processes. Therefore, it is essential to have a deep introspection of the middleware layer in order to "see" and "hear" all the conversations that are occurring. Without this capability, your view of your business will have serious blind spots which could lead to the wrong conclusions and decidedly non-optimal decisions.

Business normal vs. business abnormal

By embracing a BTP approach, users gain enough perspective to accurately define "business normal" and "business abnormal" states based on hard-won knowledge of their business. This enables extraordinarily accurate decisions and resultant automation based on user-defined business normal conditions. When the engine compares the data that it has instantly discovered, correlated and analyzed to these definitions, the system automatically recognizes symptoms of potential problems before they have a chance to materialize and derail business processes.

For example, banks in the U.S. must reconcile with the Federal Reserve by the end of the day. At 4:00 p.m., your BTP system might detect that 20 transactions are in flight with an average duration of 10 minutes and that you will not clear these transactions in time. No tickets are open and no users are impacted…yet. However, you proactively recognize there will be a problem and can either act yourself or let the system respond using predefined automation before the problem actually has impact.

From this 360-degree situational awareness "lens," it becomes possible to view your applications veering towards a business-abnormal condition long before they get there and users are adversely impacted.

Defining "business normal" is what gives the system its predictive capabilities. But to handle the largest and most dynamic environments, the BTP system has to be able to process millions of rules per second. In order to do this, it is crucial to run the BTP system in conjunction with a complex event processing engine. That way, the system will not back up, and businesses with high volumes of messages and transactions can easily recognize the patterns that appear in "business abnormal" conditions.

Conditions leading to the BTP development

The country's recent economic woes have hit the financial industry hard, and cries for reform are not going unheard. One of the most crucial factors for a competitive electronic financial business platform is low-latency. The difference in latency between winners and losers can be measured in mere nanoseconds.

Some IT personnel for financial organizations are skeptical as to whether current systems can handle increasing loads. Time is money, and BTP methodology, coupled with a complex event processing engine, has the potential to keep real-time transactional data in the financial industry.

The other key desirable on Wall Street is the concept of best execution. Without appropriate visibility, there is no way to know if your trades have achieved best execution.

An important result of the visibility that BTP provides is the knowledge of the detailed path your transactions have taken without presenting blind spots. This knowledge can produce surprising results. Many users discover that their application performance problems are not initially due to resource constraints but instead are caused by transactional misbehavior or logic errors resulting from unexpected actions. This is sometimes referred to as "normal accidents."

As many highly transactional environments are extraordinarily complex, not every permutation can be tested. As a result, in a production environment, a transaction may act in a novel way, certainly not delivering best execution and likely contributing to aggregate latency.

Other recessionary factors include increased governance, regulation and government ownership for investment banks, brokerages and other organizations. BTP strategies increase speed and accuracy for transactions.

Partially due to recessionary pressures to make use of what you already have and to greater realization of their power and practicality, mainframes have moved into the 21st century. As an essential part of business transaction environments, businesses require the ability to track complex transactions across all tiers of their IT infrastructures, including mainframes, whether they are running zOS or zLinux. Remember: no blind spots! This support is offered by enabling a true, comprehensive BTP strategy.

A lesson in efficient manufacturing operations

An industry-leading technology infrastructure manufacturing/distribution company powered operations with a variety of application and transaction types, including SOA, Web Services, integrations, Point of Sale, manufacturing, services and support, and others. All together, the operations team supported 300+ applications that leverage IBM MQ (which was put in place to support the manufacturing floor, integrations between sales and manufacturing systems, equipment servicing, and several other applications).

The company had developed an in-house solution for monitoring queues on MQ (around 65,000 daily) that was outdated and expensive and often did not notify the team when transactions were broken, resulting in between $750,000 and $1.5 million in costs per Severity 1 problem. Especially for a large company like this one, numbers like these are at stake when you don't have an effective monitoring platform that provides 360-degree situational awareness.

To replace the old system, the manufacturer/distributor put a well-known commercial BTP solution in place. The BTP solution provided a unique and comprehensive situational awareness that combined transactional, operational and business KPI data and correlated it using a complex event processing engine to proactively notify and take action before users are impacted.

When all was said and done, based on the solution's ability to predict problems, Severity 1 problems were reduced from 15 per year to one in the past year and a half, saving the company $15,500,000 annually. On top of that, the company was able to save $350,000 annually in staff fees. Trouble tickets were reduced by 70 percent, customer satisfaction improved, and Tier 1 was able to handle many more customer problems than before.

By identifying pain points before they were reached, the company was able to save themselves time, hassle and money. This could only be achieved by putting cutting-edge technology in place as a method for proactive application performance management.