By Vijay Manwani, Chief Technology Officer, BladeLogic
Organizations have invested millions into the data center to make the transition from
mainframe and client/server architectures to the new distributed computing paradigm.
This seismic shift in how data center infrastructure is deployed and used has provided IT
organizations with substantial cost and efficiency benefits. However, to make a truly
meaningful impact on the business, organizations require a state where business policies
and service-level agreements drive dynamic and automatic optimization of the IT
infrastructure, creating a highly agile, business-driven IT environment. Unfortunately, the
hard reality is that while this is a noble goal, a number of significant obstacles must be
addressed before it can be attained. These obstacles can be categorized into three areas: exploding complexity and cost; inconsistent quality of service; and escalating security risks.
Exploding Complexity and Cost
Inherent to distributed computing architectures is an exponential increase in the number
of devices, applications and configuration elements or "knobs" that needs to be managed
as compared to the traditional client/server platform.
More devices to manage- A single distributed application can span multiple types
of servers, such as web, application and database servers. Consequently, with the
ability to develop new applications far more rapidly, the total number of devices in
the data center has increased dramatically.
More complex configurations - Hundreds if not thousands of discrete
configuration elements such as files, configuration files, vendor and OS-specific
packages, and processes must be tracked and managed on an on-going basis.
Moreover, since an application relies on multiple types of servers, making a change
requires not only an understanding of the configuration elements and the
associated dependencies across and within each server tier but also the sequence in
which changes must be made to maintain application integrity.
More specialization required - Distributed applications can extend over different
devices with different operating systems. For example, a supply-chain application
may have web servers running on Windows/Intel-based devices and databases
servers running on larger UNIX machines. Managing these applications requires not
only application-specific skills but also specialists in both Windows and UNIX
operating systems. Consequently, an administrator today can easily cost an
organization over $100K/yr in fully loaded salary and benefits. A major Wall Street
financial institution estimates that the operating costs of managing a server is
between 8-9 times the associated capital costs of that device.
Strict regulatory requirements - Due to increasing regulatory pressure to
implement strict management controls, requirements such as Sarbanes-Oxley, SAS-
70, and HIPAA have put a significant burden on IT organizations to allocate resources
to document and track what changes are made, and when, in the data center.
Inconsistent Quality of Service
There are two strongly held axioms in the data center. One, there is an inverse correlation
between the rate of change and the level of stability of data center infrastructure, and
two, there is an inverse correlation between the number of people who touch the
infrastructure and the availability and integrity of that infrastructure. Given the attributes
of the distributed computing model, a key challenge that IT managers face today is how
to turn those axioms upside down.
High rate of change causes instability- In the client/server world, applications and
infrastructure changes happen once or twice a quarter. Conversely, changes for
distributed applications can occur as frequently as several times a week. Studies by
Gartner have shown that 80% of all downtime is due to misconfigurations or
operator error. Hence, distributed applications are by nature far more unstable.
Many groups involved- Due to the many technologies that support distributed
applications, multiple IT groups (e.g., Help Desk, UNIX, Windows, Application, Security
and Networking), each with its own expertise in a particular technology, have to be
involved when changes are made. This requires extensive planning and coordination
between groups. Because it is so difficult and time-consuming to execute all
required changes, invariably all tasks cannot be completed within regular
Poor documentation of server configurations - Due to the dynamic nature of
distributed applications and the number of people involved in making changes, it is
extremely difficult to track the current state of configurations and to identify
deviations from the "gold" configuration standard, if one even exists. For example,
when new application updates get migrated from development to QA to
production, each environment has differences in configurations which often cause
unintended and very negative consequences.
Hard to align IT to the business - IT groups are organized by specific
technology domains, not by the services they offer. As a result, these groups do
not have complete visibility into the needs of the business. For example, if an
organization needs to expand the capacity of its e-commerce web site due to
seasonality in its business, this requirement initiates a cumbersome process
which ultimately gets translated into complex, low-level operational tasks
executed by many different IT groups.
Escalating Security Risks
The number of security breaches increases every year, due mostly to flawed software with
security holes that are easily exploited and the difficulties associated with tracking changes and identifying compromised servers. As a result, IT organizations are under tremendous pressure to secure their data center infrastructure and keep up with the latest security recommendations.
Hard to balance security versus responsiveness - The dramatic increase in servers, applications and users requires IT organizations to monitor and manage many more security-related configuration elements. The conventional approach has been to
"lock-down" the data center, limiting the amount of change and the number of
personnel involved in managing change. As a result, organizations are faced with
trading off responsiveness for security. Unfortunately, the dichotomy between
traditional automation tools, which only focus on change, and security tools, which
only focus on compliance, has compounded this problem. For this reason, IT
organizations struggle to secure their data center infrastructure in an environment
where a high rate of change is the norm, not the exception.
Difficult to identify and fix compromised servers- Security holes in OS and
application software increase every year, requiring organizations to be very vigilant
about patching servers. Nonetheless, it is estimated that 90% of all security breaches
result from existing vulnerabilities and that breaches can be prevented if servers are
patched on time. The problem is not that organizations pay only lip service to
security but that it is very hard to identify what security patches have been issued
and which specific servers are affected. Once identified, it is difficult to quickly patch
the appropriate servers in a timely manner without impacting application stability.
Poor access controls for data center staff - Many different administrators from
different IT teams touch servers on a daily/weekly basis. Administrators typically get
full "root" access, which has significant ramifications for security. The lack of a
centralized access control mechanism significantly impedes the ability to secure the
data center. Such a control mechanism provides a rich audit trail of actions taken
and limits what administrators can do based on their skills and privileges.
The Solution: Holistic Data Center Automation
Given these shortcomings, forward-looking IT organizations now recognize that they
must take a holistic approach to managing infrastructure by employing a comprehensive
data center automation solution that provides the foundation for a truly responsive IT
environment where business policies drive the allocation and optimization of IT resources.
Data center automation solutions not only address the automation requirements of today's complex server and application infrastructure, but also better align IT operations to the needs of the business. These types of solutions provide one platform for provisioning, change, administration and compliance, and offer a wide range of functionality, including:
Modeling and Management of Configuration Items - The ability to treat all types
of configuration items such as files, vendor packages, specific parameters in
configuration files, Windows registry settings, .Net and J2EE components across all
major operating systems as objects that can be manipulated and managed in one
consistent, secure and seamless manner;
Transaction-Safe Provisioning and Change - The ability to easily simulate complex
distributed changes to prevent problems up front and roll them back to quickly
recover from unforeseen problems when changes are made;
Continuous Compliance Management - The ability to define reference
configurations (i.e., gold standards, security, regulatory policies, etc.) and to scan and
remediate changes against these reference configurations to ensure a high level of
infrastructure and application consistency on an on-going basis;
Service-Oriented Computing - The ability to simplify the complexity of managing a
large number of configuration items by modeling services, so that provisioning of
new servers and applications or scanning and repair of existing non-compliant
servers and applications can occur based on these service models. This enables
business requests to be easily translated into operational tasks.
Fundamentally, the foundation to a highly agile, business-driven IT environment starts
with the implementation of a data center automation solution. By doing so, IT organizations can:
use one platform to provision, configure and manage all types of servers and
applications in the data center;
foster strong collaboration between IT teams, with different technology and
functional expertise, to accomplish key data center tasks;
increase staff productivity and improve cost structure while supporting a large and
fast growing data center environment;
reduce complexity and better align with the business through the abstraction of IT
configuration components into IT services, where management decisions are made
at the IT service level;
and create an environment that supports an extremely high rate of change while
ensuring consistency and application stability, and where data center resources are
easily repurposed and reallocated based on changing business needs.
About the Author
Vijay is a co-founder and a member of the Board of Directors of BladeLogic, and is responsible for the company’s overall product strategy and direction. Previously, Vijay led all phases of the company’s development efforts which resulted in BladeLogic’s current product leadership position. Before BladeLogic, Vijay was an entrepreneur-in-residence at Battery Ventures where he spent the bulk of his time working to launch BladeLogic. Earlier, Vijay was the CTO at Breakaway Solutions where he was responsible for all technology initiatives in the ASP and eBusiness lines of businesses. Prior to Breakaway, Vijay was the CTO and co-founder of Eggrock Partners, an ASP/eBusiness-consulting firm that was acquired by Breakaway Solutions.
Vijay is recognized as one the most thoughtful and experienced software technologists in the industry with a long track record of success. In his 17 year career, Vijay has held technology positions of increasing responsibility at Unisys, TCI and Cambridge Technology Partners. He holds a B.S. degree in Mechanical Engineering from Pune University, India. In addition, he serves on several technical advisory boards, holds two patents and has co-authored a book on eCommerce with two Harvard Business School professors.
As the data center has evolved from the client/server to the distributed computing paradigm, solutions for provisioning, change and configuration management have not kept pace. Management processes are mired in a craftsmanship era, where highly-skilled personnel use a collection of scripts and point tools to effect change manually. The result – data center infrastructure is essentially hard-wired because it is so complicated and costly to change. Consequently, server utilization rates are exceedingly low, IT resources cannot be quickly re-purposed to respond to changing business requirements and management and support costs account for up to 80% of data center budgets.
BladeLogic was founded by industry veterans who understand these problems having managed complex, globally-distributed computing environments, including eleven data centers on four different continents. Through that experience they identified the pressing need for a new data center automation platform to help IT organizations more efficiently provision, configure and manage today’s highly sophisticated data center environment.
By providing the industry’s most comprehensive data center automation solution, more Global 2000 customers use BladeLogic to dramatically cut data center operating costs, reduce security risks and increase IT service quality than any other solution in the market. BladeLogic’s data center automation solutions enable a state where business policies and service-level agreements drive the dynamic and automatic optimization of the IT infrastructure, creating a highly agile, business-driven IT environment.