Enabling Enterprises for Cloud-Scale Deployments

Within the last five years, enterprises have adopted LAMP (Linux, Apache, MySQL, PHP) stacks to deploy both internal and external Web applications. This new approach has dramatically reduced the costs of application management and deployment for not only these organizations, but their customers, employees and partners.

However, an underlying issue to this framework is the lack of an adequate storage infrastructure to support exploding data requirements. Traditional file servers and storage systems are simply not built to handle the high data demands and scale required by enterprise Web applications. Consider the retail operations of Walmart.com handling more than half a million unique visitors daily, the customer-facing tracking options for FedEx with more than a quarter million daily visitors, or the collaboration and Intranet requirements of a company as large as General Electric. These types of applications drive unprecedented needs for fast, economical and scalable file serving.



Enterprise Web applications require a new and innovative approach to file system infrastructures to enable simple scalability and reliable application performance. This article will outline the enterprise challenges inherent in legacy storage and file systems and give real-world examples of the key benefits resulting from the use of distributed file systems that are optimal in supporting cloud-scale applications.

Challenges and Requirements: Enterprise Scale-Out Deployments

Enterprises continually struggle to match file serving and storage costs, performance and feature sets that can keep up with unpredictable and rapidly expanding workloads. Since today's applications differ significantly from those of just a few years ago, there are a new set of challenges and requirements facing enterprises.

In 2009, Enterprise Strategy Group conducted a study on scale-out network attached storage solutions. The results showed the most frequently mentioned considerations including:

- Faster storage provisioning times
- Improved scalability
- Easier to manage
- Improved data availability
(Source: http://www.enterprisestrategygroup.com/2009/01/esg-report-scale-out-nas-driving-value-for-rapidly-growing-file-based-storage-environments/)

Rapid provisioning and scale

Whereas applications that served only a small set of users had moderate needs for rapid provisioning and scale, today's Web applications reach unprecedented levels of data growth-both in terms of overall capacity and the number of files or objects managed. This causes significant pain for IT administrators, who must constantly provision new capacity and performance. If provisioning requires the deployment of new systems, and then the manual oversight to load balance across those systems, it will be impossible to keep up.

In addition, many file serving and storage systems have caps on the number of nodes, overall capacity, or maximum file delivery performance they can achieve. These inherent limitations can quickly cause troublesome and costly roadblocks to effective scaling.

Ease of management

Similar to provisioning and scale, IT professionals must be able to manage a growing installation without having to add management tasks. Conventional systems often operate as "islands of storage," requiring individual mount points as well as individual management. The explosion of managed entities easily outpaces the ability of a single administrator to keep up limiting application growth.

Availability

Scale-out applications typically support a large number of end users requiring continuous uptime. Solutions that require off-line software upgrades, or cannot undergo online hardware refreshes, negatively impact the application.

Basics of a Cloud-Scale Solution

Cloud-scale file serving and storage solutions support Web companies, service providers and enterprises architecting scale-out Web based applications that may in turn be offered as services to their customers.

These solutions must scale in a manner that decreases management cost per unit to meet the economic challenges of serving large user bases with potentially explosive amounts of data growth.

True scale-out -- commodity hardware + software, no bottlenecks

As evidenced by Internet giants such as Amazon, Facebook, Google, and Yahoo!, the models to support scale-out applications involve large numbers of commodity hardware server nodes throughout the compute and storage layers. These solutions are architected to handle millions of simultaneous requests across billions of files, and therefore distribute metadata operations across many inexpensive hardware nodes to achieve high throughput at low cost. This eliminates centralized choke points within the storage system to locate and retrieve the requested data, and removes crippling performance bottlenecks typical of conventional systems.

Single management point, single namespace

Scaling performance and capacity seamlessly requires unified management where any node in the system can act as a management node of the entire cluster. This simplifies operations and keeps management overhead constant while accommodating increasing application loads and datasets. By making the entire system available within a single namespace, administrators can manage a single mount point for applications instead of having to manually load balance and reallocate among a set of independent devices.

Withstand drive and node failures automatically

Cloud-scale solutions must withstand the drive and node failures that occur without disrupting the application. Commodity hardware is inexpensive, but not immune to physical failures. Self-healing systems automatically detect hardware failures and repair the system without loss of data or access to the affected data. Data replication often plays a key role by enabling this capability, ensuring that data is always available and eliminating single points of failure.

On-demand capacity

Scale-out cloud applications service users 24 hours each day, making downtime for upgrades an unaffordable luxury. Successful cloud applications deploy flexible storage systems that can be upgraded and expanded seamlessly, while applications continue to access data. Administrators adjust the storage system as the business requires without worrying about scheduling an outage window simply to add capacity or upgrade hardware and software components.

The overall strategy involves an investment in software resiliency that ultimately allows applications to transcend hardware implementations. One can envision an application that remains online through numerous hardware refreshes during its lifetime.

Conclusion

Cloud-scale solutions have the ability to service large numbers of end users with explosive data requirements, keep the cost of service delivery to a minimum and provide availability to keep users happy and administrators nimble. Web companies, service providers and enterprises can keep these guidelines in mind when evaluating current and future solutions.

About the Authors

Gary Orenstein is vice president, technical solutions at MaxiScale. Orenstein, who has extensive data center infrastructure and network storage experience, has served in leadership marketing roles at numerous networking and storage companies. In addition to being a regular contributor to GigaOM, Orenstein hosts the podcast The Cloud Computing Show. Orenstein is the author of IP Storage Networking: Straight to the Core. He holds an MBA from the Wharton School at the University of Pennsylvania, and a BA from Dartmouth College.

More by Gary Orenstein

Mark Balch has served in leadership roles for market-winning storage, networking and data center automation products at early stage ventures and established technology firms. Prior to MaxiScale, he led the flagship Server Automation product line at Opsware, which was acquired by Hewlett Packard. He has held product management and development positions at Topspin Communications (acquired by Cisco), Nishan Systems (acquired by McData/Brocade) and C-Cube Microsystems (acquired by Harmonic). Mr. Balch earned a bachelorís degree in electrical engineering from The Cooper Union.

More by Mark Balch