Optimizing A Virtual Data Center

Untitled Document

It is just not enough these days to have storage virtualization for the sake of storage virtualization. Pooling heterogeneous resources and migrating data from point A to point B while the application is up and running is great, but businesses really need complete solutions -- solutions that provision storage more efficiently as well as virtualize, protect, migrate, dedupe, encrypt, replicate, recover and archive any data source in real time via policy.



There is a definite need for a simple, yet comprehensive, solution that enables a more efficient IT infrastructure and leverages existing assets, policies and procedures, all while reducing the overall costs. This requires building an optimized suite of integrated data services on a common platform.

This holistic approach is called the "optimized data services" (ODS) utility. An ODS utility enables physical abstraction and flexible data movement by virtualizing existing datasets, storage and servers between compute and storage elements. Once virtualized, this allows the creation of policies that enforce specific service levels for explicit or pooled datasets. Physical constraints, such as volumes in the same array, or SAN versus non-SAN, or storage network-attached devices or hosts, should not hinder the grouping of data elements for consistency or recovery purposes.

The overall administrative burden is reduced, providing an element of automation to the design through the thin provisioning capabilities for enhancing storage utilization thast the ODS solution enables. Additionally, capacity expansion for running applications can occur in real time and on demand by the compute resources in question. Recovery time objectives (RTO) and recovery point objectives (RPO) are achievable at minimal costs based not on budget constraints, but by the service level agreement (SLA) policy applied to the application. This unique capability can only be achieved if the solution also automatically applies efficiency in data storage and movement. Such effciency is accomplished through de-duplication and sub-block-level monitoring of all stored data to ensure only unique data is stored and replicated.

Focusing on application uptime and rapid recovery is vital in such a design. The solution must also be able to integrate at the application level and allow for a simple means to monitor and recover any application across any platform from hardware and software failure to malicious intent. Continuous protection needs to be utilized to achieve a zero RPO for critical applications to ensure protection from corruption and deletion. Using such a solution for critical applications would eliminate the need for multiple management elements for protection and replication, such as log shipping and array-based replication. Also, there would be no requirement for backup applications, clients, servers, media or processes, since protection is continuous and policy based, which could save huge sums of money and time, allowing companies to focus on the business rather than IT technologies. The engine must also be intelligent and able to seamlessly work with or even enhance other protection and virtualization solutions such as VMware and VMware Site Recovery Manager, Microsoft Failover Clustering and Data Protection Manager, Oracle Real Application Clusters, SAP BRtools, PolyServe, Platform Computing, Virtual Iron, Citrix, Sybase Replication Server and others.

A comprehensive ODS utility needs to provide built-in encryption and off-site replication of all data sets for risk mitigation. Data optimization over WAN links needs to be included to reduce WAN costs. The ability to transparently integrate with tape formats and tape-based archiving also can be beneficial since many organizations are obligated by law or regulation to provide removable media copies. It should be used for long-term archives; and data should move transparently based on policy to tape-based media, because tape also is low cost and removable. The media should not require expensive tape hardware or libraries that enable encryption, but rather should be automatically encrypted by the solution. Furthermore, data must be searchable for audit purposes, and it must be able to be stored in an immutable fashion for compliance. All datasets also should be de-duplicated so that only a single instance of every data object is stored for archive.

When storing data in native format ready for application use or when rapid recovery (under a couple of minutes) is required, however, data de-duplication should not be used. Data de-duplication usually implies electronic hashing of data into unique objects; therefore, a recovery process would need to be applied to re-constitute the data. Instead, data should simply be stored more efficiently by monitoring the data stream and eliminating any "white space" within the file system or data blocks written by the application. Only these unique sectors of disk would need to be replicated and stored for recovery at the disaster recovery (DR) site during data replication. Companies gain the benefits of data de-duplication without the associated overhead or risk by simply storing data more efficiently, and the datasets themselves are always instantly available for mounting to the same or a different application for recovery, testing or DR. In fact, these space-efficient images can be utilized for retention of multiple data points for many days, if data can be stored very efficiently, providing the ability to recover applications rapidly to any point in time while saving costs.

The ODS utility should be flexible enough to accommodate existing protocols as well as newer protocols, so that rapid obsolescence can be avoided. Maintenance must be able to be accomplished with minimal or no downtime, and a technology refresh of any component should be transparent to running applications. Scalability is a simple factor of adding more compute resources, connections or ports in a modular fashion and cannot be limited or hampered by technical issues or artificial resource limitations of the file system, capacity, connectivity or availability.

The ODS utility is more cost effective if it provides these capabilities using the same server and storage infrastructure currently in place. The need to purchase proprietary disks or servers to create the solution is eliminated. The learning curve will be greatly reduced, since the accumulated knowledge of the existing environment will not be wasted.

Recovery from failure or disaster is simple, fast, comprehensive and cost effective and provides automation capabilities where possible. Recovery is simple enough so that at time of failure, no-one has to scramble to figure out how recovery actually works. As a result, the ability to test for DR should be intrinsic in the design and simplified to the point that following a wizard or script is all operations staff need to know. The ability to provide consistency grouping for recovery across platforms and storage tiers is also a requirement for the ODS utility, because many applications also include data feeds from other applications.

The ODS utility should be implemented intuitively and rapidly without requiring weeks or months of professional services to make it work. In fact, it would be beneficial if all you needed to do was take an existing server, grab a USB memory stick that includes all the self-installable software you need, place the stick in an open USB port and reboot the server to create a node that operated within the ODS platform. If this is done to as many servers as you need for the required performance, you can build your own platform for optimized data services - or PODS - which provide the critical data services and abstraction.

Companies looking to optimize their data services, create a more services-oriented architecture for their applications and data resources, or who are looking at moving to a cloud computing model, should take a hard and critical look at solutions currently available in the market. Be sure to look for a platform that provides all the capabilities mentioned above, so you can implement simply, quickly and with peace of mind, knowing that everything is certified, supported and can be managed globally from a single console.

About the Author

Christopher Poelker is Vice President of Enterprise Solutions at FalconStor Software. Prior to joining the company, Poelker was a storage architect at HDS and Compaq. While at Compaq, he built the sales/service engagement model for Compaq StorageWorks, and trained VARs and Compaq ES/PS on StorageWorks. His certifications include: MCSE, MCT, MASE, A+ certified. Poelker is the author of "Storage Network: Networking for Dummies."

More by Christopher L. Poelker

About FalconStor Software

FalconStor Software, Inc., a provider of disk-based data protection, delivers comprehensive data protection solutions that facilitate the continuous availability of business-critical data with speed, integrity and simplicity. Their TOTALLY Open™ technology solutions, built upon the IPStor(R) platform, include Virtual Tape Library (VTL) with Deduplication, Continuous Data Protector™ (CDP), Network Storage Server (NSS), and Replication option for disaster recovery and remote office protection. The products are available from major OEMs and solution providers, and are deployed by thousands of customers worldwide, from SMBs to Fortune 1000 enterprises. Headquartered in Melville, N.Y., FalconStor has offices throughout Europe and the Asia Pacific regions.