We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Join us for SOA & Application Integration in Action Virtual Conference on October 5, 2010. Learn more here.

Deduplication has been one of the hottest technologies in the storage industry for almost three years. During that time, it has generated marketing wars, industry consolidation, and comments and controversy from vendors. IT managers in most midrange data centers typically have limited staff and few backup specialists, and it can be hard to figure out how deduplication might fit into their situation. Following are important questions for IT managers to ask as they consider deploying deduplication in a midrange datacenter.

1. Is data deduplication now a mainstream technology?

Yes. Deduplication appliances have absolutely made the transition from experimental to mainstream. Analysts tell us that a little over 30 percent of IT departments use it for at least part of their data, and vendors now offer products with a couple of technology generations behind them that are optimised for simplified, non-disruptive deployment.

However, this doesn't mean that every solution is equal. Most deduplication vendors go through a learning curve, so it pays to ask about experience, references, and support when evaluating solutions.

2. What does deduplication really do?

Generally, deduplication is a method for finding redundant data at a sub-file level, and substituting a pointer for the repeated data. It can be used to reduce disk requirements as well as the bandwidth needed to transmit data.

There are several different and legitimate ways of doing that-block level deduplication is the most typical, but some products find differences between file-sets at a Byte level. Different approaches may have implications for performance, the amount of working space required, how easily they can support different software applications, and ease of setting up replication. The specific approach is less important than proven results and how well the approach matches with the problem you are trying to solve.

3. What problems are best addressed by deduplication?

The greatest leverage, and the most widespread adoption, involves backup data. That's natural since backups contain more redundancy than any other datasets and get retained longer. Most common types of office data-including email, databases, and flat files-benefit from high deduplication rates.

Quantum recently surveyed users of its DXi-Series appliances to quantify results on the effects of deduplication when it is added to users' backup strategies. Compared to traditional storage systems, users reported an average increase in backup speeds of 125 percent, an 87 percent reduction in failed backups, and a huge change in restore profiles-restores that used to take several hours or days are typically reduced to minutes using deduplication. Costs are also reduced, often dramatically. Users reported that overall removable media costs dropped by an average of nearly half, the costs of retrieving tape from offsite storage were deduced by 97 percent, and the amount of time required to manage backups was reduced by 63 percent.

Users that adopted remote replication for disaster recovery (DR) protection saw an increase in recovery points, automating the process and eliminating tape (and tape management) in smaller offices.

4. Does it matter what backup software I use?

Most deduplication vendors have tested their systems with different backup applications and achieved effective results. Some vendors can even optimise data storage for more than one backup application. It is worth asking a deduplication supplier whether there are applications that they have optimised around.

Be sure to check for support for specific backup software interfaces. Symantec, for example, has developed an OpenStorage interface that works with backup appliances to provide an additional level of operational advantages-increased performance, better replication management, even direct, off-line tape creation. Ask deduplication appliance vendors about their strategic relationships with backup application suppliers. You will want to understand how closely they work together, and what their plans are for interoperability and integration in the future.

5. What is the easiest way to implement deduplication?

The choice facing most IT departments is between deploying deduplication appliances or carrying out deduplication within the backup software. There is no universal answer about which approach will be easiest to deploy. There are some guidelines, however.

With appliances, currently the most wide-spread approach to deduplication, the backup data is all sent to the device and deduplication occurs at the target. With appliances, users can add systems in place of, or along side of, existing backup targets and make very little change in the overall backup methodology. Because the deduplication is carried out on a purpose-built appliance, it never increases the load on backup clients or media servers, and it makes the deployment of operations like replication straightforward. As the most common method, it is also the most mature-which usually means faster deployment and fewer service needs.

With a software approach, the backup application adds deduplication to the other tasks that it carries out, either on backup clients or on media servers. By deduplicating data before it is sent to a target, less data has to be transmitted over the network-the idea is similar to performing compression in software, and in fact deduplication processes almost always include compression as well. Since deduplication is a relatively high overhead operation, there's a chance that backup operations may slow down so deployment may require adding new servers or dedicated storage. This tends to increase the cost and complexity of integration.

Either approach can make sense depending on specific circumstances. To decide what is best for you, think about where bottlenecks are in your system today, whether or not your current media servers are underutilized, and what level of integration effort makes sense for your specific situation.

6. Should I eliminate my tape storage altogether?

Although most end users who adopt deduplication reduce their use of removable media, very few eliminate it entirely-and for good reason. Typically, users have roughly three tiers of needs for backup: daily backup and restore, near-term DR protection, and long-term data retention. It makes sense to look at different technologies for each tier and to talk to vendors who understand them.

Daily backup and restore: Many users find that disk read and write profiles give them advantages for day-to-day backup and restore. Deduplication adds the advantage of letting them store data on disk longer so that more restores can take advantage of those profiles.

Near term DR: Replication enabled by deduplication lets users with multiple sites replace removable media with remote replication for DR. As a result, they see more restore points, reduce costs, and automate what is for many a very manual operation.

Long-term retention: Removable media continues to provide strong economic and security value. Tape consumes the least power, space, and cooling of any storage, making it the preferred medium for long-term retention. New technologies for tape, including encryption and media integrity analysis, have made it more secure and reliable.

7. Where can I get objective advice?

There are lots of ways to get objective advice about which approaches match best to your specific needs. Some independent analysts who spend time talking directly with end users provide very useful and objective information about others' experience. But if you aren't a client of the big-name analysts, there are other options.

One of the best is an experienced reseller. Good resellers, who have a track record of helping IT departments deploy technology, understand the reality of what will work for specific environments and they have a vested interest in helping you succeed.

You can also talk directly to vendors. If they offer multiple technologies, they are likely to provide a broader view than if they offer only one product. And if you have a vendor that you already trust for backup, it makes sense to see what kind of deduplication options they have.

Editor's note: Quantum is exhibiting at 360°IT, the IT Infrastructure Event held 22nd - 23rd September 2010, at Earl's Court, London. The event provides an essential road map of technologies for the management and development of a flexible, secure and dynamic IT infrastructure. For further information please visit www.360itevent.com.



Explore Our Topics

  • Virtual Conferences
  • Webinars
  • Roundtables

BPM in Action

March 10, 2011

The sixth annual BPM in Action 2011 Virtual Conference will explore cutting-edge market developments in BPM and describe how to leverage them for improved business operation and performance. More

View All Virtual Conferences

Smart Case Management: Why It's So Smart.

Date:Nov 05, 2009
Time:12:00 PM ET- (17:00 GMT)


Date:Oct 29, 2009
Time:15:00 PM ET- (19:00 GMT)

View All Roundtables
  • Research Library
  • Podcasts
  • News

Joe McKendrick: Part II of II: Designing Evolve-ability into SOA and IT Systems

In part two of Joe McKendrick's recent podcast with Miko Matsumura, chief strategist for Software AG, they talk about how SOA and IT systems need to change and grow and adapt with the organization around it.

Listen Now

Phil Wainewright: Helping Brands Engage with Social Media

Phil Wainewright interviews David Vap, VP of products at RightNow Technologies, and finds out how sharing best practices can help businesses understand how best to engage with online communities.

Listen Now

Peter Schooff: Making Every IT Dollar Result in a Desired Business Outcome: Scott Hebner of IBM Rati

Scott Hebner, Vice President of Marketing and Strategy for IBM Rational, discusses a topic on the top of every company's mind today: getting the most from IT investments.

Listen Now

Jessica Ann Mola: Where Will BI Fit In? Lyndsay Wise Explains

In BI, this tough economy and the increasing role of Web 2.0 and MDM are certainly topics on people's minds today. WiseAnalytics' Lyndsay Wise addresses each of them in this informative podcast.

Listen Now

Dennis Byron: Talking with...Deepak Singh of BPM Provider Adeptia

Deepak Singh, President and CTO of Adeptia, joins ebizQ's Dennis Byron in a podcast that gets its hand around the trend of industry-specific BPM.

Listen Now
More Podcasts
  • Most Popular
  • Quick Guide
  • Most Discussed

Quick Guide: What is BPM?

Learn More

Quick Guide: What is Event Processing?

Smart event processing can help your company run smarter and faster. This comprehensive guide helps you research the basics of complex event processing (CEP) and learn how to get started on the right foot with your CEP project using EDA, RFID, SOA, SCADA and other relevant technologies. Learn More

Quick Guide: What is Enterprise 2.0?

A lot of people are talking about Enterprise 2.0 as being the business application of Web 2.0 technology. However, there's still some debate on exactly what this technology entails, how it applies to today's business models, and which components bring true value. Some use the term Enterprise 2.0 exclusively to describe the use of social networking technologies in the enterprise, while others use it to describe a web economy platform, or the technological framework behind such a platform. Still others say that Enterprise 2.0 is all of these things. Learn More