We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

First Look

Krissi Danielson

CopperEye Takes a Hard Look at Data Mining

Vote 0 Votes

Listen to the entire podcast Download file

    Agenda and Resources

1. The Problem of Data Management

2. Industry Verticals and Data Management Needs

3. Flat Files vs. Relational Databases

4. Case Studies

Read a complete transcript of the podcast here

Learn more at CopperEye's Web Site

Learn more about SOA in Action

Kate Mitchell will regularly respond to any comments posted below.

A recent ebizQ Webinar guest estimated that by the year 2010, the world's data volume will reach one zetabyte -- a 1 with 21 zeroes after it. To state the obvious, that's a lot of data, and it's also an obvious challenge for corporations  that want to find ways to use that data to business advantage. Bath, U.K.-based CopperEye is one company that hopes to help solve that problem.

Complications and Costs of Data Management

The first challenge of data management is obviously being able to find the data you want, points out CopperEye CEO Kate Mitchell.

"As companies have been continuing to increase the size of the database they are adding larger hardware platforms so that performance remains constant, both on the transaction side and the analytical side," she says.

But databases are growing so large that hardware costs and data center footprints, along with power requirements, are becoming unmanageable and companies are starting to have trouble finding highly skilled DBAs to manage everything.

In some industries, traditional solutions can still keep pace with data needs, but in others you cannot just throw money at the problem, Mitchell says. The explosion in data is too dramatic. And furthermore, she points out, regulatory issues are forcing companies to keep more data for longer and be able to provide specific responses to inquiries sometimes in 24 hours or less.

Data Management in Industry Verticals

The telecommunications sector tends to have complex data needs, says Mitchell. "They are heavily regulated and in this case, aiding to combat terrorism - they need to find very specific information on individuals that are potential criminal suspects -- to be able to find calls they made, when they made them, where they were, who they called, how often they called."

In business, issues like 3G are creating loads of data, as are services like cell phones for audio/video shopping and so forth. The financial industry tends to have particularly intensive data needs as well. And as websites are being asked to track visitor information, that creates massive amounts of data as well.

Scaling with the Data Explosion

Relational databases have weaknesses in managing huge amounts of data, says Mitchell. Relational databases were designed to manage data that's changing. "There's no better approach for concurrency with hundreds or even thousands of users inside an organization and outside an organization and making sure you've got all of the capability for transactional integrity whether that's two-phase commit or row-level locking," she says. "That's exactly what the database was designed for."

But in data that's not changing, such as transaction or event data, sometimes you may not need the overhead or complexity of a relational database. An alternative is to keep such static data in a lower cost, scalable location such as a simple flat file -- but without giving up immediate and precise access to that data.

"That's what CopperEye is finding that our customers are pleased with as a new innovation in managing this vast volume of data without giving up very precise and immediate access to the data," she points out.

CopperEye takes the flat file system and tries to use the best attributes of that -- the low cost and high scalability along with simplicity and indexing, almost cherry picking through the IT stack.

Case Studies - A Wireless Provider and a Messaging Company

One CopperEye customer, Orange (a UK wireless provider) needed to track anomalous calling patterns for revenue tracking and for regulatory compliance with the EU. Orange found that storing all that data was not feasible in a relational database model, so they worked with CopperEye to find a way to store the data in a flat file instead.

"We added up the other day that we've handled 500 billion transactions for them over that period of time [seven years]," Mitchell said.

Orange went from storing call records in a relational database, where it could only afford to keep ten percent for two days, to keeping the records in a flat file where it could keep 100 percent for 40 days. "Just to put that in perspective, to meet the new EU mandates, rather than keeping all this data for 40 days, the guideline is to keep it for a minimum of a year and some EU member countries, like Ireland are saying that data needs to be kept for three years," said Mitchell.

In another example, a CopperEye customer called Message Labs had over 13,000 corporate customers with 35 million email accounts and about 8 billion emails per month. The company needed tot rack these emails across twelve data centers around the world with a 24-hour service level agreement that Message Labs would locate missing emails within 24 hours.

"Their network help desk would notify their operations people," she said. "Their operations people would actually be trolling through these log files trying to find literally that needle in the haystack; that one email out of the billions that they handle in any particular month."

And customers wanted turnaround faster than 24 hours a lot of the time. Message Labs implemented CopperEye and got direct access to log files. Now, within a few seconds, customers can go online and enter recipient or sends or subjects and retrieve information on a particular email within a few seconds. This has improved customer service and lowered costs for Message Labs.

Flat Files and Vocabulary Search

Mitchell is quick to point out that flat files are not a cure-all for all data storage needs. "This approach is not for text or word documents or web pages," she says. "CopperEye is focused on data that would otherwise live in the database -- scalar data."

For e-discovery and searches for documents and Web pages, companies like Endeca, Fast, Google, and Yahoo tend to be most optimized. "They pre-build all the indexes based on every important word or relevant word in any particular language," she explains. "That's not possible to do at huge volumes if you've got 15 billion combinations of transaction ID or customer ID based on combinations of alphanumeric characters. So that's one distinction that we make right up front with the prospect."

Executive Summary by Krissi Danielsson



Join ebizQ producer Krissi Danielson for interviews with the innovators, movers and shakers behind emerging enterprise software solutions.Have a solution that qualifies? E-mail Krissi at krissi (at)ebizq.net

Krissi Danielson

Krissi Danielsson is a podcast producer with ebizQ and contributor to ebizQ's SaaSWeek site. View more

Recently Commented On