Loraine Lawson shined
a light on an interesting data integration project around the Web called "Deep
Web."
"Alon Halevy, a former computer
science professor at the University of Washington, believes it's integrating
the Web. And by Web he doesn't just mean the crawl-able information we all know
and love. He means the down-deep data hidden in databases connected to the Web"
It's called the Deep Web and is the mother-of-all
integration challenges. Moreover, this
same type of effort is occurring at the cloud computing level as well, called "Intercloud." One is retrofitting integration on top of
many dynamic information stores, Deep Web, while the other is creating
integration between known resources, Intercloud.
"Halevy is now leading a team at --
where else? -- Google that's attempting to solve this. Google's tactic is to
use a program to analyze the contents of every database it encounters on the
Web. Every. Database. The thought process being that you need to know what's in
the database before you can decide whether to search it for information. The
program works by finding a form on a Web page and then guessing at likely query
terms, based on the Web site's content. Once it gets a match, 'the search
engine then analyzes the results and develops a predictive model of what the
database contains'"
Truth-be-told we've been looking at this kind of technology for
years. However, the explosive growth of
the Web, and different approaches to Web content databases, has left the opportunities
for integration in the dark. This
project, and other like it, could mean that we have a common integration layer
that spans the major Web systems, that's going to allow businesses to leverage the
data much more effectively since the information is now available through
non-visual interfaces, versus have to read it from a Web page (by humans or
screen access middleware).
The cloud computing providers are looking at a similar approach,
but they are coordinating the data integration effort among themselves
(Intercloud). The idea is to provide
access to data residing in other clouds, thus providing a back-end integration
infrastructure allowing those leveraging many cloud providers to move their information
between them as needed. This will accelerate
cloud adoption.
Both of these efforts are more important than they
appear. If the information that resides
on the Web can be accessed as information, and not visual content, than the Web
goes to another level of play. Moreover,
if there is workable data integration layer between the major cloud computing
providers, than cloud computing becomes more practical for the Global 2000.













Leave a comment