There is no doubt that Service Oriented Architecture (SOA) is really taking off. Many large enterprises have investigated it, and have or will adopt it as a strategic direction, regardless of the underlying technology. As SOA is applied to more complex tasks, involving existing or legacy applications, there is a need for a repository to store metadata about enterprise information and services, as the basis for sophisticated discovery mechanisms. But can the Semantic Web provide a better alternative?
Integrated and Searchable Metadata Repository
Yefim Natis, vice president and distinguished analyst with Gartner, says that a metadata repository is a key enabling technology for SOA, and no long-term enterprise SOA initiative can succeed without an integrated and searchable repository or registry. SOA depends upon loosely coupled services with simple interfaces, and a discovery mechanism that enables consumers to find the services that they need.
In web-services there is a registry of descriptions expressed in the Web Services Description Language (WSDL) that can be queried using the Universal Description, Discovery, and Integration protocol (UDDI). As the more general concept of SOA evolves, many architects believe that this simple registry should be expanded to a sophisticated repository that includes content such as XML schemas and information metadata, as well as service descriptions.
An SOA repository stores and organizes metadata about services and the information that they consume and produce. It is used at design time by architects defining services and their inter-relationships, and by developers programming those services and the information transforms between them. It is used at run time by intelligent services searching for information sources. It improves programming productivity, increases re-use of software assets, and enables an intelligent architecture with dynamic connections between services. And, if metadata for existing applications is developed and included in the repository, it can liberate these applications’ data into the SOA world. With such a repository, SOA can open up the silos, and let the information flow.
There are SOA repository products on the market. According to Gartner, these range from simple registries focused on SOA software assets starting at around $50,000, to complex repositories focused on legacy modernization and understanding that may cost more than a million dollars. But there is another possible solution that could be freely available: the Semantic Web. Can this provide an open, interoperable, and low-cost discovery mechanism for SOA?
The Semantic Web
Enterprise application integration is one of the goals of the Semantic Web. Tim Berners-Lee, the father of the World-Wide Web, says that the Semantic Web is designed to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scientific and cultural data. This relates to enterprise data, not human documents, including data in relational databases, XML documents, spreadsheets, and proprietary format data files.
In the Semantic Web, metadata about information and services is not stored in a single, centralized location where it can be queried; it is distributed over the web, where it can be searched. In the context of an enterprise, this could be restricted to the enterprise intranet, but storing metadata on the global web gives wonderful opportunities for integrating information across enterprises, delivering interoperability, and supporting collaboration. And a distributed solution will cope much better than a centralized one with mergers, acquisitions, divestitures, and other organizational changes.
The basic metadata description standard of the Semantic Web is the Resource Description Framework (RDF). This describes objects and their properties. An extension, RDF Schema (RDFS) describes object classes as well. The Web Ontologies Language (OWL) goes even further, by describing more complex inter-relationships between objects, classes, and properties, to allow the definition of fully-fledged ontologies. These languages can be used to describe services and the information that they consume and produce, and enterprise data in all its forms. The descriptions can be posted on the Web, where they can be searched, and from where they can be retrieved.
Use of ontologies, and mappings to them from the enterprise metadata, can enable discovery of information and services with descriptions similar to, but not the same as, a given description. So results of a search for “component cost” could include “widget cost” data in an enterprise that uses widgets as components. This capability, known as Semantic Discovery, is much more powerful than the basic discovery mechanism of UDDI but, at present, it exists more in the research lab than in the product catalog.
So the semantic web not only can do the same job as an SOA repository, but it can potentially do a better job. The problem is that browsers and search engines for the semantic web are by no means as easy to find as are those for the ordinary information web. There is little support for RDF and OWL in commercial, enterprise-grade information products. There is a good range of development tools, some free software, and an occasional new commercial product, but the big-name information processing and management product vendors are not yet on board.
For the present, many CIOs will go for dedicated enterprise SOA repositories. They will find suppliers that commit to solutions that meet their immediate requirements, rather than asking their internal teams to build a better solution from software that may be new or unsupported. But those with long-term vision might, at the same time, set up small projects to investigate the Semantic Web, and put pressure on their suppliers to implement support for the open standards RDF and OWL.
There is enormous activity right now in information semantics, and from many different parts of the IT community. Enterprise information management, web services, and e-business specialists are all looking to develop a semantic infrastructure, just as they are all looking to SOA to provide a processing infrastructure. The need for a semantic repository arises independently of SOA, as does the need to relate different data descriptions within and across repositories, whether by ontologies, or more directly by an indexing mechanism such as the Universal Data Element Framework (UDEF). The enterprise repository is where all the different activities converge. It is an enabler for enterprise information management and for e-business, as well as a service discovery mechanism for SOA.
One thing is clear; the discovery mechanism must be based on open standards. As boundaries within and between enterprises become permeable, the enterprise repositories must be interoperable, to enable Boundaryless Information Flow™. WSDL and UDDI are the defining standards for the web services registry. They are good standards, but not sufficiently functional to meet the expanded needs of SOA as it has evolved today. RDF and OWL can meet these needs, and support more advanced semantic discovery features, but they are not yet widely implemented. The Semantic Web has been under development for a long time, but has not yet made the breakthrough to widespread use. The growth of SOA, and its need for sophisticated discovery, could be the Semantic Web’s opportunity.
For more information, please contact Dr. Chris Harding at firstname.lastname@example.org
About the Author
Dr. Chris Harding leads the SOA Working Group at The Open Group - an open forum of customers and suppliers of IT products and services. In addition, he is a Director of UDEF Forum, and manages The Open Groups work on semantic interoperability. He has been with The Open Group for over ten years.
Dr Harding began his career in communications software research and development. He then spent nine years as a consultant, specializing in voice and data communications, before moving to his current role.
Recognizing the importance of giving enterprises quality information at the point of use, Dr. Harding sees information interoperability as the next major challenge, and frequently speaks or writes on this topic. He is a regular contributor to ebizQ.
Dr Harding has a PhD in mathematical logic, and is a member of the British Computer Society (BCS) and of the Institute of Electrical and Electronics Engineers (IEEE).
The Open Group is a vendor-neutral and technology-neutral consortium, whose vision of Boundaryless Information Flow will enable access to integrated information within and between enterprises based on open standards and global interoperability. The Open Group works with customers, suppliers, consortia and other standard bodies. Its role is to capture, understand and address current and emerging requirements, establish policies and share best practices; to facilitate interoperability, develop consensus, and evolve and integrate specifications and open source technologies; to offer a comprehensive set of services to enhance the operational efficiency of consortia; and to operate the industry’s premier certification service. Further information on The Open Group can be found at http://www.opengroup.org.