By Eric Roch, National Practice Director, Perficient
A challenge for enterprise-wide SOA is establishing the enterprise semantics (meaning) of data and a canonical (common) format for business objects. The semantics and canonical representations of business objects (such as the common XML schema representing a customer) form the Common Information Model needed to build the enterprise SOA.
The greatest challenge of creating the CIM is determining project scope. Starting with an enterprise scope will lead to a long-lived project with little or no business value in the early stages of the project. A limited CIM scope can result in canonicals that are likely to change as more systems are considered that where outside of the original scope.
A pragmatic approach to establish the initial scope of the CIM is to define an integration domain and start the CIM within a domain constraint. The domain of the integration, as defined by a context diagram and use cases (also integration design artifacts), establishes the domain model for data integration. Within the context of the domain, multiple data sources are mapped together to a canonical form. Canonical data formats then become part of the CIM. A simple definition of the CIM (for brevity) is a common externally-managed repository of data semantics that promotes reuse.
The domain approach does not preclude the use of industry standard schemas (e.g. OAG) or application specific ones (e.g. SAP IDocs). In the case of using standards there is still the concern of scope. For example, an integration domain might constrain the CIM problem to a geography eliminating the need for a world-wide committee to agree on the world view of business objects within a global corporation. Within the constrained domain if SAP were the dominate application then the business objects could look very much like schema representations of IDocs.
The advantage of limiting the scope of the CIM to an integration domain is the early delivery of SOA related projects and application integrations. By decomposing the CIM into domains it is possible to iteratively add to the CIM as more systems are added to the enterprise SOA. This however presents a new problem, but one that is more easily solved, namely canonicals are likely to change as the CIM is expanded.
With respect to data transformation, private processes will expose services as an XML representation of the native interface. Transformation from the native XML representation to the native interface occurs in the private process. Private business objects are tightly coupled to the application interface; therefore, it is more practical for the developer to map the private business object directly to the application schema that will be consuming the data. Public processes provide a fully enumerated, constrained and documented interface based on the CIM. Transformation between the public interface and the private process occurs in the public process. The concepts of public and private processes are essentially an abstraction mechanism and therefore fit well within SOA. The private processes are fine grained and application specific and the public processes are course grained encapsulating the fine grained services thus promoting reuse and hiding complexity and change.
With respect to the CIM, it is a good practice to separate the XML document manipulation logic from process orchestration. Expressing business processes in terms of the documents to be handled will clutter the business logic with document processing-related logic, which should be abstracted from application developers focused on implementing business logic. It also introduces strong dependencies between the document's schemas and the business logic, which may cause maintainability problems when supporting new versions of the CIM schemas. For example, some integration brokers do not support the late binding of schema and XSLT within a business process.
By separating the document manipulation logic from the processing logic, a developer can switch between various document manipulation mechanisms (e.g. broker mapping, XSLT or a JAVA transformation) without affecting the processing logic. There is a clear division of developer skills when using external tools to build XML schemas and maps using authoring and metadata management tools such as Contivo, Altova, Stylus Studio or Unicorn and orchestration development tools. A data architect is best suited to build complex XML schemas and with the right tools generate transformations that can be leveraged across multiple technology platforms – brokers, application servers and web servers. In this fashion there is a division of labor that builds the CIM collaboratively: analysts help define the meaning of data (semantics), data architects builds schemas and maps, and SOA developers use the schemas and maps.
Separating document manipulation logic from process orchestration also simplifies change management and therefore domain integration. If we iteratively build the CIM by adding integration domains we must have a strategy for domain integration. For SOA, this strategy includes versioning the schema, transformations and interfaces (either by proxy or gateway). The proxy approach supports multiple versions of interfaces and schemas. A gateway translates domain semantics between two or more domains. Proxies and gateways are implemented as services with the Enterprise Service Bus (ESB). Versioning techniques in detail are a complex topics left for another day.
This approach implies that it is acceptable to have multiple canonical representations of a business object – e.g. customer. This happens if there are multiple versions of the canonical or variations of the canonical within integration domains. It turns out there are many practice reasons why this many be necessary including:
Avoid analysis paralysis of business objects
To support very different views of the same business objects (e.g. a customer may look different in different geographies or between very different systems like ERP and CRM – these are examples of integration domain boundaries)
Creating a single schema for a business object may be too large for practical use
Finally, the CIM will grow to be a large repository with the need for change management, data storage and security. A repository for artifacts and governance process for change management is required. The governance process should include versioning as well as clean up such as removal of deprecated schemas and elements.
In summary the following tips should be considered when building the CIM:
Avoid analysis paralysis of business objects by decomposing the problem into domains with a domain integration strategy
Do not create one huge schema for a complex business object, rather consider the business object in the context of domains
Promote a collaborative CIM development process with tooling specific to schema authoring and metadata management
Separate document manipulation logic from business process logic
Design for change and create mechanisms for versioning schemas, transformations and interfaces
Store the CIM in a repository with a governance process for change and management
About the Author
During his more than 20 years of experience in Information Technology, Eric Roch has worked in roles such as executive level management, technical architect, and software development in top tier technology organizations including TIBCO Software (Director of Professional Services) and Deloitte Consulting. Recently, Eric served as CTO of a pure-play SOA consultancy and is currently Chief Technologist and National Practice Director for Business Integration at Perficient Inc. (NASDAQ: PRFT). In his role with Perficient, Eric develops strategic plans for business integration and SOA for Fortune 500 companies. He is also responsible for the commercialization of Perficient’s methodology and software for SOA services delivery.
As an IT industry speaker and author, Eric has been invited to speak at numerous industry events and written for leading industry publications. Eric earned his Bachelor of Science in Computer Systems from City University of Bellevue, Washington, and Master of Science in Management of Technology from the University of Maryland.
Perficient is a leading information technology consulting firm serving clients throughout the United States. We are experts in designing, building and delivering business-driven technology solutions. We help our clients gain competitive advantage by using Internet-based technologies to make their businesses more responsive to market opportunities and threats, strengthen relationships with customers, suppliers and partners, improve productivity and reduce information technology costs.