Common Data Models, or CDMs, is something I've been pushing for years since the EAI days. These days I'm seeing some good thinking around this problem and some good technology behind it as well. The data is the single most important component of enterprise architecture and data integration, and you need to think carefully about how the data is managed in terms of data integration and business intelligence. In essence, you need a common understanding that spans the enterprise.
Truth-be-told most enterprises have no clue what their core metadata is enterprise wide, no less any sort of CDM for the enterprise. This is a good test for those ready to be successful with data integration, and those that still need to do a great deal of work. Unfortunately, most need to do the work.
How do you build a CDM? It's really an enterprise metadata model, that's fully normalized and compartmentalized, so that there is a complete single, functional, and well defined schema that spans the enterprise. Moreover, all security, logic, rules, and other important information is defined in the CDM. Creating a CDM is just a matter of understanding the existing data, including semantics, structure, integrity, rules, logic, and physical location, and then working through the logical and physical design of the CDM, including normalization, and physical to virtual database abstraction.
While this may sound like a lot of work, I've done it in weeks, as long as the information is there someplace. In some instances you have to do some reverse engineering, and that's really not a problem these days either.
If you don't have a CDM, you should create one. I dare you.













I completely agree with you David, however from my experience/PoV delivering the business case never actually happens. This makes any CDM initiative a fantasy (dreams can come true, but fantasies never happen :-)
The issue IMHO is that most organizations will 1. use consultants/contractors to build their CDM and therefore fail to embed the learning and understanding in to the organization (no organizational memory), and 2. Fail to put in place the processes, policies, controls measures and rewards necessary to embed the management, maintenance, usage and exploitation of a CDM into Business as Usual (BAU) processes.
For me, until we can embed the understanding and importance of data/information in to the organizations culture, (e.g. data management, data quality, data governance, meta data management, CDM, et al are seen as core parts of organizational infrastructure) by IT AND the Business (and their partners), information exploitation will alas remain fantasy for most…
Excellent post....and I agree. The reality is most enterprises have very little understanding of their DATA let alone (if they're lucky) to have any descriptions or rules applied to said data (e.g. metadata). It's mainly an issue of controlling the chaotic influx of data and its growth (especially the EASE of growth these days). Taking time to document whereabouts/use/descriptions/rules/system use/at rest or in use/etc etc is a dream for many organizations.
Also, I'd be careful of socializing the term "CDM" for *C*ommon Data Model....as it's typically been reserved for *C*onceptual Data Modeling...e.g. leveraging various non standardized notations like CASE*Method/Barker and or less structured approaches....up to and including boxes and lines in PowerPoint! This is that often overlooked layer above the other often overlooked logical modeling layer. We all know that a DBA can show us a reverse-engineered database model and say: Voila! Our enterprise model (yet its a snapshot of an atrociously architected physical database design). A core/common logical enterprise model (or collections of generalize logical models) described in REAL ENGLISH and designed to be understood my many constituencies...technical and non technical alike...are most commonly used for the implementations you describe above.
Again, great post. Speaks to my heart! Ha!
Greg
www.embarcadero.com
http://metafrequency.blogspot.com
David,
I'd love to think that this was possible, but the oil business has tried for nearly 20 years to create a common data model. I see that you worked at Mobil so you must have come across POSC, EPICENTRE, PPDM, OpenWorks, CDS, Seabed et al. Not one of them has been able to deliver an implementable, enterprise wide model. Now admittedly the oil business tried to create a CDM by committee and we all know how well they work (or more particularly, fail to work) but if it were possible wouldn't one of the major software/ data vendors in the oil space have delivered something? Well they both tried and neither have truly succeeded.
So I think the feasibility of creating a CDM depends on the complexity of the enterprise you need to model. Maybe the oil business is more complex than most, but in my line of work anyone who suggests creating a CDM for an oil company has little or no experience of the industry and will quickly lose whatever professional respect they may have earned.
That is not to say that I don't think we need a CDM, just that the not inconsequential intellectual exercise of creating one is a fraction of the political exercise getting it accepted and even less of the technical exercise of getting it rolled out into the business
Ian
David yes I agree you could build functional CDM or MDM for structured information ( i.e. lives in a DB ) but for unstructured information ( about 80% of information that enterprises possess ) well let me just say best of luck ;)
Morover, for more dynamic the information ( e.g. RSS feeds, wikis ) which is on the up n up CDM or MDM does not work well at all !
I think the idea is great and should be implemented worldwide. This gives an opportunity to integrate systems, whenever required and makes data sharing easier. However, it should be properly tested before actions are taken on it.
Maybe I'm alone in thinking this: but IMHO there is all difference in the world between CDM and MDM. MDM's strength is that it doesn't have to imply a massive common data model. The real role of MDM is to store the master data copy of the 'Dublin core' metadata attributes. And nothing more. So the MDM is simply that: the data store you are 100% certain is correct (and you manage its data quality with the utmost care). Meanwhile other applications and look over their shoulders and refer back to the MDM as a necessary. The DI is the bit that allows the 'look over a shoulder' to work between different meta data models.
To give an example: In upstream E&P, for so many workflows, the fundamental entity is the 'well.' So what something like PPDM defines is a good solid place to keep track of your wells - where they are, what they're called, and a bunch of reliable descriptive attributes (good old 'Dublin core' metadata). How people and applications describe the 'same' wells in relation to Lease Bids, Seismic, Drilling or PDEN is, unfortunately, very variable - which is where your MDM should come in and rescue you from all that ambiguity.
So a MDM is a 'little thing', very targeted, very achievable, and a very valuable. CDM - huh - sounds like a megalithic monster to me.