When SOA came on the scene, it promised to revolutionize how data is accessed within applications, across organizations and across the Web; basically anywhere it was needed.
Promoting the ultimate reuse of data and harnessing the rapid data growth were other promises of SOA. Rather than duplicating data from one system to another, SOA provided cleaner ways to access the data directly and reuse it. It was supposed to turn spaghetti-like webs of disparate systems with one-off, proprietary interfaces into an orchestrated access layer that could ask for data from anywhere and put data back seamlessly, while being more agile to changing business demands.
While SOA has accomplished this, it has also created some new challenges. How is this new data “source” documented? How is it governed? Who’s accountable to maintain quality and traceability to the back-end databases? At some point, the data in the SOA layer or enterprise service bus has to end up back in the database. If no standards are leveraged in the SOA infrastructure, integrating and sharing data can be problematic enough without it returning you to where you started with time and money wasted.
Data lives in more places that just databases. SOA has been invaluable in enabling its re-use and controlling data redundancy that can plague organizations. The backbone of Web services and SOA is XML and, more specifically, XML schemas (XSD). XSD development still elicits images of the "wild, wild west" where you build whatever you need with very little thought about reuse and standards. For the most part, XSDs have been created and managed by developers, not data architects. Developers typically they work on one project at a time, and typically do not think about enterprise-wide standards and ensuring data stored in one place is defined the same way as like data stored everywhere else.
As a result, not only can you have different representations of the same data in the SOA layer, but the version of the same data in the SOA layer can diverge greatly from data in source systems.
The XSD language also has different standards for how data is typed that provide a lot more freedom than database DDL. Precision and scale are optional on most data types. The maximum length is different between like data types like strings, dates and integers. Primary, foreign and check constraints are also treated differently. This can lead to drastic differences between the structure of the XSD and the back-end databases. If the source and target rules are not carried over to the XSD definition, it can cause many errors or, even worse, it can result in data loss as the payloads are messaged between systems.
-1-