We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.
Start a Discussion

Is Dirty Data Becoming a Showstopper for SOA?

Vote 0 Votes
Joe McKendrick:  Is dirty data becoming a showstopper for SOA?  SOA projects typically have focused on application services, data has typically been an afterthought. However, if data is outdated, unqualified, or duplicated, SOA-enabled services may be delivering dirty data even faster and to more users than before.

7 Replies

| Add a Reply
  • Dirty data is an impediment for the Enterprise proper operation. But dirty data can be packaged very well in SOA services. So I don't think dirty data stops SOA.
    But it is true that data has to be normalised, standardised so that services understand it. For that data transformation and MDM initiaves are worth considering for SOA.

  • So true, but on the other hand, SOA may be a golden opportunity for getting the benefits from data quality tools that we haven’t been able to achieve so much with the technology and approaches seen until now.

    Data Quality functionality deployed as SOA components has a lot to offer:
    • Reuse is one of the core principles of SOA. Having the same data quality rules applied to every entry point of the same sort of data will help with consistency.
    • Interoperability will make it possible to deploy data quality prevention as close to the root as possible.
    • Composability makes it possible to combine functionality with different advantages – e.g. combining internal checks with external reference data.

    During the last years I have been on projects implementing data quality as SOA components. The results seem to be very promising so far.

  • Garbage In... Garbage Out; even when the garbage flows through a platinum pipe.

    SOA is that platinum pipe.

    A benefit of SOA is that it might help accelerate the realization of dirty data and raise the visibility of the problem to a level such that something positive might be done about it.

    SOA is that hope.

  • I agree that a SOA initiative represents a golden opportunity for cleaning up that dirty data that you may have been hoarding for decades. Sort it into three piles:

    1. Keep the existing assets and adapt to it;
    2. Extract and rationalize, or virtualize the data into a more lightweight, or better organized asset
    3. Throw it away - maybe it's not needed anymore (wouldn't that be nice...)

    We see companies accomplish this rationalization by observing the actual usage of data and transactions over a window of time, then making decisions about what data sources are bogging down the progress of development and testing. Those dependencies can be virtualized for predictability and stability while the dirty jobs of moving and rationalizing data are going on.

  • Dirty data is one of the many reasons why service-oriented architectures (SOA) are so powerful. Gartner studies over the last decade have demonstrated that dirty data “leads to significant costs, such as higher customer turnover, excessive expenses from customer contact processes like mail-outs and missed sales opportunities.? In this day and age, there can be no doubt that the one and zero sitting in your databases are corrupted. But what do you do about it?

    Many have suggested that this is an IT issue. The fact that data assets are inconsistent, incomplete, and inaccurate is somehow the responsibility of those response for administrating the technology systems that power our enterprises. There solution seems to further suggest the only real way to solve the problem is with a “reset? of the data supply chain - retool the data supply chain, reconfigure the data bases, do a one time scrub of ALL data assets, and set up new rules that somehow prohibit corruption activities. At best, this has been shown to be a multi-million dollar, multi-year activity for fortune 2000 class companies. But at worst, it is a mere pipe dream of every occurring.

    A more practical solution can be found in SOA, specifically Dirty Data Modernization Services (DDMS). These are highly tailored temporal services designed around the specific Digital Signatures of the dirty data in question. For example, Dirty Data Identification Services use artificial intelligence to identify and target corrupt data sources. Dirty Data Transformation Services use ontological web-based algorithms to transform bad data into better data (not correct data). Other services like Accuracy and Relevance Services can be used on an ongoing basis to aid in mitigating the inclusion of bad or dirty data.

    Human beings, by our nature, do not like change. We often look to rationalize away doing the hard things in life, rather than justifying the discomfort that comes through meaningful change. Dirty data is just one of those reasons one can use if you truly don’t want to get on with different, often better solution paradigm. So, rather that treat dirty data as a show stopper, look to it as a catalyst for real meaningful enterprise change.

  • Misinformation and bad data have been haunting and hobbling organizations -- not to mention entire societies -- since the dawn of time. Nowadays, of course, information travels at the speed of light, and SOA -- enabling interoperability between all types of applications -- becomes the latest mode of transport.

    Data governance needs to be front and center in SOA discussions, because SOA governance ensures that the enterprise is behind the effort. Likewise, data governance helps ensure that the correct version of data is being deployed within the architecture.

    Just as the Internet has shown itself to be a speed-of-light carrier of rumors, gossip, and misinformation, so can service oriented architecture within an organization be a speed-of-light carrier of problematic data. If a SOA-based architecture is delivering inaccurate or bad data, it is a very ineffective SOA indeed. Data is often a last considering in SOA planning, but SOA really won't function properly if it's delivering bad data.

    Neal Fishman, in a book "Viral Data in SOA: An Enterprise Pandemic," likens the effect to a mosquito. A mosquito's claim to fame is that it can pick up viruses and bacteria from any type of organism, and deliver the payload to any other type of organism on the planet. A human and a deer and a bird may not have much in common, but they can all share the same diseases.

    Earlier this year, we saw the founding of the SOA Data Integration Architect Community (SDIAC), an online community focused on the value of data integration and data services in agile architectures such as SOA, promising to bring the data and SOA worlds closer together. The effort is being led by Dave Linthicum, an active member of the ebizQ community here as well. Such an organization will definitely bring support to SOA proponents struggling to address data quality issues, and data architects seeking to service-enable their environments.

  • Dirty Data shouldnt stop anything if you handle it correctly. True garbage in...garbage out, but there are so many things you can do to ensure dirty data can still be of use...

    So to answer, NO...

Add a Reply

Recently Commented On

Monthly Archives