Data is any company's lifeblood. If the data can't be accessed, or is slow
to be accessed, or is of poor quality when it arrives, the company pays the
price. SOA provides access points for common functions so that they can be reused
in multiple business processes throughout an enterprise, but the essence of
what those processes are sharing is data. One of the key benefits of embarking
on SOA is that you can treat data sources and applications that store and act
on data as services and combine them into composite applications. This provides
the company with unparalleled data access, efficiency and resiliency to change.
The problem: then you are dependent on the quality of the source data and may
have limited insight into all the relevant definitions of and limitations of
how it's been described. While services that participate in SOA are supposed
to be self-describing, there are no standards for how deeply the real meaning
of the data is described. As an example, if a customer's name is entered into
the system, and an address requested, that data could easily reside in a dozen
different data silos. Each one could have a slightly different view of what
a customer means. While one would return a company's Texas location as the address
for that customer, another might return the company's California address and
still another might return the CEO's home address. Data governance, including
management of metadata descriptions, is the key to knowing which address is
the right one to return for this particular business process requestor and for
hundreds of other similar situations.
Even that example assumes all three addresses are complete and correct. Another
aspect of data governance is data quality monitoring. Data quality is of extreme
importance to every business process and to a company's success as a whole.
In a SOA, data quality is even more essential. Any errors in the data will be
visible globally across the enterprise by any consumer that uses the service
that pulls information from the faulty data source. Using the above example,
if invoices, bills or products are consistently sent to the wrong address, the
company will lose a lot of business. Data quality evaluations to find anomalous
data, and manual or automated remediation of that data, must be an integral
part of any really useful SOA plan. The consumer needs to be able to trust that
the data they request from the service will be both correct and relevant to
their current need.
Data as a Service
The need for the data to be relevant to the consumer might encourage a SOA
designer to tightly couple the source of the data with the specific service
that uses it. This can create a brittle situation if services dependent on a
particular data source are unable to adapt when data sources change over time,
or fail to give a global view of the data. A better way is for the data itself
to become a service. Encapsulating the data into a service used by multiple
processes helps with standardization and prevents data duplication. It also
allows the data to slip seamlessly into the composite business processes and
applications as just another service. This concept of Data as a Service provides
data access to any business process in any part of the company, giving the most
efficient possible data flow as well as resiliency over time.
However, delivering data as a service is a challenge in its own right. To participate
in SOA flows, a robust delivery service, exposed in a way that's consumable
as a service but also reliable, is an absolute requirement. And you must be
able to combine data from multiple sources for it to be useful in a SOA context.
If conventional code is used to connect all the hundreds or even thousands of
data sources in an enterprise, limited SOA benefit will likely be seen in the
long run because of the challenge and cost of maintaining those brittle connections.
To avoid that, data services should be built on a middleware platform that
connects to many sources, preferably all of the data sources in the enterprise
from the legacy mainframe COBOL application in the basement to the new SaaS
CRM application in the cloud. Yes, it is possible to incorporate SaaS applications
into an integrated SOA initiative. Internally, there will be less control over
the SaaS application's metadata and content, but it can still be accommodated
by a flexible integration platform. That integration layer should also adapt
easily to change, since it will be the touch point for volatile information
sources. It will have to be able to accommodate complex processes as a unit,
and be compatible with standard SOA service technologies like SOAP, XML and
WSDL. That may seem like a lot to ask, but integration platforms with that level
of flexibility and power exist, and the cost of custom coding brittle end points,
which must then be constantly repaired, far outweighs the license costs of a
Once that's addressed, you need to think about how to expose and manage relevant
metadata. The essential task of metadata is to clarify for everyone using the
Data as a Service what that data actually means. To return to the previous example,
instead of simply having that record defined as "customer," it would
make the data far more reliable and relevant if it were defined as, for instance
person, with various fields or metadata properties that identified that person
as among other things, a customer, with job title of CEO for corporation XYZ,
which has two locations, one in Texas and one in California. Then, the issue
of defining a location becomes relevant. The Texas branch might have multiple
buildings and might be the manufacturing branch of that company, while the California
branch is the service and support branch. This level of metadata would give
the consumer what is needed in order to know which address to return to the
requestor for an invoice, a support request or a letter to the CEO.
A related issue is granularity. If you present very fine-grained detailed services,
users must contend with building their own complex flows. Adding a layer of
more complex combined services that are themselves packaged as a service can
give the consumer a robust, easy-to-use application. For example, serving up
a complete order processing application would be far more useful than serving
up each element separately. Similarly, consolidating data services will give
a more global view of data and prevent duplication and conflicts between sources.
From a data governance perspective, the more granular the data service, the
more problematic governance will be. People will need to understand the different
data elements and therefore be more liable to unintentional updates, misuse
or insertion of incorrect data. With a less granular approach, appropriate integration
technology can be leveraged to create more sophisticated, robust services that
are more reliable in a SOA environment.
Data governance of the service metadata is also crucial, as that will be the
main means to manage the plethora of existing services and any new services
or composites that come along. A good metadata management tool, preferably one
compatible with the integration middleware used to tie in the end points, will
help with the most integral aspects of change management for data services as
well as other services, lineage and impact analysis. Knowing how a particular
value came to be, where it originated and what the impact would be on other
systems of altering that data source can give businesses the edge they need
to keep running smoothly.
Think Globally, Act Pragmatically
To bring all the needed elements of good data governance and solid infrastructure
together, a SOA initiative should start with a good road map; a global plan
for the vision of the enterprise level result desired, including authoritative
data sources, flows, policies and governance. But trying to build all of the
integration end points, service wrappers and messaging systems for an entire
enterprise simultaneously isn't very practical. Global big bang projects have
a historically high rate of failure.
The way to get immediate buy-in and immediate ROI is to show results quickly.
Start with a single pressing business issue and solve it with that overall road
map in mind. Leverage the integration middleware to create microflows that pull
together existing, functional stop-gap integration measures with the more robust
new technologies into cohesive processes, and then expose those as services.
This prevents large amounts of time being spent replacing technologies that
are working and allows the focus to fall on areas that genuinely need help.
As time permits, the older point-to-point interfaces, stored procedures and
other interim measures can be replaced as needed, and other business processes
can be pulled into microflows and exposed as services. One step at a time, each
step reaping benefit in its own right, will get the enterprise to the goal of
an integrated SOA environment in the shortest possible time with the fastest
return on investment. Just make certain that governance of that indispensable
resource, the company's data, is at front of mind from day one.
About the Authors
David Inbar is Director of Marketing and International Alliances for
Pervasive Integration Products. He has more than 20 years sales of marketing experience as a consultant and senior executive in the
software industry in the U.S. and Europe. Inbar has extensive expertise
in DBMS and Application Development and Process Management and holds an MBA and Masters in Electrical Engineering.
Paige Roberts is Technical Content Developer for Pervasive Integration Products. She has worked in the data integration industry for the past twelve years as a support technician, technical writer, trainer, software developer, and consultant.
Pervasive Software (NASDAQ: PVSW) helps companies get the most out of their data investments through embeddable data management and agile data integration software. Pervasive's multi-purpose data integration platform accelerates the sharing of information between multiple data stores, applications, and hosted business systems and allows customers to re-use the same software for diverse integration scenarios. The embeddable PSQL database engine allows organizations to successfully embrace new technologies while maintaining application compatibility and robust database reliability in a near-zero database administration environment. For more than two decades, Pervasive products have delivered value to tens of thousands of customers in more than 150 countries with a compelling combination of performance, flexibility, reliability and low total cost of ownership. Through Pervasive Innovation Labs, the company also invests in exploring and creating cutting edge solutions for the toughest data analysis and data delivery challenges. For additional information, go to www.pervasive.com.