Data Access in SOA Environments

Untitled Document

Editor's Note: Learn how to best succeed on your SOA journey right here.



Although SOA is different from traditional architectures, applications in SOA environments still need to access and use data. And it's often SOA experts, not data experts, that design these applications.

As a result, performance issues often appear when applications are deployed. Here are the main ways you can ensure that your data applications perform well in SOA environments.

Data access guidelines for SOA environments

The following checklist will ensure your database applications perform well in SOA environments:

1. Involve data experts in addition to SOA experts.
2. Decouple data access from business logic.
3. Design and tune for performance.
4. Consider data integration.

Tip #1: Involve data experts in addition to SOA experts.

SOA guidelines are defined by SOA architects, who do a good job of creating and managing reusable services that represent business logic. But SOA architects aren't experts at databases or data access. As explained earlier, SOA is about business agility. SOA helps achieve agility by allowing you to build services that multiple applications can reuse.

For example, suppose you design a service to be used in an application that typically has no more than 50 users. When the application is deployed, the performance of the service remains good until other applications are deployed that start to use that same service. Quickly, there are 500 percent more users than the service was designed for, and performance takes a nosedive.

This is a problem that occurs over and over in real-world SOA service design -- the performance of a service that performed well when it was first deployed breaks down as other applications begin to use that service. Designing a service that performs well for 500 users is different than designing one that performs well for 50 users.

Tip #2: Decouple data access from business logic.

In both traditional architectures (such as object-oriented architectures) and SOA architectures, applications depend on technologies such as ODBC, JDBC, and ADO.NET for access to data stored in databases. In traditional architectures, data access code is contained within the application.

Even when using an object-relational mapping tool such as Hibernate to abstract the data layer, data access code remains within the application. This tightly coupled method works because the applications aren't designed to share components with other applications (although code is routinely copied and propagated to other applications). When changes occur that affect data access, the code must be updated everywhere it appears.

In SOA environments, services are designed to be reusable, but we often find that data access has been implemented in the same way it always has, using the familiar, tightly coupled method. Data access code is built into each service that requires access to the database.

Building data access dependencies into services produces the following bad side effects:

  • It forces your business logic experts to become data access experts.
  • It results in complicated deployment scenarios that are hard to maintain.
  • It reduces scalability and performance.

Suppose that you discover a tip that will speed up the performance of a service you've been developing. The next day, you go into work and implement that change in your service. With careful testing, you realize that the change has improved the response time of your service by 100 percent and allowed it to scale to many more users.

This is a great benefit for your service, but can you implement the same tip in the thousands of other services that are deployed in your company? Achieving business agility, the real value of SOA, becomes more difficult when you have to modify many services to achieve the same goal across the board.

Tip #3: Design and tune for performance.

Although some of these tips also apply to your data access code in SOA environments, here are a few that are particularly important for this type of architecture:

  • Reusable services imply multiple users making many connections -- the perfect scenario for connection pooling. Any service with many users that is called often will fail to perform adequately without connection pooling.
  • Reusable services imply that the same statements are executed multiple times -- the perfect scenario for using statement pooling.
  • Be aware that each service that accesses the DSL may have different requirements. For example, one service may retrieve large objects and require tuning for this use, whereas another may load bulk data into tables and require a different tuning approach. Therefore, it's important for your database driver to be tunable.

Tip #4: Consider data integration.

Most companies start implementing SOA slowly, designing simple services that do simple things. For example, the scope of a first effort may be to design a service that looks up an order using an order ID. As developers become more comfortable with SOA, they design services that are more complex. For example, a service that handles the following workflow requires access to different data sources:

  • Retrieves an incoming Electronic Data Interchange (EDI) order
  • Validates client information stored in a database
  • Retrieves the customer number from the database
  • Submits an order to a DB2 database using the customer number
  • Sends a receipt to the client using EDI

Sequentially processing all the data involved in these steps can involve tremendous overhead. Comparisons of or conversions between data in different formats requires code to marshal the data from one format to another. Typically, this code changes the data from the XML data model to the relational data model and vice versa.

Eventually, all data used by the service is marshaled to the XML format to create an XML response to a Web Service call. Retrieving all this data from disparate sources can require many network round trips and multiple transformation layers to marshal the data.

Let's think about this differently. Most SOA architectures use XML-based requests and responses. XQuery is a query language for XML. Some XQuery products allow you to query data from XML documents or any data source that can be viewed as XML such as relational databases. With this type of solution, you can query almost any data source as though it were XML, regardless of how the data is physically stored.

Just as SQL is a relational query language and Java is an object-oriented programming language, XQuery is often thought of as a native XML programming language. At the time of this writing, XQuery 1.0 is a recommended specification of the W3C that you can find at www.w3.org/TR/xquery/.

The XQuery API for Java (XQJ) is designed to support the XQuery language, just as the ODBC, JDBC, and ADO.NET APIs support the SQL query language. The XQJ standard (JSR 225) is being developed under the Java Community Process that you can find at www.jcp.org/en/jsr/detail?id=225.2.