We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Leveraging Information and Intelligence

David Linthicum

Identifying Data Characteristics for Data Integration

user-pic
Vote 0 Votes

One of the things that seems to be missing from the process of doing data integration is the ability to properly define data characteristics. Here are some suggestions from me.

The first step in identifying and locating information about the data is to create a list of candidate systems. This list will make it possible to determine which databases exist in support of those candidate systems. The next step will require determining who owns the databases and where they are physically located. It will also include relevant design information and such basic information as brand, model, and revisions of the database technology. Finally, the properties and structure of the databases are defined including database integrity, data latency and data format.

When analyzing databases for application integration, integrity issues constantly crop up. In order to address these issues, it is important to understand the rules and regulations that were applied to the construction of the database. For example, will the application allow updating customer information in a customer table without first updating demographics information in the demographics table?

Data latency, the characteristic of the data that defines how current the information needs to be, is another property of the data that needs to be determined for the purposes of application integration. Such information will allow application integration architects to determine when the information should be copied, or moved, to another enterprise system and how fast.

Another identifying component of data is data format. How information is structured, including the properties of the data elements existing within that structure can be gleaned from knowledge of the data format. Likewise, length, data type (character or numeric), name of the data element, and what type of information stored (binary, text, spatial, and so on) are additional characteristics of the data that may be determined by its format.
Resolution of data format conflicts must be accomplished within such application integration technologies such as integration servers.

Different structures and schemas existing within the enterprise must be transformed as information is moved from one system to another. The need to resolve these conflicts in structure and schema makes knowing the structure of the data at both the source and target systems vital, as it is relevant to population of the metadata layer.

More on this later.

Industry expert Dave Linthicum tells you what you need to know about building efficiency into the information management infrastructure

David Linthicum

David Linthicum is the CTO of Blue Mountain Labs, and an internationally known distributed computing and application integration expert. View more

Subscribe

 Subscribe in a reader

Recently Commented On

Categories

Microsoft,

Monthly Archives

Blogs

ADVERTISEMENT