Data Quality as a Management Tool
The problem of maintaining high-quality data is emerging as one of the primary information management issues for the new decade. Industry surveys confirm that over 70 percent of major companies experience substantial volumes of defective data and suffer losses that include the failure to accurately bill or collect receivables. As the amount of data that is collected and used increases, not only does the occurrence of errors increase as well, but so does the potential for misinterpreting results derived from the aggregation and merging of erroneous data.
In addition, and perhaps more important in the business intelligence and B2B arenas, more and more organizations are seeking to automate the interchange of information (such as enhanced, real-time CRM systems or supply chain applications). In this B2B environment, the value of automation as well as increased processing volumes can only be achieved if the participants agree on the validity of the information being exchanged.
To this end, it is critical that companies institute a data quality program as the centerpiece of their information exchange framework.
The concept of data quality management has evolved from the elimination of mailing list duplicates and list merges/purges to a more sophisticated knowledge management process that monitors and corrects data as part of a business's production data flows.
Data Quality Defined: Fitness for Use
What does data quality mean? In actuality, almost everyone has a different view of what is understood to be data quality, whether that means duplicate elimination, address standardization, customer record enhancement, validity of patient records, etc. Since anybody's definition is geared toward the individual's view of what is "good" and what is not, we can define data quality in terms of fitness for use--the level of data quality is determined by the consumers of the data in terms of meeting or beating expectations. In practice, this means identifying a set of data quality objectives associated with any data set and then measuring that data set's conformance to those objectives.
Fitness for use implies that within any operational context, the data that is being used meets or beats user expectations. For a more rigorous view, we can look at fitness in terms of three aspects of data quality: