Modern Database Management: Chapter 10 Problem and Exercise 2

    what-types-of-data-pollutioncleansing-problems-might-occur-with-the-fitchwood-oltp-system-data


What types of data pollution/cleansing problems might occur with the Fitchwood OLTP system data?

 

The Solution

Most of the data pollution problems mentioned in the chapter could occur with this data set. The most likely concerns are missing and duplicate data and inconsistencies (For example, different primary keys for the same policies or other hire dates for different agents might be legitimate because they were hired on different dates to work in different product lines). It is also possible that other systems have different rules for creating computed values (for example, various insurance products might have different rules for using face value and commissions to calculate the amount paid agents). Territories might have other geographical boundaries across the source systems. Even more, issues are possible.

 

The data pollution/cleansing problems that might occur in the Fitch wood Insurance Company system includes:

  • Misspelt names and addresses, odd formats for customer names and addresses
  • The impossible or erroneous effective date in the Policy table or date of hire in the Agent table
  • Fields used for purposes for which they were not intended
  • Mismatched addresses and area codes
  • Missing data
  • Duplicate data
  • Inconsistencies (e.g., different addresses) in values or formats across sources
  • Different primary keys across sources

Post a Comment

Post a Comment (0)

Previous Post Next Post