Handling a correction vs a change in SCD type II dimension.

View previous topic View next topic Go down

Handling a correction vs a change in SCD type II dimension.

Post  M. Khan on Tue Jul 24, 2012 9:16 pm

What approach to architect the data architecture and ETL to handle a correction instead of a change in SCD type II dimension, e.g. customer realizes that his address is not correct when he receives his first bill. This update to the customer data is not a change but a correction. Should it be handled as a special adhoc ETL job especially when a source system does not indicate the reason of update and would it requires to add audit tables to capture the correction ...

M. Khan

Posts : 11
Join date : 2012-07-24

View user profile

Back to top Go down

Re: Handling a correction vs a change in SCD type II dimension.

Post  ngalemmo on Tue Jul 24, 2012 10:45 pm

Other than adding a 'reason for change' kind of attribute, you shouldn't do anything different.

The whole point of a type 2 is to store states of a dimension as it appeared in the DW. The fact that you will know is was one thing one day, then corrected to something else at some point in the future is the whole point of having a type 2. The fact that is was a correction versus a change is immaterial. As you state you don't have a reason of change attribute, I would put my efforts into changing the source system so you know why the row was updated and storing that information in the warehouse. It is time better spent than developing a 'special' process to handle this type of change. If you need to report current attribute values, you adjust your query against the dimension to retrieve the most current row. There are various techniques to do this.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Handling a correction vs a change in SCD type II dimension.

Post  M. Khan on Wed Jul 25, 2012 3:02 pm

"The fact that is was a correction versus a change is immaterial."

I see your point of capturing the reason of change in dimensional model from the source system, assuming the source system provides this info. If we leave bad record in the dimension, does it not violates major rule of data quality that provides the foundation to the any data warehouse.

M. Khan

Posts : 11
Join date : 2012-07-24

View user profile

Back to top Go down

Re: Handling a correction vs a change in SCD type II dimension.

Post  ngalemmo on Wed Jul 25, 2012 5:23 pm

No. Data quality is an operational issue, not a data warehouse issue. Sure, a data warehouse may get bad data, but the data warehouse is not where it should be fixed. The data warehouse needs to be accurate, not correct.

A type 2 reflects the state of the dimension at the time of the transaction. It is important that this not be 'corrected' otherwise you compromise accuracy. If the user wants to see the current state of the dimension, (i.e. after it has been corrected) then you do a self join on the type 2 dimension to locate the current row (that is what the 'current flag' attribute is all about).

If the business is only concerned about the current state of the dimension, don't implement it as a type 2.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Handling a correction vs a change in SCD type II dimension.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum