Handling "Unknown" Dimension atrributes that are Type 2 changes

View previous topic View next topic Go down

Handling "Unknown" Dimension atrributes that are Type 2 changes

Post  meb97me on Wed Aug 25, 2010 10:58 am

In the data warehouse we're building, we've got an Employee dimension that has a self referencing attribute called ReportsTo which stores the surrogate key of their line manager who is also in the DW (in most cases). This attribute is a type 2 change as we want to capture historical heirachical changes in the company, however some people aren't always assigned a valid manager (its a really crappy system that i wont bore you with) but suffice to say they can store invalid managers rather than not speicfying one (such as someone at the top of the tree)

Obvioulsy as we process thir dimension record their ReportsTo would come up as an "unknown"/-1 and so assuming we put this into the dimension table any facts that occur involving this employee would be stored against this employees surrogate key. Now assuming we get them to fix the ReportsTo in the source system this would come through as type 2 change and so a new dimensional record would be created and any new events with this employee would be against that. Now the problem as i see it would lie in the cube as events would be rolled up against an "Unknown" manager.

Would it be better not to load these dimensions until they have fixed the "unknown" reports to as other wise it would give inaccurate historical reporting. ie treat it almost like a fact record as in "its not going in the DW unless the values are correct and valid".

hopefully that kind of makes sense

cheers for any thoughts

meb97me

Posts : 34
Join date : 2010-07-28

View user profile

Back to top Go down

Re: Handling "Unknown" Dimension atrributes that are Type 2 changes

Post  ngalemmo on Wed Aug 25, 2010 11:54 am

Even though it is a type 2, it doesn't mean you MUST create a new row when something changes. The situation you describe is one where it makes sense to simply update in place.

What is not clear is what happens if other attributes, of which you are also tracking history, change between the time the employee is first loaded and the management hierarchy is cleaned up. You will need to work out how you want to update the dimension in such instances. The ultimate goal is to try to capture as accurate and useful history as possible and your update process should do whatever it needs to do to achieve that.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Handling "Unknown" Dimension atrributes that are Type 2 changes

Post  meb97me on Tue Sep 07, 2010 6:20 am

thanks again for a great response ngalemmo we'll adapt our ETL process as necessary

meb97me

Posts : 34
Join date : 2010-07-28

View user profile

Back to top Go down

Re: Handling "Unknown" Dimension atrributes that are Type 2 changes

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum