Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

View previous topic View next topic Go down

Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  ian.coetzer on Wed Nov 16, 2011 7:16 am

Hi,

We have for example two fact tables, these fact tables are at the same grain.
However one of the dimensions (conformed) has been designed to implement simple SCD type 2 change tracking.

The problem is that records are not necessarily inserted at exactly the same time, since the 2nd fact table's business event takes longer to start.
So we do not yet have measures for the 2nd fact table in the source table at the time that the first fact table is populated.

This is fine, records come in the following day and everything matches up again for that time period and totals on reports balance etc.

However some of the attributes on one of the common dimensions have been implemented as SCD type 2, this means the dimension record is expired and a new one inserted.

Now again we have unbalanced fact tables due to the fact that a record in the first fact table is linked to an expirement record in the dimension, and then when the second fact table is populated it of course links to the new dimension record which is active and was inserted due to a SCD type 2 change.



avatar
ian.coetzer

Posts : 57
Join date : 2010-05-31
Age : 36
Location : South Africa

View user profile

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  ngalemmo on Wed Nov 16, 2011 11:49 am

What is it you want to do? If you want the attributes to be the same, which 'same' do you want? The same as the first fact, second fact or current regardless of fact?
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

RE: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  zoom on Wed Nov 16, 2011 12:28 pm

Is expired dim record tied to first fact table correct or the second fact table tied to a new dim record is correct because of SCD2 ? Look at the detail data came in to load first fact and the dimention and second fact and the same dim. If dim data changed from day one (loading first fact) to day second (loading second fact) then it is behaving correctly. This is not a data modeling issue, but timing issue to process the data.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  ian.coetzer on Wed Nov 16, 2011 2:48 pm

Hi,

It is not timing because the second fact table is dependant on a completely different business process that is suppose to only come through at a later stage.

What I think must happen is that all the fact records that are linked to the original (now expired) dimension record must be updated with the new surrogate key of the new active dimension record.

However this would then change this from a transactional fact table to an accumulating fact table (in a way)...
And this is not right from a dimensional modelling perspective, and it would potentially have to scan through 35 million + records to update the ones that changed? this would produce the results we want from a reporting perspective but I can't get myself to do this because it seems wrong to me on so many levels? we then also lose what the dimension record looked like at the time of the fact record insertion which is also important to keep intact! I cannot seem to find a solution that makes sense.
avatar
ian.coetzer

Posts : 57
Join date : 2010-05-31
Age : 36
Location : South Africa

View user profile

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  John Simon on Thu Nov 24, 2011 6:03 pm

If this is something you do often, why not just store the natural key in both fact tables for that one dimension?

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

View user profile http://jsimonbi.wordpress.com

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  LAndrews on Thu Nov 24, 2011 6:20 pm


With SCD-2, the dimension key is dependent upon the transaction date the fact represents (which may be different than the date the fact was loaded).

It sounds like one of your facts can be considered "late-arriving", which means you'll need to modify your ETL process to look up the surrogate keys based on the transaction date, rather than just using the "current" record.

If the 2 facts represent a different transaction date, then having different surrogate keys is legitimate SCD2 behavior.


LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

View user profile

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  ngalemmo on Thu Nov 24, 2011 9:50 pm

ian.coetzer wrote:
What I think must happen is that all the fact records that are linked to the original (now expired) dimension record must be updated with the new surrogate key of the new active dimension record.

Sounds like you would rather have a type 1 dimension. Which is why I asked the original question.

When you combine facts, you combine them on attributes, not keys (unless you are creating an aggregate table, in which case it really doesn't matter that the keys are different, you just wind up with more rows). If you want to combine the facts on current dimensional values, you need to self join on the type 2 dimension to locate the current row. You do not change keys or in any way alter what you have.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  hang on Thu Nov 24, 2011 9:51 pm

I think Andrews is right on the point. Your ETL should always be able to load fact table with proper dimension keys for transactions that take place a few days ago. In your case the second fact table load will contain transactions with early dates. For ETL, always use SCD dates to get the correct dimension keys, so that there is only one version of ETL for both historical rebuild and daily incremental load.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Analyzing measures from two fact tables where records do not necessarily match due to SCD2 changes.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum