Preserving prior history for late arriving dimensions
Page 1 of 1 • Share •
Preserving prior history for late arriving dimensions
Can't find much about this topic anywhere else so hopefully someone can help me.
Our warehouse has late arriving dimensions for an Employee. Their department changes, for example, can be backdated in the systems and so we have to account for this and adjust the effective dates in the SCD and the relevant surrogate keys in the affected sales fact table.
This is covered nicely in the "Data Warehouse Toolkit" under Health Care, so that bit's fine.
Now what I need to do is allow users to report on the historic version of the truth so that any reports that are rerun for a prior month end will reflect the totals reported at the end of that month (and so will not take into account the backdating).
What I intend to have is an aggregate table built on top of the sales fact which is refreshed daily. At the end of the month I will siphon this off into a monthly snapshot table and add a month dimension surrogate key. This will give me the correct totals by employee at the end of each month.
The problem I will have is that after the snapshot is taken, an employee's department move gets backdated (and the relevant dimension rows) and the corresponding rows in the snapshot table will then point to the incorrect department.
Do I :-
a) add additional "Prior history department" columns to the employee dimension to the record
b) adjust the surrogate keys in the snapshot table in the same way I would with the fact tables
c) forget this idea as a lost cause
Sorry for the long winded explanation, hopefully my issue is clear.
Thanks,
B
Our warehouse has late arriving dimensions for an Employee. Their department changes, for example, can be backdated in the systems and so we have to account for this and adjust the effective dates in the SCD and the relevant surrogate keys in the affected sales fact table.
This is covered nicely in the "Data Warehouse Toolkit" under Health Care, so that bit's fine.
Now what I need to do is allow users to report on the historic version of the truth so that any reports that are rerun for a prior month end will reflect the totals reported at the end of that month (and so will not take into account the backdating).
What I intend to have is an aggregate table built on top of the sales fact which is refreshed daily. At the end of the month I will siphon this off into a monthly snapshot table and add a month dimension surrogate key. This will give me the correct totals by employee at the end of each month.
The problem I will have is that after the snapshot is taken, an employee's department move gets backdated (and the relevant dimension rows) and the corresponding rows in the snapshot table will then point to the incorrect department.
Do I :-
a) add additional "Prior history department" columns to the employee dimension to the record
b) adjust the surrogate keys in the snapshot table in the same way I would with the fact tables
c) forget this idea as a lost cause
Sorry for the long winded explanation, hopefully my issue is clear.
Thanks,
B
Bisquite- Posts: 1
Join date: 2010-02-07
Re: Preserving prior history for late arriving dimensions
If you must carry both versions of the fact, then add a current flag and effective/expiration date columns to the fact table. When you need to restate a fact, expire the old one and insert a new row. The current flag is not absolutely necessary (you could look for unexpired rows) but it does make it easier and more efficient to filter current state in queries, particularly if you DBMS supports bitmap indexes.

ngalemmo- Posts: 320
Join date: 2009-05-16
Location: Los Angeles

Permissions of this forum:
You cannot reply to topics in this forum





