Late Arriving Facts

View previous topic View next topic Go down

Late Arriving Facts

Post  dwbi_rb on Thu Mar 18, 2010 7:41 am

Hello All,
Need some inputs regarding handling of late arriving facts:

I have a demographics related fact table that holds the following measures - Population_Count, Birth_Count, Death Count.

The behaviour of the Population_Count is that it stores the count as of the refresh date. However, the Birth and Death Counts are for a particular period - (monthly).

Now the fact table mentioned above is populated from a Level 1 data mart which is refreshed daily and holds data at a lower granularity than Level 2. The Level 2 data mart is refreshed monthly. As a result, there are mismatches in the Counts.
Example:
For the period 1 JAN - 31st JAN the Population Count = 200 for Level 1 mart refreshed as on 31st JAN and Level 2 mart refreshed on 31st JAN. However, Level 2 will not be refreshed until the 28th FEB, whereas Level 1 will be refreshed daily. As a result there will be an obvious mismatch between the count in Level 1 and LEvel 2. Say in 15th Feb new data is received in LEvel 1 for Births/Deaths in Jan. This should be reflected in the LEvel 2 as well.

What are the best ways of resolving this issue?


Thanks in advance!

dwbi_rb

Posts : 17
Join date : 2009-02-19

View user profile

Back to top Go down

Re: Late Arriving Facts

Post  Mj1978 on Thu Mar 18, 2010 9:41 am

You are talking of two kinds of snapshot data in a single fact table.
Transactional(Daily) for BIRTH_COUNT and DEATH_COUNT and
Monthly for POPULATION.
What I believe is the grain of this fact table is not defined appropriately.

If the Level 2 Datamart is scheduled to be loaded monthly basis then I would suggest to create a monthly snapshot fact table and load this table for the level 2 Datamart. This way you can have access to both Transactional and Monthly snapshots at any point of time by the means of these Two Fact tables.

But If you really want to maintain those two grains in the same fact table then you can have a Routine or SQL procedure which can update the BIRTH_COUNT and DEATH_COUNT fields in the Level 2 Data mart every time the LEVEL 1 data mart is refreshed. In this case the LEVEL 2 Data MArt will no longer remain an absolute monthly snapshot of the data.

Thanks
avatar
Mj1978

Posts : 8
Join date : 2010-03-10

View user profile

Back to top Go down

Re: Late Arriving Facts

Post  ngalemmo on Fri Mar 19, 2010 3:37 pm

Now the fact table mentioned above is populated from a Level 1 data mart which is refreshed daily and holds data at a lower granularity than Level 2. The Level 2 data mart is refreshed monthly. As a result, there are mismatches in the Counts.

Yes, of course they don't match. What were you expecting to happen? What is the 'level 2' summary supposed to represent?

It's very common to have month-end snapshots of things in a data warehouse. If that is what the 2nd table is supposed to be, so be it. If it is a snapshot and the user want's current data, why can't they query the first table (the one that is updated daily)?

If the second table is not a monthly snapshot, but is supposed to be a summary of the first table, then update it daily... or eliminate it altogther.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Late Arriving Facts

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum