Storing data that changes periodically - should I use a periodic snapshot fact?

View previous topic View next topic Go down

Storing data that changes periodically - should I use a periodic snapshot fact?

Post  min.emerg on Thu Jun 09, 2011 3:59 am

Hi everyone,

I need to make additions to the data warehouse that I'm working on, and I am hoping for some guidance.

Some context on the data that I'm working with:

In simple terms, the system that will form the source of this data tracks how employees are allocated to various projects. From 200901, Employee X allocates 40% of her time to Project A and 60% to Project B. In 201102, she changes roles and her time is now split 20% to Project B and 80% to Project C. The tact table needs to show employee allocations to each project they are working on for a given time (at a month level), so that these allocations for each employee can be reported on a monthly basis.

This does not seem transactional to me - as such, a transactional fact does not look like the correct approach. Would a periodic snapshot fact be more suited for this?

My confusion comes from the fact that from 200901 to 201101, the first allocation of Employee X is valid. From 201102, the allocation changes, and is now valid until a new allocation is created. Do I somehow fill in the gaps between 200901 and 201101 in the fact able, and have a record for each month during this time period, even though the allocation hasn't changed, or do I only record instances where allocations change and use logic in my queries to work out what allocations were in between these records?

Thanks.

min.emerg

Posts : 39
Join date : 2011-02-25

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  min.emerg on Thu Jun 09, 2011 8:44 am

I've spent some more time researching the problem (on this forum) and have found a few possible solutions:

1. Take a monthly snapshot of the data, and create a new record in the fact table for all Employees that are active (regardless of whether their allocations have changed since the last month). We could therefore have 12 records of exactly the same allocation data for an employee but for 12 different months (if their allocations have stayed the same for that 12 month period).

2. Use a type-2 fact, with two date fields added to the fact table to determine when a particular record was active.

A question comes to mind - aren't you limited in the types of aggregation that can be done using method 2? If you want to sum an Employees allocation for the last three months, this might be difficult as there is only one record, and not three identical records as there would be using method 1.

min.emerg

Posts : 39
Join date : 2011-02-25

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  ngalemmo on Thu Jun 09, 2011 9:45 am

Option 2, the accumulating snapshot, does make some calculations more difficult, but it is more compact. Its more difficult to maintain, requiring you to update expiration dates on superceded rows. It is also open to interpretation as users can choose what date they want to look at... not necessarily the month-end.

For this particular application, a periodic snapshot, with monthly rows, is probaly the better choice. Its easy to understand, not open to interpretation (explicity shows the 'month' numbers based on business rules that determine when such numbers are counted) and simple to use.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  min.emerg on Thu Jun 09, 2011 10:50 am

Thanks again ngalemmo for your help.

I was leaning towards the periodic snapshot approach - we will possibly be adding 100k records to the fact table every month (at the most), and will partition the data after a year, so a million or so records is easy enough to work with. It also makes life easier if they somehow want to update a record in the past. Using the type-2 fact approach would mean that we have to insert a record somewhere in the middle and change the dates accordingly, whereas with periodic snapshots we only have to updated a single record.

min.emerg

Posts : 39
Join date : 2011-02-25

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  BoxesAndLines on Fri Jun 10, 2011 9:23 am

I like the periodic snapshot idea as well. Especially when the volumes are lower.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  min.emerg on Fri Jun 10, 2011 10:06 am

Is a rolling window useful when loading a periodic snapshot fact?

There are scenarios where adjustments can be made in the future for past entries into the fact (late arriving fact data). If I want these late arriving facts to be processed into the fact table, could I use a rolling window based off a record's updated date to determine whether data has been updated or is newly inserted since data was last processed? Bearing in mind that I will be taking a snapshot monthly - I also want to include updates that were made to previous months.

min.emerg

Posts : 39
Join date : 2011-02-25

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  BoxesAndLines on Fri Jun 10, 2011 3:44 pm

For snapshots, I do not generally apply late arriving facts. That's the beauty of the snapshot. It is what it is.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  hang on Fri Jun 10, 2011 6:00 pm

BoxesAndLines wrote:For snapshots, I do not generally apply late arriving facts. That's the beauty of the snapshot. It is what it is.
Fully agreed. If you update the snapshot facts, it will not be the true picture at particular point of time. But why bother updating? the future snapshots will cover the adjustments anyway.

I also prefer the periodic snapshot in this case, as it is more cube friendly for trend analysis than the accumulating ones.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Storing data that changes periodically - should I use a periodic snapshot fact?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum