Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Large Fact Table and Maintaining Periodic Snapshot: Practice

3 posters

Go down

Large Fact Table and Maintaining Periodic Snapshot: Practice Empty Large Fact Table and Maintaining Periodic Snapshot: Practice

Post  buzzer75 Thu Jun 18, 2009 8:30 pm

I would like some opinions with my approach here. I am trying to replace an overkill lift and load ETL process that basically replicated the entire universe of dataset every period instead of doing just Delta Load. In this delta approach, I have a stage table with new fact rows and I merge it to the target base table to load delta. I also a current flag to the new and old records. For reporting purposes, the current picture is all we need and I use the flag. For weekly and monthly view if needed, I have a summary like table that gets loaded periodically to point me to the weekly or monthly view.

FactTable (daily currrent view is a just selection of all currrent records)
FactKey inv_id locid Sumval IS_CURRENT Dateid
1000 990 100 50 N 20090101
1001 991 101 25 N 20090101
1002 992 102 75 Y 20090101
----------- ---------------- ----------------------------
1003 990 102 55 Y 20090102 -------> new changes inserted and flag set to mark current
1004 991 100 30 Y 20090102

For a monhthly view, I have a MonthlyInvoiceSnapshot table that gets loaded from my fact table above at the end of the month picking all current records for say invoice fact
Monthid FactKey inv_id
M1 1000 990
M1 1001 991
M1 1002 992
-----------------------
M2 1003 990
M2 1004 991
M2 1002 992

Similarly I could have weekly, quarterly and yearly look. Lets just say all Dimension tables are TYPE1 and therefore no need to keep history. Is there something I am overlooking? I would like to hear some opinions

Thanks
Buz

buzzer75

Posts : 2
Join date : 2009-06-18

Back to top Go down

Large Fact Table and Maintaining Periodic Snapshot: Practice Empty Re: Large Fact Table and Maintaining Periodic Snapshot: Practice

Post  ngalemmo Fri Jun 19, 2009 1:39 pm

The first part is ok, but I don't understand the purpose of the monthly snapshot. It does not appear to be a summary of any kind because you are including the invoice number in the table. How is it different than just querying the base fact table?
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Large Fact Table and Maintaining Periodic Snapshot: Practice Empty Why have separate tables?

Post  JoeSalvatore Fri Jun 19, 2009 2:52 pm

Most DBMS's have optimizers built into their query plan generation engines that will recognize true aggregate queries versus a detailed fact table and will redirect to an aggregate view that will provide the results.
Often to use this functionality you will create an Indexed View (or your DBMS flavor equivalent) that contains the appropriate aggregate query of the detailed fact table.
The point of this is it minimizes the ETL effort as it would require only designing and implementing ETL load for the base detailed fact table.
A great further reference for this and other aggregation stratgies can be found in "Mastering Data Warehouse Aggregates - Solutions for Star Schema Performance" by Christopher Adamson [Wiley 2006 ISBN 0-471-77709-9].
Additionally, this all assumes that you are not using any OLAP or other analysis engines that already provide this behavior.
JoeSalvatore
JoeSalvatore

Posts : 4
Join date : 2009-06-19

Back to top Go down

Large Fact Table and Maintaining Periodic Snapshot: Practice Empty Re: Large Fact Table and Maintaining Periodic Snapshot: Practice

Post  buzzer75 Fri Jun 19, 2009 10:14 pm

Thank you so much for the responses. To answer the question on the snapshot, I may have used a bad example. I think of this snapshot as a hook table that will point me to the fact key in the main fact table as of the period in question. This will avoid date logic complications and also as you pointed out, I can leverage join indexes and have some good performance for retrieving historic data.

I do plan on type2 with some dimension tables where profile changes are required to capture. My main goal is to get rid of massive lift and load (version) etl that basically is a overkill in my case. With this delta process, I can still answer those rare AS OF questions. For majority of my need, I believe the current picture will suffice.

buzzer75

Posts : 2
Join date : 2009-06-18

Back to top Go down

Large Fact Table and Maintaining Periodic Snapshot: Practice Empty Re: Large Fact Table and Maintaining Periodic Snapshot: Practice

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum