Periodic snapshot fact tables with sparse data
Page 1 of 1 • Share •
Periodic snapshot fact tables with sparse data
Description of the business and data model: my client runs a typical service oriented business; employees provide services to several clients, who pay according on the number of hours they use.
Description of the problem: One of the tables that I have designed is a monthly periodic snapshot fact table which records the number of hours provided and the associated cost (amount), with several dimensions attached like client, contract, office, type of service and a few others. Usually periodic snapshot fact tables are dense (there are actual transaction facts to load for any combination of the dimensions), but in this case this is not true: for example, a typical client only buys 2-3 services, while the services dimension contains more than 200 different kinds of services.
Question: should I load the periodic snapshot with a record for every combination of dimension values, even when there are not actual data? Of course this means loading a large number of records, 90% of which would contain all zeroes on the measures.
Considerations: since the snapshot is mainly used to build cross-tab reports for a given year with the 12 months on the columns, some combinations of dimensions must always be filled, otherwise the report might come out with some missing months on the column headers (this might depend on the tool used, we use Pentaho and we have observed this behavior). As a minimum, it is necessary to create at least one record per month per client, even if the client didn’t buy any services on some months.
I wanted to know if any of you have used a similar technique when dealing with periodic snapshot tables with sparse data and if there are some risks that I haven’t considered in doing so.
Description of the problem: One of the tables that I have designed is a monthly periodic snapshot fact table which records the number of hours provided and the associated cost (amount), with several dimensions attached like client, contract, office, type of service and a few others. Usually periodic snapshot fact tables are dense (there are actual transaction facts to load for any combination of the dimensions), but in this case this is not true: for example, a typical client only buys 2-3 services, while the services dimension contains more than 200 different kinds of services.
Question: should I load the periodic snapshot with a record for every combination of dimension values, even when there are not actual data? Of course this means loading a large number of records, 90% of which would contain all zeroes on the measures.
Considerations: since the snapshot is mainly used to build cross-tab reports for a given year with the 12 months on the columns, some combinations of dimensions must always be filled, otherwise the report might come out with some missing months on the column headers (this might depend on the tool used, we use Pentaho and we have observed this behavior). As a minimum, it is necessary to create at least one record per month per client, even if the client didn’t buy any services on some months.
I wanted to know if any of you have used a similar technique when dealing with periodic snapshot tables with sparse data and if there are some risks that I haven’t considered in doing so.

Andrea Vincenzi- Posts: 8
Join date: 2009-02-04
Age: 57
Location: Rome (Italy)

Re: Periodic snapshot fact tables with sparse data
I have a "sparse" daily snapshot for inventory data. Probably not as sparse as your scenario, but records with a 0 quantity on hand are not stored. Because not every item is stored at every storage location, not storing a record for each item for each storage location eliminates a large number of rows from the daily snapshot.
However, I have a business user who wants to see "zero" records. He does a cross-tab report of items by storage locations, and he wants every possible combination to have a value. Rather than store the zero records in the snapshot, I gave him a view that synthesizes them. In essence the view does a cross-tab of items and storage locations and then left joins the daily snapshot to get the non-zero values when they exist. From the user's perspective it is as if there is a record for every combination stored in the daily snapshot fact table.
You might be able to do something similar with clients, services, and months.
However, I have a business user who wants to see "zero" records. He does a cross-tab report of items by storage locations, and he wants every possible combination to have a value. Rather than store the zero records in the snapshot, I gave him a view that synthesizes them. In essence the view does a cross-tab of items and storage locations and then left joins the daily snapshot to get the non-zero values when they exist. From the user's perspective it is as if there is a record for every combination stored in the daily snapshot fact table.
You might be able to do something similar with clients, services, and months.
Last edited by VHF on Wed Oct 26, 2011 6:01 am; edited 2 times in total (Reason for editing : typo; spelling)
VHF- Posts: 220
Join date: 2009-04-28
Location: Wisconsin, US
Re: Periodic snapshot fact tables with sparse data
Good idea. Depending on your database, you may also want to consider doing a UNION ALL instead of a join.

ngalemmo- Posts: 1732
Join date: 2009-05-15
Location: Los Angeles

Similar topics» 3110 data blk error
» MCX IEOD DATA MAY 2012
» EEPROM DATA FILES,UNIVERSAL TV BOARD
» NIFTY FUTURE IEOD ONE MIN DATA [ 2006 ]
» Periodic snapshot fact tables with sparse data
» MCX IEOD DATA MAY 2012
» EEPROM DATA FILES,UNIVERSAL TV BOARD
» NIFTY FUTURE IEOD ONE MIN DATA [ 2006 ]
» Periodic snapshot fact tables with sparse data
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum