Reducing number of rows in fact tables
4 posters
Page 1 of 1
Reducing number of rows in fact tables
Hello
We have a Kimball style DW with a FactDailyBalancesSnapshot table. The fact table is growing at quite a rate as it is a daily snapshot of every (bank) account. A lot of the accounts, however, have a static balance during the life of their term. These so called Term Accounts are similar to savings accounts, but with with a maturity date. Once started, not much happens to these accounts if held till maturity.
My question is this: What are the consequnces of us minimising the number of rows in the fact table by only ETL'ing accounts where the daily balance has changed?
One issue that I can think of is the Analysis Services Cube not being to handle the dates for which there are no balances. The grain of the fact table is daily and there will be missing days - will the Cube magic work/cope with this?
Thanks in advance.
We have a Kimball style DW with a FactDailyBalancesSnapshot table. The fact table is growing at quite a rate as it is a daily snapshot of every (bank) account. A lot of the accounts, however, have a static balance during the life of their term. These so called Term Accounts are similar to savings accounts, but with with a maturity date. Once started, not much happens to these accounts if held till maturity.
My question is this: What are the consequnces of us minimising the number of rows in the fact table by only ETL'ing accounts where the daily balance has changed?
One issue that I can think of is the Analysis Services Cube not being to handle the dates for which there are no balances. The grain of the fact table is daily and there will be missing days - will the Cube magic work/cope with this?
Thanks in advance.
amir2- Posts : 29
Join date : 2010-07-29
Re: Reducing number of rows in fact tables
You could store a begin/end date in the balance fact and only add rows when the balance changes. It is easy enough in SQL to select balances for a specific date.
I do not know what SSAS would do with it if you need to load multiple dates into a cube. You would probably need some form of view to revert it to the full daily snapshot for SSAS.
I do not know what SSAS would do with it if you need to load multiple dates into a cube. You would probably need some form of view to revert it to the full daily snapshot for SSAS.
Re: Reducing number of rows in fact tables
Partition the fact by day. Drop daily history after a month a carry month end snapshots after that. Adjust parameters as needed.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Reducing number of rows in fact tables
Theoretically, effective dated fact can produce periodic snapshot fact through some form of view (by cross join with date dim) as ngalemmo suggested. However it really comes down to the performance.
With SSAS, the view will be materialized in the cube, so effectively it is still daily snapshot. Maybe cube can compress the cells more efficiently than relational database engine.
I would combine both approaches, one of which has been proposed by B&L. Use effective dated fact to produce daily snapshot on the fly if required (say within a particular month), and monthly snapshot for trend analysis over many years.
When you are interested in daily trend, you always limit you time frame as you don't want to be overwhelmed by hundreds of dates, so the view will be quick once constrained by a short period of time. On the other hand, monthly snapshots can also give you big picture instantly over years history.
With SSAS, the view will be materialized in the cube, so effectively it is still daily snapshot. Maybe cube can compress the cells more efficiently than relational database engine.
I would combine both approaches, one of which has been proposed by B&L. Use effective dated fact to produce daily snapshot on the fly if required (say within a particular month), and monthly snapshot for trend analysis over many years.
When you are interested in daily trend, you always limit you time frame as you don't want to be overwhelmed by hundreds of dates, so the view will be quick once constrained by a short period of time. On the other hand, monthly snapshots can also give you big picture instantly over years history.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Reducing number of rows in fact tables
Thanks guys.
We set the grain of the fact table at daily because the Kimball books recommend atomic level grains. For a bank account, daily balance is the most logical lowest grain. Also, our users want at least weekly data, so we thought for the sake of four extra rows (Monday - Thursday) per account, we might as well go for the more accommodating solution.
I have checked with the Business again and they want at least one year's worth of data, so we won't be able to get rid of the daily records after a month - we need to wait at least one year.
The above suggestions don't seem straight forward to implement (?) and require more clever SQL etc. Is this one of those cases where hardware (extra disk, memory, etc) will provide a better solution than the software equivalent?
We set the grain of the fact table at daily because the Kimball books recommend atomic level grains. For a bank account, daily balance is the most logical lowest grain. Also, our users want at least weekly data, so we thought for the sake of four extra rows (Monday - Thursday) per account, we might as well go for the more accommodating solution.
I have checked with the Business again and they want at least one year's worth of data, so we won't be able to get rid of the daily records after a month - we need to wait at least one year.
The above suggestions don't seem straight forward to implement (?) and require more clever SQL etc. Is this one of those cases where hardware (extra disk, memory, etc) will provide a better solution than the software equivalent?
Last edited by amir2 on Thu Nov 10, 2011 12:04 pm; edited 1 time in total (Reason for editing : Typos)
amir2- Posts : 29
Join date : 2010-07-29
Re: Reducing number of rows in fact tables
Why has it got anything to do with hardware, and what is hard about implementing it? You can just treat the fact table like SCD dimension and only add new rows if the balance has changed. So that's your base fact table. You may take weekly and monthly snapshot and store the data in another fact table, by querying the base fact table, to cater for trend analysis by the cube. You don't loose any data at daily grain.
Last edited by hang on Fri Nov 11, 2011 7:10 am; edited 1 time in total (Reason for editing : typo)
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Reducing number of rows in fact tables
I take back my last post. B&L and Hang are right. I did the math. If a fact row takes 40 bytes and you have 1 million accounts, you would need 14.6GB to store a year's worth of daily snapshots. Maybe 2-3 times that depending on your database and indexing requirements. So, even at 42GB, it is not a big deal.
If the bank has many millions of accounts, they can certainly afford suitable hardware to support it.
If the bank has many millions of accounts, they can certainly afford suitable hardware to support it.
Similar topics
» Fact and dimension tables - avoiding same number of rows in both
» Number of Columns in Fact Tables vs. Dimension Tables
» Converting Great Plains SOP tables to fact rows
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» Unknown number of relationships from dimension to fact until fact loaded
» Number of Columns in Fact Tables vs. Dimension Tables
» Converting Great Plains SOP tables to fact rows
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» Unknown number of relationships from dimension to fact until fact loaded
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum