Reducing number of rows in fact tables

View previous topic View next topic Go down

Reducing number of rows in fact tables

Post  amir2 on Wed Nov 09, 2011 11:03 am

Hello

We have a Kimball style DW with a FactDailyBalancesSnapshot table. The fact table is growing at quite a rate as it is a daily snapshot of every (bank) account. A lot of the accounts, however, have a static balance during the life of their term. These so called Term Accounts are similar to savings accounts, but with with a maturity date. Once started, not much happens to these accounts if held till maturity.

My question is this: What are the consequnces of us minimising the number of rows in the fact table by only ETL'ing accounts where the daily balance has changed?

One issue that I can think of is the Analysis Services Cube not being to handle the dates for which there are no balances. The grain of the fact table is daily and there will be missing days - will the Cube magic work/cope with this?

Thanks in advance.

amir2

Posts : 29
Join date : 2010-07-29

View user profile

Back to top Go down

Re: Reducing number of rows in fact tables

Post  ngalemmo on Wed Nov 09, 2011 11:15 am

You could store a begin/end date in the balance fact and only add rows when the balance changes. It is easy enough in SQL to select balances for a specific date.

I do not know what SSAS would do with it if you need to load multiple dates into a cube. You would probably need some form of view to revert it to the full daily snapshot for SSAS.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Reducing number of rows in fact tables

Post  BoxesAndLines on Wed Nov 09, 2011 2:40 pm

Partition the fact by day. Drop daily history after a month a carry month end snapshots after that. Adjust parameters as needed.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Reducing number of rows in fact tables

Post  hang on Wed Nov 09, 2011 7:29 pm

Theoretically, effective dated fact can produce periodic snapshot fact through some form of view (by cross join with date dim) as ngalemmo suggested. However it really comes down to the performance.

With SSAS, the view will be materialized in the cube, so effectively it is still daily snapshot. Maybe cube can compress the cells more efficiently than relational database engine.

I would combine both approaches, one of which has been proposed by B&L. Use effective dated fact to produce daily snapshot on the fly if required (say within a particular month), and monthly snapshot for trend analysis over many years.

When you are interested in daily trend, you always limit you time frame as you don't want to be overwhelmed by hundreds of dates, so the view will be quick once constrained by a short period of time. On the other hand, monthly snapshots can also give you big picture instantly over years history.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Reducing number of rows in fact tables

Post  amir2 on Thu Nov 10, 2011 12:03 pm

Thanks guys.

We set the grain of the fact table at daily because the Kimball books recommend atomic level grains. For a bank account, daily balance is the most logical lowest grain. Also, our users want at least weekly data, so we thought for the sake of four extra rows (Monday - Thursday) per account, we might as well go for the more accommodating solution.

I have checked with the Business again and they want at least one year's worth of data, so we won't be able to get rid of the daily records after a month - we need to wait at least one year.

The above suggestions don't seem straight forward to implement (?) and require more clever SQL etc. Is this one of those cases where hardware (extra disk, memory, etc) will provide a better solution than the software equivalent?


Last edited by amir2 on Thu Nov 10, 2011 12:04 pm; edited 1 time in total (Reason for editing : Typos)

amir2

Posts : 29
Join date : 2010-07-29

View user profile

Back to top Go down

Re: Reducing number of rows in fact tables

Post  hang on Thu Nov 10, 2011 9:20 pm

Why has it got anything to do with hardware, and what is hard about implementing it? You can just treat the fact table like SCD dimension and only add new rows if the balance has changed. So that's your base fact table. You may take weekly and monthly snapshot and store the data in another fact table, by querying the base fact table, to cater for trend analysis by the cube. You don't loose any data at daily grain.


Last edited by hang on Fri Nov 11, 2011 7:10 am; edited 1 time in total (Reason for editing : typo)

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Reducing number of rows in fact tables

Post  ngalemmo on Fri Nov 11, 2011 5:16 am

I take back my last post. B&L and Hang are right. I did the math. If a fact row takes 40 bytes and you have 1 million accounts, you would need 14.6GB to store a year's worth of daily snapshots. Maybe 2-3 times that depending on your database and indexing requirements. So, even at 42GB, it is not a big deal.

If the bank has many millions of accounts, they can certainly afford suitable hardware to support it.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Reducing number of rows in fact tables

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum