Type 2 Fact table - Durable surrogate key?

View previous topic View next topic Go down

Type 2 Fact table - Durable surrogate key?

Post  carbonite on Mon Jul 13, 2009 11:24 am

I'm working on a DW design and I'm faced with fact tables that mostly contain data on bank account balances and values of securities (bonds, derivatives, etc...). Some of these balances change on a daily basis while others might remain constant for long periods of time. These types of facts didn't seem to fit the typical fact table model (where new records would be inserted for each transaction), so I was considering a "Type 2 Fact Table" scheme like was described here (see the last post by VHF): http://forum.kimballgroup.com/dimensional-modeling-and-data-architecture-f6/item-costs-dimension-or-fact-t45.htm#662.

Here an example of the schema I'm considering:

Dimension Table (Security Dimension)

SEC_ID - Primary key - surrogate key
SEC_ID_ORIG - durable surrogate key - will always retain value of security's first sec_id
HOLDING_ID - Natural Key - varchar field from source system
EFF_DATE_BEGIN - Date this record is effective
EFF_DAT_END - Date this record is no longer effective
CUR_REC_IND - Indicates whether a record is most current or not (Y/N)
Dimension attributes...

Fact table (Balance Fact)

SEC_ID - Foreign key
SEC_ID_ORIG - Foreign_Key
EFF_DATE_BEGIN - Date this record is effective
EFF_DAT_END - Date this record is no longer effective
CUR_REC_IND - Indicates whether a record is most current or not (Y/N)
ACCT_BAL - Account balance for this period of time (can vary from one day to many months)
Several other facts...

This would allow the tables to be joined on either SEC_ID or SEC_ID_ORIG. In both cases it will be necessary to either filter by the effectives dates or on the current record indicator. The business users are fairly adept at SQL and they've expressed interest at being able to query with correct partitioning of history (using SEC_ID) as well as combining current dimensional attributes with historical facts (using SEC_ID_ORIG).


My question is whether this use of a durable surrogate key is a good idea or if there's a better design. I'm more concerned with ease of querying and ETL rather than space issues (it's not a huge amount of data). I'd really appreciate some opinions on this design. Thank you!


Last edited by carbonite on Mon Jul 13, 2009 11:28 am; edited 1 time in total (Reason for editing : spelling)

carbonite

Posts : 3
Join date : 2009-07-03

View user profile

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  BoxesAndLines on Mon Jul 13, 2009 5:05 pm

Have you considered a daily snapshot? That normally works well with balances like you are talking about.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  ngalemmo on Mon Jul 13, 2009 9:04 pm

BoxesAndLines wrote:Have you considered a daily snapshot? That normally works well with balances like you are talking about.

As B&L said, a snapshot is typically the way to go. I also question why the security dimension needs to be a type 2? CUSIP and SEDOL(and others) are stable identifiers for a security. If a security changes, new identifiers are assigned. Corporate actions are usually tracked using a fact table, not by maintaining dimensional history.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  carbonite on Tue Jul 14, 2009 8:46 am

Thanks for your replies, those are some good points.

I did consider a daily snapshot, but I was hesitant to use that design because some loans and securities may change value only a few times a year, but have maturities 10 years (or more) in the future. On the other end of the spectrum, there are some discount notes that will only be active for a few days. Business users want these matured loans or securities to remain in the fact tables so that they can query them (their balance or current par would be 0 upon maturity). One major drawback of the daily snapshot would seem to be that I'd have include all of the matured loans and securities in every snapshot. Although the number of active securities at any point in time isn't especially large, including all matured ones (there's many overnight and short term notes) would greatly increase the size. There's about 3,000 active securities and 10,000 actives loans right now.

That's why I was considering the Type 2 Fact Table. Securities would only get a new record in the fact table when the balance/par value changed and the final record's EFF_DATE_END would be some date far in the future.

As for why the security dimension is a Type 2, you're entirely correct that the majority of attributes of a security are static. However, there are some fields from the trading system that do change, such as status (ACTIVE, CANCELLED, MATURED, RETIRED, etc...). It's also not uncommon for some fields in the trading system to be corrected or modified between trade and settle date and it's necessary to retain this history. But overall this dimesion is quite slow-changing and close to static.


Last edited by carbonite on Tue Jul 14, 2009 8:50 am; edited 1 time in total (Reason for editing : clarify)

carbonite

Posts : 3
Join date : 2009-07-03

View user profile

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  BoxesAndLines on Tue Jul 14, 2009 10:49 pm

Snapshot still sounds pretty good. 13K rows per fact snapshot is easily managed. I have multiple snapshot facts that have over 4-6M rows. You could carry a full year of snapshots before you would have to consider deleting data. If you partition by snapshot date, performance would excellent.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  VHF on Wed Jul 15, 2009 12:41 pm

The original design looks good to me! Make sure to set EFF_DAT_END to a high value (sufficiently in the future, something like #12/31/2999#) for the current record.

Ralph Kimball calls this pattern “Instantaneous Balances” in the Dimensional Modeling in Depth class. Having a Type2 Dimension combined with a Type2 Fact does make things a little more complex, but the proposed design address the issues with SKs/durable SKs correctly.

Although daily snapshots are certainly a viable alternative, the Type2 Fact table offer an advantage in significantly reduced storage when the data changes relatively rarely. (And while disk may be cheap, fewer records in the DW generally leads to better query performance.) This design also supports the ability to query for mid-day balances, which is not possible with a daily snapshot.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Type 2 Fact table - Durable surrogate key?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum