Factless Fact table to model 1:M relationships between Type 2 SCD

View previous topic View next topic Go down

Factless Fact table to model 1:M relationships between Type 2 SCD

Post  kallison on Tue Aug 23, 2011 3:43 pm

I have a requirement to provide point in time data between 2 dimensions; client and product. Both are type 2 SCDs. I was going to use a factless fact table to model the 1:M relationship between clients and products (1 client can have many products). Is this an appropriate approach when our reporting tool requires a star schema? Will a new row need to be added to the fact table with every type 2 change to either dimension? In simple terms I'm anticipating the data to look as follows:

dim_client (sur_client_key, eff_date, end_date, current_val, client_id, client_name, client_state)
1, 1/1/1800, 6/1/2011, N, 10, ABC, FL
2, 6/2/2011, 12/31/9999, Y, 10, ABC, GA

dim_product (sur_product_key, eff_date, end_date, current_val, product_id, product_nm)
5, 1/1/1800, 12/31/2010, N, 20, P1)
6, 1/1/2011, 12/31/9999, Y, 20, P1a)

ff_client_product (sur_client_key, sur_product_key, eff_date, end_date)
1,5, 1/1/1800, 12/31/2010
1,6, 1/1/2011, 6/1/2011
2,6, 6/2/2011, 12/31/9999

Thanks in advance.

kallison

Posts : 5
Join date : 2011-08-23

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  hang on Tue Aug 23, 2011 7:23 pm

I guess a simple and effective approach to handle the relationship between product and client is to use periodic snapshot factless fact table using date key instead of effective date pair. If the table is not massive, you could do daily snapshot, otherwise do it as frequently as practically needed.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  ngalemmo on Tue Aug 23, 2011 9:28 pm

Wasn't this discussed in an earlier thread? It sounds familiar.

When a new row is added to a type 2 dimension you do not need to add a new row to the fact table. For type 2 the basic technique is to do a self join (using the natural key) on the dimension table to locate the desired version of the dimension row.

There are other techniques discussed in other threads that allow you the same functionality that avoids the self-join by using two keys for each dimension (one a 'point in time' key and the other a 'current' key).

You would only need effective/expiration dates on the fact if the customer/product relationship has a timeframe (i.e. it can expire). You would only add new rows if a new customer/product relationship was created or if a previously expired relationship is renewed.

A way to look at it is the fact tables purpose is to record the existance of a relationship. The nature of the dimensions (i.e. what version you wish to apply) can be resolved within the dimensions themselves.


Last edited by ngalemmo on Tue Aug 23, 2011 9:32 pm; edited 1 time in total
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  kallison on Tue Aug 23, 2011 9:30 pm

Thanks for the idea, however, we are operating in a real-time environment with a 5 minute latency window so periodic snapshots are not really an option. Since these truly are slowly changing dimensions that's why I think I need to make the factless fact table be a coverage table with effective and end dates. I've seen differing opinions though on whether or not a new row needs to be added to the fact table with each new insert to the dimension(s). Since I need to provide point in time and historical reporting (of the data in the dimensions) I thought the use of the pair of dates would handle this. Thoughts?

kallison

Posts : 5
Join date : 2011-08-23

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  kallison on Tue Aug 23, 2011 9:40 pm

When this technique is used:
When a new row is added to a type 2 dimension you do not need to add a new row to the fact table. For type 2 the basic technique is to do a self join (using the natural key) on the dimension table to locate the desired version of the dimension row.

What is used as the FK in the fact table? the PK or the natural key?

kallison

Posts : 5
Join date : 2011-08-23

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  ngalemmo on Tue Aug 23, 2011 10:02 pm

The PK. Fact tables should never hold natural keys (unless it is a degenerate dimension). That gives you the point-in-time relationship. The current dimension row is found using a self join (the dimension table with itself) on the natural key, filtering for the current row.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  kallison on Wed Aug 24, 2011 6:46 am

What is the reason for not writing a new row to the fact with each new row to the SCD(s)? It seems like the use of the self join could be avoided if that were done and if the fact table contained effective and end dates it would be very easy to get to data at any specific point in time.

kallison

Posts : 5
Join date : 2011-08-23

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  ngalemmo on Wed Aug 24, 2011 11:47 am

kallison wrote:What is the reason for not writing a new row to the fact with each new row to the SCD(s)? It seems like the use of the self join could be avoided if that were done and if the fact table contained effective and end dates it would be very easy to get to data at any specific point in time.

The reason is to avoid a big mess. Given a fact table with a multitude of dimensional references, when and how do you generate a new row? Do you do it during dimension maintenance? Every time a dimension row changes you generate new fact rows that referenced the superceded row? What if 2 or more dimensions changed on the same fact during the update process? Do you wind up creating multiple fact rows or somehow accumulate the changes?

Also, as I mentioned, there are techniques to eliminate the self-join and do not require updating the fact table.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  kallison on Wed Aug 24, 2011 12:48 pm

Although the solution may be complex I think it allows for both point in time and historical reporting. I came across Kimball Design Tip 50 which also talks about this solution. Have people found this to not be a practical implementation?

From Desing Tip 50:
We can use a factless
fact table to capture the relationship between the customer dimension and mini-dimensions over
time. We load a fact row in the factless fact table whenever there is a Type 2 change to the base
customer dimension or a change in the relationship between the base dimension and the mini-
dimensions. The factless fact table contains foreign keys for the base customer dimension and each
of the four mini-dimensions when the row is loaded. We then embellish this design with two dates,
row effective and row expiration, to locate a customerís profile at any point in time.

kallison

Posts : 5
Join date : 2011-08-23

View user profile

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  ngalemmo on Wed Aug 24, 2011 1:15 pm

This has more to do with the mini-dimensions rather than the type 2. Mini (aka junk) dimensions are type 1 and represent values at a point in time only. There are no 'versions' as in a type 2 dimension. The only way to record a change relating to attributes in those dimensions is to reference another dimension row, either by updating the fact row or, preferably, expiring the old row and creating a new one. Since you have no option other than updating the fact table, you might as well do it for the type 2 as well so queries are consistent (i.e. current row always references current dimension versions).

But if mini-dimensions are not involved, you don't need to do it that way. If size of the fact table and time and complexity to update it are not a concern, you could certainly do it that way.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Factless Fact table to model 1:M relationships between Type 2 SCD

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum