Fact-to-Fact Joins

View previous topic View next topic Go down

Fact-to-Fact Joins

Post  turbotortuga on Wed Nov 06, 2013 12:42 pm

I have come across a situation where two tables coming from the source system have become facts with degenerated dimensions in the DW/BI system.
I know that it the Data Warehouse Toolkit it mentions to never issue SQL that would join two fact tables together across "the fact table's foreign keys".
But what about Natural Keys?
The PK of table "A" from the source system became the NK in the DW/BI fact.
Table "B" that also comes from the source system has table "A" PK in it.
When I bring over table "A" and table "B" to the warehouse as facts with degenerated dimensions. I am also bringing table "A" PK as "Table_A_ID"
Is it alright to then join these two facts in the Warehouse using table "A" NK and table "B" reference to table "A" (Table_A_ID)?
Or should I create some sort of bridge dimension that would map these keys to a PK,SK key in the dim?

Example of SQL Join using Natural Keys:


Thanks,
RGC
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  ngalemmo on Wed Nov 06, 2013 1:10 pm

A natural key and a surrogate key are effectively the same thing.  The notion of a foreign key from one fact table to another is simply not a dimensional design.

You can integrate two facts based on common dimensions (actual or degenerate) but there is not a foreign key relationship between the two.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Fact-to-Fact Joins

Post  turbotortuga on Wed Nov 06, 2013 1:15 pm

Alright, so in this scenario I would just leave them as is and allow the analysts to join the two if they ever need it.

Thanks for your help
RGC
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  ngalemmo on Wed Nov 06, 2013 1:38 pm

The relationship between any two fact tables is inherently many-to-many and the point of integration can be any combination of common dimensions. So the general pattern is to aggregate the facts individually along common dimensions and join the resulting sets along those dimensions. Doing a UNION ALL is an alternate way combine two facts.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Fact-to-Fact Joins

Post  turbotortuga on Wed Nov 06, 2013 3:36 pm

Yes I remember reading something about that.
So the order goes as follows:
1. Aggregate each fact with the common dimensions among them.
2. Join the resulting set.

I will communicate this in the documentation of the warehouse.

Thanks again.
RGC
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  marcin.wizgird on Fri Nov 15, 2013 7:23 am

I wonder whether it's sensible to add this to data warehouse documentation. If the data is accessed with any frontend reporting tools - such as for instance OBIEE, SAP BO -these tools by themselves reinforce such a behavior. They query each fact table separately and then consolidate the results based on common dimension values.
avatar
marcin.wizgird

Posts : 4
Join date : 2013-05-07

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  turbotortuga on Fri Nov 15, 2013 8:56 am

Thank you for you sharing that.

Since the business analysts have access to query dims and facts directly in the warehouse.
I will communicate this to them and the difference of querying data using the tools you mentioned vs directly using SQL to fetch data from tables in the warehouse.

RGC
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  Giuseppe78 on Sat Jan 18, 2014 1:08 pm

rcedeno wrote:I have come across a situation where two tables coming from the source system have become facts with degenerated dimensions in the DW/BI system.

Hi all,
I'm facing the same situation but we're still in design phase.

The business requirement is to build a data warehouse on top of a warehouse management system to fulfill two basics needs:

  1. Store data coming from each warehouse in a common place and analyze them together (the wms system unfortunately consist in a stand alone installation for each physical warehouse so you need to login to two systems if you want to query transactions related to two different warehouses)
  2. Provide so-called "free analysis" at transaction level granularity (single order line, single physical movement, singe parcel, single shipment ...)


The most common questions the system should answer will be sort of "How many order lines in one parcel?, How many picking to fulfill an order?, Orders shipped in two or more distinct shipments".

Assuming 1:N relationship between what could be intended as a fact (e.g.: orders, shipments, ...) I do not see any other way to model that than using degenerated dimensions in each fact and query facts together; moreover, since the front end tool will be Qlikview, I need to take care of product-specific constraints in order to avoid issues (such as circular references)

Any suggestion on that would be really apprecieted.

Giuseppe.

Giuseppe78

Posts : 2
Join date : 2014-01-18

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  ngalemmo on Sat Jan 18, 2014 6:36 pm

Looking at the order-to-cash process; orders, fulfillment, and invoicing are independent and distinct business processes.  Facts dealing with fulfillment need to be collected and stored independent of the others.  What you need to address is ensure fulfillment facts have the correct dimensionality.

This dimensionality will certainly include some degenerate dimensions, such as order # and order line #.  And a single fact table should be sufficient to address the types of queries you describe.  

Where you get into 'querying facts together' is when comparing what was shipped against what was ordered… how accurate was the fulfillment, did substitutions take place, etc…  As this can be a fairly common analysis, one approach is to create an aggregate of order, fulfillment, and invoicing fact to provide a compete picture of the process.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Fact-to-Fact Joins

Post  Giuseppe78 on Sun Jan 19, 2014 10:57 am

That makes perfectly sense!

Unfortunately this implies the need to make assumptions since the businesss requirement "just show me everything at transaction level in order to allow all kind of analysis" cannot drive modeling choiches.

Just as an example, consider a subset of the as-is model where pink rectangle could be the source of dimensions while the green one the source for facts:


What else can be done other than linking facts directly?

Consider that they choose Qlikview as front end: postponing modeling decision and query the model in a particular way at reporting time is not an option since the tool, unless using aliases in column names, will create relationship where column names match across tables.

Besides that, as a general thought, I spent (and still spend) lot of time in reading books but I've never been able to spot something to address those kind of situation in which the datawarehouse is just needed to extend the historical depth and integrate all data coming from legacy systems.

Best regards

Giuseppe78

Posts : 2
Join date : 2014-01-18

View user profile

Back to top Go down

Re: Fact-to-Fact Joins

Post  ngalemmo on Tue Jan 21, 2014 12:17 pm

Take a step back from your computer and clear your mind.  You are applying relational thought to a dimensional model.

In a dimensional model, fact tables are independent of each other.  The only relationships a fact maintains are to dimensions.  These dimensions may be physical tables or degenerate dimensions in the fact.

When a pick list is issued, does the data not indicate what order and line it is for?  These are dimensions of the pick, and recorded in the fulfillment fact.

When you combine facts, you do so across common dimensions.  Fulfillment has an order # and line #, orders has an order # and line # does it not?

When you diagram star schema you never draw relationships between facts because the relationship is not absolute.  Facts are combined across common dimensions at query time.  That statement says nothing about which common dimensions are actually used… all?... some?… it doesn't matter.  This is different than a relational model where such relationships are fixed and absolute.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Fact-to-Fact Joins

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum