Multiple grain in a parent-child-child relationship

View previous topic View next topic Go down

Multiple grain in a parent-child-child relationship

Post  KKumar on Fri Jul 29, 2011 10:30 am

Hi,

I have this ER model which has parent-child relationship that is too deep
Project -> Project Bldg -> Bldg Inspection -> Inspection Results

In this event there are 3 processes
Process 1: Grouping buildings in a project
Process 2: Multiple Inspection of each bldg on project start
Process 3: Multiple results on an inspection for each bldg

I have identified 3 fact table for each process

Fact 1:Project -> Project Bldg
Fact 2:Project Bldg -> Bldg Inspection
Fact 3: Bldg Inspection -> Ispection Results
Dimensions : Date, Bldg, Results Lookup, Inspection Status

My question is in Fact 2 I will end up in having only records in the "Bldg Inspection" and foreign keys from the Fact 1. Do I have to repeat the same facts in Fact 1 to Fact 2? The same goes for Fact 3

Please advise
Thanks

KKumar

Posts : 22
Join date : 2011-07-29

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  VHF on Fri Jul 29, 2011 11:56 am

I see possible dimensions for Date, Project, Building, Inspection, Status, and Result.

Each fact table needs to have FK pointers to all the applicable dimensions. So I would envision the fact tables looking like this:

Fact1: Project, Building
Fact2: Project, Building, Inspection, DateScheduled, DatePerformed, Status
Fact3: Project, Building, Inspection, Result

I think the answer to your question about repeating facts is 'sort of'. For example, for each inspection represented by a row in Fact2 you will eventually have a least one row in Fact3 representing the results... and maybe multiple rows if there are multiple results for a given inspection.

In dimensional modeling, each fact table needs to be able to stand alone. There should never be a (direct) parent-child relationship between fact tables. When querying the DW you should never directly join two fact tables. Fact tables can be joined indirectly by using attributes of conformed dimensions.





Last edited by VHF on Fri Jul 29, 2011 11:57 am; edited 1 time in total (Reason for editing : clarification)

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  KKumar on Fri Jul 29, 2011 4:27 pm

It means that if I have 30 attributes for inspection and 40 attributes for bldg then repeating it in Fact 3 is OK. Don't you think we are repeating the same facts in 2 tables; and I will have to repeat the ETL code that I build for Fact 1 to Fact2 and Fact3. Also what happens if the bldg needs to be enhanced when the business changes then I will have to deal with 3 fact table to be taken care of.

KKumar

Posts : 22
Join date : 2011-07-29

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  VHF on Fri Jul 29, 2011 4:38 pm

Attributes should exist only in the dimension tables. The fact tables should include only a FK pointer to the primary surrogate key in the dimension record. So no matter how many attributes you might have for a building, you should only have one FK in each fact table ("BuildingKey") which would typically be a 32-bit integer.

One of the hard parts of getting started with data warehousing/dimensional modeling is giving up some of normalization concepts of relational design. Some information in a DW is deliberatley denormalized to improve query performance and ease of use, although the fact tables themselves are highly normalized and very efficent.

It is possible you might be able to implement your design with a single fact table (Fact3).

Do you have any Kimball reference books?

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  KKumar on Fri Jul 29, 2011 5:40 pm

I was thinking on the same lines to have one fact table but then I will have to force 3 processes (project creation, inspection and results) into one fact table. What would be the industry norm in these type of case?

By the way, I am planning to get hold of a kimball reference book. Can you please suggest a good one?
Thanks

KKumar

Posts : 22
Join date : 2011-07-29

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  VHF on Fri Jul 29, 2011 5:49 pm

It is normal to have a fact table represent a particular business process. This is called determing the grain of the fact table. That said, a variety of queries can frequently be fullfilled off a single fact table.

It is very common to replace a parent-child relationship in a source system with single fact table at the most detailed level in the DW. For example, orders are frequently represented as Header-Detail in the OLTP system, but in the DW are most often represented by a single fact table at the line level. Queries both about individual line items (products sold, etc.) as well as information associated with the header (customer demographics, etc.) can be handled by this single fact table.

The Data Warehouse Toolkit would be a good book to start.



Last edited by VHF on Mon Aug 01, 2011 9:25 am; edited 1 time in total (Reason for editing : typo)

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  KKumar on Fri Jul 29, 2011 6:57 pm

Thanks this is very helpful

KKumar

Posts : 22
Join date : 2011-07-29

View user profile

Back to top Go down

Re: Multiple grain in a parent-child-child relationship

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum