Transitive Dependency
4 posters
Page 1 of 1
Transitive Dependency
There are many dimensional tables in our organzation has got transitive dependency in them and we are forced to adopt different SCD types like type 1 and type 2
I will explain my scenario with an example(see table below) ,the Item dimension has got many attributes but i picked up only those where we have issues.In the below table dept,category,acct_dept,brand are all denormalized from source system and made into Item.All type 1 Item related attributes are changed based on the Item_nbr but other transitive dependent attributes are dependent on ther respective _nbr column for ex: dept_desc depends on dept_nbr,category_desc depends on category_nbr.We dont want the desc as Type 2 as there is no point in maintaining history for desc but we want the dept_nbr,category_nbr,acct_dept_nbr,brand_nbr to be type 2.In nutshell any descrption changes we want to overwrite them.
[tableborder="1"]
Problem statement
As per your theory type 1 attrbutes should change based on the natural key (i.e Item_nbr here) but in this case its changing based on its corresponding _nbr column which is against Type 1 SCD definiton
Proposed Solution
Solution Approach 1:
Change transitive dependent attributes based on its corresponding _nbr column.
Drawback
Additional overhead for the ETL team to load based corresponding _nbr column
Solution Approach 2:
Make separate tables to handle the type 1 and type 2 separately as per the article below("Type 2 with "Current" Overwrite")
http://www.intelligententerprise.com/info_centers/data_warehousing/showArticle.jhtml?articleID=59301280
I am very much sure that you must have encountered transitive dependent attributes in the dimension ,can you please explain me what is your solution and best way to handle this as these tables are very huge .Please let me know.
Thanks in advance
I will explain my scenario with an example(see table below) ,the Item dimension has got many attributes but i picked up only those where we have issues.In the below table dept,category,acct_dept,brand are all denormalized from source system and made into Item.All type 1 Item related attributes are changed based on the Item_nbr but other transitive dependent attributes are dependent on ther respective _nbr column for ex: dept_desc depends on dept_nbr,category_desc depends on category_nbr.We dont want the desc as Type 2 as there is no point in maintaining history for desc but we want the dept_nbr,category_nbr,acct_dept_nbr,brand_nbr to be type 2.In nutshell any descrption changes we want to overwrite them.
Columnname | Datatype | SCD |
Item_SK_Id | integer | na |
Item_nbr | integer | Fixed |
Item1_desc | varchar | type1 |
dept_nbr | integer | type2 |
dept_desc | varchartype1 | |
category_nbr | integer | type2 |
category_desc | varchar | type1 |
acct_dept_nbr | integer | type2 |
acct_dept_desc | varchar | type1 |
brand_nbr | integer | type2 |
brand_desc | varchar | type1 |
Problem statement
As per your theory type 1 attrbutes should change based on the natural key (i.e Item_nbr here) but in this case its changing based on its corresponding _nbr column which is against Type 1 SCD definiton
Proposed Solution
Solution Approach 1:
Change transitive dependent attributes based on its corresponding _nbr column.
Drawback
Additional overhead for the ETL team to load based corresponding _nbr column
Solution Approach 2:
Make separate tables to handle the type 1 and type 2 separately as per the article below("Type 2 with "Current" Overwrite")
http://www.intelligententerprise.com/info_centers/data_warehousing/showArticle.jhtml?articleID=59301280
I am very much sure that you must have encountered transitive dependent attributes in the dimension ,can you please explain me what is your solution and best way to handle this as these tables are very huge .Please let me know.
Thanks in advance
radsampath- Posts : 8
Join date : 2009-02-03
Re: Transitive Dependency
As i understand the primary key to capture the chnage is item_nbr in your table -
Step1 : Approach should be first load the tables columns where you want to capture history(Type2) for the day and for columns you expect to be tracked. I would ignore all columns which you treat as type1 during this load.
Step2 : Load the TYPE1 columns simply updating using the corresponding key values with the latest descriptions as you dont care about history of descriptions.
Hope this helps.
Good Luck.
Step1 : Approach should be first load the tables columns where you want to capture history(Type2) for the day and for columns you expect to be tracked. I would ignore all columns which you treat as type1 during this load.
Step2 : Load the TYPE1 columns simply updating using the corresponding key values with the latest descriptions as you dont care about history of descriptions.
Hope this helps.
Good Luck.
DWBI- Posts : 4
Join date : 2009-02-03
Transitive Dependency
Thanks for your response.Before going into how to load the data i want to make sure how to handle the design of transitive dependencies in general.Is it the right design to keep them in same dimension ?Do i need to keep them same table or is it needs to be outtriggered as separate table as per the article below.What is the best practice in handling the transitive dependency with different SCD's
http://www.intelligententerprise.com/info_centers/data_warehousing/showArticle.jhtml?articleID=59301280
http://www.intelligententerprise.com/info_centers/data_warehousing/showArticle.jhtml?articleID=59301280
radsampath- Posts : 8
Join date : 2009-02-03
Re: Transitive Dependency
Thats a good question but it will need stats of data in your organisation, you can choose to keep in same table if the processing time will not be huge, you can seperate if the type1 has to be performed on this huge dimension and update may take time, you can out trigger if you okay with increased join for your reporting etc .. you can take a call including all this and other factors you can think of before loading.. and how best you will be able to report as well as from processing and maintenance perspective. Hope this makes sense
good luck buddy.
good luck buddy.
DWBI- Posts : 4
Join date : 2009-02-03
Transitive Dependency
Thanks once again for your response.We have around 100000 records and of which 20000 changes in a year.Let me know
radsampath- Posts : 8
Join date : 2009-02-03
Re: Transitive Dependency
We use the following approach as suggested in one of the previous columns
1. Type 2 First
2. Type 1 since you are updating all the rows in that table.
1. Type 2 First
2. Type 1 since you are updating all the rows in that table.
Devendra Naik- Posts : 7
Join date : 2009-02-03
Re: Transitive Dependency
With 100,000 rows and 20,000 changes per year, I see no technical reason why you can't just perform the updates in a simple star dimension table. It will make the ETL slightly more complicated (your stated drawback to solution approach #1 in your original post)... but it is a very slight complication. And it's the job of the ETL system to do complicated things in order to make things work smoothly at query time.
It is OK to do what you're suggesting: Manage descriptions as Type 1 unless the underlying department (_nbr) changes. Most ETL tools have SCD-management widgets that will always do Type1 or Type2 changes on a column. You need to be slightly fancier than that. So you'll have to code the SCD management "by hand", by which I mean you can't use your ETL tool's wizard/widget. But it's not hard, you clearly understand the issues.
Good luck
It is OK to do what you're suggesting: Manage descriptions as Type 1 unless the underlying department (_nbr) changes. Most ETL tools have SCD-management widgets that will always do Type1 or Type2 changes on a column. You need to be slightly fancier than that. So you'll have to code the SCD management "by hand", by which I mean you can't use your ETL tool's wizard/widget. But it's not hard, you clearly understand the issues.
Good luck
Re: Transitive Dependency
I opine to both the comments above and also data vloume you mentioned is moderate (if not less). i would so go ahead and implement
DWBI- Posts : 4
Join date : 2009-02-03
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum