Some Practical Challenges

View previous topic View next topic Go down

Some Practical Challenges

Post  radsampath on Wed May 06, 2009 7:57 am

Hi All

I am facing some challenges in my dimensional modeling and most of these are covered in many books too.But i have practical limitation for that
1st Challenge :Unbalanced Hierarchy

In one of the dimension we are tracking employees and their hierarchy i.e Org structure ;challenge here is the number of employees .Its more than a million employees and mostly unbalanced hierarchy.The solution given in many of the books is to go for Bridge table instead of maintaining it in the same table.The table will be containing parent employee,child employee,number of levels,Bottom flag and Top flag.The challenges are

  • These org structures are changing often and the history of these structure needs to be maintained .There are more than million employees who have reportees.

  • ETL tool is not very sophisticated ;its home grown tool.


There are 2 kinds of employees 1.Reporting Employees means employees who have reportees 2.Non Reporting employees means employees who dont have reportees.Now if i want to find number of Reporting and Non reporting employees at any point of time ;i dont have direct way to handle that.

Is there any better design technique to handle the unbalanced hierarchy structure

2nd Challenge : Relationship between Dimensions

I have 2 dimensions namely membership and cardholder .One membership can have many cardholders and one cardholder can have only one membership so the cardholder is the lowest grain.Both of these dimensions are Type 2 and there is a frequent change in both the dimension attributes

So in order to maintain the relationship between these 2 dimensions i.e number of cardholders for a membership at any point in time
; i am planning to go for factless fact table .It will be containing the surrogate key of the membership and surrogate key of the cardholder and count.
  • Is it the only solution or is there any better way to handle this situation.

  • When do i need to snap the factless fact table;does it needs to snapped when the cardholder changes or does it needs to snapped when the membership changes or when both changes

    I am planning to snap at the lowest grain i.e cardholder changes and to maintain an additional field in the factless fact table to indicate the current one..

Please provide some best practices in this

3rd Challenge:Junk Dimensions

In another situation we are tracking sales and there are 2 grains .1 Sales Transaction and 2.Sales Trans Line .The sales transaction is at the transaction level and the sales trans is at the every item sold in the transaction .The number of transactions are huge and number of sales trans line is in order of billions.We are planning to have Sales Trans Dim ,Sales Trans Fact ,Sales Trans Line fact but not the Sales Trans Dim
due to the volume of sales trans line.If we maintain the sales trans dim also then there will be billion of rows in sales trans line dim and sales trans line fact.Moreover there is no key identifying attributes for the sales trans dim
Now if we ignore sales trans line dim then some of the key attributes such as pricing type,transaction line type needs to be maintained .Its not tightly coupled with the sales trans line.

Is it a good idea to go for Junk dimensions ?

Challenge

  • In the junk dimension we keep the combinations always and this combination itself is coming in millions .In future if we want to add a new field in the dimension then the whole existing records needs to be repeated for all possible values of new attribute


  • ETL tool is not sophisticated ;its home grown


Thanks for sparing time to read this.

radsampath

Posts : 8
Join date : 2009-02-03

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum