Merging of dimensions?
3 posters
Page 1 of 1
Merging of dimensions?
Hello:
I am thinking about a model that has transactions (expenses made by a researcher against a research account) as Fact table surrounded by various dimensions that include account, researcher's info (like name, dept he/she belongs to), sponsor's info, date etc... To see how many researchers belong to one sponsor, one way is to join Researcher_Dim, Sponsor_Dim & Transaction_Fact.
However, do we have to join Transaction_Fact every time to find such info even though we don't need any transaction information? Or can I merge both researcher_dim & sponsor_dim into one dimension like Researcher_Sponsor_Dim. What's the good practice? Please advise.
-zaci
I am thinking about a model that has transactions (expenses made by a researcher against a research account) as Fact table surrounded by various dimensions that include account, researcher's info (like name, dept he/she belongs to), sponsor's info, date etc... To see how many researchers belong to one sponsor, one way is to join Researcher_Dim, Sponsor_Dim & Transaction_Fact.
However, do we have to join Transaction_Fact every time to find such info even though we don't need any transaction information? Or can I merge both researcher_dim & sponsor_dim into one dimension like Researcher_Sponsor_Dim. What's the good practice? Please advise.
-zaci
zaci- Posts : 11
Join date : 2015-09-16
Re: Merging of dimensions?
The more I read the more I lean towards merging of the tables (perhaps, the wrong route).
One of the fundamentals of DW is to design the star schema in such a way that tables are de-normalized (to minimize the number of joins). In my example (research related data in a university), I have a couple of dimensions (SPONSOR_DIM & RESEARCHER_DIM). Right now, I feel that merging these two tables into one makes more sense than having two different dimensions.
RES_SPON_DIM
1 RES_SPON_SID
2 RES_ID
3 SPON_ID
4 RES_NAME
5 RES_ADDRESS
6 SPON_NAME
7 SPON_LOC
While my gut says that RESEARCHER_DIM should be different from SPONSOR_DIM, what am I missing in the above design?
-zaci
One of the fundamentals of DW is to design the star schema in such a way that tables are de-normalized (to minimize the number of joins). In my example (research related data in a university), I have a couple of dimensions (SPONSOR_DIM & RESEARCHER_DIM). Right now, I feel that merging these two tables into one makes more sense than having two different dimensions.
RES_SPON_DIM
1 RES_SPON_SID
2 RES_ID
3 SPON_ID
4 RES_NAME
5 RES_ADDRESS
6 SPON_NAME
7 SPON_LOC
While my gut says that RESEARCHER_DIM should be different from SPONSOR_DIM, what am I missing in the above design?
-zaci
zaci- Posts : 11
Join date : 2015-09-16
Re: Merging of dimensions?
You could. There are various other approaches.
One way is with a 'factless' fact table. This is a table containing dimension associations, dates but no measures. It is used to record the state of dimension relationships over time.
Another is to store a FK in the dimension. To avoid snow flaking the fact would carry both dimension keys. You would use the FK to do dimension only reporting.
One way is with a 'factless' fact table. This is a table containing dimension associations, dates but no measures. It is used to record the state of dimension relationships over time.
Another is to store a FK in the dimension. To avoid snow flaking the fact would carry both dimension keys. You would use the FK to do dimension only reporting.
Re: Merging of dimensions?
Thank you ngalemmo.
"Another is to store a FK in the dimension. To avoid snow flaking the fact would carry both dimension keys. You would use the FK to do dimension only reporting."
Can you please elaborate? PKs of SPONSOR_DIM & RESEARCHER_DIM are stored in the fact table. How do you join dimension to dimension without joining fact table? I am a little confused. All I am trying to acheive is dimension only reporting without joining fact, is that even possible? If so, what's the best approach.
-zaci
"Another is to store a FK in the dimension. To avoid snow flaking the fact would carry both dimension keys. You would use the FK to do dimension only reporting."
Can you please elaborate? PKs of SPONSOR_DIM & RESEARCHER_DIM are stored in the fact table. How do you join dimension to dimension without joining fact table? I am a little confused. All I am trying to acheive is dimension only reporting without joining fact, is that even possible? If so, what's the best approach.
-zaci
zaci- Posts : 11
Join date : 2015-09-16
Re: Merging of dimensions?
I would think how you model it would depend on the relationship between Sponser and Researcher.
My guess is each researcher can have multiple sponsers, and each sponser can have multiple researchers (many-to-many).
Many-to-many relationships are typically modeled as a bridge table and/or a fact table (as ngalemmo suggested).
My guess is each researcher can have multiple sponsers, and each sponser can have multiple researchers (many-to-many).
Many-to-many relationships are typically modeled as a bridge table and/or a fact table (as ngalemmo suggested).
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Re: Merging of dimensions?
As LAndrews pointed out, the fk from researcher to sponsor only works if a researcher has only one sponsor. Any other situation requires a bridge or a fact table.
A bridge probably would not work as I imagine the relationships can be complex, with dependencies on project, time, etc... The factless fact would store only the relationships, not the activity, giving you a smaller fact to work with. However, if the fact table isn't very big to begin with, you will probably find using the existing fact for that type of reporting may be all you need to do. The downside of using the transactional fact is you can only report relationships that had transactions during the time period of the query.
A bridge probably would not work as I imagine the relationships can be complex, with dependencies on project, time, etc... The factless fact would store only the relationships, not the activity, giving you a smaller fact to work with. However, if the fact table isn't very big to begin with, you will probably find using the existing fact for that type of reporting may be all you need to do. The downside of using the transactional fact is you can only report relationships that had transactions during the time period of the query.
Similar topics
» Linking or Merging Similar dimensions from Different Source Systems
» Match merging
» Merging two Data Mart tables
» Merging customer data from disparate sources to create a master customer dimension
» Dealing with deduplication, merging records, etc in the warehouse layer
» Match merging
» Merging two Data Mart tables
» Merging customer data from disparate sources to create a master customer dimension
» Dealing with deduplication, merging records, etc in the warehouse layer
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|