Same attribute in multiple Dimensions ?

View previous topic View next topic Go down

Same attribute in multiple Dimensions ?

Post  prw on Mon Jun 02, 2014 8:35 am

I have taken over a current DW and have come across a design where the same attribute has been modelled in different dimensions which I have never seen in a model before or modelled myself and therefore wonder whether the design is correct.

Consider a fact table made up of policies. Each policy has an associated Product of which there is a Dimension (DimensionA).
The business key is ProductCode and there is a Product Description.
The ProductCode is itself determined from the combination of CategoryCode and GroupCode, which themselves have associated descriptions e.g. ProductCode CatAGrp1 is associated with Category CatA & Group Grp1, ProductCode CatBGrp1 is associated with Category CatB & Group Grp1, ProductCode CatAGrp2 is associated with Category CatA & Group Grp2.........and so on.

This is currently modelled in one Dimension which I see as correct and looks as follows (the descriptions aren't relevant so I've just described with xxxxxx):
Product Code, ProductDescription, CategoryCode, CategoryDescription, GroupCode, GroupDescription
CatAGrp1, xxxxxxx, CatA, xxxxxxx, Grp1, xxxxxxx
CatAGrp2, xxxxxxx, CatA, xxxxxxx, Grp2, xxxxxxx
CatBGrp1, xxxxxxx, CatB, xxxxxxx, Grp1, xxxxxxx

However, GroupCode is also subdivided into SubGroups with an associated Description so e.g. Grp1 can have SubGrp1A, SubGpr1B, SupGrp1C.
The current model therefore has included another Dimension (DimensionB) which has included the GroupCode already defined in DimensionA and its associated SubGroup attribute with the SubGroup defined as the BusinessKey.

I'm attempting to understand why this has been done. My view is that the business wish to drill down from Group to SubGroup and this is why GroupCode has been included in DimensionB. If this drill down wasn't required then SubGroup could exist on a Dimension on its own. However, by adding GroupCode to DimensionB we now have the same attribute in two Dimensions which from my understanding is not a good modelling technique.

My first thought was to remove DimensionB from the model and add SubGroup to DimensionA.
However, by doing so ProductCode no longer becomes unique so SubGroup should stay in DimensionB, GroupCode should be removed from DimensionB (so that it now exists in only one Dimension) and a bridge table created between DimensionA and DimensionB which links Group and SubGroup.
What do others think ?

prw

Posts : 11
Join date : 2012-09-14

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  nick_white on Mon Jun 02, 2014 10:24 am

My thoughts...
It all depends on what your business' reporting requirements are - there's rarely one right answer and often a lot more right answers than there would be when designing a 3NF model.
In a dimensional model there is nothing wrong with duplicating data as long as there is valid reason for doing so.
If a Group (and a Category, for that matter) have relationships to entities other than Product then you could consider moving them out of your Product Dim into their own Dimensions - giving you reusable (conformed) dimensions which would avoid duplication of data. If you did this then you'd have to add additional FKs to your Fact table.

For Sub-groups you could create a SubGroup Dim and have a bridge table to your Group Dim. You could also include the Group details in the Subgroup Dim table (assuming a subgroup relates to only one group) - which allows for easy roll-up from sub-group to group (assuming that's a requirement). You could also create an 'aggregated' version of this Dim that just holds the Group information (c.f. a Date Dim and a Month Dim) if you need to relate sub-groups to one fact and groups to another (as you do in the example you'e given): this is a example where data duplication is a good thing.
From the example you have given, including policies/products and sub-groups in the same report wouldn't make any sense (how would you display these on one report?) - so sub-groups probably aren't relevant to this star. Without knowing the design of a star in which sub-groups would be relevant it's impossible to comment on the best way of modelling them

Then again, if category and group just consist of code/name pairs and no other attributes (or very few other) and they are only ever used in the context of Products then you can treat them as Product attributes and include them in the Product Dim (which is basically what you have at the moment).

So to summarise,
1) Keep your Dims conformed i.e. you have a single Product Dim that can be used with any fact that has a relationship to Product. When I say you can duplicate data, I am definitely not saying that you can break the principle of creating conformed dimensions!
2) But as long as you stick to point 1 you can also duplicate data across different dimensions if that makes your model quicker to query and easier to understand: you can take groups of attributes from your Product Dim and also put them in another Dim if you need to



nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  prw on Tue Jun 03, 2014 3:59 am

Thanks for the reply.

Useful to know that attributes can be duplicated across dimensions if it supports the business model.

Category doesn't have a relationship with other entities to my knowledge so I didn't move it out into its own dimension. Further the business requirements are to drill down from Category to Product so it makes sense to leave it in DimensionA along with Product. Additionally, the business wish to be able to associated Product with Group so again I believe it makes sense to also have this in DimensionA.

SubGroups probably wasn't the best terminology for me to use in the example, but essentially a Group can have 1 or more values e.g. think of Group as a Team and SubGroup as Agents within the team.
There is a possible requirement for the business to drill down from Category to Product to SubGroup (each Product has an associated Group) which is why my thought of the Bridging table to DimensionB which just contains SubGroups. The current model doesn't support this ability, but I think you're correct in saying that this couldn't be done with the new model I have described anyway (although I'm not 100% sure on that).

I'm leaning towards leaving the model as is.
DimensionA will contain Category, Product and Group and will allow drill down from Category to Product and also allow the business to associate a Group to a Product.
DimensionB will contain Group and SubGroup and will allow the business to drill down from Group to SubGroup.
Both Dimensions will link to the Policy Fact table.
Although the Group attribute is repeated across Dimensions, I think this is acceptable as the only reason it is included in Dimension A, is that the business can still associate this attribute with the Product.

Just to clarify, which my original example perhaps didn't.
A Category is made up of one or more Products and a Product can only exist in one Category.
Each Product can only be sold by one Group (hence the association of Product with Group), though a Group can sell more than one Product (the business would wish to know which Groups are selling which Products).
A Group can consist of one or more SubGroups, which I should have perhaps named Agents which sell the Product.

prw

Posts : 11
Join date : 2012-09-14

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  nick_white on Tue Jun 03, 2014 5:00 am

From what you are saying, it appears that there is a hierarchical relationship between Category and Product (each Product has one, and only one, 'parent' Category) - therefore putting them in the same Dimension makes sense and as Product is at the lower level it should be a Product Dimension, rather than a Category Dimension.

However, I am less sure about including Group in this Product Dimension. If every policy is sold by a subgroup then linking the Policy fact to the subgroup Dim (which includes the parent group for each subgroup record) allows you to report on which Products are sold by which group - therefore I would not include Group in the Product Dimension as it is unnecessary duplication.

The downside to this is it does not allow you to report on which group can sell which products (it would only tell you which group has sold which products); this may or may not be a reporting requirement for you. If it is, then keeping Group in the Product Dim allows you to report on this easily (by just querying this one Dim in isolation) but is obviously not a 'star' design (not that it has to be - no need to introduce complexity for the sale of it!).
If there are other interesting relationships a Product has then you might want to consider creating a "Product Configuration" fact table that had links to the Product Dim, a Group Dim (aggregated version of the Subgroup Dim) and any other relevant Dims. This might be a factless fact table unless there are useful measures you want to record about the Product (base cost, for example?).

If you do include include Group in your Product Dim then I would hide it from your end-users when the Dim is being used in the context of the Policy Fact - otherwise they will see two Group fields (one from the Product Dim and one from the Subgroup Dim) which will cause confusion

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  prw on Tue Jun 03, 2014 5:28 am

Nick, I think you are spot on with your understanding and your answer has confirmed clarification of the business requirement for me.

"From what you are saying, it appears that there is a hierarchical relationship between Category and Product (each Product has one, and only one, 'parent' Category) - therefore putting them in the same Dimension makes sense and as Product is at the lower level it should be a Product Dimension, rather than a Category Dimension."
- Agreed.

"However, I am less sure about including Group in this Product Dimension. If every policy is sold by a subgroup then linking the Policy fact to the subgroup Dim (which includes the parent group for each subgroup record) allows you to report on which Products are sold by which group - therefore I would not include Group in the Product Dimension as it is unnecessary duplication."
- Agreed, however by not including Group in the Product Dimension, the only way to link Groups with Products is via the Fact Policy sales.
This I think is exactly what you are saying in the paragraph that followed, in that removing this duplication allows you to analyse which Products have been sold by which group (via Fact Policy Sales) but not which Products can be sold by which Group (because the only link would be via the Fact Policy Sales table). This is a business requirement so again as you have indicated then it makes sense to keep the Group in the Product Dimension even though we have duplication.

"If there are other interesting relationships a Product has then you might want to consider creating a "Product Configuration" fact table that had links to the Product Dim, a Group Dim (aggregated version of the Subgroup Dim) and any other relevant Dims. This might be a factless fact table unless there are useful measures you want to record about the Product (base cost, for example?)."
- I hadn't considered this and will investigate further. Is there an example from the DW Toolkit ?

"If you do include include Group in your Product Dim then I would hide it from your end-users when the Dim is being used in the context of the Policy Fact - otherwise they will see two Group fields (one from the Product Dim and one from the Subgroup Dim) which will cause confusion"
- Agreed.

prw

Posts : 11
Join date : 2012-09-14

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  nathanjones77 on Tue Jun 03, 2014 6:31 am

You say that a product may only be sold by one Group (Team) and each Team is made up of agents - I assume we're talking insurance here. It sounds like to me that there is a 'primary' Team and a 'actual' Team (given it's quite likely an Agent who has just made an Auto sale could also then sell a home policy on the same visit or call, even if it isn't meant to happen).
What is the difference in values between the two - a quick query including both Group fields across the primary fact table should sort that out. If there are differences then they clearly have different meanings and should be named separately.

nathanjones77

Posts : 11
Join date : 2014-06-03
Location : Zurich

View user profile

Back to top Go down

Re: Same attribute in multiple Dimensions ?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum