Star vs Snowflake with many different attributes

View previous topic View next topic Go down

Star vs Snowflake with many different attributes

Post  JoyMcCall on Thu Apr 24, 2014 9:17 am

My fact table has many different code attributes ( > 20) that need translation descriptions. Would it be best to keep these as individual keys(dimensions) on the Fact table to keep the Star schema or break them off into a snowflake dimension? Either way I would end up with 20 dimensions or should they all be contained in one dimension? I am trying to keep performance in mind as I build this model.

JoyMcCall

Posts : 3
Join date : 2014-04-24

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  nick_white on Thu Apr 24, 2014 9:58 am

Are these different attributes or a single attribute that is multi-valued or a combination of the two?
Is there a fixed number of attributes that a fact could have (even if some of them may be null for a particular fact record)?
Do you already have a design for your fact/dimensions - your post suggests that you do but in that case I'm confused why adding text descriptions to codes would result in a change to your design?

If you could provide a bit more detail around exactly what you are modelling it would help when trying to come up with some solutions. Thanks

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  JoyMcCall on Thu Apr 24, 2014 1:08 pm

I am creating the data model now for our OLTP system. it contains a Fact for an Insurance Policy that has attributes such as Billing Method, Company Code, Permission To Order Credit, Reinstatement Type Code, Print Option, Transaction Reason Code. These attributes are codes - many different values, for each transaction in the Fact. They have translations/descriptions associated with them. thanks for your help!

JoyMcCall

Posts : 3
Join date : 2014-04-24

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  JoyMcCall on Thu Apr 24, 2014 1:09 pm

let me clarify. I am creating a data warehouse from the OLTP system and creating the data model for the new dimensional data warehouse. The model is not for the OLTP system.

JoyMcCall

Posts : 3
Join date : 2014-04-24

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  ngalemmo on Thu Apr 24, 2014 1:45 pm

When you have a bunch of codes that have no particular place to go, common practice is to combine them into one or more junk dimensions. The natural key to the dimension is the codes themselves, so you would have one row for each unique combination of codes.

The number of dimension tables is driven by the cardinality and correlation of the codes. One table for each code is too many, and one table for all the codes is probably too few. If the cardinality is very low, combining uncorrelated columns is not a big deal as you would not encounter too many combinations. If cardinality is high, combining it with highly correlated columns will not result in excessive numbers of combinations. You only create rows for combinations that actually exist.

As far as the descriptions of the codes go, you will need a list of values table(s) to support ETL so you can populate the description when adding a new row in the junk dimension, as well as maintaining the dimension should there be a change in a description.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  nick_white on Thu Apr 24, 2014 1:58 pm

OK - I have a better understanding now. My suggestions would be:
1) Codes/descriptions - if both are meaningful to your end users then include them both. If only the descriptions are meaningful to end users then whether you include the codes is a decision you would take based on things like technical support being easier for you if they are there - but you would hide them from the end users by using DB Views and/or configuring your reporting toolset

2) There is nothing technically wrong with having them all (codes and descriptions) in a single Dimension - a dimension can have 100s of columns. Whether it makes sense to group them into a single dimension or split them really depends on how correlated/related they are.
One of the examples you give is "company code" - does this mean that there is a concept of "company" in your model with attributes besides this code? If so then it may be a good design decision to split these out into a separate Company Dimension
Billing Method, Reinstatement Type Code, Transaction Reason Code all sounds like attributes of the transaction event and so might go in a Transaction Dimension
Attributes of the Policy itself (type of policy, etc.) might go in a Policy Dimension

Hope this helps?

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  BoxesAndLines on Thu Apr 24, 2014 8:37 pm

If the business uses the code to drill down to the fact table, then add as a standalone dimension. If they just use it for reporting (not an entry point), then a junk dimension will work fine.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Star vs Snowflake with many different attributes

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum