Large Mini dimension alternate ?

View previous topic View next topic Go down

Large Mini dimension alternate ?

Post  VTK on Wed Dec 28, 2011 7:02 pm

We are planning to create a Mini dimension for our customers with following fields as these fields would make the Cust Dim(Type-2) huge.
Cust State, Cust 3 digit zip(929 unique values), Age(120 unique values), Rating Age(15 unique values), Gender(3 unique values) and this is exploding as 929*120*15*3 would come around 5 millions which is very large for Mini dimension. Should we outrigg the 3 digit zip from Cust dim or cust mini dim ? What other options do we have ?

Thanks for your time...

VTK

Posts : 50
Join date : 2011-07-15

View user profile

Back to top Go down

Re: Large Mini dimension alternate ?

Post  ngalemmo on Wed Dec 28, 2011 7:34 pm

When designing mini (aka junk) dimensions you need to consider correlation and cardinality. State and zip code have very high correlation, so it makes sense to put them together. Age and sex have very low correlation, but also have low cardinality. Creating two tables will give you two fairly small manageable tables. However, when you put them together into one table, you have five attributes with no correlation and, in combination, high cardinality... not a good way to do things.

As far as zip code goes, I would not truncate it to 3 digits if you have all five. Why lose information that may be useful later (i.e. geographic analysis)?
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Large Mini dimension alternate ?

Post  VTK on Thu Dec 29, 2011 1:38 am

Thanks for your reply. If we create two mini dims now we have totally 3 dims for Customer and we have to use all of them wherever we need customer info(assuming they will be using these fields). Correct ? I am just wondering if that's a good practice. Is this how others are handling this ? I guess it would not be just me who faced this scenario and others would have faced similar issue when dealing with Customers.

VTK

Posts : 50
Join date : 2011-07-15

View user profile

Back to top Go down

Re: Large Mini dimension alternate ?

Post  Jeff Smith on Thu Dec 29, 2011 3:45 pm

Why is a mini-dimension considered a junk dimension? Mini-dimensions can be created for any point of a hierarchy. If your Geographic dimension starts at Census Block Group and rolls to Census tract, County, State, you can add a 2nd surrogate key at say County, roll fact data up to the county_skey and then create a view from the geo dimension that included county and state.

I would not recommend doing this on very long dimension such as a customer or member dimension unless you can create a materialized view of the mini-dimension otherwise performance could be really slow.

Zip Code information can be tricky. You can purchase data that rolls Zip Code up to City, County, and State (it's not 100% accurate, but many people use it). If this is the case, I would put Zip Code in it's own dimension along with City, County, and state as it can be used with more than just the customer.

Age is another element that I tend to put in it's own dimension table. I usually want to know the age of the customer at the time of a transaction. If people aren't careful, they can match the customers current age with activity from years ago. It's OK to put gender on the Customer as it doesn't change very often, but if you need to aggregate data to Gender, consider putting it in a junk dimension. You could put Gender and Age together in the same dimension - giving you a dimension with 360 rows.

Again, I tend not to create mini-dimensions off really big dimension tables. There are times when it can't be avoided. We had a member dimension in which the relationship code of the member to subscriber was part of the unique identifier of the member. The challenge was that many reports had data at the relationship level. Creating a mini-dimension on the member dimension wasn't practical because the member dimension had millions of rows. Instead, I created a Relationship Dimension and denormalized it within the member dimension.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Large Mini dimension alternate ?

Post  ngalemmo on Thu Dec 29, 2011 4:31 pm

They were referred to as 'junk' dimensions in the first Toolkit book. The second version of the book uses the term 'mini'. Both are essentially the same thing.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Large Mini dimension alternate ?

Post  ngalemmo on Thu Dec 29, 2011 4:35 pm

kaps wrote:Thanks for your reply. If we create two mini dims now we have totally 3 dims for Customer and we have to use all of them wherever we need customer info(assuming they will be using these fields). Correct ? I am just wondering if that's a good practice. Is this how others are handling this ? I guess it would not be just me who faced this scenario and others would have faced similar issue when dealing with Customers.

Yes, you would include FK's to those dimensions when applicable. It's not a big deal, as smaller dimensions help performance in most cases.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Large Mini dimension alternate ?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum