Junk Dimension
+2
ngalemmo
dbadwh
6 posters
Page 1 of 1
Junk Dimension
In our data model, for customer dimension, we have customer related attributes like name,age, dob, etc. separately and the other transaction related dimension attributes in a separate junk dimension? Is it a good practice or should both the values should be in a single dimension?
dbadwh- Posts : 31
Join date : 2011-09-30
Re: Junk Dimension
If the attributes are not related to the customer, then a junk dimension is one way to deal with them. If they are related to customer, sometimes placing them in a junk dimension is appropriate. Junk dimensions can serve as a 'poor man's' type 2, providing point in time attribute values without having to implement a type 2 for the primary dimension.
Re: Junk Dimension
If you already have an existing customer dimension then it makes little sense to create a junk dimension of customer attributes unless you are trying to solve another problem.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Junk Dimension
Customer is one of the biggest dimensions in a warehouse, and in my opinion using seperate dimensions and facts is always a beneficial approach. If you try to hold all the information about customer in a single dimension table, it becomes a rapidly changing dimension with millions of rows.
You say age is one of the attributes of your customer dimension. I think you don't have to store age column because you have "date of birth", and age can be calculated easily using a simple database function. My company has 30 millions of customers and storing age column in customer dimension means 30 millions of new rows each year. Using some age labels like "young customers" may be more efficient, because business is generally uses age column only for determination of these kind of labels.
This is a useful article if you have a problem with big dimension tables like customer etc. I benefited from this article while modeling the customer subject area.
http://www.kimballgroup.com/html/articles_search/articles1999/9908bIE.html?TrkID=IE199908_1
You say age is one of the attributes of your customer dimension. I think you don't have to store age column because you have "date of birth", and age can be calculated easily using a simple database function. My company has 30 millions of customers and storing age column in customer dimension means 30 millions of new rows each year. Using some age labels like "young customers" may be more efficient, because business is generally uses age column only for determination of these kind of labels.
This is a useful article if you have a problem with big dimension tables like customer etc. I benefited from this article while modeling the customer subject area.
http://www.kimballgroup.com/html/articles_search/articles1999/9908bIE.html?TrkID=IE199908_1
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 43
Location : Istanbul, Turkey
Re: Junk Dimension
Hi,
could you please tell us more about the attributes, what type of attributes you are trying to store in a Junk dimension, how many distinct values are there.
Are you going to do any reporting on those attributes?
thanks
could you please tell us more about the attributes, what type of attributes you are trying to store in a Junk dimension, how many distinct values are there.
Are you going to do any reporting on those attributes?
thanks
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Junk Dimension
It depends on the number of customers.
I tend to keep customer dimensions fairly narrow - customer specific information that that don't change much. Customer Dimensions can be very, very long. I don't like to join to it unless customer level information is needed. I prefer to keep other attribute information in a seperate dimension if possible. I am more likely to need the customer's Zip Code or State information rather than the Customer, or even the market segmentation name. If I can put market segment on a Customer Attribute table (as long as the junk dimension doesn't become obscenely large), then a query that rolls up information to the Market Segment will run much faster than if I have to go through the Customer Dimension which may have tens of millions of rows.
I tend to keep customer dimensions fairly narrow - customer specific information that that don't change much. Customer Dimensions can be very, very long. I don't like to join to it unless customer level information is needed. I prefer to keep other attribute information in a seperate dimension if possible. I am more likely to need the customer's Zip Code or State information rather than the Customer, or even the market segmentation name. If I can put market segment on a Customer Attribute table (as long as the junk dimension doesn't become obscenely large), then a query that rolls up information to the Market Segment will run much faster than if I have to go through the Customer Dimension which may have tens of millions of rows.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Similar topics
» bridge table and junk dimension on customer dimension (bank/credit union)
» modelling Product dimension for Pizza outlet
» Should I use a degenerate dimension or create a junk dimension?
» "Junk" dimension looking more like a "Header" dimension
» Should this be a degenerate dimension or a junk dimension?
» modelling Product dimension for Pizza outlet
» Should I use a degenerate dimension or create a junk dimension?
» "Junk" dimension looking more like a "Header" dimension
» Should this be a degenerate dimension or a junk dimension?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|