1 or 2 Dimensions - Customer & Shipping Address

View previous topic View next topic Go down

1 or 2 Dimensions - Customer & Shipping Address

Post  tonytiger on Tue Sep 22, 2009 2:45 am

I have read the Toolkit book and am doing a little dimensional modelling exercise, but I have come across an issue which I am not sure about.

I have a Customer dimension but I am not sure whether or not to include the shipping address in the Customer dimension. Each Customer typically (95% of the time) has one shipping address with fields that change slowly (I'll use type 2 SCD). But 5% of the time, customers have multiple shipping addresses. If I use a type 2 SCD for these occasionally rapidly changing address fields, then my ETL process will create new rows each time the shipping address changes.

Is the principle of a type 2 SCD that once a dimension changes it should not change back to a previous value - ie. it acts as a historical log? Or is fine for a new fact table entry to use an "older" version of type 2 SCD?

As an alternative, I was considering creating a seperate ShippingAddress dimension, but this dimension will have the same qty of rows as the Customer dimension *plus* rows for each time a shipping address changes.

Any help for my understanding is appreciated.

Thanks in advance.

Tony

tonytiger

Posts : 2
Join date : 2009-09-21

View user profile

Back to top Go down

Re: 1 or 2 Dimensions - Customer & Shipping Address

Post  ngalemmo on Tue Sep 22, 2009 2:23 pm

It depends on the types of analysis, but, generally speaking, a full address has little analytic value except in extraordinary circumstances (such as product recalls). So, what I generally do is identify what data has significant analytic value and which data is there for completeness. Address is such a case.

I would create a separate address dimension to store all addresses. The fact table would have two foreign keys, one for customer address and one for shipping address. The latter would reference the same address row when product is shipped to the customer address. There would be a separate, independent customer dimension which may contain current address or those location attributes (such as state and zip code) that may be needed for analysis. If the customer dimension is very large or if there is a need to maintain point-in-time location attributes, they can be placed in a much smaller junk dimension and referenced from another foreign key in the fact table.

If you are sourcing from a proper ERP system, there should not be an issue with business keys for the addresses. But if there are, or if your order source is from a less formal system where the addresses are simply order header information, then you need to 'invent' a key to locate addresses in the dimension. I would suggest generating a CRC from the address string and use it as a non-unique hash index to the address table. This gives you a lightweight alternate index that will narrow the search to 3-4 rows without having to create an index on the entire address. You, of course, would still have the unique surrogate primary key for use in fact tables.


Last edited by ngalemmo on Wed Sep 23, 2009 11:10 am; edited 1 time in total
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: 1 or 2 Dimensions - Customer & Shipping Address

Post  tonytiger on Wed Sep 23, 2009 5:31 am

Thank you for the thoughtful advice. What you suggest sounds like a very sensible way to proceed.

tonytiger

Posts : 2
Join date : 2009-09-21

View user profile

Back to top Go down

Re: 1 or 2 Dimensions - Customer & Shipping Address

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum