location dimension

View previous topic View next topic Go down

location dimension

Post  scabral on Mon Aug 05, 2013 3:24 pm

Hi,

I have a location dimension which contains County, State, and Zip fields.

I have a fact table that only has County and State values, so how do i do a lookup to the location dimension to only get 1 row for the combination of County and State? Right now it will return multiple rows each with a different surrogate key.

thanks
scott

scabral

Posts : 58
Join date : 2012-05-02

View user profile

Back to top Go down

Re: location dimension

Post  ngalemmo on Mon Aug 05, 2013 6:45 pm

You don't (or you shouldn't). Create a city dimension.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: location dimension

Post  scabral on Mon Aug 05, 2013 10:22 pm

ngalemmo,

I'm a bit confused. My fact data has only County and State. What would be contained in a City dimension? How would I do a lookup to find the right combination of County and State?

scabral

Posts : 58
Join date : 2012-05-02

View user profile

Back to top Go down

Re: location dimension

Post  BoxesAndLines on Mon Aug 05, 2013 10:58 pm

He meant to say create a country/state dimension.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: location dimension

Post  ngalemmo on Tue Aug 06, 2013 8:45 am

BoxesAndLines wrote:He meant to say create a country/state dimension.

Yes.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: location dimension

Post  scabral on Tue Aug 06, 2013 8:49 am

got it.

thanks!

scabral

Posts : 58
Join date : 2012-05-02

View user profile

Back to top Go down

Re: location dimension

Post  VHF on Thu Aug 08, 2013 3:26 pm

You could also add rows to your existing dimension for unique County/State combinations with the zip code left blank (or populated with a * or some other special value.) You would point to those rows when loading facts that don't have a specific zip code.

That way you only have a single location dimension, which could be advantages if you want to query across fact tables.

On the other hand, you add some complexity by having entries both with and without zip code in your you location dimension.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: location dimension

Post  KimballFan on Wed Jan 15, 2014 8:23 pm

Is solution covered in any of the Kimball Books? Are there any issues other than handling the complexities? The reason I ask is, our old Data Warehouse has separate dimensions for situations like this. We recently had some consultants come in to build a prototype for a new DW. In their design they combined 2 of our dimensions into a single location dimension as you described. I thought it was a pretty slick way to handle fact table with different levels of granularity for the same dimension but, was wondering if there are any 'Gotchas'. In the past when I've deviated from the standard Kimbal Dimensional Model for something that seemed better or more logical, I've regretted it.
Thanks

KimballFan

Posts : 11
Join date : 2014-01-15
Location : Tucson

View user profile

Back to top Go down

Re: location dimension

Post  BoxesAndLines on Thu Jan 16, 2014 9:05 am

The gotcha is what the OP stated. Which row do I select when may dimension rows are applicable? An analogy is picking a date for a monthly snapshot fact when joining to the date dimension. Any of the days of a given month can be applicable.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Location Dimension

Post  nick_white on Thu Jan 16, 2014 12:11 pm

Hi,

just as an FYI, according to Kimball we should not be creating Location Dimensions:
"Geographic information is naturally handled as attributes within multiple dimensions, not as a standalone location dimension or outrigger."

There is a page on this in Ch. 11 - Telecommunications.

But assuming you have a valid business reason for creating this dimension, is the reason you are facing this challenge due to the issue of grain? You have a fact table with a declared grain of county/state (plus other attributes presumably) but you are trying to join it to a dimension with a grain of zip/county/state. This would never work cleanly - though as discussed in this chain there are possible workarounds. Instead you need to roll up your zip/county/state location dimension to create a new county/state dimension and then join your fact table to that new location dimension.
This is the same design principle where if you create a monthly snapshot/aggregate fact table you would join it to a month dimension and not a date dimension (as the 'time' grain of the fact is now month and not day).

Regards,

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: location dimension

Post  ngalemmo on Thu Jan 16, 2014 12:25 pm

It depends on what information you have or the intent (such as creating aggregates). Certainly in telecommunications you have precise geographic information all over the place. In the original case, if all you have is ZIP or County, then you wind up implementing two dimensions, one for ZIP the other for County. You can't really piggyback county or state off ZIP as ZIP codes do not conform to political boundaries.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: location dimension

Post  BoxesAndLines on Thu Jan 16, 2014 2:08 pm

I've done lots of work for telco's and we used geography all over the place. I'd have to say this is one of the few times I would have to disagree with the Doc.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: location dimension

Post  KimballFan on Mon Mar 17, 2014 2:59 pm

BoxesAndLines wrote:The gotcha is what the OP stated.  Which row do I select when may dimension rows are applicable?  An analogy is picking a date for a monthly snapshot fact when joining to the date dimension.  Any of the days of a given month can be applicable.  

Selecting which record to use is not an issue in this design. The Dimension was created with an additional set of records at the higher dimension level. In this example, the lowest level is Service Point (This is a Utility Industry model). For the most part, each service point represents where a meter is installed at a Premise (Service Address, abount 700,000 rows). The highler level in the dimension is the Premise. So, the table has a record for every Service Point which contains details about the premise. There is an Additional record added for each Premise where the service point information is blank. THe table contains a Primary Surrogate Key (Location Key), a Premise Key and an SP Key. So, if a fact table is at the Service Point level, it joins to the record where the Location Key = the Service Point Key, if a fact record is at the Premise Level, it will join to the record where the Location Key = the Premise Key. I did find a brief paragraph in Kimball's most current Data Warehouse tookit where, it seemed like it was alluding to this design and recommended against it. I left that book at home today so, I don't have the exact quote or page but, I believe it was in Chapter 7. As I stated earlier, when I've deviated from Dimensional Modelling standards, I've come to regret it and, usually too late to go back and fix it without a lot of effort. The 1 gotcha I see is, If 2 tables are joined to the same dimension and, thos fact tables are linking to different grains of the dimension, it might be difficult to enforce how this data would be used by the end user. We could aggregate the table that is at the lower level but, why have the lower level if you can't use it. I realize that handling fact tables at multiple grains is an issue even when they are at the same dimension level, this just seems to compound that.

KimballFan

Posts : 11
Join date : 2014-01-15
Location : Tucson

View user profile

Back to top Go down

Re: location dimension

Post  nick_white on Tue Mar 18, 2014 5:56 am

If I have understood correctly, you have put Service Points and Premises in the same Dimension? If so then I'm pretty sure this is wrong - they are two separate entities and so should be in separate Dims. It sounds like there is a hierachical relationship between Service Point (child) and Premise (parent); if this is the case then you wouldn't need to assign the Premise SK to the fact table as the Fact-Service Point relationship covers the Fact-Premise relationship anyway.
If you create an aggregate fact table at the premise level then you would not include the Service Point Dim but you would include the Premise Dim.
This is the same principle as the Date/Month issue - if you have an aggregate table at the Month level you don't assign a (random) Date Dim SK to it, you create a Month Dim and assign the relevant Month Dim SK to each fact record.

Regards,

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

.

Post  KimballFan on Tue Mar 18, 2014 10:25 am

nick_white wrote:If I have understood correctly, you have put Service Points and Premises in the same Dimension? If so then I'm pretty sure this is wrong - they are two separate entities and so should be in separate Dims. It sounds like there is a hierachical relationship between Service Point (child) and Premise (parent); if this is the case then you wouldn't need to assign the Premise SK to the fact table as the Fact-Service Point relationship covers the Fact-Premise relationship anyway.
If you create an aggregate fact table at the premise level then you would not include the Service Point Dim but you would include the Premise Dim.
This is the same principle as the Date/Month issue - if you have an aggregate table at the Month level you don't assign a (random) Date Dim SK to it, you create a Month Dim and assign the relevant Month Dim SK to each fact record.

Regards,
Thanks for the response! My sentiments exactly. Our old Data Warehouse had separte dimensions for these entities as well as the PERSON, ACCOUNT and SERVICE AGREEMENT entities that were also rolled into one dimension in this new model. I was just looking for confirmation as I'm trying to convince my boss that we need to revise this design before moving foreward. We are in the very early stages of implementing a new model and I want a solid foundation to build on. There is a reference to this structure on page 302 of Kimball's latest book but, it didn't go into much detail as to the impact of doing this.

KimballFan

Posts : 11
Join date : 2014-01-15
Location : Tucson

View user profile

Back to top Go down

Re: location dimension

Post  davej on Wed Mar 19, 2014 12:39 pm

I am presuming that when you add a location dimension this has no effect on the address fields in the Customer dimension, right?

davej

Posts : 6
Join date : 2014-02-17

View user profile

Back to top Go down

Re: location dimension

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum