Are compound PKs ok for a dimension table?
4 posters
Page 1 of 1
Are compound PKs ok for a dimension table?
Would like to get some opinions on this...
I've always been of the opinion that dimension tables should have a single surrogate primary key column, and that the 'grain' of the dimension would be defined by its natural key. However, I am now thinking that may not always be the case.
For example, to do sales analysis in the retail business, there are attributes that depend on an intersection of dimensions. Calendars, holidays, promotions may be diverse, particularly with national and multinational chains. It is not unreasonable to see date and location being factors in any of these dimensions. Wither the dimension table itself has one or more columns in the PK doesn't impact the dimension, but it can impact the fact table(s).
Assuming the sales fact table has date and location FKs to the corresponding dimensions, an implementation of both standard and local calendars (the latter dependent on location), would require a third FK if a single column PK was the rule. If you defined it with a compound PK, the existing date and location FKs on the fact will suffice. Thinner fact tables, particularly when they are huge, can have significant performance advantages. Using compound PKs also allow you to add new dimensions (such as the local calendar) at a later date without impacting the fact table.
Comments are welcome.
I've always been of the opinion that dimension tables should have a single surrogate primary key column, and that the 'grain' of the dimension would be defined by its natural key. However, I am now thinking that may not always be the case.
For example, to do sales analysis in the retail business, there are attributes that depend on an intersection of dimensions. Calendars, holidays, promotions may be diverse, particularly with national and multinational chains. It is not unreasonable to see date and location being factors in any of these dimensions. Wither the dimension table itself has one or more columns in the PK doesn't impact the dimension, but it can impact the fact table(s).
Assuming the sales fact table has date and location FKs to the corresponding dimensions, an implementation of both standard and local calendars (the latter dependent on location), would require a third FK if a single column PK was the rule. If you defined it with a compound PK, the existing date and location FKs on the fact will suffice. Thinner fact tables, particularly when they are huge, can have significant performance advantages. Using compound PKs also allow you to add new dimensions (such as the local calendar) at a later date without impacting the fact table.
Comments are welcome.
Re: Are compound PKs ok for a dimension table?
Hi,
I’m still of the opinion, that dimension tables should have surrogate keys.
As to the mentioned problem, I used to have ONE calendar dimension which contains not only the standard calendar data, but the location specific data as well.
The calendar dimension looks like:
# Id ( Surrogate )
Date
Month Id
Month Name
Week Nu
Day of week etc
and
Location Id
Weekday Indicator
Holiday Indicator,
Special Event etc
So the granularity of this dimension is not calendar date, but calendar date per location.
Several times the granularity of location are not the same in the fact and dimension tables. For instance locations means in the calendar dimension Country, while location in the fact table means City.
In this solution there are just the 2 FKs in the fact table.
I’m still of the opinion, that dimension tables should have surrogate keys.
As to the mentioned problem, I used to have ONE calendar dimension which contains not only the standard calendar data, but the location specific data as well.
The calendar dimension looks like:
# Id ( Surrogate )
Date
Month Id
Month Name
Week Nu
Day of week etc
and
Location Id
Weekday Indicator
Holiday Indicator,
Special Event etc
So the granularity of this dimension is not calendar date, but calendar date per location.
Several times the granularity of location are not the same in the fact and dimension tables. For instance locations means in the calendar dimension Country, while location in the fact table means City.
In this solution there are just the 2 FKs in the fact table.
gvarga- Posts : 43
Join date : 2010-12-15
Re: Are compound PKs ok for a dimension table?
Yes, PKs for dimensions are always surrogate.
What I mean is, assume you already have a calendar dimension with its PK and a location dimension with its own PK. There is also a requirement to store and maintain attributes in another dimension table that are dependent on both the calendar and location. Do you define a table with date/location as the natural key and assign a unique single column surrogate PK and carry it in the fact or do you define the table using both the calendar dimension PK and the location dimension PK as a compound PK to the new dimension?
There are problems using a compound key. As you mentioned, a 'location' could be anything and the attributes relating to location and date may not be applicable to all locations. It would be very difficult to implement a 'does not apply' row to such a table, meaning you would need to perform outer joins as a 'cya' move. It also doesn't work very well if one of the base dimensions is type 2. But, I suspect it may be a useful solution in specific circumstances.
What I mean is, assume you already have a calendar dimension with its PK and a location dimension with its own PK. There is also a requirement to store and maintain attributes in another dimension table that are dependent on both the calendar and location. Do you define a table with date/location as the natural key and assign a unique single column surrogate PK and carry it in the fact or do you define the table using both the calendar dimension PK and the location dimension PK as a compound PK to the new dimension?
There are problems using a compound key. As you mentioned, a 'location' could be anything and the attributes relating to location and date may not be applicable to all locations. It would be very difficult to implement a 'does not apply' row to such a table, meaning you would need to perform outer joins as a 'cya' move. It also doesn't work very well if one of the base dimensions is type 2. But, I suspect it may be a useful solution in specific circumstances.
Re: Are compound PKs ok for a dimension table?
Hi,
Although I don’t like the rules of thumb, in the case of dimensions I insist applying them. ( No composite key, no natural key, but single surrogate key). I used to say that I can distance myself from the unforseeable changes of the source systems just this way.
Although I don’t like the rules of thumb, in the case of dimensions I insist applying them. ( No composite key, no natural key, but single surrogate key). I used to say that I can distance myself from the unforseeable changes of the source systems just this way.
gvarga- Posts : 43
Join date : 2010-12-15
Re: Are compound PKs ok for a dimension table?
I would think of it as a bridge table situation. My understanding is that a simple type 1 bridge does not need a dedicated surrogate key to connect to the fact table. The primary connection seems to be one of FKs in the bridge.
In case of location specific holidays, you would have a location SK and date key that only points to exceptions (holidays) in the bridge. I would think of the holidays as a multivalued dimension on the location key in the fact. If the factual date key is in the list of holidays for the location, then the date is a holiday date. Effectively it is a left outer join on two keys to a small bridge (I hope).
With SCD implication, a primary surrogate key in the bridge may be necessary as you might need an effective dated bridge and you may not want to apply the date range constraint in the left join statement, similar concept to a typical SCD2 SK in a fact table.
In case of location specific holidays, you would have a location SK and date key that only points to exceptions (holidays) in the bridge. I would think of the holidays as a multivalued dimension on the location key in the fact. If the factual date key is in the list of holidays for the location, then the date is a holiday date. Effectively it is a left outer join on two keys to a small bridge (I hope).
With SCD implication, a primary surrogate key in the bridge may be necessary as you might need an effective dated bridge and you may not want to apply the date range constraint in the left join statement, similar concept to a typical SCD2 SK in a fact table.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Are compound PKs ok for dimension tables
Absolutely.
The unique key of a dimension table is its natural key which will have one or more columns.
You then add surrogate key ( an integer column) that serves as PK for the dimension.
Regards
Arif Zaman
The unique key of a dimension table is its natural key which will have one or more columns.
You then add surrogate key ( an integer column) that serves as PK for the dimension.
Regards
Arif Zaman
azaman- Posts : 2
Join date : 2012-02-29
Re: Are compound PKs ok for a dimension table?
Having read Kimall's latest book 'The Kimball Group Reader', p339 'Multi-Enterprise Calendar Dimension', I believe the componund PK is permissible for the localised calendar scenario. Kimball called it 'Snowflake Calendar Subdimension' with both date key and organisation key as PK. The subdimension has 1-1 relationship with base date dimension having subdimension constrained by specifying a single organisation.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Pros and cons of consolidated dimension table Vs. many dimension table ?
» joining dimension table to dimension and again fact table
» Can a dimension table directly link to another dimension table?
» Large Dimension table compared to fact table?
» joining dimension table to dimension and again fact table
» Can a dimension table directly link to another dimension table?
» Large Dimension table compared to fact table?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum