Primary Key of Fact Table

Page 1 of 2 1, 2  Next

View previous topic View next topic Go down

Primary Key of Fact Table

Post  kurtv on Tue Feb 17, 2009 1:57 pm

When designing a fact table, is it best practice to make a composite primary key out of all the foreign keys?

Thanks,
Kurt V

kurtv

Posts: 2
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  DilMustafa on Tue Feb 17, 2009 9:10 pm

Yes it is. Readers please, also comment on the idea of including timestamps as part of the PK along with these surrogate FK's.

DilMustafa

Posts: 48
Join date: 2009-02-03
Location: Calgary, Canada

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  BrianJarrett on Wed Feb 18, 2009 3:42 am

In general I'd say yes. However, in most of my designs I do it a bit differently. In almost every model I've built I have at least one many-to-many relationship between my fact and a dimension. In this case I use a bridge table between the fact and dimension table to address this relationship. What I do is create a unique constraint on the foreign key fields and allow that to be an "alternate key", acting as a composite primary key. I then generate a normal surrogate primary key for the fact table. That surrogate key goes into a bridge table, along with a dimension surrogate key and a divisor, for any of the many-to-many dimensions.

Using this primary key along with the alternate key on the fact table allows me to represent the entire fact record with a single key in the bridge table. If I didn't do this then I'd have to insert every foreign key of my composite primary key into the bridge table for every record joining to the many-to-many dimension. If I had 10 composite keys in my fact table I'd have to have 10 records for a single dimension record join. That turns into a modeling nightmare very quickly and the code to retrieve data from the many to many dimension is pretty brutal.

There might be a different way to address this but I haven't found it. This method works nicely, is very clean, and is very easy to implement.

As far as including timestamps as part of the primary key; I'm not sure what benefit that provides. I usually include some housekeeping fields (record date, who modified) but that timestamp (record date) isn't needed to make the composite key unique. Can you provide some more detail on why this field would be necessary? Maybe I'm missing something.

BrianJarrett

Posts: 61
Join date: 2009-02-03
Age: 38
Location: St. Louis, MO

View user profile

Back to top Go down

Primary key in a fact table?

Post  slaborda on Wed Feb 18, 2009 6:58 am

Hi all,

My Database teacher always said... for all questions in IT the answer is "It depends".
The main idea that i want transmit you is.. There are not absolute ideas, it depends of each concret situation.

In most cases i use surrogate keys as PK in fact tables, in this cases i built an "alternate key" with the set of foreing keys.

Why we use Pk's/AK in fact tables? who access the fact table using the associated pk/ak index?
We use PK/AK just in order to be sure that there are not data incoherencies?
Why this control must be in the data structure? Always the DWH must be a semantical model?

In fact, we are designers and we can try for each problem our customized solution. You can use PK's in fact tables, but you must have arguments for do this. If you haven't don't use it.

Best regards,

Sergi

slaborda

Posts: 3
Join date: 2009-02-18
Age: 34
Location: Barcelona, Via lactea, earth, Europe, Spain, Barcelona

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  BrianJarrett on Wed Feb 18, 2009 7:33 am

I'm glad to see someone else doing this too. I normally do it for all fact tables, just in case I have a many-to-many dimension relationship. It also provides consistency across the model.

BrianJarrett

Posts: 61
Join date: 2009-02-03
Age: 38
Location: St. Louis, MO

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  warrent on Wed Feb 18, 2009 11:24 am

There are also some other reasons to assign a surrogate key to the fact table, like updates, indexes, and debugging. See the following design tips for more info:
  • Design tip 81
  • Design tip 84
--Warren

warrent

Posts: 38
Join date: 2008-08-18

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  DilMustafa on Thu Feb 19, 2009 10:09 pm

Hi Brian,

For my curosity, why dont you use PK of the bridge table as part of the key in fact table.

http://www.intelligententerprise.com/010810/412warehouse1_1.jhtml

DilMustafa

Posts: 48
Join date: 2009-02-03
Location: Calgary, Canada

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  BrianJarrett on Fri Feb 20, 2009 3:41 am

dmustafa wrote:Hi Brian,

For my curosity, why dont you use PK of the bridge table as part of the key in fact table.

http://www.intelligententerprise.com/010810/412warehouse1_1.jhtml

I read the article but I think the bridge table Ralph talks about is a special kind of bridge; a "helper" bridge table. This one acts more like a fact table between two different dimensions with a many-to-many relationship between each other.

What I've been talking about is a single dimension that has a many-to-many relationship directly with the fact table. For example; I used to work for a portrait photography company. On a given order line item we could have a single photo product that contained pictures from many different photo sessions. Each session could also be attached to multiple order line items as well.

You mention using the primary key of the bridge table as part of the key in the fact table. The type of bridge I'm talking about doesn't have a single primary key; instead the primary key of this bridge table is a composite key, consisting of the surrogate key from the fact table and the surrogate key from the dimension table. Creating a surrogate key on the fact table simplifies the bridge table, allowing us to represent the fact record with a single key rather than the composite key (made up of all the foreign keys that make the record unique). Uniqueness on the fact table is maintained by an "alternate key", a unique constraint built on the foreign keys that make the record unique.

In the article, as I understood it, Ralph was using the surrogate from one dimension in the fact table, then joining it to another dimension using a helper table. So this fact table still maintained a one-to-many relationship with a single dimension but that dimension then took on a many-to-many relationship with another dimension (via the helper table). I think these are two different kinds of modeling requirements.

Let me know if I've misunderstood something and thanks for the feedback.

BrianJarrett

Posts: 61
Join date: 2009-02-03
Age: 38
Location: St. Louis, MO

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  DilMustafa on Fri Feb 20, 2009 9:27 am

Thanks for the reply. I appreciate your help. You are the most active member on this forum for sure.

Now getting back to M:M between dimension and fact. Kimball recommends creating a group tables. He quotes an example where many salesrep can be assigned to an order. Kimball creates a salesrepgroup dimension with salesrep key and salesrep group key and uses the salesrep group key in the fact table. ETL process will do lookups and create a group if does not exist based on salesrep combos. I will appreciate your opinion on this. Once again thanks for the effort and time.

DilMustafa

Posts: 48
Join date: 2009-02-03
Location: Calgary, Canada

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  BrianJarrett on Fri Feb 20, 2009 10:09 pm

I've read about that approach in the Toolkit but haven't needed to use it in any of my models yet. I'm sure the group comes in handy in specific situations but for a simple many-to-many I've never found it necessary. I guess I'd say that if the group itself doesn't need to be named and identified then it's probably not necessary to design for it. That may be oversimplifying it though.

BrianJarrett

Posts: 61
Join date: 2009-02-03
Age: 38
Location: St. Louis, MO

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  DilMustafa on Sat Feb 21, 2009 9:09 pm

Thanks Brian... It was a great discussion. I do like your approach for modeling M:M relationships.

Thanks once again,

Dil Mustafa

DilMustafa

Posts: 48
Join date: 2009-02-03
Location: Calgary, Canada

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  BrianJarrett on Sat Feb 21, 2009 10:50 pm

No problem; I love talking about this stuff. Although there is a solid and consistent approach to dimensional modeling there will always be some exceptions that will require us to be creative. This board is a great place to learn from each other. I know I already have.

BrianJarrett

Posts: 61
Join date: 2009-02-03
Age: 38
Location: St. Louis, MO

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  kurtv on Mon Feb 23, 2009 9:32 am

Perhaps the answer to when to use groups lies in how often the group would be reused.

Kurt

kurtv

Posts: 2
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  juz_b on Wed Feb 25, 2009 9:27 pm

We always assign a generic surrogate key to all our tables. This is a pretty clean and efficient approach to help facilitate UPDATES for the ETL process (as Warren had suggested). And like Brian had mentioned, we also use it a lot for M:M references in a bridge table. Having to deal with composite keys in these situations can get very inefficient and cumbersome.

The other point I'd like to share is that we define alternate keys strictly for enforcing record uniqueness. This is typically defined by the grain of the Fact table, which may or may not include all of your FKs.

Hope this helps.

juz_b

Posts: 17
Join date: 2009-02-06

View user profile

Back to top Go down

Re: Primary Key of Fact Table

Post  monsieur_arrie on Tue Apr 21, 2009 8:44 am

I have found, in some cases, I have more than 16 FK's in my fact table, which is the largest number of columns allows in a composite index or key.

In cases like this, I need to do careful thinking about my fk indexing strategy.

In these cases, I always adding a surrogate PK for no other reason than it makes a row individually identifiable. I have a J Celko view of that, a table should always contain uniquely identifiable rows. If not, deleting and updating becomes impossible (before everyone leaps on me about updating fact rows, I know it is not recommended)

monsieur_arrie

Posts: 2
Join date: 2009-04-21

View user profile

Back to top Go down

Page 1 of 2 1, 2  Next

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum