Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Primary Key of Fact Table

+6
juz_b
warrent
slaborda
BrianJarrett
DilMustafa
kurtv
10 posters

Go down

Primary Key of Fact Table Empty Primary Key of Fact Table

Post  kurtv Tue Feb 17, 2009 5:57 pm

When designing a fact table, is it best practice to make a composite primary key out of all the foreign keys?

Thanks,
Kurt V

kurtv

Posts : 2
Join date : 2009-02-03

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  DilMustafa Wed Feb 18, 2009 1:10 am

Yes it is. Readers please, also comment on the idea of including timestamps as part of the PK along with these surrogate FK's.

DilMustafa

Posts : 49
Join date : 2009-02-03
Location : Calgary, Canada

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BrianJarrett Wed Feb 18, 2009 7:42 am

In general I'd say yes. However, in most of my designs I do it a bit differently. In almost every model I've built I have at least one many-to-many relationship between my fact and a dimension. In this case I use a bridge table between the fact and dimension table to address this relationship. What I do is create a unique constraint on the foreign key fields and allow that to be an "alternate key", acting as a composite primary key. I then generate a normal surrogate primary key for the fact table. That surrogate key goes into a bridge table, along with a dimension surrogate key and a divisor, for any of the many-to-many dimensions.

Using this primary key along with the alternate key on the fact table allows me to represent the entire fact record with a single key in the bridge table. If I didn't do this then I'd have to insert every foreign key of my composite primary key into the bridge table for every record joining to the many-to-many dimension. If I had 10 composite keys in my fact table I'd have to have 10 records for a single dimension record join. That turns into a modeling nightmare very quickly and the code to retrieve data from the many to many dimension is pretty brutal.

There might be a different way to address this but I haven't found it. This method works nicely, is very clean, and is very easy to implement.

As far as including timestamps as part of the primary key; I'm not sure what benefit that provides. I usually include some housekeeping fields (record date, who modified) but that timestamp (record date) isn't needed to make the composite key unique. Can you provide some more detail on why this field would be necessary? Maybe I'm missing something.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Primary Key of Fact Table Empty Primary key in a fact table?

Post  slaborda Wed Feb 18, 2009 10:58 am

Hi all,

My Database teacher always said... for all questions in IT the answer is "It depends".
The main idea that i want transmit you is.. There are not absolute ideas, it depends of each concret situation.

In most cases i use surrogate keys as PK in fact tables, in this cases i built an "alternate key" with the set of foreing keys.

Why we use Pk's/AK in fact tables? who access the fact table using the associated pk/ak index?
We use PK/AK just in order to be sure that there are not data incoherencies?
Why this control must be in the data structure? Always the DWH must be a semantical model?

In fact, we are designers and we can try for each problem our customized solution. You can use PK's in fact tables, but you must have arguments for do this. If you haven't don't use it.

Best regards,

Sergi

slaborda

Posts : 3
Join date : 2009-02-18
Age : 46
Location : Barcelona, Via lactea, earth, Europe, Spain, Barcelona

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BrianJarrett Wed Feb 18, 2009 11:33 am

I'm glad to see someone else doing this too. I normally do it for all fact tables, just in case I have a many-to-many dimension relationship. It also provides consistency across the model.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  warrent Wed Feb 18, 2009 3:24 pm

There are also some other reasons to assign a surrogate key to the fact table, like updates, indexes, and debugging. See the following design tips for more info:
  • Design tip 81
  • Design tip 84
--Warren
warrent
warrent

Posts : 41
Join date : 2008-08-18

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  DilMustafa Fri Feb 20, 2009 2:09 am

Hi Brian,

For my curosity, why dont you use PK of the bridge table as part of the key in fact table.

http://www.intelligententerprise.com/010810/412warehouse1_1.jhtml

DilMustafa

Posts : 49
Join date : 2009-02-03
Location : Calgary, Canada

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BrianJarrett Fri Feb 20, 2009 7:41 am

dmustafa wrote:Hi Brian,

For my curosity, why dont you use PK of the bridge table as part of the key in fact table.

http://www.intelligententerprise.com/010810/412warehouse1_1.jhtml
I read the article but I think the bridge table Ralph talks about is a special kind of bridge; a "helper" bridge table. This one acts more like a fact table between two different dimensions with a many-to-many relationship between each other.

What I've been talking about is a single dimension that has a many-to-many relationship directly with the fact table. For example; I used to work for a portrait photography company. On a given order line item we could have a single photo product that contained pictures from many different photo sessions. Each session could also be attached to multiple order line items as well.

You mention using the primary key of the bridge table as part of the key in the fact table. The type of bridge I'm talking about doesn't have a single primary key; instead the primary key of this bridge table is a composite key, consisting of the surrogate key from the fact table and the surrogate key from the dimension table. Creating a surrogate key on the fact table simplifies the bridge table, allowing us to represent the fact record with a single key rather than the composite key (made up of all the foreign keys that make the record unique). Uniqueness on the fact table is maintained by an "alternate key", a unique constraint built on the foreign keys that make the record unique.

In the article, as I understood it, Ralph was using the surrogate from one dimension in the fact table, then joining it to another dimension using a helper table. So this fact table still maintained a one-to-many relationship with a single dimension but that dimension then took on a many-to-many relationship with another dimension (via the helper table). I think these are two different kinds of modeling requirements.

Let me know if I've misunderstood something and thanks for the feedback.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  DilMustafa Fri Feb 20, 2009 1:27 pm

Thanks for the reply. I appreciate your help. You are the most active member on this forum for sure.

Now getting back to M:M between dimension and fact. Kimball recommends creating a group tables. He quotes an example where many salesrep can be assigned to an order. Kimball creates a salesrepgroup dimension with salesrep key and salesrep group key and uses the salesrep group key in the fact table. ETL process will do lookups and create a group if does not exist based on salesrep combos. I will appreciate your opinion on this. Once again thanks for the effort and time.

DilMustafa

Posts : 49
Join date : 2009-02-03
Location : Calgary, Canada

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BrianJarrett Sat Feb 21, 2009 2:09 am

I've read about that approach in the Toolkit but haven't needed to use it in any of my models yet. I'm sure the group comes in handy in specific situations but for a simple many-to-many I've never found it necessary. I guess I'd say that if the group itself doesn't need to be named and identified then it's probably not necessary to design for it. That may be oversimplifying it though.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  DilMustafa Sun Feb 22, 2009 1:09 am

Thanks Brian... It was a great discussion. I do like your approach for modeling M:M relationships.

Thanks once again,

Dil Mustafa

DilMustafa

Posts : 49
Join date : 2009-02-03
Location : Calgary, Canada

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BrianJarrett Sun Feb 22, 2009 2:50 am

No problem; I love talking about this stuff. Although there is a solid and consistent approach to dimensional modeling there will always be some exceptions that will require us to be creative. This board is a great place to learn from each other. I know I already have.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  kurtv Mon Feb 23, 2009 1:32 pm

Perhaps the answer to when to use groups lies in how often the group would be reused.

Kurt

kurtv

Posts : 2
Join date : 2009-02-03

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  juz_b Thu Feb 26, 2009 1:27 am

We always assign a generic surrogate key to all our tables. This is a pretty clean and efficient approach to help facilitate UPDATES for the ETL process (as Warren had suggested). And like Brian had mentioned, we also use it a lot for M:M references in a bridge table. Having to deal with composite keys in these situations can get very inefficient and cumbersome.

The other point I'd like to share is that we define alternate keys strictly for enforcing record uniqueness. This is typically defined by the grain of the Fact table, which may or may not include all of your FKs.

Hope this helps.

juz_b

Posts : 17
Join date : 2009-02-07

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  monsieur_arrie Tue Apr 21, 2009 11:44 am

I have found, in some cases, I have more than 16 FK's in my fact table, which is the largest number of columns allows in a composite index or key.

In cases like this, I need to do careful thinking about my fk indexing strategy.

In these cases, I always adding a surrogate PK for no other reason than it makes a row individually identifiable. I have a J Celko view of that, a table should always contain uniquely identifiable rows. If not, deleting and updating becomes impossible (before everyone leaps on me about updating fact rows, I know it is not recommended)

monsieur_arrie

Posts : 2
Join date : 2009-04-21

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  BoxesAndLines Tue Apr 21, 2009 1:48 pm

I update rows in accumulating snapshot fact tables all the time.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Primary Key of Fact Table Empty Primary Key and Clustered Index on the Fact Table

Post  Vincent Rainardi Wed Feb 24, 2010 7:10 pm

http://dwbi1.wordpress.com/2010/02/24/primary-key-and-clustered-index-on-the-fact-table/
just a thought

Vincent Rainardi

Posts : 6
Join date : 2009-02-03
Location : London

http://www.datawarehouse.org.uk

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  John Simon Thu Feb 25, 2010 1:19 am

Personally, I don’t have any primary keys in my fact tables as I see no advantages and plenty of disadvantages to the approach.
Having a surrogate primary key makes the table wider, giving me less rows per page, thus impacting performance. This is especially the case if I have billions of records in my fact table. I want as few columns as possible to improve query performance. If want to do a delete or update I can use ROW_NUMBER() OVER (PARTITION BY ...

Having the primary key as a composite key of all my foreign keys will impact my ETL process because it needs to test for uniqueness, thus slowing down the load. If I design my ETL process correctly then I shouldn’t need to test for uniqueness using referential integrity.

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

http://jsimonbi.wordpress.com

Back to top Go down

Primary Key of Fact Table Empty Re: Primary Key of Fact Table

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum