Nullability & Performance
3 posters
Page 1 of 1
Nullability & Performance
Hello,
I have a number of measures inside a fact, but they are not loaded together. i.e.: measure 1 can be loaded in today's load but measure 2 can come 2 days later (will be loaded as an update for the existing record inside the fact)
This implies that all my measures inside the fact will be nullable. Will this affect the performance? And how severe will be the effect is any?
Thank you
I have a number of measures inside a fact, but they are not loaded together. i.e.: measure 1 can be loaded in today's load but measure 2 can come 2 days later (will be loaded as an update for the existing record inside the fact)
This implies that all my measures inside the fact will be nullable. Will this affect the performance? And how severe will be the effect is any?
Thank you
DBADS- Posts : 7
Join date : 2010-08-22
Re: Nullability & Performance
Nulls won't affect performance. But it does raise questions over the design of the fact table itself.
Unless this is an aggregate or snapshot of some sort, you generally want to design fact tables that reflect the business process and capture individual events in toto as they occur. For example in a sales cycle, orders, shipping and invoicing are individual independent events that occur at different times. Each event would be reflected in its own fact table.
Any fact that would incorporate all three would need to be an aggregate of some sort. Designing a warehouse that only maintains such an aggregate not only overcomplicates the load processes, but limits the level of analysis that can be performed.
Not knowing what you are modeling, it is not clear wither such a late arriving update to measures is appropriate or not.
Unless this is an aggregate or snapshot of some sort, you generally want to design fact tables that reflect the business process and capture individual events in toto as they occur. For example in a sales cycle, orders, shipping and invoicing are individual independent events that occur at different times. Each event would be reflected in its own fact table.
Any fact that would incorporate all three would need to be an aggregate of some sort. Designing a warehouse that only maintains such an aggregate not only overcomplicates the load processes, but limits the level of analysis that can be performed.
Not knowing what you are modeling, it is not clear wither such a late arriving update to measures is appropriate or not.
Re: Nullability & Performance
Would a better approach be to create a fact table with a single measure and a "measure dimension" that contains a description of the measure? This way you never have nulls in the fact table. The down side is that the fact table can dramatically increase in size.
Updating every row of a big fact table would take a long time - some databases may have a hard time completing the update.
Updating every row of a big fact table would take a long time - some databases may have a hard time completing the update.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|