Dates in many dimensions

View previous topic View next topic Go down

Dates in many dimensions

Post  rf001 on Wed Dec 29, 2010 11:42 am

Hello,

I am new to DW. Currently, I am working on a Human Resource database. There are many tables in the source systems which stores dates e.g. dates about contract start, finish, Vacations of an employee, temporary service suspension and many more.

In this situation is it better to use a Date dimension and reference it in the dimensions which stores the dates or I should simply include the dates in the dimensions?
thanks,
rf
avatar
rf001

Posts : 23
Join date : 2010-12-16

View user profile

Back to top Go down

Re: Dates in many dimensions

Post  Jeff Smith on Wed Dec 29, 2010 2:20 pm

I prefer to use a snowflake design when it comes to dates on other dimensions.

I figure there are 3 possible options. 1) Snowflake to a date dimension, 2) Put the date on the dimension table, 3) Mixed, depending upon how it will be used.

In my opinion, consistency very important so option 3 ranks at the bottom. There are times when I need the extended information in the date dimension to be avaliable for the date in the non-date dimension, which would exclude option 2. That leaves option 1.

In my database, a calendar date is only found in the Date Dimension.


Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Dates in many dimensions

Post  hang on Wed Dec 29, 2010 5:50 pm

If there is no time (hour, minute et.) involved, I agree with Jeff, have FK for most date attributes in dimensions. Not quite sure about things like birthdate, I would leave it as calendar date.

If the time of the day is significant, use Datetime field directly instead of trying to snowflake the dimension for the time field in any form. Of course you would still snowflake other date fields in the same dimension. This arrangement even applies to fact tables unless the time unit is well defined. The worst thing is to extend a date dimension into a date-time dimension only because some attributes involve time parts. Create a separate time unit dimension if the granularity does not go below a second and you need some descriptions for different time scales, otherwise just leave it as Datetime type.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Dates in many dimensions

Post  rf001 on Mon Jan 17, 2011 8:10 pm

Thanks a lot for replies. It was very helpful.
avatar
rf001

Posts : 23
Join date : 2010-12-16

View user profile

Back to top Go down

Re: Dates in many dimensions

Post  ngalemmo on Tue Jan 18, 2011 2:53 pm

There are dates that are just simply dates and dates that you need date dimension attributes for analysis. Birthdate, usually falls into the former, so would something like hire date.

In the latter, if these dates are significant, they more than likely belong in the fact table relating to the event. For example, in an employee action fact table, hiring would be one such action, so the hire date would be the effective date of that action (hire). But this does not mean it is the sole place for that value. Obviously, hire date (and rehire date) should also be attributes in the dimension for reference purposes.

If you were to do a report of how many people were hired during a period, you would look for hire actions in the action fact table and report on effective date (which references the date dimension), not by running a report off the employee dimension.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Dates in many dimensions

Post  hang on Tue Jan 18, 2011 8:31 pm

I just came across a design that has a fact table with 13 date keys and 30 other dimension keys. The date dimension contains more than 200,000 rows dating from 18500101 to 24991231. I think someone was so confident that the system could be still useful in next a few centuries.

Now there is a reporting requirement that many calendar dates are needed in the fact, so that I have to join the fact with date dimension a few times. As the date dimension in this case is not a relatively small dimension, each join will have performance penalty. I could also avoid the annoying joins by self deriving the date from the smart surrogate key, but not sure if the conversion would outperform the joins.

When I saw this fact table, my initial impression is Centipede fact, as the number of dimensions far exceeds Kimball’s rule of thumb figure 25. Redesigning the fact structure and have many dates wrapped up in some combined dimensions will reduce the number of joins significantly.

However it raises a question about the guideline on what data type we should use for date fields in dimension and fact tables. Generally we should use date keys in the fact to cater for non-existence and other unusual date meanings. Whereas in dimensions, we have options of date key or calendar date because there would not be any referential issue as in fact table. Sometimes snowflaking all the date attributes in a dimension may make queries unnecessarily complicated by too many joins with date dimension and hence defeat one of the important purposes of dimensional modelling, ease of use.

I am kind of turning against my previous post, and possibly Jeff’s as well, advocating using date FKs in dimensions. It would simplify a lot of queries if we just use straight calendar date as date attributes in dimensions which would allow NULL value as well, and if necessary, connect it to date dimension by the calendar date instead of date key.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Dates in many dimensions

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum