Attendance data and Fact Table

View previous topic View next topic Go down

Attendance data and Fact Table

Post  PJK on Thu Sep 27, 2012 4:48 am

hi,
We have a requirement to design a fact table for employee attendance and leave.
From the source system, daily clocked time is in the form of number of hours.
Leave information is in the form of from_date and to_date.

Reporting requirements include,
1) Show Emp Leave details for the period ie "From date" and "To date"
2) "Number of days of leave" for arriving at derived measures

one of the approaches we can think of is,
Generate a fact transaction for each date as applicable in the leave duration.

ie. if the leave period is "02-Jan-2012 Half day to 05-Jan-2012 Full day" then
generate separate fact entries for Earned Leave as,

Emp _Id ==== Date === Leave ==== Hours
==================================================
E_01 ==== 02-Jan-12 === On Duty ==== 4
E_01 ==== 02-Jan-12 === Earned Leave ====
E_01 ==== 03-Jan-12 === Earned Leave ====
E_01 ==== 04-Jan-12 === Earned Leave ====
E_01 ==== 05-Jan-12 === Earned Leave ====
E_01 ==== 06-Jan-12 === On duty ==== 8

But considering the number of days of leave this may make the Fact table quite large.

What will be the most effective design approach here?

thanks,
Pullai

PJK

Posts : 4
Join date : 2012-09-26

View user profile

Back to top Go down

Re: Attendance data and Fact Table

Post  min.emerg on Thu Sep 27, 2012 7:49 am

Hi Pullai

I would first determine whether you'll be building a transactional or periodic-snapshot fact. For transactional facts, I understand that a record is inserted only when an event takes place (i.e.: an insurance claim is submitted, a credit card transaction is captured). For periodic snapshots, there isn't really an event that occurs - instead, the state of the data is recorded for a particular period in time (i.e.: it is snapshotted).

I would argue that you can consider time logged as an event (i.e.: transactional), and that a time will be associated with it (i.e.: the employee logs time for a particular day). That being said, I would have a single measure in the fact that stores the number of hours logged for a particular day. The type of time being logged would be a dimension with specific types of time (i.e.: time worked, sick leave, annual leave, public holiday, unpaid leave, study leave etc.) What I have found useful with storing time at an hour level is that it was easy for me to introduce tasks to the fact, whereby people assign time not at a day level, but at a task level (i.e.: today I spend 2 hours on development, 3 hours on training, and 3 hours in meetings. You'd have multiple fact records per person per day as a result, if they did more than one task during the day).

In terms of people taking leave for an extended period of time, I would enter a record into the fact for each of the days that they were on leave for, and log the time for each day as 8 hours (or however many number of hours a standard working is). So if someone is on leave for 5 days, enter 5 records into the fact, with the measure being set as 8 (hours) for each record.

I wouldn't worry too much about the number of records that this approach will generate. If you're modeling an organization with 10 000 employees, you'll have at most 365 x 10,000 = 3,650 000 records per year, which isn't very much at all. Worry when you get into the hundreds of millions or billions. And even then, there are ways to manage it.

min.emerg

Posts : 39
Join date : 2011-02-25

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum