Finding Active Employee for a specific period of time.

View previous topic View next topic Go down

Finding Active Employee for a specific period of time.

Post  arnayj on Tue Mar 01, 2011 4:28 am

Background: We have a Employee table which contains the Employee Status which is bench/work. Employee is in bench if he is in the company and in work when he is visiting the client. The current table updated accordingly. The salary of the employee is also increased by the time and it is also updated on the below table.
Employee(EmployeeName, EmployeeStatus, CreateDate, UpdateDate, Salary).

Query: We have to find the "number of in Work Employees", "Sum of the salary of total active employee", "Number of on Bench Employees" for a period of time. We have to consider the last status of the employee for the calculation.
E.g. one employee changes as
(EmployeeName, EmployeeStatus ,CreateDate ,UpdateDate ,Salary)
(A1, On Bench, 5 Jan 2010, 5 Jan 2010, $5000)
(A1, In Work, 5 Jan 2010, 5 Feb 2010, $5000)
(A1, On Bench, 5 Jan 2010, 5 Jun 2010, $5000)
(A1, On Bench, 5 Jan 2010, 01 Jul 2010, $8000)
(A1, In Work, 5 Jan 2010, 08 Oct 2010, $8000)
So the result set for this will be for each quater will be as below (Quater starts from Jan)
End of Q1 - A1 was on work with monthly salary of $5000
End of Q2 - A1 was on bench with monthly salary of $5000
End of Q3 - A1 was on bench with monthly salary of $8000
End of Q4 - A1 was on work with monthly salary of $8000
Again
starting from Q1 To end of Q2 - A1 was on bench with saraly of $5000
starting from Q1 To end of Q4 - A1 was on work with saraly of $8000

Ask: How to create the fact for the same to show the above results.

Thanks,
Arnay

arnayj

Posts : 5
Join date : 2011-03-01

View user profile

Back to top Go down

Re: Finding Active Employee for a specific period of time.

Post  VHF on Thu Mar 03, 2011 2:04 pm

This topic really belongs in Dimensional Modeling and Data Architecture rather than in Technical Architecture -- perhaps a moderator can move it. In the mean time I'll take a stab at it...

When designing a fact table, the Kimball method requires that we determine the "grain" of the fact table, i.e. what business process does a record in the fact table represent? It is also required to determine the dimensionality of the fact table--i.e. to what dimension tables should each fact record be related?

Employees would typically exist as dimension. The dimension would contain a surrogate key (SK), the natural key(s) such as employee name, and additional attributes that apply to the employee (such as start date, department, status, or salary.) If it is desired to track changes to these attributes over time, the dimension must be an SCD (Slowly Changing Dimension) Type 2 dimension. With SCD Type 2, you create a new record for the employee whenever any of the attributes changes.

A fact table would contain transactions relating to the employee. For example, commissions paid to an employee. However, in this case you don't really have a transaction to record in the fact table. The changes you want to track are all changes to employee attributes. So it is possible to address your requirements without a fact table at all--just a single employee dimension table!

Here's what our employee dimension (DimEmployee) might look like:

SKNameStatusSalaryBeginEffectiveDateEndEffectiveDateIsCurrent
1A1On Bench$50005 Jan 20104 Feb 2010N
2A1In Work$50005 Feb 20104 Jun 2010N
3A1On Bench$50005 Jun 201030 Jun 2010N
4A1On Bench$80001 Jul 20107 Oct 2010N
5A1In Work$80008 Oct 201031 Dec 2999Y

Every time an attribute of the employee changes, a new record (with a new surrogate key) is created for that employee. The previous record is updated to set the IsCurrent flag to 'N' and to set the EndEffectiveDate to one day less than the date the new record takes effect. For the current record, the EndEffectiveDate could be left NULL, but it is more useful to set it to a far future date to make querying simpler and more consistant.

With this SCD Type 2 dimension, you can can do your query for any point in time as follows. For the end of 2010-Q3 it would be:

SELECT Status, COUNT(SK) AS [Number of Employees], SUM(Salary) AS [Total Salary]
FROM DimEmployee
WHERE #31 Sep 2010# BETWEEN BeginEffectiveDate AND EndEffectiveDate
AND Status <> 'Inactive'
GROUP BY Status

This will give you the count and total slaray for each employee status for any given date.

(The query assumes a value of 'Inactive' would be used for terminated employees. When an employee is terminated, a new current record would be created with a status of 'Inactive' and the previous record for that employee updated to set EndEffectiveDate and set IsCurrent = 'N'. The previous records for that employee would of course retain whatever status was in effect at the time that record was current. With SCD Type 2, whenever an employee attribute such as status or salary changes, a new record is created--existing records are not modified except to update the EndEffectiveDate and IsCurrent flag on the previous record for that employee.)

That should do it... no fact table needed with this approach! In the future you might add a fact table to record billable hours for each employee, or revenue generated by each employee, or you might record actual paychecks. In any of these cases, you'll also want a date dimension. (A date dimension could also be useful to you now for finding the last day of each quarter.) See Kimball books for more information.


VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum