Do I need multiple fact tables or dimensions

View previous topic View next topic Go down

Do I need multiple fact tables or dimensions

Post  Scott on Mon Mar 07, 2016 11:36 am

I am designing an Admissions data mart for a college. Each application has a stage which needs to be tracked (App Received, Interview, Offer Made, Withdrawn etc.). I have designed a fact to be able to do point in time analysis

FactApp
Course (FK)
Student (FK)
Status/Stage (FK)
AppNumber (Degenerate)
Notes (FK)
EffectiveDate (DateTime)
App (always 1)
ActiveApp (1 or 0)

An applicant will also have an interview, possibly more than one and this also has a status (Invited, Attended, Did not Attend, rearranged etc.) that needs to be tracked. Do I need a fact for this too? The same goes for Application Offer, it will have a status (Offer made, Offer Accepted)

I have thought having these 3 transactional fact and then a 4th fact which has one row per application and pivots the various stages. The difficulty is that there can be more than one interview and even more than one offer and I understand it is not wise to join fact tables together?

This is my first post, so apologies if its not well written. Any help would be much appreciated. Thanks

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Mon Mar 07, 2016 4:03 pm

You need 1 transaction fact table. Stage and Status, and studentů etc. are your dims. Could you explain what 3 fact tables you are talking about?
You have a EffectiveDate in the Fact table, so having more than 1 interview with same student should not be an issue. Based on that date, you can select most current interview for a student and show that data in your report.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Tue Mar 08, 2016 4:12 am

Thank you for your reply.

Interviews are created in a separate section and will need to record different information such as interview date, interview time, interview staff, interview location, student, course, interview status.

Offers are created in a different table and needs to records offer date, student, course, offer status.

The EffectiveDate is the source data updated date, so that I can do point in time analysis of the state at that time.

So my idea was

FactApp
Course (FK)
Student (FK)
Status/Stage (FK)
AppNumber (Degenerate)
Notes (FK)
EffectiveDate (DateTime)
App (always 1)
ActiveApp (1 or 0)

FactOffer
EffectiveDate (DateTime)
OfferDate (FK)
Student (FK)
Course (FK)
Status (FK)
OfferNote (FK)
AppNumber (Degenerate)

FactInterview
EffectiveDate (DateTime)
Date (FK)
Time (FK)
Student (FK)
Course (FK)
Staff (FK)
Location (FK)
AppNumber (Degenerate)
Status (FK)

How could this fit in 1 transactional fact table or would 3 tables be the right approach? Thank you

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Tue Mar 08, 2016 8:31 am

What is the reason to keep them separate? This should be designed as a pipeline of a process.... starting point is an interview and end point is making an offer. I could not find a pipeline approach design example in Kimball's design tips, but he describe it in detail in his book. Based on the information you provided, you need a accumulating snapshot fact table to capture this start and end process. Please read about accumulating snapshot fact table in Kimball's design tips at:

http://decisionworks.com/2010/12/design-tip-130-accumulating-snapshots-for-complex-workflows/

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Tue Mar 08, 2016 9:28 am

Thanks for the link.

I was thinking of deriving a pipeline accumulating snapshot from these facts with one row per application process for a current view of the application. I wasn't sure how to deal with storing multiple Interview slots if its one line per Application?

The reason I was going to have the transactional facts is to be able to do point-in-time comparison. For example - number of apps this year vs. this time last year (including what status they were in at the time).

I was keeping the Overall Application, Interviews & Offer separate because I thought they were recording different facts with different information. The overall application has details like the course, learner and overall status, status date. The interview slots have specific information such as interview date,time, location, staff and status. I forgot to mention the Interview section holds Interviews, Assessments & Trials. Unfortunately these are differentiated by the status with a prefix of the type.

The application has an overall status (stage) - 'App Received', 'Interview', 'Offer', 'Accepted', 'Suspended' or 'Withdrawn'.
The interviews will have a status of 'Interview Invite', 'DNA', 'Re-arranged' (duplicated with prefix of Int, Assess or Trial eg. 'Assess Invite', 'Assess DNA' etc.)
The offer will have a status of 'Offer Made', 'Accepted', 'Rejected'

Apologies if this is not very clear, for all the reading I have done, I just can't quite get it in my head exactly how this needs to be modelled.

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Tue Mar 08, 2016 11:02 am

You can have multiple interviews for an application in the fact table. If multiple interviews for an application are apart by a day then a specific date keep them apart and unique. If multiple interviews for an application is happening in a single day then add date/time to the fact table and it will keep them apart and unique. Kimball has a whole chapter on education in his book so it should provide more detail on this subject.

If multiple interviews for an application happen by mistake or by an error, then how do they get corrected on the source system ( i-e by deleting them or by flagging them as invalid). You have to involve your business user to correct such errors.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Tue Mar 08, 2016 11:23 am

My understanding is that the accumulating pipeline fact table would have one row per application. That's why I wasn't sure how to represent multiple interview slots in that single row.?

Am I right in thinking I need a transactional fact to be able to say this time last year there were x amount of applications at Interview stage compared with x amount this year?

If the Overall Application has its own Stage/Status and specific application information, and each interview has its own status and interview specific details, and the offer has its own status and offer specific details, do I need 3 facts to record these and a history of their status change?

I have read the Education section for admissions tracking in the data warehouse toolkit which is very useful. Kimball uses a Pipeline Accumulating Fact. This seems really useful for calculating lag between stages and having one row per application, however I can't see how this enable you to do point-in-time analysis. Also the pipeline accumulating fact only has a column for one interview, whereas I need to see a history or slots and status changes.






Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Tue Mar 08, 2016 12:48 pm

To do a point-in-time analysis, you add start date and end date into the fact table which works as SCD type 2. Here is more information to read about it:

http://www.kimballgroup.com/2012/05/design-tip-145-time-stamping-accumulating-snapshot-fact-tables/

You are confusing yourself with fact grain or level of detail it can have. In my previous post I explained to you that your level of detail for a fact is not just the application but a date too. You do need 1 transaction fact table and then 1 accumulating Snapshot Fact table. If source system has normalized data (kept interview, offer etc. separate) then you de-normalize that data in your Trans fact table. Once that is done , then create your accumulating Snapshot Fact table.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Wed Mar 09, 2016 4:18 am

That's really useful information thanks. I wasn't aware of using a SCD type 2 on a fact table.

How would I store it in one transaction fact table? I think I'm struggling with the de-normalizing into a single Trans Fact here. Do I enter a new row for when an application is created, and then a new row when an interview or offer is created? Then subsequently when the status changes on either of these statuses? I'm struggling to understand how I store them all in one fact when they use different fields.

Would it be like..

Timestamp
Course (FK)
Student (FK)
AppNumber (Degenerate)
ApplicationStage (FK)
InterviewDate (FK)
InterviewTime (FK)
InterviewStaff (FK)
InterviewLocation (FK)
InterviewStatus (FK)
OfferDate (FK)
OfferStatus (FK)

If so, when the application is in the initial 'received' stage, before an interview or offer, would the interview and offer fields just be empty?

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Wed Mar 09, 2016 9:20 am

Creating a de-normalized fact table is going to be a challenge for you so you can keep them separate. Here I explain how would your accumulating Snapshot Fact table work. Please pay attention to date columns in the example.

--- Day 1 Joe apply

Timestamp change it to create_date = 3/9/2016
Update_date = 3/9/2016
Course (FK) = null
Student (FK) = Joe
AppNumber (Degenerate) = 1
ApplicationStage (FK) = 'App Received'
InterviewDate (FK) = null
InterviewTime (FK) = null
InterviewStaff (FK) = null
InterviewLocation (FK) = null
InterviewStatus (FK) = null
OfferDate (FK) = null
OfferStatus (FK) = null
Start_date = 3/9/2016
End_date = 1/1/3500 ( good practice is to use date in the future to represent a current row so use 1/1/3500)
Current_flag = Y


--- Next day Joe got interview so you close that above row in the table.


create_date = 3/9/2016
Update_date = 4/1/2016
Course (FK) = null
Student (FK) = Joe
AppNumber (Degenerate) = 1
ApplicationStage (FK) = 'App Received'
InterviewDate (FK) = null
InterviewTime (FK) = null
InterviewStaff (FK) = null
InterviewLocation (FK) = null
InterviewStatus (FK) = null
OfferDate (FK) = null
OfferStatus (FK) = null
Start_date = 3/9/2016
End_date = 4/1/2016
Current_flag = N

--- Now Joe got interview on 4/1/2016. You insert a new row with interview info:

create_date = 4/1/2016
Update_date = 4/1/2016
Course (FK) = null
Student (FK) = Joe
AppNumber (Degenerate) = 1
ApplicationStage (FK) = 'Interview
InterviewDate (FK) = 4/1/2016
InterviewTime (FK) = 9:30
InterviewStaff (FK) = Sam
InterviewLocation (FK) = Room 1
InterviewStatus (FK) = 'Interview Invite'
OfferDate (FK) = null
OfferStatus (FK) = null
Start_date = 4/1/2016
End_date = 1/1/3500
Current_flag = Y


zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Wed Mar 09, 2016 9:34 am

Thank you for this explanation. Sorry if you are having to repeat yourself, but just so that im clear in my head -

In terms of my transaction fact tables, I would have one for the overall application with a new row when the stage changes. A fact for the interviews, with a new row when the interview status or details changes or when a new interview slot is created, and a fact for the offers with a new row for each status or details change?

Then derive from these 3 facts an accumulating snapshot fact as per your explanation by acting like a SCD type 2 using start and end dates.

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  zoom on Thu Mar 10, 2016 7:56 am

Yes, you got it right.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Scott on Thu Mar 10, 2016 7:59 am

Thank you for your help

Scott

Posts : 17
Join date : 2016-03-07

View user profile

Back to top Go down

Re: Do I need multiple fact tables or dimensions

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum