Cube design Planned and Actual data?

View previous topic View next topic Go down

Cube design Planned and Actual data?

Post  djphatic on Sat Apr 21, 2012 6:38 am

I am about to begin venturing into the world of OLAP cubes/SQL Analysis Services.

The database I will be doing ETL from is an operating theatre system which records data on sessions (operating lists) and operations. Planned and actual data is recorded for each of these in separate tables. Any sessions or operations that don't go ahead and are cancelled are flagged appropriately in the planned tables, when sessions and operations go ahead a related record is created in the actual tables.

I'm thinking the design of the cube would have 2 fact tables, sessions and operations, which are related on the session number.
Dimensions would be locations, specialties, consultants, time, session type, operation type.

The planned and actual data can exist in the same fact tables by the use of joins on the data, though I'm wondering whether it would be best to have 2 cubes one for planned and one for actual or have 4 fact tables in the same cube?

We do have a DW from an external supplier which has a module (cubes) pre-built for operation theatre data. There design has 2 cubes one for sessions and one for operations, though the planned and actual data exist within the same fact tables for each cube. Though they are currently not being used to due various reasons and having looked at the schema it's not obviously not purpose built for our operating theatre database.

EDIT: Further information on schema and cube design

The schemas are designed in the following way in the OLTP system I will be extracting from:

Sessions: Each session has a planned session record, this record contains the date, theatre, consultant, anaesthetist etc. When a session has been planned and goes ahead an associated record is created in the actual session table, this has a 1 to 1 relationship. Almost the same data is recorded here, by default the values are the same as the planned though they could differ.

Where a planned session is cancelled the statusflag field is changed on the planned session record, a record would not exist in the actual session table once cancelled. There are also placeholder planned sessions which simply have a value for the primary key, a date and some flags set. There is no information on theatre, consultant, anaesthetist as these sessions are used for emergencies.

Operations: Each operation has a planned operation, this record is related to a planned session record. A planned session can have many operations. As with sessions, when an operation occurs an associated record is created in the actual operation table, this has a 1 to 1 relationship. The actual operation has a relationship with the planned session record also.

Where an operation is cancelled the statusflag field is changed on the planned operation record, a record would not exist in the actual operation table once cancelled.

Measures: Proposed measures are as follows

Planned Sessions: No. of planned sessions, No. of cancelled sessions, No. of reallocated sessions, Planned session time, Planned session operating time*, Planned operation in session*
Actual Sessions: No. of actual sessions, Actual session time, Sessions started late, Sessions started early, Sessions finished late, Sessions finished early, Actual session operating time*, Actual operation in session*, Gap time in session*

Planned Operations: No. of planned operations, No. of cancelled operations, Planned operating time,
Actual Operations: No. of actual operations, No. of abandoned operations, Actual operating time, Gap time between prev. operation

* these are aggregates of the measures for operations in the sessions. At present I am aggregating the values in my SSIS package and storing the values in fields in the planned/actual session fact tables. I'm not sure whether this is necessary.

Dimensions: Proposed dimensions are as follows

Specialty: All measure groups
Theatre: All except planned operation, though this could be implemented to all measure groups.
Consultant: All measure groups
Anaesthetist: All measure groups
Hospital Site: All measure groups
Anaesthetic Type: Operation measure groups
Session Type: All session measure groups
Date: All measure groups

There maybe some more dimensions implemented after but these are the initial ones.


Last edited by djphatic on Thu May 03, 2012 12:57 pm; edited 1 time in total (Reason for editing : Further information on schema and cube design)

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  Mike Honey on Mon Apr 23, 2012 12:01 am

Hi djphatic,

I'd lean towards a single cube with all your fact data together. This is basically essential if anyone ever wants to compare the fact data (e.g. Planned vs Actual). Most common query tools (e.g. Excel, SSRS, MDX) struggle to combine data from multiple cubes. There are also design and processing efficiencies, although these are less important IMO.

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Mon Apr 23, 2012 8:14 am

Mike Honey wrote:I'd lean towards a single cube with all your fact data together.

Hi Mike

Thanks for your reply. Can I clarify that you are suggesting 1 cube which would have 4 fact tables Planned Sessions, Actual Sessions, Planned Operations and Actual Operations?

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  Mike Honey on Mon Apr 23, 2012 7:56 pm

Hi djphatic,

Yes that sounds like a good starting point to me. I'd be unlikely to try to combine the Planned & Actual fact table pairs, as I'd expect them to require differing dimensionality (if not now, at some point in the future).

If this doesn't match the physical fact table structure, it's OK (IMO) to do some manipulation via T-SQL Views which you feed into SSAS e.g. unioning or partitioning. Views are a good "insulation" layer in any case.

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Tue May 01, 2012 4:07 pm

I've been doing some work on this today.

The Fact tables for the Planned and Actual Sessions have the same dimensions:
  • Date
  • Specialty
  • Consultant
  • Surgeon
  • Anaesthetist
  • Theatre
  • Session Type
  • Session Identifier (AM/PM)

Visual Studio is advising that I unify the measure groups for these fact tables to improve performance.
I assume this means to join the 2 fact tables together into 1 or is there another alternative to this?

I didn't quite follow what you mean by using a T-SQL view? Did you mean instead of merging the 2 fact tables into 1 physical table, use a view to join the tables instead?

Thanks

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Tue May 01, 2012 4:30 pm

Mike Honey wrote:This is basically essential if anyone ever wants to compare the fact data (e.g. Planned vs Actual).

Say if I wanted to compare Planned Theatre to Actual Theatre for operations, wouldn't I need 2 dimensions one for Planned Theatre and one for Actual Theatre?

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  Mike Honey on Tue May 01, 2012 7:25 pm

djphatic wrote:Visual Studio is advising that I unify the measure groups

This is only happening because your Dimension Usage definitions are the same for the Fact tables. I would assume that will not be the case as your model matures, so I would not attempt to unify those Facts. For example you mentioned "sessions or operations that don't go ahead and are cancelled are flagged appropriately in the planned tables" - this implies a dimension unique to your Planned data.

With separated Fact tables you will have separated Planned and Actual measures. So the Planned Theatre to Actual Theatre analysis you described below would probably involve Measures like "Planned Operations" and "Actual Operations" vs a common dimension/attribute for "Theatre".

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Wed May 02, 2012 3:13 am

Hi Mike

Thanks for your reply.

Perhaps it is my lack of understanding as this is my first venture into OLAP cubes/SSAS but wouldn't "Cancelled Sessions" by a measure rather than a dimension. "Cancellation Reason", which I forgot about, would be a dimension?

The way I was thinking with the Actual vs. Planned Theatre analysis is that the data would display the name of the planned theatre and actual theatre and then a total of that combination.

For example:
Planned Theatre = Theatre 1
Actual Theatre = Theatre 2
Total = 10

I would then be able to drillthrough the total figure to find the actual session record ID and/or further details using a SSRS report.

I think I will update my original post with more information about the schemas I am working with to provide a better context. This isn't as straight forward as I thought it would be. Thanks.

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hkandpal on Wed May 02, 2012 7:42 am

Hi,

how are the planned and actual session related, can you have one planned and more than one actual session or vice versa ? Are all the measures same in planned and actual ?

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Thu May 03, 2012 12:59 pm

hkandpal wrote:Hi,

how are the planned and actual session related, can you have one planned and more than one actual session or vice versa ? Are all the measures same in planned and actual ?

thanks

Hi, thanks for your reply.

I've updated my original post. The planned and actual sessions have a 1 to 1 relationship. The measures will be different for planned and actual as cancellations come from planned data, along with planned times, whereas measures from the actual data would be actual times, operations etc. used for calculations.

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  ngalemmo on Thu May 03, 2012 2:48 pm

djphatic wrote:Hi Mike

Thanks for your reply.

Perhaps it is my lack of understanding as this is my first venture into OLAP cubes/SSAS but wouldn't "Cancelled Sessions" by a measure rather than a dimension. "Cancellation Reason", which I forgot about, would be a dimension?

The way I was thinking with the Actual vs. Planned Theatre analysis is that the data would display the name of the planned theatre and actual theatre and then a total of that combination.

For example:
Planned Theatre = Theatre 1
Actual Theatre = Theatre 2
Total = 10

I would then be able to drillthrough the total figure to find the actual session record ID and/or further details using a SSRS report.

I think I will update my original post with more information about the schemas I am working with to provide a better context. This isn't as straight forward as I thought it would be. Thanks.

Would not 'planned' and 'actual' be a dimension of the cube? Or separate sets of complementary measures? Why would the theater be different between planned and actual? Are you not planning for an actual theater?
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hang on Fri May 04, 2012 12:51 am

I think for a start, there should be two fact tables for planned and actual fact. I would think of the planned fact as a coverage fact table that has all the predetermined dimension context for each session/operation. Whereas actual fact table will reflect what actually happened to the session/operation which may have different version of context from planned.

However I would allocate session fact down to operation level having session and operation number as degenerate dimension (DD). I would not touch planned coverage fact table for cancelled operation as I think it should belong to actual fact. I guess I could introduce an additive measure (count) to the actual fact table and have it set to 1 for all the implemented operations and 0 for cancelled ones.

As Mike suggested, I would definitely put all your data in the scope into a single cube. You may create a single fact view to consolidate two fact tables if you want to handle the logic in TSQL instead of MDX.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Fri May 04, 2012 3:20 am

ngalemmo wrote:Would not 'planned' and 'actual' be a dimension of the cube? Or separate sets of complementary measures? Why would the theater be different between planned and actual? Are you not planning for an actual theater?

Hi, thanks for your reply.

I've looked into this further and I think it is a data quality issue more than anything and would not necessarily be required. In the client system the default values of the actual session are those of the planned session, though the user can change them if required. I believe this functionality is tailored more towards personnel changing rather than the location. If the location were to change then the planned record would need to be changed so the session would appear in the correct place in the client software.

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Fri May 04, 2012 4:35 pm

djphatic wrote:The way I was thinking with the Actual vs. Planned Theatre analysis is that the data would display the name of the planned theatre and actual theatre and then a total of that combination.

For example:
Planned Theatre = Theatre 1
Actual Theatre = Theatre 2
Total = 10

A better example than the one above would be ward before operation and ward after operation, this may not be the same. I'm not sure how I would model this, the ward before operation belongs to the planned operation and ward after operation in the actual operation. I think I would be best storing the planned ward in both the planned operation and actual operation fact?

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Fri May 11, 2012 9:25 am

Perhaps I need to tweak my fact tables but I'm wondering how I can achieve the following:

At present I have 2 fact tables for operations, planned and actual. Each of the fact tables is related to the theatre dimension i.e. planned theatre and actual theatre. In certain situations the planned theatre is unknown.

I'd like to be able to start with the number of planned operations where the theatre was unknown, then break these down by the theatre the operation actually took place in.

I've tried this with 2 dimensions, planned theatre and actual theatre but this doesn't work. Do I require a fact dimension containing the unique operationID which would be used as a referenced dimension?

The alternative approach I can think of is making the planned theatre value available in the actual theatre fact table.

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hang on Fri May 11, 2012 5:57 pm

I am not sure if you have ever read my response to your problem. In your business case, there are some basic but important concepts you may need to understand.

1. Degenerate dimension (DD). When you have an ID for an event, eg. operation ID and session ID, the ID becomes a DD in a fact table, as operation itself is a fact table and its related attributes should be modeled as dimensions.

2. Coverage factless fact. When you have a planned event, like planned utility usage, or in your case, planned operation, a simple but effective solution is to model it as a coverage factless fact, separating from actual fact table. The theatre is a conformed dimension used by two fact tables.

3. Fact allocation. When you have two facts, one at header level and the other at item level, like your session and operation, the best practice is to allocate header fact down to the item fact, so that you can conveniently connect header fact to all the measures and dimensions at item level, without trying to conform two fact tables.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Sat May 12, 2012 9:11 am

Thanks for your reply. I did read your original post but initially did not understand.
hang wrote:1. Degenerate dimension (DD). When you have an ID for an event, eg. operation ID and session ID, the ID becomes a DD in a fact table, as operation itself is a fact table and its related attributes should be modeled as dimensions.
Agree and understand the above.
2. Coverage factless fact. When you have a planned event, like planned utility usage, or in your case, planned operation, a simple but effective solution is to model it as a coverage factless fact, separating from actual fact table. The theatre is a conformed dimension used by two fact tables.
The above makes sense with a coverage factless fact but how do I deal with cancelled operations/cancelled sessions. A record will not exist for these in the actual operations/actual sessions as they did not occur. To get the number of cancelled operations/cancelled sessions this measure would be in these fact tables which would mean it is no longer a factless fact table? Are you suggested that a record would exist in the actual operation/actual session fact table but all measures would 0 except for a cancellation measure?

How would I use this factless fact table to allow me to do analysis such as where the planned operation theatre was unknown, which theatre did the actual operation take place in? The only way I seem to be able to achieve this is make the planned operation theatre data available in the actual operation fact table?
3. Fact allocation. When you have two facts, one at header level and the other at item level, like your session and operation, the best practice is to allocate header fact down to the item fact, so that you can conveniently connect header fact to all the measures and dimensions at item level, without trying to conform two fact tables.
I did some further reading into the parent-child relationship which suggests as you have done above. Using the example below:

Actual OperationActual Session
DimensionDimension
ConsultantConsultant
SpecialtySpecialty
TheatreTheatre
Session Type
MeasureMeasure
Anaesthetic TimePlanned Session Time (Planned Finish Time - Planned Start Time)
Surgical TimeActual Session Time (Finish time of late operation - start time of first operation)
Operating TimeTotal Operating Time
Gap TimeTotal Gap Time
Minutes Starting Late
Minutes Finishing Late
Utilisation %
Efficiency %

I would actually model this to the operation grain level as follows, using an example of 3 operations in a session:

OperationRecordID000010000200003
OperationOrder123
PlannedSessionID000010000100002
ActualSessionID000010000100001
OperationConsultantAAABBBCCC
SessionConsultantAAAAAAAAA
OperationSpecialty100101102
SessionSpecialty100100100
TheatreXYZXYZXYZ
SessionTypeStandardStandardStandard
Anaesthetic Time101010
Surgical Time202020
Operating Time303030
Gap Time0510
PlannedSessionTime24000
ActualSessionTime25000
Minutes Starting Late500
Minutes Finishing Late005

Please ignore the actual numbers as I know they don't add up.

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hang on Sat May 12, 2012 6:33 pm

djphatic wrote:Are you suggested that a record would exist in the actual operation/actual session fact table but all measures would 0 except for a cancellation
No, I did not suggest this. However in my first response, I did suggest to treat cancelled operation as a zeroed measure in actual fact table. It's like a student did not turn up or cancelled the class after registered to it. So an absence can be a zeroed attendance fact in an attendance fact table (actual), while scheduled timetable can be regarded as a coverage table. You could have a dimension, or an attribute in a junk if there is one, to indicate whether an operation has been implemented, cancelled (reasons).

Or alternatively, you don't do anything, if the cancellation is the only reason that an operation did not go ahead. It's similar to a product in promotion (coverage) but not in sales fact. You may simply work out cancelled by using NOT EXISTS in the query against actual and planned fact tables.

The point is, don't mix up planned and actual resulting in more non-existence/applicable dimensions at base level. If operation theatre is a dimension, put it in both fact tables and cater for non existence in both. Eventually you can always consolidate the two fact tables in a view by conformed dimensions (eg, theatre), and feed the fact view to your cube.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  djphatic on Thu Jun 07, 2012 3:26 am

hang wrote:You may simply work out cancelled by using NOT EXISTS in the query against actual and planned fact tables.

How can this be achieved within the cube as a measure? By using a view for the dsv rather than individual tables?

djphatic

Posts : 20
Join date : 2012-04-21

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  jchernev on Thu Jun 07, 2012 12:15 pm

Hey hang,

Really good insight here. I''m a bit confused ... can you please explain why you would store the attendance fact in the actual fact table as opposed to the factless coverage fact table? I think I'm being thrown off by the semantics behind the word and I can't wrap my head around the process you're suggesting.

jchernev

Posts : 14
Join date : 2011-12-08

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hang on Thu Jun 07, 2012 11:16 pm

djphatic wrote:
hang wrote:You may simply work out cancelled by using NOT EXISTS in the query against actual and planned fact tables.

How can this be achieved within the cube as a measure? By using a view for the dsv rather than individual tables?
It depends on what identifies the planned session and operation. Suppose key attributes are SessionID, OperationID, TheatreKey and DateKey, then you may use planned fact to left outer join the actual fact table in your consolidated fact view on the set of keys, as follows:

Select p.*, a.*,
case when a.SessionID is null then 'Canceled' else 'Operated' end as OperationStatus
From PlannedFact p
Left join ActualFact a
On p.SessionID=a.SessionID
And p.OperationID=a.OperationID
and p.TheatreKey =a.TheatreKey
And p.DateKey=a.DateKey

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  hang on Thu Jun 07, 2012 11:20 pm

jchernev wrote:Hey hang,

Really good insight here. I''m a bit confused ... can you please explain why you would store the attendance fact in the actual fact table as opposed to the factless coverage fact table? I think I'm being thrown off by the semantics behind the word and I can't wrap my head around the process you're suggesting.
Attendance is like transaction, about what has actually happened. Whereas coverage fact tables are mostly used for predetermined fact, like product promotion, or facilities utilization coverage factless fact table where you record one row in the fact table for each facility for standard hourly time blocks during each day regardless of whether the facility is being used or not. I would be surprised if the attendance is not modeled as actual fact table, maybe factless if you are only interested in count. If you want to assign a measure to 1 for attendance and 0 for absence in the same table, then it becomes a true fact table instead of factless.

Please refer to Chapter 12 in Kimball's dimensional modeling book where you may find more details about education.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Cube design Planned and Actual data?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum