Allocating -Parent child relationship

View previous topic View next topic Go down

Allocating -Parent child relationship

Post  dellsters on Wed Feb 11, 2009 4:56 pm

I have a design problem that is related to the Design tip #25 (Dimensional Models for Parent-Child Applications) under design tips. My design issue is that I would like to create a fact at the most granular level (each individual employee that rides the ambulance when an emergency call comes in and dispatched). My problem is what to do with the time that the ambulance takes from the time the ambulance is dispatched to the arrival time. This fact belongs to a higher granular level (each ambulance dispatched).
I'm tempted to allocate this fact and divide it by the number of individual employee that rides the particular ambulance. So for instance, if a calls comes in and the ambulance takes 10 minutes to get to the destination and there are 2 employees that ride the ambulance, then I would divide 10 by 2 and insert 5 minutes for the fact record for each individual. I would take a similar approach for every fact at the higher level such as number of patients being carried, etc. These fact values are meaningless to look at for each fact record since 5 minutes is not a valid representation of the time it took for the ambulance to get from point A to point B. Is this still a good design approach to allocated the facts? Wouldn't this confuse the users? Should I go with 2 separate facts?

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  BrianJarrett on Wed Feb 11, 2009 5:18 pm

I'd argue that allocation is the only way to go here. I'd also say that the business should be the one to make the final decision on the business rules. It might be true that the ride duration at the lowest level of granuarity isn't an officially captured value but it will represent an average duration and it will, more importantly, allow users to aggregate the data to a higher level and still get the right answer.

I'd also be ready to present your recommendations to the business. Odds are they'll default to your judgment.

I'd try to avoid two different fact tables; I think that ultimately gets more confusing for end users.

I'd also try to be clear in the presentation layer that the data is allocated (in layperson speak) to mitigate end user confusion.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Wed Feb 11, 2009 7:16 pm

Bryan,
Thanks for the feedback.

I guess at a higher level, there is a phone call made. Based on the phone call, there might be multiple ambulances dispatched with multiple individuals per ambulance. In the same data mart, would I include a Call dimension per each call made? This dimension table would store information specific to each call such as time of call made, how long the call took before ambulances were dispatched, length of the phone call, etc. I am worried that with this approach, the call dimension table would increase exponentially. I can't think of a better design approach to take care of the call related attributes. Any thoughts on this?

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

RE: ALLOCATING -PARENT CHILD RELATIONSHIP

Post  BrianJarrett on Wed Feb 11, 2009 9:14 pm

Actually phone call sounds like a transaction; in other words, a fact. Date, Time, Driver (employee), etc. would all be dimensions on that fact. I don't know your data well enough to say for sure but that's how it appears.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Wed Feb 11, 2009 9:25 pm

It does not care about drivers. The relationship is one phone call to many ambulances dispatched at a certain time for each call and one ambulance dispatched to many employees that were in the ambulance dispatched. If I create one fact for the employees in the ambulance and the ambulances dispatched, and another fact for calls, then I can't roll up the employees dispatched for the particular ambulance dispatched to the call?

Basically they need to track the phone call that comes in and its related dispatched ambulances with the employees the ambulance carries related to the phone call. I thought they all should be in one fact table because of the relationships. Please correct me if I'm wrong.

I hope I made it a little more clear. Please let me know if the data still doesn't make sense.

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

RE: ALLOCATING -PARENT CHILD RELATIONSHIP

Post  BrianJarrett on Thu Feb 12, 2009 12:06 am

When I say driver I mean employee. That's why I put it in parentheses to indicate anyone your company pays to do the work. I'm not sure how your company tracks employees.

I don't recall suggesting you use two fact tables.
BrianJarrett wrote:I'd try to avoid two different fact tables; I think that ultimately gets more confusing for end users.
The problem here seems to be the idea that phone call should be a dimension rather than a fact. You mentioned a few metrics that need to be tracked:
dellsters wrote:This dimension table would store information specific to each call such as time of call made, how long the call took before ambulances were dispatched, length of the phone call, etc...
I'd recommend that phone call be modeled as a fact. In this fact you'll store metrics like length of call before dispatch, and length of phone call. Date and time of the phone call would be foreign keys pointing back to your time and date dimension tables.

You'd also have foreign keys for your ambulance and employee dimensions (and maybe others). If the relationship between phone call and either of these two dimensions is a many to many (rather than a one to many) then you can solve that with bridge tables and divisors.

In other words, I see this data mart focused around a phone call fact table with multiple dimensions surrounding it and bridge tables handling the many to many relationships.

Hopefully this makes things a bit clearer. Let me know if it doesn't.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 2:43 am

Brian,
It is clearer, and I appreciate your insight.
I have designed the following:
I have created a call_ambulance_bridge table that has a many to one relationship to the call fact table. This bridge table also has a many to one relationship to the ambulance dimension. In this bridge, there are 2 foreign keys (composite pk) which is the ambulance key (coming from ambulance dim) and the call key (coming from the call fact). This bridge table will have measures such as time to destination, number of patients it carriers, etc.

Now the second part regarding employees:
I have a one to many relationship from the call_ambulance_bridge table to call_ambulance_employee_bridge table. This call_ambulance_employee_bridge table will have the ambulance key and the call key coming from the call_ambulance_bridge table as well as the employee key coming from the employee table. This call_ambulance_employee_bridge table will be a many to one relationship to the employee table.

I have a few concerns with this design:
-Both the call_ambulance bridge and call_ambulance_employee_bridge tables will have more records than the call fact table. Is this a problem?
-The call_ambulance bridge table functions like a fact table because of its measures such as time to destination, or number of patients the ambulance carries. Should I as well name it a fact table?
-The fact that there are 2 bridge tables with a relationship, with one dimension table each, and one of the bridge tables connected to the fact table is a very complicated design for a dimensional model and it can be complicated for the users. Is there an easier approach?

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

RE: ALLOCATING -PARENT CHILD RELATIONSHIP

Post  BrianJarrett on Thu Feb 12, 2009 8:03 am

dellsters wrote:I have created a call_ambulance_bridge table that has a many to one relationship to the call fact table. This bridge table also has a many to one relationship to the ambulance dimension. In this bridge, there are 2 foreign keys (composite pk) which is the ambulance key (coming from ambulance dim) and the call key (coming from the call fact). This bridge table will have measures such as time to destination, number of patients it carriers, etc.
The relationship between this bridge table to the fact and ambulance tables sounds perfect. However, I wouldn't put any facts in the bridge table. I'd put those fact inside the fact table. The only thing in the bridge table should be the two foreign keys, a divisor (which is however many bridge rows to single fact row) and maybe some housekeeping fields. If there are, let's say, 3 ambulances to a particular call record then the divisor would be 3 to allow users to eliminate the duplication that comes with a many to many relationship.

dellsters wrote:I have a one to many relationship from the call_ambulance_bridge table to call_ambulance_employee_bridge table. This call_ambulance_employee_bridge table will have the ambulance key and the call key coming from the call_ambulance_bridge table as well as the employee key coming from the employee table. This call_ambulance_employee_bridge table will be a many to one relationship to the employee table.
I wouldn't do it this way. Here I'd create a second bridge table, this one between your fact table and your employee table. This way you can select any fact from your fact table for a particular call and then include both the ambulance dimension and the employee dimension associated with that fact row. The intersection will be all ambulances and all employees that were associated with that phone call. You'll want to include a divisor in this call_employee_bridge table as well.

Also attached to this fact record will be a date dimension and a time dimension.

Here's a quick mockup of what I'd probably do (it would need to be fleshed out of course).


Hope this helps.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 12:18 pm

BrianJarrett wrote:
This way you can select any fact from your fact table for a particular call and then include both the ambulance dimension and the employee dimension associated with that fact row. The intersection will be all ambulances and all employees that were associated with that phone call. You'll want to include a divisor in this call_employee_bridge table as well.
Hope this helps.

Can you please expand on this? How would I associate the ambulance dimension and the employee dimension with that fact row?

Looking at the diagram, the call_employee bridge is missing the ambulance dimension. One call could dispatch, for example, 2 ambulances, with 4 employees in one and 2 employees in the other one. I need to track who goes in which ambulance. How would I do this?

If I include all facts in the fact table, then the granularity changes. Would I allocate the measures related to Call_ambulance bridge and call_employee bridge tables to the fact table? Or are you referring to not allocate measures at the call_ambulance or call_employee level to the fact table and just store the measures as is and the divisor will take care of that? I think this might not work because of the complexity of the data. I think the example I provided above is a good one:
What about in the case that one call fact row dispatches 2 ambulances, one with 4 employees and one with 2 employees. Can you walk me through how the facts at different level would work out and how the divisor would work? For example, measures related to the call (duration of the call before ambulances are dispatched), time each ambulance takes to get to destination (one ambulance might take 5 minutes to get there, the other might take 4 minutes).

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dharidas on Thu Feb 12, 2009 2:25 pm

Instead of two bridge tables, How about a "Call" Fact table and a "Call Detail" fact table to store information related to the details of the call. This will allow navigation to the detail table from both Call level and through conformed dimensions common to Call and Call Detail.

dharidas

Posts : 5
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 2:43 pm

With that design approach, how would you suggest resolve the different granularity between call_ambulance and call_ambulance_employee differences in the fact for the call detail?

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dharidas on Thu Feb 12, 2009 2:51 pm

Wouldn\'t the granularity of \"Call Detail\" be at ambulance, employee level? This should help roll data up at the ambulance level (how many employees per ambulance) and at employee level, provided the fact values are decomposed using appropriate divisors (ambulance level facts are divided by number of ambulances dispatched per call and employee level fact values are divided by number of employees per ambulance).


Last edited by dharidas on Thu Feb 12, 2009 3:03 pm; edited 1 time in total

dharidas

Posts : 5
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 2:59 pm

There are measures such as time it took for the ambulance to get to the destination. That type of measures is at the call_ambulance level, not at the call_ambulance_employee level.

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  BrianJarrett on Thu Feb 12, 2009 3:06 pm

dellsters wrote:Looking at the diagram, the call_employee bridge is missing the ambulance dimension. One call could dispatch, for example, 2 ambulances, with 4 employees in one and 2 employees in the other one. I need to track who goes in which ambulance. How would I do this?
Then the grain of your table isn't phone call; it's lower than that (as you mention below). Call it 'dispatch detail' or something like that. Essentially one row in the fact table will represent the intersection you mention above. You have to take your fact table down to the atomic level, whatever that is. That means you'll have multiple fact records referencing the same phone call. Phone call might then become a dimension (or a degenerate dimension if it has no attributes).

dellsters wrote:If I include all facts in the fact table, then the granularity changes. Would I allocate the measures related to Call_ambulance bridge and call_employee bridge tables to the fact table? Or are you referring to not allocate measures at the call_ambulance or call_employee level to the fact table and just store the measures as is and the divisor will take care of that? I think this might not work because of the complexity of the data. I think the example I provided above is a good one:
What about in the case that one call fact row dispatches 2 ambulances, one with 4 employees and one with 2 employees. Can you walk me through how the facts at different level would work out and how the divisor would work? For example, measures related to the call (duration of the call before ambulances are dispatched), time each ambulance takes to get to destination (one ambulance might take 5 minutes to get there, the other might take 4 minutes).
Any allocation you do should be down to the lowest level of granularity. You just need a business rule (which is sounds like you're already on your way to). Again, the allocated measures might not necessarily have a lot of value at that level but it allows us to roll it up accurately and delivers on the promise to slice and dice the data any which way (as Ralph Kimball says).

So to summarize I think I'd get away from the notion that the grain of your data is at the phone call level. It appears to be deeper than that. Once you accurately determine the grain then the divisor for your many to many relationships (if you still have them) can be derived in the ETL process (you'll know exactly how many records will be returned in your many to many query because you can count them as you load them.)
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dharidas on Thu Feb 12, 2009 3:08 pm

I guess I edited my previous post before seeing your reply. Would fact decomposition help? Again, I would be cautious with fact decomposition especially if you have users directly accessing detail data from your fact tables. I am sure there should be a simpler way to achieve what you want, but nothing pops up in my mind now. Will ponder more about this later.

dharidas

Posts : 5
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 3:42 pm

dellsters wrote:I guess at a higher level, there is a phone call made. Based on the phone call, there might be multiple ambulances dispatched with multiple individuals per ambulance. In the same data mart, would I include a Call dimension per each call made? This dimension table would store information specific to each call such as time of call made, how long the call took before ambulances were dispatched, length of the phone call, etc. I am worried that with this approach, the call dimension table would increase exponentially. I can't think of a better design approach to take care of the call related attributes. Any thoughts on this?

I should have been clearer with the requirements when I said the above. What makes it confusing is that there are not 2, but 3 different levels of granularity. At the highest level (phone call) is the individual call. At the second level, there is the call - ambulance level, since one call can dispatch multiple ambulances. And at the third level, there is the call - ambulance - employee level, since there can be multiple employees in each ambulance dispatched.

For example, if there are 2 ambulances dispatched in a particular call, 2 employees in one ambulance, and 3 employees in the other ambulance, I would have 5 records in the fact table if the grain is at the lowest level (call - ambulance - employee). In this case having a call dimension, this dimension will be a very large table and that's one of the things I wanted to avoid. With this approach, I wouldn't need any bridge tables because I'm working in the most atomic level of granularity. Another confusion with this approach is with the measures in the fact table since I'd be allocating multiple levels of granularity in the fact table. For example, measures related to the ambulance (time each ambulance takes to get to destination), this value will be different for each of the ambulances so 2 employees will be allocated to the time the corresponding ambulance takes and the other 2 employees will allocated the time it takes for their corresponding ambulance. So if both ambulances takes 6 minutes to get to destination, 3 records will have 2 minutes each and the other 2 employees will have 3 minutes eac. At the same time, all 5 employees will have the same allocated time at the call level (time the call takes before it dispatches ambulances). So if the call took 10 minutes, each record will have 2 minutes for each employee.

Actually as I'm writing this, I think it might be a better idea to not have a call dimension since I will have measures related to a particular call allocated to the grain level in the fact table. I would have a degenerate dimension in the fact table with the call id in the fact so I can always roll up to the call that way.

Either way, I thought of a second approach that would be less confusing with the allocations in multiple levels of granularity. I know it's not at the most atomic level of granularity, but I was thinking I could have the grain be at the call-ambulance level and have a bridge table to indicate which employees went on which ambulance. I would need to confirm with the business users what level of information they need at the employee level. If all they need to track is which employee went on which ambulance for a particular call, this might be a better design approach? Let me know what you think. Thanks!

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  BrianJarrett on Thu Feb 12, 2009 5:15 pm

I think we're mostly on the same page now. I don't see phone call as a dimension unless there are some useful attributes. For instance, maybe the originating phone number, some demographic or geographic information, stuff like that. If that's the case then a dimension might be in order.

I'd recommend going down to the most atomic level of granularity possible. Even if you get your users to agree on a higher level there's no guarantee they'll never care about lower levels, especially once they get their hands on the data. That's when the questions start. If you start at the lowest level you can always roll up for speed and still provide all the detail available. Anything less than this is a compromise.

If you do go further down than phone call then you're going to have to have a degenerate dimension other than phone call. It'll probably be a key generated during the ETL process unless you can find some key in the source data that will make the record unique down to that level.

I also think your bridge tables will no longer be necessary (at the atomic level) once you go all the way down. Funny how the model works once the grain is accurately determined! Of course once you start rolling up to the phone call level they'll be necessary again. Those aggregates can be built on a case by case business, driven by business needs.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Thu Feb 12, 2009 5:23 pm

I was just concerned about the complexity of the data that might confuse business users. But I agree. I think it's always a good idea to have the grain at the most atomic level since requirements always change and they will probably need something at the most atomic level down the road. Your help on this matter is greatly appreciated! I will continue posting if new rules change the design.

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  dellsters on Sat Feb 14, 2009 1:12 am

The more find out about the requirements, the more I think there are actually 2 different business processes. There are a lot of questions and measures regarding the calls. Actually, most of the questions that the business users need answers to are at the call level. I think the first business process is the call transaction, just like an order placement at Amazon, and the second business process would be the assignment and dispatch of ambulances and the employees that ride the ambulance, assigned and dispatched determined by the call, just like the shipping process at Amazon. Any thoughts?There is just too match allocating and splitting call level numbers to each employee to each ambulance. How do you define whether it's 2 different business process or one?

dellsters

Posts : 39
Join date : 2009-02-11

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  BrianJarrett on Sun Feb 15, 2009 9:27 am

I'm not sure I see two different business processes. From what I know about your data I see only one process; dispatching ambulances. The ambulances just happen to be dispatched due to the receipt of a phone call but it would be the same process if they were dispatched due to the submission of a webform, receipt of a letter in the mail, or by viewing a smoke signal (joking about the second two). I don't see the phone call as a different process, I see it as a different level of analysis within the dispatch process.

Having said that I think the solution to answering the business' analytical needs is to provide summary at a call level. They'll have the call level summary available which will answer most of their questions but they'll also have the atomic level detail for when they want to drill down and see what makes up the call level summary. As long as you allocate properly and deliver it using aggregate navigation your users will navigate to the right level of detail without any special effort on their part, aside from clicking a measure to see its detail.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Allocating -Parent child relationship

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum