Modeling Flight segments

View previous topic View next topic Go down

Modeling Flight segments

Post  rearmo on Mon Apr 23, 2012 8:01 pm

We are trying to analyze the flight activity/experience/preferences of various users of our app, particularly the choices made in flight. Like for e.g.
Seat type : Window/Aisle seat
In Flight Amenities Requested for : In Seat video Player etc.
Aircraft equipment type : 747 …
Meals choice : Vegetarian meal, Special Meal etc.

For this we are working with flight segment level detail of each users trip. The challenge for us is multifold:
1. There will be flight seat details like seat 23-D, 48-G , should these be part of the fact table , since these are textual information ?
2. If we were to create a Flight dimension, that makes up the tuple [Flight Number, equipment type, Seat, meal choice, amenities, special instructions] , there is no standard here because the flight layouts can change and the equipment type of the carrier can also change and likewise meals service can be different.
3. So should Flight amenities, meals choice , equipment type each be separate dimensions ?

Thanks for reading and appreciate any thoughts on this.

rearmo

Posts : 5
Join date : 2012-03-10

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  BoxesAndLines on Mon Apr 23, 2012 9:38 pm

The grain of your fact table sounds like a flight segment.

Textual information belongs in the dimensions. The only exception to this are degenerate dimensions. In this case flight number seems like a plausible DD since I want to easily query segments for a given flight.

Seat dimension is interesting in that the seat number and type are dependent on the aircraft used. Given that most airlines how a limited number of aircraft types, I can see building an aircraft dimension that contains all of the seat information in it.

Do not create a flight dimension. Create smaller dimensions like mentioned in #3.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Modeling Flight segments

Post  rearmo on Tue Apr 24, 2012 2:06 pm

Thanks for your inputs BoxesAndLines!

The issue with Seats is that, there can be several configurations of the same Aircraft say for e.g. 757. Delta itself has 8 different configs with this aircraft and then again United Airlines can have a different seat configuration for the same Boeing 757. And we do not have info on which seat configuration was used for a particular Flight.

So there is my confusion , Seat number in the Flight segment transactions is a textual info and I am not able to build a dimension out of it due to reason above , so does this also become a DD in the flight segments fact table ?

Please share your thoughts on this ! Appreciate it.

rearmo

Posts : 5
Join date : 2012-03-10

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  rearmo on Tue Apr 24, 2012 2:11 pm

One other thing is that the Flight number is being reused for different routes i.e. UA 138 can on a particular day be for route SFO to DEN another day it can be for SAN to SFO and another day it can be for ORD to BWI.

rearmo

Posts : 5
Join date : 2012-03-10

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  ngalemmo on Tue Apr 24, 2012 3:05 pm

Flight number is for all intents and purposes, meaningless, unless you are the airline. I think the airlines play with them so they can market a destination (i.e. a direct flight to x, even though you need to change planes anyway) and see if they can get a rise in sales. It is a good candidate for a degenerate dimension, but I doubt it has much analytic value (if you are not the airline, and if you are... bring in a consultant to look over what you are trying to do. ;- ) )
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Modeling Flight segments

Post  TheNJDevil on Tue Apr 24, 2012 4:10 pm

Taking a step back....Do the seat numbers really matter? Isn't the information about a selected seat really the attributes you want to track?

TheNJDevil

Posts : 68
Join date : 2011-03-01

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  rearmo on Tue Apr 24, 2012 4:42 pm

Yes we do want to track the selected seat and its attributes, but due to the dynamic nature of seat map we are not able to build the seat map by aircraft. And at this point we dont have info on the seat map that was present to the user for that particular flight. (or we dont have info on what was the aircraft seat map for this particular flight on this particular day for this particular schedule)

In future if we do get to integrate that info we want to be able to build the seat map like suggested by BoxesAndLines. So until then would it be a good idea to keep seat number as a Degenerate dimension in the flight segments fact table ?

rearmo

Posts : 5
Join date : 2012-03-10

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  ngalemmo on Tue Apr 24, 2012 5:17 pm

If having a seat map in the future is a possibility, I would go ahead a build a seat dimension so you can retrofit attributes when the info becomes available. You already have position (aisle, middle, window) and maybe cabin class (first, business, economy plus, economy), which is a start. The problem you may have doing this is at minimum, the seat map is dependent on the airline and equipment. Even within equipment, the same aircraft model may be configured differently (such as the 757-200 UAL PS flights between JFK and LAX, SFO). If you can get that distinction from the information you currently receive, so you can create a proper natural key, you could build a dimension now. It will make life a lot easier than trying to incorporate a new dimension later.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Modeling Flight segments

Post  rearmo on Tue Apr 24, 2012 7:06 pm

Thanks everyone for your thoughts and inputs, thats very helpful indeed!

rearmo

Posts : 5
Join date : 2012-03-10

View user profile

Back to top Go down

Passenger Flight Segment Bookings

Post  thomashaughey on Thu May 10, 2012 11:10 am

A few general points first:

Flight is not granular enough for you to collect your data. The natural key of flight is Flight No, Flight Date.

I believe that probably neither is Flight Segment granular enough. Its natural key is Flight No, Flight Data, From Airport Code, To Airport Code. A flight segment is operation of a flight involving a stopover, change of aircraft, or change of airline for a passenger. Or, it is the trip of a flight between one airport and another.

[NOTE: In the following examples I am using the natural keys because the natural key demonstrates the granularity and identity of the data. It is very important to understand the different granularity of the possible solutions. Let’s put aside the surrogate key discussion for now because it doesn't contribute to the solution. Even if you use a surrogate key you still have to put a uniqueness constraint on the natural key.]

However, depending on the answers to the questions below, the grain you need could be :

(1) Flight Segment Booking. If you are tracking specific people on flight segments, then the granularity of the fact is the passenger’s participation on a Flight Segment. The natural key will be something like: Flight No, Flight Date, From Airport Code, To Airport Code, Booking Number (or User ID).

(2) Flight Segment Passenger Type. Here you are tracking types of passengers by flight segment. The natural key is Flight No, Flight Date, From Airport Code, To Airport Code, Passenger Type Code.

(3) Flight Segment. Here you are counting or summarizing the various choices were made on a flight segment, such as count of passengers by meal type. The natural key is Flight No, Flight Date, From Airport Code, To Airport Code. This fact does not contain specific passengers.

Grain 1 Fact Attributes: You track the choices that a specific passenger (user) makes on a flight segment. In this case, you could record that this passenger with User ID 12345 (or passenger name Jane Doe) was in aisle seat 9C, requested a vegan meal and chose pay-per-view TV, etc., etc. The seat, requested meal and media are attributes of the Flight Segment Booking.

Grain 2 Fact Attributes: You are tracking passenger types by flight segment. This grain is less detailed than Case 1 above. The fact attributes are a series of counters. They count the number of passenger of each type who made each choice and would have attributes such as Seat Type Count, Meal Type Count, etc. In this solution you do not have information about individual passengers.

Grain 3 Fact Attributes: You count only for a given flight segment the count of passengers who made each selection and their preference. In this case, the grain is Flight Segment. The natural key for Flight Segment is above. Flight Segment has a series of counters for each preference or choice on that Flight Segment. In this solution, you do not have either passenger or passenger type information.

I think you should keep Grain 1 which means the grain is Flight Segment Booking, with a series of columns in the fact showing what this passenger chose for each column: meal type, seat type, the actual seat, chosen media type, etc., etc. Booking ID or User ID could be part of the key (not both). A passenger can be in only one Flight Segment Booking at a time. If you do not want to carry the common details of booking, you could retain Booking ID as a degenerate (or denormalized) dimension. This grain is very detailed and gives the most flexibility. You can count (except for seat number), the columns up and down passengers to get your answers.

Now as to seat. The seating plan of an aircraft describes the aircraft. So an aircraft could have a seat “9C”, among many others. Each seat has characteristics of say aisle/window, exit/ordinary; preferred/non-preferred. But “Jane Doe’s taking seat 9C” is an attribute of her booking on a particular flight segment. So, in case (1) above, the database row would assert passenger 12345 (Customer Number or User ID) on flight 123 (Flight No) on date 1/1/2012 (Flight Date) leaving JFK (Departure Airport), arriving PDX (Arrival Airport) occupied seat 9C (Seat No), which is a regular aisle seat (Seat Type). If you will eventually have seating plans for each aircraft, then keep the occupied seat number. The fact that its data type is text is immaterial; it describes the seat that a single passenger occupied on a single flight segment. However, it is non-additive. Since right now you do not have aircraft seating plans, you may also have to record in the Flight Segment Booking fact the actual seat number and the seat type. You cannot put this in User because on different flights a User could be in different seats. You can only put in Aircraft Seating Plan the seating arrangement, not who occupied the seats because over time many passengers can be in each seat. The principle behind this discussion is functional dependency, which (informally) says put an attribute with what it describes. You may want to record in User their most popular choices for seat type, etc.

Frankly, I recommend solution Grain 1, which seems to be the one you prefer anyway. Apparently if you collect it at the level of passenger by flight segment it will generate a lot of data. There is a principle that justifies Grain 1: always collect data wider and deeper than you originally think. If you collect too much data, you have a performance and scalability problem – which can be managed a number of ways. But if you collect less data than you need, you are in deep tapioca and it will be very expensive to rectify this. I know. I did this in one organization where I inherited a myopic DW design and it cost $20MM to correct it (half of this was a better platform). If your application will generate a large amount of data, you cannot run this on a hand-me-down server and a left-over DBMS license. So, importantly, consider the technology (hardware and software) it will it run on. Because this will affect your design. If this is an eventual candidate for “BigData”, saving the details will be crucial.

However, another big question for you to address are what kind of response time you require for the queries; what kind of volume growth do you expect to have; what kind of analysis will you do (e.g., will you mine the data or just do OLAP type queries such as this by this by this).

Since this does not seem to be exactly a simple problem, I suggest you step back for a second and model it (however you want) first independent of any predetermined solution, such as a star schema. Use a white board. First understand the issues -- clearly. Then fit them to the pattern. Don’t presume the pattern and then fit your need into it. Understand what the rules, requirements and issues are -- first. Then rearrange it to a dimensional model if that is what you want and need, rather than trying simultaneously to understand the problem and fit it into a dimensional structure.



Last edited by thomashaughey on Thu May 10, 2012 11:16 am; edited 1 time in total (Reason for editing : typos)

thomashaughey

Posts : 2
Join date : 2012-05-10

View user profile

Back to top Go down

Re: Modeling Flight segments

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum