how to design? Vehicle location status fact table

View previous topic View next topic Go down

how to design? Vehicle location status fact table

Post  lprince on Wed Nov 11, 2015 2:58 pm

I'm trying to figure out the best design for the following:

I have a raw data table containing a daily snapshot of vehicle location information that contains state and county (that's the lowest granularity I need) (Date, VehID, CountyName, StateName). I have another table that contains sales district info (DistrictID, CountyName, StateName). I want to use this as part of my data warehouse, which I've tentatively designed as a factless fact table:
DayKey
Statekey
CountyKey
DistrictKey
VehicleKey
LocCount (=1) for easy SQL.

My problem is that the Vehicle may be in more than one county on a daily basis (two max) and any given sales district can include multiple counties as part of its district.
This results in two records for the same vehicleKey but different countykey.

If I want to enable the DW users to be able to get a count of the number of vehicles in a given district on a given day, regardless of county, do I need to make a second fact table that excludes county as a key? So then my ETL job(s) would maintain two fact tables that basically contains the same data. This seems like I've missed something in my reading....

To create the report currently, I run a SQL query that does a group on the raw data tables excluding the County field. I could drop county, but my sales districts are slowly changing dimensions (counties get re-assigned), so I'm struggling with this initial design.

Thanks!

lprince

Posts : 9
Join date : 2015-11-11

View user profile

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  ngalemmo on Wed Nov 11, 2015 9:17 pm

No. Construct the queries to perform a count(distinct vehicleKey).
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  nick_white on Thu Nov 12, 2015 7:48 am

Also, I assume County>District>State is a strict hierarchy on any given date (a county can belong to only one district that can belong to only one state)? If this is the case then all 3 should be in a single Dim table and there should be only one key in the fact table referencing them - this table can be an SCD if the district definition changes. Even if District cannot be included in the hierarchy (because a district crosses state boundaries) count & state should be in a single Dim

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  lprince on Thu Nov 12, 2015 11:11 am

Thank you both for your help!
Technically, it's district, state, county in the hierarchy because sales districts can include all of a state, more than one state including some but not all of the counties in that state, or just be made up of counties within a state. It all depends on how much business is present in that region. But I see your point, I hadn't thought of it that way. I was trying to give the option of slice the reports on a district, state, and county level. I'll make the state/county a single dim, those changes are extremely slow.

This is currently a weekly snapshot/trending report run using a regular SQL query against the source tables using distinct count(district, vehicleID) (if vehicle is in two counties in different districts during the snapshot, it gets counted in that district also - the vehicles aren't ours, think more like a mobile store we sell to). I am struggling with how to design that in a cube and how the output will compare to the existing report. If I have one record for districtkey, state/countykey, vehicleID, then the usual count(*) in the cube counts incorrectly when grouped by districtkey due to the issues above. If I can use a distinct within the cube, I'm happy. This is my first foray in dwh and I wasn't getting enough complete examples to extrapolate the results from the usual "student attendance" cube that I'm finding in all the books. Ultimately, I need the results of this count to be used with the sales cube, for another metric.

lprince

Posts : 9
Join date : 2015-11-11

View user profile

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  zoom on Thu Nov 12, 2015 11:51 am

Donít you need a carís dealer dim? A car cannot exist in two different dealership at a same time so a car most current arrival time to a dealership can be used to resolve your issue (Even 2 dealerships exist in a same district).

If you do not have car arrival time to a dealership then talk to the report user (your stakeholder) how to resolve this issue. BTW you do need one dim for your district hierarchy as it was suggested above.

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  lprince on Thu Nov 12, 2015 12:14 pm

Thank you for your input. I'm not dealing with a dealership situation. The vehicle in question is a "mobile warehouse". The raw source data is generated by a third party containing the location by state, county, and GPS lat/long on a daily interval. There isn't a time available. The vehicle goes to a location and sits and is only likely to be in a maximum of two counties (based on eval of raw data) on any given day. 98% of the time, it's only in one location on a daily basis. I have to figure out how to handle counting what happens the other 2% of the time when it moves either between counties within a district or between districts. If it stays within the district, regardless as to what county, I only want to count the vehicle once. If it moves between districts, I need to count it twice. Thus, I currently execute a distinct query on the raw data to generate the counts and create my report; but I need to move this to a cube to use the results with another cube. If I can use a distinct query within the cube, (I'm not sure what that syntax looks like), I can make that work. So I think I have a handle on the design based on the feedback and my question now becomes, how do I run a distinct query by District on a cube given a design of:

FactTable:
DayKey
DSCKey
VehicleID
LocCount (=1)

DimTable:
DSCKey
District
State
County


lprince

Posts : 9
Join date : 2015-11-11

View user profile

Back to top Go down

Re: how to design? Vehicle location status fact table

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum