Tracking history in huge hierarchies

View previous topic View next topic Go down

Tracking history in huge hierarchies

Post  andriy.zabavskyy on Wed Nov 14, 2012 2:04 pm

Let's review the following dimensional challenge:

Challenge
There is a big organizational tree with the 6 levels: OrgLevel1 -> OrgLevel2 -> .. -> OrgLevel6
Each level has its own set of attributes of type 2 and could be used for facts analysis
The relationships between levels are changing pretty fast: each month up to 10% of members in one level could change their parent members and the history of changes should be kept.
The question is how to store all the data and track all the changes efficiently.

Possible implementation
I have following vision how it could be implemented:

  • Keep separate dimensional table for each OrgLevel with SCD type 2 structure (row_start_date, row_end_date columns etc.)
  • Create a special bridge table to keep relationships between organization levels with time range indicators. So the bridge table would have following structure:
    OrgHierarchyBridgeTable
    BridgeTableSurrogateKey
    OrgLevel_1_BusinessKey
    OrgLevel_2_BusinessKey
    ...
    OrgLevel_6_BusinessKey
    RowStartDate
    RowEndDate
  • Keep in fact table the BridgeTableSurrogateKey


What do you thing of this approach?
Thanks

[i]

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

View user profile

Back to top Go down

Re: Tracking history in huge hierarchies

Post  ngalemmo on Wed Nov 14, 2012 8:57 pm

Adding effective periods to the bridge table, as well as the source relationships is fine. Problem is, the bridge table is incorrect.

A bridge is not a flattening of the hierarchy. If you flatten it, you just store a flat picture keyed at the lowest level. Its a dimension, not a bridge.

A bridge table is an explosion of the recursive parent/child keys. The table contains every possible parent-child relationship across all levels. Each relationship can be date bound to track versions. A bridge contains a child key, a parent key, child level, parent level or distance, date ranges, and other attributes as appropriate. Since the bridge is every possible relationship pair, you can join at any level and traverse up or down the hierarchy using simple non-recursive SQL.

The update process is to first update the recursive structure (parent/child relationships), then create a new exploded bridge table. Do a match/merge pass of the new exploded bridge table and the current one (full outer join on parent key and child key). Expire old relationships that are not in the new table and add new relationships that are not in the old table.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Tracking history in huge hierarchies

Post  andriy.zabavskyy on Thu Nov 15, 2012 9:05 am

I agree, I used not proper naming - it is not a real bridge table.
But my question if the approach in general will work properly? Are there any disadvantages or difficulties related to it?
I personally don't think we need to build a real bridge table because the depth of the hierarchy is fixed.

Thanks

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

View user profile

Back to top Go down

Re: Tracking history in huge hierarchies

Post  ngalemmo on Thu Nov 15, 2012 1:07 pm

Both would support similar queries, but an exploded bridge is a more efficient and flexible structure.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Tracking history in huge hierarchies

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum