Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Tracking history in huge hierarchies

2 posters

Go down

Tracking history in huge hierarchies Empty Tracking history in huge hierarchies

Post  andriy.zabavskyy Wed Nov 14, 2012 2:04 pm

Let's review the following dimensional challenge:

Challenge
There is a big organizational tree with the 6 levels: OrgLevel1 -> OrgLevel2 -> .. -> OrgLevel6
Each level has its own set of attributes of type 2 and could be used for facts analysis
The relationships between levels are changing pretty fast: each month up to 10% of members in one level could change their parent members and the history of changes should be kept.
The question is how to store all the data and track all the changes efficiently.

Possible implementation
I have following vision how it could be implemented:

  • Keep separate dimensional table for each OrgLevel with SCD type 2 structure (row_start_date, row_end_date columns etc.)
  • Create a special bridge table to keep relationships between organization levels with time range indicators. So the bridge table would have following structure:
    OrgHierarchyBridgeTable
    BridgeTableSurrogateKey
    OrgLevel_1_BusinessKey
    OrgLevel_2_BusinessKey
    ...
    OrgLevel_6_BusinessKey
    RowStartDate
    RowEndDate
  • Keep in fact table the BridgeTableSurrogateKey


What do you thing of this approach?
Thanks

[i]

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

Back to top Go down

Tracking history in huge hierarchies Empty Re: Tracking history in huge hierarchies

Post  ngalemmo Wed Nov 14, 2012 8:57 pm

Adding effective periods to the bridge table, as well as the source relationships is fine. Problem is, the bridge table is incorrect.

A bridge is not a flattening of the hierarchy. If you flatten it, you just store a flat picture keyed at the lowest level. Its a dimension, not a bridge.

A bridge table is an explosion of the recursive parent/child keys. The table contains every possible parent-child relationship across all levels. Each relationship can be date bound to track versions. A bridge contains a child key, a parent key, child level, parent level or distance, date ranges, and other attributes as appropriate. Since the bridge is every possible relationship pair, you can join at any level and traverse up or down the hierarchy using simple non-recursive SQL.

The update process is to first update the recursive structure (parent/child relationships), then create a new exploded bridge table. Do a match/merge pass of the new exploded bridge table and the current one (full outer join on parent key and child key). Expire old relationships that are not in the new table and add new relationships that are not in the old table.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Tracking history in huge hierarchies Empty Re: Tracking history in huge hierarchies

Post  andriy.zabavskyy Thu Nov 15, 2012 9:05 am

I agree, I used not proper naming - it is not a real bridge table.
But my question if the approach in general will work properly? Are there any disadvantages or difficulties related to it?
I personally don't think we need to build a real bridge table because the depth of the hierarchy is fixed.

Thanks

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

Back to top Go down

Tracking history in huge hierarchies Empty Re: Tracking history in huge hierarchies

Post  ngalemmo Thu Nov 15, 2012 1:07 pm

Both would support similar queries, but an exploded bridge is a more efficient and flexible structure.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Tracking history in huge hierarchies Empty Re: Tracking history in huge hierarchies

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum