Accumulating Snap Shot Challenge

View previous topic View next topic Go down

Accumulating Snap Shot Challenge

Post  The Emu on Fri Feb 13, 2009 7:25 am

Hi,

I'm working for a large Internet Service Provider (ISP) and have a DW modelling challenge which I hope people can help me out with.

Their nature of business includes as a fact recording as an accumulating daily snapshot real time partition basic session information of customers (volume for day), and is a periodic snapshot real time partitioning.

As a simplified example, the fact table includes the following:

customer_key
session_start_key
session_end_key
usage_start_key
usage_end_key
status_key
usage_duration
usage_bytes

The session can last (and generally does) a long period of time (ie over 6 months), hence taking a daily "usage" snapshot of the session throught it's "life". This is represented with the usage_start_key and usage_end_key.

For each daily snapshot:
- the session_start_key is start of session
- the session_end_key is end of session or if still open end of snapshot window
- the usage_start_key is start of snapshot window
- the usage_end_key is end of snapshot window
- the status key remains open for all snapshot records except the last when it changes to closed

The business wants the ability to query sessions "open" between two points in time, which is fine because they can take advantage of the usage start/end keys, however at the same time the want a full picture of the session as a whole, but which proves difficult because the end_key and status_key are not the same throughout all snapshot records.

The way I see it there are two solutions, but I might be missing something.

Solution 1 - During each overnight load, if the session is still open use a "big date" value for session_end_key and if the session is closed, update all historical snapshots of the session with the true session end date.

Solution 2 - Create a Header/Line Item facts with different granularity, the header containing session_start_key, session_end_key and session_status, which is continously update, and Line Item fact containing the snapshots. This I guess could double as an aggregate table aswell.

Has anyone had any similar experiences with something like this?

The Emu

Posts : 2
Join date : 2009-02-13

View user profile

Back to top Go down

Re: Accumulating Snap Shot Challenge

Post  dharidas on Fri Feb 13, 2009 4:36 pm

I would go with the second option where daily snapshots are maintained separate from session grain data. You will avoid the overhead of updating history of snapshots every time the user closes a long running session.

dharidas

Posts : 5
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Accumulating Snap Shot Challenge

Post  BrianJarrett on Sun Feb 15, 2009 9:49 am

I don't know a lot about your data but I think I see two different fact tables here. You're already taking a periodic snapshot of the data, the daily periodic snapshot you mentioned in your post. I agree that you should set the end date of your open sessions to a high date (like 12/31/9999 or something similar) but going back and changing that end date is rewriting history. If you think about it, on any particular day where the session was open it was open and always will be. If, for instance, the session was open on 2/10/2009 then ended on 2/11/2009, the fact that the session ended on 2/11/2009 doesn't mean that the session state on 2/10/2009 was suddenly closed. It was open on the tenth and will always have been open on the tenth.

To solve the issue with analyzing the session itself I think you need an additional accumulating snapshot fact table. The accumulating snapshot fact table won't be at a daily grain level; rather it'll be the session as of your most recent update. So on 2/10/2009 a query against this fact table will show the session being open (end date of 12/31/9999) but the same query on 2/11/2009 will show the session as closed. (You could even set an open/closed flag here.) This gives them the view of the session as a whole they're looking for. If they want to know the state of a session on a particular day then they can use the daily periodic snapshot fact table.

As for displaying these new facts in the reporting tool you can have a different grouping of measures for each fact table, one at a daily level and one at the overall level (as of your most recent update).

Let me know if this helps.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Re: Accumulating Snap Shot Challenge

Post  The Emu on Thu Feb 19, 2009 8:32 am

Thanks for your response, it was good help.

Yes I guess your right in that they are two different facts, with different grains; but I'm trying to atleast at the granular level allow the query to be built without having to run two queries and synch at the report level. (Their using Business Objects).

What I've ended up doing is creating a second accumulating snapshot fact table (session) that contains session dates; removing session start and end dates all together from the accumulating daily snapshot, and then joining the two together to accomodate the 1%-ers.

I know joining the 2 fact tables aren't ideal, but the way I see it, this is the only effective way.

Thanks again for your help

The Emu

Posts : 2
Join date : 2009-02-13

View user profile

Back to top Go down

Re: Accumulating Snap Shot Challenge

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum