MDX(or something else) for getting history based on last dimension characteristic for SCD.

View previous topic View next topic Go down

MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Tue Feb 08, 2011 4:32 pm

Hi everyone,
I am implementing a cube and I run into a problem with slowly changing dimensions(SCD type 2). I guess it might be resolved somehow with MDX or somehow else, maybe you guys can help me to resolve it.
The problem: I have SCD, say Policy with dimension attribute PolicyType. I have a measure, say NumberOfActivePolicies. To simplify, I have 1 policy in my dimension which was activated in January with PolicyType=’Simple policy’. In February policy was amended and PolicyType become ‘Advanced Policy’. When I browse my cube, I have NumberOfActivePolicies = 1 for PolicyType=’Simple policy’ and I have NumberOfActivePolicies = 0 for PolicyType = ‘Advanced Policy’. This result reflect exactly what I have in my fact table: 1 ‘Simple Policy’ was activated and 0 ‘Advanced Policies’ was activated. But this result does not reflect the business situation – after the amendment I have 1 ‘Advanced Policy’ activated and 0 ‘Simple policy’ activated. And now the question: is that possible to count NumberOfActivePolicies not according to the facts, but according to the last dimension characteristic (I mean if I query the cube till January only, I have 1 ‘Simple policy’ and 0 ‘Advanced policy’, but if I query the cube till February I have 0 ‘Simple policy’ and 1 ‘Advanced policy’). How can I implement this logic?
Here is what I have approximately on data level:
DimPolicy:
DimPolicyId BusinessPolicyid PolicyType
100 1 Simple policy
101 1 Advanced policy

FactPolicy
Id DimPolicyId FactDate Activation
1000 100 jan 2010 1
1001 101 feb 2010 0

So if I just summarize all activations and group them by PolicyType I get wrong answer from business point of view

Thanks

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  LAndrews on Tue Feb 08, 2011 4:54 pm

Seems to me you may have an issue with the definition/grain of your fact.

The measure ("Activation") appears to measure the activation event, not the Active state. In order to get the desired resuls, you need to count the policies by policy type, month etc rather than sum up the activation measure.

clear as mud?

LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Tue Feb 08, 2011 5:56 pm

Hi LAndrews, yes, you are right. Basically your suggestion is to keep not event(activation), but actually the final state(active), and count how many active policies we have. This particular case might be resolved like you suggested, there are other cases which cannot be resolved like this. For example, let’s analyze money instead of count:
FactPolicy
Id DimPolicyId FactDate Debit Credit Balance
1000 100 jan 2010 100 60 40
1001 101 feb 2010 80 0 120

Here we have th e1-st record linked to the ‘Simple policy’ and the 2-nd to the ‘Advanced policy’. Now if you ask the cube what was the Debit for ‘Advanced policy’ for the period jan-feb, you will have the answer 80. Which is wrong from the business point of view, and the right answer is 180. So I still do not have understanding how to achieve this 180 result

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  gvarga on Tue Feb 08, 2011 6:33 pm

In that case you have to query what was the Debit for BusinessPolicyId = 1 for the period jan-feb and you will get the answer 180.


gvarga

Posts : 43
Join date : 2010-12-15

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Tue Feb 08, 2011 6:43 pm

Hi gvarga, correct, if I will query it for BusinessPolicyId I will get the right answer, but I need to get right answer with grouping by PolicyType, and I actually can easily implement this logic in stored procedure in simple report, but get the right answer from Analysis Service is not so simple (.So actually I am looking for the possibility to calculate dimensions, depending on the query period, take the last one in this period and link all facts to this one.

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  LAndrews on Tue Feb 08, 2011 6:57 pm


I think you may be overthinking this.

When you build the cube, the logic for the member label needs to be such that the current label is associated with all facts having BusinessPolicyID=1. That will result in the correct label "Advanced Policy".

Jan 2010 |Advanced Policy|100
Feb 2010 |Advanced Policy|80




LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Tue Feb 08, 2011 7:28 pm

maybe you are right, it drives me crazy…

yes, the current label is ‘Advanced policy’, but the term ‘current’ depends on your query. If you query jan only, the current is ‘Simple policy’ and for jan it should be(it is easy, cause this what I have in fact table) :
jan 2010 | Simple Policy | 100

If I query jan-feb, the current is ‘Advanced policy’ as you specified

What if I have another record in fact table, (like March 2010 | Super policy | 20) , And if I query jan-march, the current is ‘Super policy’:
Jan 2010 |Super Policy|100
Feb 2010 |Super Policy|80
March 2010| Super policy | 20

The idea is the current is changing depending on your MDX query, and I do not know how to implement it in SASS (I can easily implement it in just a usual report in stored procedure)

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  Jeff Smith on Wed Feb 09, 2011 1:14 pm

This is an issue of wanting to have the ability to see the world as it was and to restate the world as it is today.

There are 2 options, one is a database design solution, the other is a SQL solution.

The database design solution is to put a "current" dimension key on each row. Add a column called "Current_Dimension_Key". Load the table like you would normally due. If you have 3 instances of a certain value. For the Current_Dimension_Key, assign all 3 instances the surrogate key from most current instance. Then do a self join of the dimension table where Surrogate Key = Current_Dimension_Key.

The SQL solution is to do a self join on the dimension table based on the business key (without the begin and end dates), and to select the most current record.

The database option makes for easier querying and has to be done only once.

This type of stuff happens all the time with aggregates. Take addresses, for example. Some times I want the address of a customer at the time of the transaction, and sometimes I want the current address of the customer based on aggregated data.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Wed Feb 09, 2011 1:58 pm

Hi Jeff, all you are saying make sense, and the idea with the current is good, but it does not work for us ( . The complication is that in your example the current is fixed, and is actually the last version of dimension. In my case the current is not fixed, it depends on the time frame of the particular MDX query. I mean, if I query the cube for jan - the current is ‘Simple Policy’; if I query the cube jan-feb - the current is ‘Advanced Policy’; if I query the cube jan-march, the current is ‘Super Policy’.
Actually we are just trying to decide if we going to use SSAS or not and we have already done a lots work in this direction, and this problem just killing the whole project... This logic is actually simple to implement in a usual report, writing a usual stored procedure, but looks like it is impossible implement in SSAS (

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  hang on Thu Feb 10, 2011 8:35 am

Don't give up! I know it should work if you understand Jeff's approach. Let me see if I can shed some light on this.

Let's say you create view for your fact, and in the view you derive a column called CurrentPolicyKey using two joins with your dimension, one on PolicyKey (SK), another one (self join) on PolicyID (NK) with CurrentStatus='Y'. So the CurrentPolicyKey stays the same value for all the PolicyKey's that share the same PolicyID (achieved by self join). Now the CurrentPolicyKey acts almost like a NK when you aggregate measures, but will point to the current version of the dimension when you need to show latest dimension attributes as column headers.

In SSAS use view as your fact, so the CurrentPolicyKey is achieved on the fly and always point to the dimension version as of today for the column or row headers, even if you try to see the aggregated facts in the past.

If you want show the past dimension attributes, you just need to aggregate on the PolicyID (NK) and the column or row headers should be reflected based on the dimension attributes at the specified point of time.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Thu Feb 10, 2011 2:43 pm

that is very interesting and something new – the key words here are ‘current’ and ‘aggregate on the PolicyId’. I am not completely understand ‘aggregate on PolicyId’ – how can I do this without showing them up? (cause I do not really want to see them, I need to see aggregation by ‘PolicyType’)

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  Sideout72 on Thu Feb 10, 2011 4:22 pm

I definitely see Jeff and hang's points. However, there is a disconnect between individ's question and your responses.

Let's take this case where the PolicyType in DimPolicy continues to change back and forth over time for a particular policy. So in January it was "Simple Policy", Febuary it was "Advanced Policy", March it was "Complex Policy", and April it was "Simple Policy". There were transactions against that policy in each of the months:

1/1/2010 | $100
2/1/2010 | $25
3/1/2010 | $50
4/1/2010 | -$75

Therefore, if you were to look at this policy "as of" the end of January, it would be known as a Simple Policy with a total amount of $100.
If you look at this policy "as of" the end of February, it would be known as an Advanced Policy with a total amount of $125.
If you look at this policy "as of" the end of March, it would be known as an Complex Policy with a total amount of $175.
If you look at this policy "as of" the end of April, it would be known as an Simple Policy with a total amount of $100.

Note that we don't care that when the first transaction was created it was during the time it was called a "Simple Policy", because in February the user didn't care - by then it was "Advanced Policy", and the total amount for the advanced policy was $125...

If today is May 1st 2010, there are four entries in the DimPolicy table for the slowly changing dimension, each with a change of the PolicyType, and only the last one (in April marked "Simple Policy") was denoted with CurrentStatus = 'Y'.

The question I have is how you would model this, as the CurrentStatus of the policy changed over time - the correct value of CurrentStatus is really dependent upon the date range you are querying. Almost like a dynamic attribute.

individ - did I come anywhere close to restating your question?



Sideout72

Posts : 4
Join date : 2011-02-09

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  hang on Thu Feb 10, 2011 6:59 pm

individ wrote:I am not completely understand ‘aggregate on PolicyId’ – how can I do this without showing them up?
Aggregates are calculated by the cube along the hierarchies stored in dimensions. The PolicyKey or CurrentPolicyKey is the only entry point for the hierarchy, eg. PolicyType-PolicyID. You never need to store in the fact an attribute at higher level of the hierarchy for the purpose of cube aggregation. And you should not aggregate on any surrogate key in any SCD2 dimensions, as they would render multiple counts or aggregations on a single dimension item.

In SSAS, the process is like this as follows:

1. Configure the necessary hierarchies, eg. PolicyType – PolicyID (or you could concatenate ID and Name)
2. Configure the dimension usage
3. Process the cube
4. Explore the cube
5. Drag and drop the measures to the center of the cube
6. Drag and drop hierarchies to row or column area
7. Drill down you data along the hierarchies.

individ wrote:the correct value of CurrentStatus is really dependent upon the date range you are querying. Almost like a dynamic attribute
It is a dynamic attribute in nature, and that’s why you build a view for the fact table as CurrentPolicyKey for each fact row could change every day to an extreme.

As for what version of dimension attributes should be used for aggregates on period involving multiple dimension changes, it’s up to business. Typically you would choose the latest version ‘as of’ reporting date, or you may nominate the latest version up to the end of aggregation period to represent the whole period. A little extra work for the latter, as you may need to derive a PeriodPolicyKey for the predefined periods.


Last edited by hang on Thu Feb 10, 2011 9:38 pm; edited 1 time in total

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Thu Feb 10, 2011 8:28 pm

Hi Sideout72 – you described it very well ), thanks! And looks like hang has a point. My brain cannot fully absorb it at once ( - I need to play with it. Will report about outcome!

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

LastNonEmpty?

Post  Mike Honey on Tue Mar 01, 2011 8:58 pm

Hi individ,
You might be able to save yourself a lot of code (maybe this is too late) by changing the Measure AggregationFunction property to: LastNonEmpty. This aggregates normally over most dimensions but takes the value from last date in scope. This requires SQL Enterprise Edition.

This is v simple to implement and performs well. Just beware of the gotcha: the scope is limited by your query so e.g. if you filter to Feb and Policy X's last record was in Jan, it will return null. There are MDX solutions that avoid this issue (and also the Edition requirement) but they can be a lot slower.

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  individ on Wed Mar 02, 2011 1:28 pm

Hi Mike,
No, it is never too late ) . For now I come up with another solution which is not so good - takes a lots of disk space, lots of processing time and has restrictions, but it works for us. I implemented this one because we have deadlines (and they are already in the past ( ). But I am still looking for good solution and going to reimplement it when I have time.

If you are interested, my solution is to keep the snapshots of the fact table as for end of every month. The snapshot for 1 month is a state of entire fact table(as for the end of this month) from beginning of time till that month(including it). In my mdx query I choose only 1 month, to have the data from beginning of the time till that month. I know, it is a huge table, we can produce the reports only from beginning of time till some month(not for any period of time). We are lucky that we do not have a big database. I posted all data including that snapshot tables and it takes 2G – not so bad. I have those snapshots for every month, only for the last half a year, and only Decembers for every year before(so the finance can do the reports from beginning of time till any past year).
I realize that it is not a good solution and I am looking for a good one. So far I have 3 ideas to check:
1. hang's idea of ‘current’ and ‘aggregate on the PolicyId’
2. thanks Mike - your idea of ‘LastNonEmpty’ property(I do not have Enterprise Edition thought)
3. there might be solution which has something to do with many-to-many relationship
I will check them when I will have time. Thanks guys!

individ

Posts : 8
Join date : 2011-02-08

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  deebeegee on Fri Mar 04, 2011 6:43 am

we have devised a method to cope with a similar demand in the fact tables. ie to be able to see the data every which way.
depending on the amount of SCDs you have it may not create too much data. basically for any transactional fact we add three items of metadata
current_flag (CF)
original_flag (OF)
system_date (SD)

facts are loaded with CF=1, OF=1, SD=GETDATE()

Tran_key OF CF SD Dim_key Value
1 1 1 01/01/10 1 100
2 1 1 05/01/10 1 120

when a SCD2 occurs all previous facts which relate to that dimension get ledgered off and a new record entered with the new dimension key. the old records are updated to CF=0 and the new records are loaded with OF=0

Tran_key OF CF SD Dim_key Value
1 1 0 01/01/10 1 100
2 1 0 05/01/10 1 120
3 0 0 10/01/10 1 -100
4 0 0 10/01/10 1 -120
5 0 1 10/01/10 2 100
6 0 1 10/01/10 2 120

subsequent facts would be loaded as normal

Tran_key OF CF SD Dim_key Value
1 1 0 01/01/10 1 100
2 1 0 05/01/10 1 120
3 0 0 10/01/10 1 -100
4 0 0 10/01/10 1 -120
5 0 1 10/01/10 2 100
6 0 1 10/01/10 2 120
7 1 1 12/01/10 2 50

this now allows someone to see the data in 3 ways (at least)
traditional as it occurred view by filtering on OF=1
current view by filtering on CF=1
as at view by filtering SD<= as at date

works for us

deebeegee

Posts : 1
Join date : 2011-03-04

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  Ashish Mishra on Fri Mar 04, 2011 9:41 am

Hi,

creating a monthly snapshotfact on the policy might help as in that case whatever is status of policy on that period will get associated to the measures

Ashish Mishra

Posts : 11
Join date : 2011-02-22

View user profile

Back to top Go down

Re: MDX(or something else) for getting history based on last dimension characteristic for SCD.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum