Counting coverages over time
4 posters
Page 1 of 1
Counting coverages over time
Hi All,
I work for a non-profit mental health org. Many of our clients get benefits from the county. These benefits last a year and then a new one is requested. They have different categories of benefit (2X1, 3A1, 3B1).
I need to count the number of active benefits broken out over time. Counting events that happen on a single date is very straight forward but counting things that have a start and an end date are proving to be very difficult in an OLAP cube in Analysis Services.
What I am looking for is something like this:
Sample benefits:
So the 2X1 would be counted 09Q1 through 10Q1 and the 3A1 would be counted for 08Q3 through 09Q3. Ideally when rolled up to the year level the 2X1 from my example would count in 2009 and 2010 because it was active in both years.
Another example of how we want to use this is: we get a report from the county that compares active benefits from year to year for the same quarter. Management wants me to duplicate this “report card” in our own systems so we can compare our numbers with the county's.
I have a factless degenerate fact table (FactBenefits) that I can link date (DimDate) on benefit start and/or end but that won’t let me count the number of active benefits for Q1 2009 only the number of benefits that start in Q1 2009.
FactBenefits
BenefitKey Int
StartDateKey Int
EndDateKey Int
*edit* ClientKey Int
StartDate Datetime
EndDate Datetime
Type Char(3)
Exists Bit
Is there some way to join the date dimension on my date range and roll up my counts into their various time buckets?
Thanks,
Michael Takos
Sound Mental Health
I work for a non-profit mental health org. Many of our clients get benefits from the county. These benefits last a year and then a new one is requested. They have different categories of benefit (2X1, 3A1, 3B1).
I need to count the number of active benefits broken out over time. Counting events that happen on a single date is very straight forward but counting things that have a start and an end date are proving to be very difficult in an OLAP cube in Analysis Services.
What I am looking for is something like this:
2009 | |||
Q1 | Q2 | Q3 | |
2X1 active benefits | 130 | 125 | 160 |
3A1 active benefits | 46 | 55 | 55 |
3B1 active benefits | 23 | 23 | 23 |
Sample benefits:
Benefit_ID | Client_ID | Type | Start_Date | End_Date |
1234 | 3432 | 2X1 | 02/23/2009 | 02/22/2010 |
1235 | 2328 | 3A1 | 07/13/2008 | 07/12/2009 |
So the 2X1 would be counted 09Q1 through 10Q1 and the 3A1 would be counted for 08Q3 through 09Q3. Ideally when rolled up to the year level the 2X1 from my example would count in 2009 and 2010 because it was active in both years.
Another example of how we want to use this is: we get a report from the county that compares active benefits from year to year for the same quarter. Management wants me to duplicate this “report card” in our own systems so we can compare our numbers with the county's.
2007 Q3 | 2008 Q3 | 2009 Q3 | |
Adult active benefits | 150 | 168 | 195 |
Child active benefits | 22 | 38 | 43 |
I have a factless degenerate fact table (FactBenefits) that I can link date (DimDate) on benefit start and/or end but that won’t let me count the number of active benefits for Q1 2009 only the number of benefits that start in Q1 2009.
FactBenefits
BenefitKey Int
StartDateKey Int
EndDateKey Int
*edit* ClientKey Int
StartDate Datetime
EndDate Datetime
Type Char(3)
Exists Bit
Is there some way to join the date dimension on my date range and roll up my counts into their various time buckets?
Thanks,
Michael Takos
Sound Mental Health
Last edited by Gehaus on Wed May 05, 2010 5:45 pm; edited 1 time in total (Reason for editing : added missing ClientKey to FactBenefits)
Gehaus- Posts : 4
Join date : 2010-05-05
Location : Seattle
Re: Counting coverages over time
What are the business rules for counting these events? If there is a time span associated with it, is it supposed to be counted in multiple periods or not? It's not clear what it is you are trying to accomplish.
Re: Counting coverages over time
Actually, I think these questions are answered in the post.
In other words, the example row with a 2X1 Type would be counted in every quarter in which it was active, from 2009 Q1 through 2010 Q1. The timespan is defined by the Start_Date and End_Date from the benefits table.
Michael, are you missing a Client_Key in the FactBenefits fact table at the end of your post?
I'm not sure how to do this in Analysis Services off hand, but I'm sure there's a way.
--Warren
So the 2X1 would be counted 09Q1 through 10Q1 and the 3A1 would be counted for 08Q3 through 09Q3. Ideally when rolled up to the year level the 2X1 from my example would count in 2009 and 2010 because it was active in both years.
In other words, the example row with a 2X1 Type would be counted in every quarter in which it was active, from 2009 Q1 through 2010 Q1. The timespan is defined by the Start_Date and End_Date from the benefits table.
Michael, are you missing a Client_Key in the FactBenefits fact table at the end of your post?
I'm not sure how to do this in Analysis Services off hand, but I'm sure there's a way.
--Warren
warrent- Posts : 41
Join date : 2008-08-18
Re: Counting coverages over time
You are right Warren, The FactBenefit table should have a ClientKey. I made the FactBenefit table example when I made the Benefit type(2X1/3B1/3A1) report and later made the Adult / Child example report and neglected to add ClientKey.
FactBenefits
BenefitKey Int
StartDateKey Int
EndDateKey Int
ClientKey Int
StartDate Datetime
EndDate Datetime
Type Char(3)
Exists Bit
ngalemmo:
The benefit should be counted in all time periods in which it was active. So for example Benefit_ID: 1234 02/23/2009 - 02/22/2010, it would count in 2009Q1, 2009Q2, 2009Q3, 2009Q4, and 2010Q1.
FactBenefits
BenefitKey Int
StartDateKey Int
EndDateKey Int
ClientKey Int
StartDate Datetime
EndDate Datetime
Type Char(3)
Exists Bit
ngalemmo:
The benefit should be counted in all time periods in which it was active. So for example Benefit_ID: 1234 02/23/2009 - 02/22/2010, it would count in 2009Q1, 2009Q2, 2009Q3, 2009Q4, and 2010Q1.
Last edited by Gehaus on Wed May 05, 2010 5:41 pm; edited 1 time in total (Reason for editing : Minor edit to phrasing)
Gehaus- Posts : 4
Join date : 2010-05-05
Location : Seattle
Re: Counting coverages over time
Getting counts by period using a date range against a date dimension is doable in SQL, but I am not sure about it in a cube using MDX.
What you basically have are multiple snapshots at different time grains (month, quarter, year, etc...). If the cube has client as a dimension, then it may be possible by counting unique client/benefit combinations by time period. If the cube does not contain client, there may be a way to have separate aggregations (i.e. unique counts) by time hierarchy (month, quarter, year) in the same cube. My analysis services knowledge is rusty, so I can't be more specific.
What you basically have are multiple snapshots at different time grains (month, quarter, year, etc...). If the cube has client as a dimension, then it may be possible by counting unique client/benefit combinations by time period. If the cube does not contain client, there may be a way to have separate aggregations (i.e. unique counts) by time hierarchy (month, quarter, year) in the same cube. My analysis services knowledge is rusty, so I can't be more specific.
Re: Counting coverages over time
I wonder if this might be handled using the many-to-many dimension support in SSAS 2005 and newer?
The typical example is bank accounts: bank acounts can have multiple owners, and owners can have multiple accounts. With the proper configuration, SSAS prevents an amount (in this case an account balance) from being counted twice. For example, if John and Mary are joint holders of an account with a balance of $100, SSAS would report the following:
Here is Microsoft's description of the feature:
http://msdn.microsoft.com/en-us/library/ms345139(SQL.90).aspx
It would take a little more thought to determine if this is a solution for the overlapping benefits.
SSAS can also be configured to not sum over time for periodic snapshot amounts (you change the aggregation over time to average, for example.)
The typical example is bank accounts: bank acounts can have multiple owners, and owners can have multiple accounts. With the proper configuration, SSAS prevents an amount (in this case an account balance) from being counted twice. For example, if John and Mary are joint holders of an account with a balance of $100, SSAS would report the following:
Account Holder | Balance |
John | $100 |
Mary | $100 |
Total | $100 |
Here is Microsoft's description of the feature:
http://msdn.microsoft.com/en-us/library/ms345139(SQL.90).aspx
It would take a little more thought to determine if this is a solution for the overlapping benefits.
SSAS can also be configured to not sum over time for periodic snapshot amounts (you change the aggregation over time to average, for example.)
Last edited by VHF on Wed May 05, 2010 7:02 pm; edited 4 times in total (Reason for editing : fix URL; added last sentence)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Counting coverages over time
VHF: Thanks, that is an interesting idea.
So I would need to define a date dimension at the month or quarter granularity so that I don't have 365 rows per benefit but that is do-able.
Something like this?
DimBenefit
FactActiveBenefit
So that would result in:
But if I add the date dimension:
Is there anything flawed in my assumptions here? I have not worked with many to many dimensions at all.
~Mike
So I would need to define a date dimension at the month or quarter granularity so that I don't have 365 rows per benefit but that is do-able.
Something like this?
DimBenefit
BenefitKey | StartDate | EndDate | Type |
1234 | 02/23/2009 | 02/22/2010 | 2X1 |
FactActiveBenefit
FactKey | ClientKey | BenefitKey | DateKey | Exists |
4567 | 3432 | 1234 | 2009Q1 | 1 |
4568 | 3432 | 1234 | 2009Q2 | 1 |
4569 | 3432 | 1234 | 2009Q3 | 1 |
4570 | 3432 | 1234 | 2009Q4 | 1 |
4571 | 3432 | 1234 | 2010Q1 | 1 |
So that would result in:
Benefit | Count |
1234 | 1 |
But if I add the date dimension:
Benefit | 2009Q1 | 2009Q2 | 2009Q3 | 2009Q4 | 2010Q1 |
1234 | 1 | 1 | 1 | 1 | 1 |
Is there anything flawed in my assumptions here? I have not worked with many to many dimensions at all.
~Mike
Last edited by Gehaus on Wed May 05, 2010 7:47 pm; edited 1 time in total (Reason for editing : changed Benefit_ID to BenefitKey)
Gehaus- Posts : 4
Join date : 2010-05-05
Location : Seattle
Re: Counting coverages over time
Let's forget about many-to-many dimensions for now--I don't think it is the the solution to this problem.
The trick is going to be getting the correct aggregation for the count.
Your FactActiveBenefit table at the quarter grain looks good (and the design issues are all the same at the daily grain, you would just be creating 356 or 366 fact records per benefit per client instead of 4 or 5). The detail reporting (at the quarter level) is correct out of the box as in your example:
However, looking at it by year the default count or sum aggregation would give:
Changing the AggregateFunction property of the measure to AverageOfChildren might work OK in simple cases, but won't roll up correctly when a client has multiple benefits over time. Consider this example where a client ended one benefit in 2010Q1 and started a new benefit in 2010Q4:
You would get this when rolling up by year for the client (because SSAS is averaging the rows that exist in each year):
When what you really should see is this (becuase there were two different benefits in 2010):
I wonder if the DistinctCount aggregration in SSAS would do what you want? What you really want is the distinct count of benefits for a client, no matter what time span you look at, correct?
Create a new measure (on the Cube Structure tab in BIDS/Visual Studio, right-click your measure group and pick New Measure...), set the Usage to Distinct Count, click the option to Show all columns, and pick BenefitKey as the source column. That way SSAS should report the count of distinct benefits, no matter what the timespan. Try it out and let us know what you find!
The trick is going to be getting the correct aggregation for the count.
Your FactActiveBenefit table at the quarter grain looks good (and the design issues are all the same at the daily grain, you would just be creating 356 or 366 fact records per benefit per client instead of 4 or 5). The detail reporting (at the quarter level) is correct out of the box as in your example:
Benefit | 2009Q1 | 2009Q2 | 2009Q3 | 2009Q4 | 2010Q1 |
1234 | 1 | 1 | 1 | 1 | 1 |
However, looking at it by year the default count or sum aggregation would give:
Benefit | 2009 | 2010 |
1234 | 4 | 1 |
Changing the AggregateFunction property of the measure to AverageOfChildren might work OK in simple cases, but won't roll up correctly when a client has multiple benefits over time. Consider this example where a client ended one benefit in 2010Q1 and started a new benefit in 2010Q4:
Client | Benefit | 2009Q1 | 2009Q2 | 2009Q3 | 2009Q4 | 2010Q1 | 2010Q2 | 2010Q3 | 2010Q4 |
5555 | 1234 | 1 | 1 | 1 | 1 | 1 | |||
5555 | 2789 | 1 |
You would get this when rolling up by year for the client (because SSAS is averaging the rows that exist in each year):
Client | 2009 | 2010 |
5555 | 1 | 1 |
When what you really should see is this (becuase there were two different benefits in 2010):
Client | 2009 | 2010 |
5555 | 1 | 2 |
I wonder if the DistinctCount aggregration in SSAS would do what you want? What you really want is the distinct count of benefits for a client, no matter what time span you look at, correct?
Create a new measure (on the Cube Structure tab in BIDS/Visual Studio, right-click your measure group and pick New Measure...), set the Usage to Distinct Count, click the option to Show all columns, and pick BenefitKey as the source column. That way SSAS should report the count of distinct benefits, no matter what the timespan. Try it out and let us know what you find!
Last edited by VHF on Thu May 06, 2010 12:07 pm; edited 3 times in total (Reason for editing : modified examples)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Counting coverages over time
Preliminary tests are looking good!
I made a new dimension: DimDateMonth
DimDateMonth
DateMonthKey int IDENTITY(1,1),
Year char(4),
Quarter char(1),
QuarterName char(6),
Month char(2) NULL,
MonthName varchar(10),
FullDate datetime,
FullDateName varchar(18)
Then I take my Benefit Dimension and use SQL to multiply it over the 12 months it is active:
Full date is not optimal but it was quick and easy. I will probably have to change it to composite keys YYYYQMM between YYYYQMM and YYYYQMM.
I loaded it up into my cube and Count Distinct seems to be working properly! I will mess with it more and see what I can get.
Thanks everyone for the help.
I made a new dimension: DimDateMonth
DimDateMonth
DateMonthKey int IDENTITY(1,1),
Year char(4),
Quarter char(1),
QuarterName char(6),
Month char(2) NULL,
MonthName varchar(10),
FullDate datetime,
FullDateName varchar(18)
DateMonthKey | Year | Quarter | QuarterName | Month | MonthName | FullDate | FullDateName |
469 | 2009 | 1 | 2009Q1 | 1 | January | 2009-01-01 00:00:00.000 | 2009 Q1 January |
470 | 2009 | 1 | 2009Q1 | 2 | February | 2009-02-01 00:00:00.000 | 2009 Q1 February |
471 | 2009 | 1 | 2009Q1 | 3 | March | 2009-03-01 00:00:00.000 | 2009 Q1 March |
472 | 2009 | 2 | 2009Q2 | 4 | April | 2009-04-01 00:00:00.000 | 2009 Q2 April |
Then I take my Benefit Dimension and use SQL to multiply it over the 12 months it is active:
- Code:
CREATE VIEW [dbo].[vFactBenefitCoverage]
AS
SELECT
br.KCRSNBenefitRequestKey as BenefitKey,
br.ClientKey,
dm.DateMonthKey,
1 as [Exists]
FROM dbo.FactKCRSNBenefitRequests br
INNER JOIN dbo.DimDateMonth dm
ON dm.FullDate between br.BenefitStartDate AND br.BenefitEndDate
Full date is not optimal but it was quick and easy. I will probably have to change it to composite keys YYYYQMM between YYYYQMM and YYYYQMM.
I loaded it up into my cube and Count Distinct seems to be working properly! I will mess with it more and see what I can get.
Thanks everyone for the help.
Gehaus- Posts : 4
Join date : 2010-05-05
Location : Seattle
Similar topics
» Model Policies and Coverages
» Bridge table and double counting issue
» Combined Date Time dimension vs. Separate Day and Time dimensions and timezones
» Many to many relationship question
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
» Bridge table and double counting issue
» Combined Date Time dimension vs. Separate Day and Time dimensions and timezones
» Many to many relationship question
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|