50/50 attributes - dimension or fact?

View previous topic View next topic Go down

50/50 attributes - dimension or fact?

Post  Chumpski on Sat Mar 24, 2012 4:44 pm

Hi, I have scenario where I need to hold a few textual attributes against the lowest level of granularity in a dimension. These attributes will be simple “either/or” values such as “true/false” or “in/out”. The problem is there is a requirement to sum up the number of each type as we move up through the hierarchy in the dimension so we can work out the percentages etc.. To compound the issue these values change from time to time so there is a requirement to record the state of each attribute for a given month.

The way I thought of implementing this was to store the values as measures in a fact table and convert them to two columns one for true and one for false (for example) storing either 1 or 0. I would then control the insert in to fact table so that the record would get replaced if it changes within a period, but a new record would get created if it changed in a subsequent period (control be the link in the fact table to a time dimension).
Is this a good way to do this? Is there a better way?

Chumpski

Posts : 6
Join date : 2012-03-24
Location : Cardiff UK

View user profile

Back to top Go down

Re:50/50 attributes - dimension or fact?

Post  hkandpal on Sat Mar 24, 2012 8:28 pm

Hi,

it looks like you could make it in to a dimension of type 2 and when ever the value changes create a new value.

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  hang on Sat Mar 24, 2012 9:35 pm

The short answer is to leverage mini dimension, so that you can remove the fast changing, low cardinality but important attributes from the main dimension while being able to connect your fact with much smaller mini dimension to cater for analytics at higher levels (aggregate).

The mini dimension is similar to the famous junk dimension in terms of the way it's constructed, but bears subtle difference in terms of its inherent ties and SCD impact on another large main dimension. Heres are some relevant posts that may put you on right track:

http://forum.kimballgroup.com/t1243-modeling-question-from-dw-amateur

http://forum.kimballgroup.com/t1321-how-do-i-model-this-and-create-ssas-cube-from-it?highlight=ssas

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

OK

Post  Chumpski on Sun Mar 25, 2012 2:58 pm

So the mini dimension will abstract the high cardinality columns in to a new 'mini dimension', but I am still finding it hard to grasp how I would aggregate up these up as they are still attributes rather than measures in the fact table. Lets say in my dimension we have 4 columns all of which can either be true or false. I can see how I could add 4 new dimensions for each of these columns and link these to the fact table, for example one of these tables would have two rows:

1 true
2 false

the key column would the make up part of the fact table primary composite key, and I would be able to use the record count to aggregate up the dimension hierarchy. If I combine all 4 of these columns into a minidimension so that only on key represents all 4 values for the dimension record I can't quite see how I would then aggregate up the dimension for the individual values.




Chumpski

Posts : 6
Join date : 2012-03-24
Location : Cardiff UK

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  ngalemmo on Sun Mar 25, 2012 5:40 pm

Why would 1 table or 4 make a difference? You are counting fact rows, not foreign keys.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  hang on Sun Mar 25, 2012 8:43 pm

I guess I know what you are trying to achieve. You want to count the fact based on attribute values. I think all you need to do is to covert the count into sum as follows:

Select dim1.NK, dim2.NK …
,sum( case when miniDim.Attribute1='True' then 1 else 0 end) as TrueCount
,sum( case when miniDim.Attribute1='False' then 1 else 0 end) as FalseCount
,sum( case when miniDim.Attribute2='In' then 1 else 0 end) as InCount
,sum( case when miniDim.Attribute2='Out' then 1 else 0 end) as OutCount

From fact
Join dim1 on fact.dim1SK=Dim1.SK
Join dim2 on fact.dim1SK=Dim2.SK

Join miniDim on fact.miniDimSK=miniDim.SK
Group by dim1.NK, dim2.NK …


Last edited by hang on Mon Mar 26, 2012 4:14 am; edited 1 time in total (Reason for editing : Case end)

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  Chumpski on Mon Mar 26, 2012 2:28 am

hi, yes you are correct that is what I want to do, I was just not thinking allong the right lines. cheers for your help!

Chumpski

Posts : 6
Join date : 2012-03-24
Location : Cardiff UK

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  Vishy on Mon Mar 26, 2012 2:34 am




does it work ??

Vishy

Posts : 75
Join date : 2012-02-28
Location : India

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  Chumpski on Mon Mar 26, 2012 3:43 am

yes I think it should, but I will have to try it out to confirm.

Chumpski

Posts : 6
Join date : 2012-03-24
Location : Cardiff UK

View user profile

Back to top Go down

Re: 50/50 attributes - dimension or fact?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum