Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Modeling help on dimension with many boolean value

5 posters

Go down

Modeling help on dimension with many boolean value Empty Modeling help on dimension with many boolean value

Post  marric01 Thu Aug 26, 2010 10:32 am

Hi

I'm really stuck on how to model a user need/business need

Basically, we are in the healthcare and we have a software that manage the work
of the paramedics in ambulances. There is a table that store all standard informations about the work
of the paramedic. When the paramedics are on the site of an event, they apply some "procedure"
(Major trauma, Deny of transportation, Seizure, Stroke, Respiratory difficulties, Heart attack, .... )

In one event for 1 patient, paramedic can apply one or many procedure.

Now, in the software (OLTP), they have created one table for each procedure. and inside each
procedure table, they have a bunch of boolean values. All those boolean (alot of them) are needed
by the user to basically Filter data when browsing in the OLAP cube. Some procedure have around 20 boolean
and some have only 5.

This is some example of what the user want to know (I have only used 1 boolean for each "need" but they could include more than 1 boolean to filter) :

I want to know the average time on site (Fact measure) for event with procedure "major trauma (procedure dimension)" and where paramedic didnt used "Stabilization mattress (boolean for procedure major trauma)" detailed by paramedic company (company dimension).

I want to know the "number of event (Fact measure)" with procedure "Deny of services (procedure dimension)" and where paramedic didnt "Informed patient about danger (boolean for procedure deny of service)" detailed by employee (paramedic dimension).

I really dont have a clue on how to model that in the DW.

All suggestion are welcome

Thanks !!!!!!!!

Richard

marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  marric01 Thu Aug 26, 2010 1:51 pm

BTW ... I dont need help on how to model everything ... its just how to handle the boolean values for each procedure.

I have already design the Fact table for the event (Wich the grain is the event itself) with all related dimensions

I need a Fact table now with the grain procedure. I'm not sure how to incoporate all the boolean values of the procedure (See first post) .... Junk Dimension ? Other way ???

Thanks

marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  ngalemmo Thu Aug 26, 2010 2:07 pm

Let me try to paraphrase (no pun intended) the issue:

There are set procedures that can be performed and each procedure has a checklist of actions or steps in doing the procedure. I assume the booleans indicate if that action was performed as part of the procedure.

The queries need to be able to look at these incidents and, for a given prodcedure, identify events where one or more actions were performed or not performed.

Correct?
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  marric01 Thu Aug 26, 2010 2:15 pm

Hi

Your understanding is all correct !

Just to add more information, those checkbox are not sequencial steps in the procedure. And yes, the boolean indicate if an action was performed or not, a situation was confirmed or not confirmed (i.e : Patient dead at paramedic arrival on site).

Oh, and each procedure dont share the same boolean action/siutation. But some can be applied to all procedure (i.e. is Documentation completed).


marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  marric01 Fri Aug 27, 2010 1:18 pm

Anyone got any idea about that boolean thing ? After reading about Junk dimension, i'm not really sure its good for that scenario ? But maybe i'm wrong, i'm far from beeing an expert in DW modeling lol ;-)

marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  ngalemmo Fri Aug 27, 2010 1:34 pm

I'm at a loss...

I would like to say store the booleans as a bit map in a number, but most databases don't have functions that would allow you to do anything with them. To locate specific instances of doing something or not requires an XOR (exclusive or) comparison of the 'as is' and 'desired' map, which isn't commonly availible as a SQL function... ie: (as_is_map XOR desired_map) = desired_map.

Junk isn't good because each procedure has different actions.

The best I can come up with would be a fact table with incident, procedure, action dimensions (among others) and a 0/1 measure that indicates if the action was performed.

You may also consider a bridge which would be made up of unique combinations of proceudre and action states, but it would be logically equivalent to the fact table. It only make sense to do the extra work if you are dealing with very large data volumes (i.e. many 100,000's of incidents).

Querying would be a challenge, particularly if you are looking for incidents where action A, B & C, but not D or E were performed.

But, if your database support bit level logical functions, it would be fairly easy to do with a bit of coding...


ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  marric01 Fri Aug 27, 2010 1:49 pm

Thanks for the answer ...

I'm using

Microsoft SQL 2005 for the database (Source and DW)
SSIS 2005 for ETL
SSAS 2005 for OLAP

And I'm using a web control to give access to the cube for the end users
They can also use (for advanced excel user) excel to browse in the olap cube and produce graphics

Oh and btw, we are dealing with maximum 10,000 incident per year ;-)

marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  marric01 Fri Aug 27, 2010 2:03 pm

I really like the option to create a fact table with ID_Event, ID_Procedure and ID_Actions (1/0)

(Just to make sure : all the possible actions are gone be added in a dimensionActions table)

The only thing is the work to be done in the ETL process to manage this. I'm trying to evaluate the impact and the how it can be done ;-)


marric01

Posts : 23
Join date : 2010-08-18
Age : 48
Location : Montréal, Québec , Canada

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  VHF Fri Aug 27, 2010 5:45 pm

In the Kimball Group "Dimensional Modeling in Depth" class there is an exercise involving a DW for cars that were sold with various combinations of options. The requirement was to be able to query how many cars were sold with any combination of options, i.e. to ask for cars that had options 1, 2, and 10 but not 3 or 17. In essence the presence or absence of each option was a Boolean value.

If there were only a few possible options (Booleans), one could create a dimension with a boolean/flag attribute for each option. Essentially a junk dimension. However, this approach becomes impractical as the number of possible options increases.

One design possibility discussed in the class was building a delimited string containing identifiers for all the options present on the car (i.e. all the ‘true’ Boolean values.) For example: ‘|1|2|4|10|11|25|127|’. This option list string could be stored simply as a DD in the fact table or as a dimension attribute (one row for each unique combination of options.)

The user can then do a SQL query along these lines:

SELECT COUNT(*)
FROM FactCarSales
WHERE OptionList LIKE ‘%|1|%’
AND OptionList LIKE ‘%|2|%’
AND OptionList LIKE ‘%|10|%’
AND OptionList NOT LIKE ‘%|3|%’
AND OptionLIst NOT LIKE ‘%|17|%’

Might not be the fastest query, but gets the job done and is a fairly straightforward way to express complex combinations of Boolean options, including some that are ‘false’ (i.e. not in the delimited string.)

Note that this approach might be used in conjuction with a bridge table (or modeling the individual options/procedures as individual facts) to better support reporting—after all, there are plenty of questions that can’t be easily answered by having identifiers buried in a delimited string!

Of course, you said SSAS OLAP, which opens a new can of worms. In fact, I did a brief test and wasn’t able to easily duplicate this logic in SSAS, but then again I’m not that strong in MDX. So this might not be the right approach at all for the SSAS world, but I thought I would toss it in the ring anyway!


Last edited by VHF on Fri Aug 27, 2010 5:50 pm; edited 2 times in total (Reason for editing : typo; clarity)

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  hang Fri Aug 27, 2010 8:03 pm

Immediately when I read the post, I thought of Junk dimension and then saw the point ngalemmo made. But after some further thoughts, I think you could still use Junk dimension, but with slight twist.

You don't junk procedures mixed with options altogether, but the combinations of options only. The junk dimension will have all the possible options across procedures as its attributes. Assuming each option has 3 values (yes, no, unknown/NA), 10 options would end up with 3 to the power of 10, 59049 possible combinations. If you have many more options, the total number of possible combinations will be out of control.

However you don't pre-build the junk dimension by cartesian join on all the options, you build the dimension with a sequence surrogate key from your fact incrementally by SELECT DISTINCT ... NOT EXISTS. In that way, the maximum size of the junk dimension can only be as big as your fact table, if not smaller. In reality, there would be a lot of common option combinations shared by many incidents. So the junk dimension should be much smaller, though very wide. If the size is still a concern, you can always split it up by some kind of categorisation. The point is we try to avoid having option dimension for every procedure, and will not end up with a Centipede fact table.

In your incident fact table you have two FKs, one for procedure and one for option junk dimension. The relationship between the procedures and actual options selected is established by fact, instead of being stored in a snowflaked dimension structure. I believe SSAS cube should happily treat the junk dimension just like a normal dimension so that you can use all the option attributes to slice, dice and filter the facts. You could also configure hierarchies based on attributes in both procedure and option dimension inside SSAS to open up many valuable analysis avenues.

Another afterthought, you may need to add a procedure FK and other grouping attributes to your option junk dimension to have predetermined hierarchies in a reasonably smaller dimension.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  BoxesAndLines Sun Aug 29, 2010 7:48 pm

I think you're getting bogged down in the fact that the information is stored in a boolean column. It would seem to me that you need to create a dimension with all of the procedures along with a bridge table to accomodate the many to many relationship. Each boolean column should be represented as a row in your dimension table.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Modeling help on dimension with many boolean value Empty Re: Modeling help on dimension with many boolean value

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum