modelling questions & answers dimensions where answer can be multiple choice or freetext

View previous topic View next topic Go down

modelling questions & answers dimensions where answer can be multiple choice or freetext

Post  wilson_smyth on Wed Jun 03, 2015 7:31 am

Im building a star schema to allow reporting against usage of an application.
Application has a section where user answers questions.

A question can have the following types of answers:
- multiple choice (radio buttons), chose 1 out of 4.
- multiple choice (tick box), chose one or more of the following.
- free text (text box), what are your thoughts on...? or if "other" radio button is chosen.

I was considering just a questions dimension, and an answers dimension.
The problem with this is the free text will pollute the answers dimension with lots of answers that are specific to only one user on one question.
This model also means that there is a row in the fact table for each answer, which i dont think is ideal.

Note I will have to report on questions that have not been answered by users as well as what has been answered.

Id appreciate some suggestions on how best to approach this.

Thanks!

wilson_smyth

Posts : 1
Join date : 2015-06-03

View user profile

Back to top Go down

Re: modelling questions & answers dimensions where answer can be multiple choice or freetext

Post  ngalemmo on Wed Jun 03, 2015 6:33 pm

If you don't want answer to be the grain, then I assume you want question as the grain. In that case you have a bridge table that relates the instance of the question to multiple answers. There are a couple of ways you can build and maintain such a bridge.

The question about collecting the free-form text is a business one.  Do they want it, do the want to pay for it.  There are a number of ways this data is collected and consumed by analytics.  If they want it and don't know what to do with it, store it in a separate structure from the primary data so it doesn't slow most queries. It would be a dimension of the instance of the fact .  

Note that some databases do this internally.  My understanding is that SQL server stores large objects (text, blobs) in a separate storage structure that is referenced from the main columnar table.  Having such columns in the table does not hinder the performance of queries that do not use the column.  Whereas if the data is stored in the same physical row (as some databases do), queries that do not use the column must still deal with a very wide row, requiring significantly more I/O to get through the table.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum