Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Null values in facts, yes or no?

5 posters

Go down

Null values in facts, yes or no? Empty Null values in facts, yes or no?

Post  jochem_van_grondelle Tue Sep 22, 2009 3:50 pm

Dear all,

I'm wondering if anyone could help me out with the following (simplified) case. This is for my Business Intelligence study on a Dutch university where we do a project to apply BI to the student information systems, especially their results and grades.

Case:
For a whole study, students have to follow courses (about 10 each year) for which they (mostly) get grades. In Holland, grades range from 1 to 10, where 10 is best and 1 is worse. You have to get a 5,5 to pass a course otherwise you have to retake the classes for that course. Up untill this point our DWH design seems kind of easy, some dimensions (course, student, time, [...]) and results as facts. Then we can easily get everything we need to know about these grades and which students passed all their courses.
The big difference here is that there are some courses for which no grades are assigned. You could just get PASSED or NOT PASSED. Now we're wondering how to process this change in our fact tables.

We are thinking of:

Code:
RESULT FACTS
---------------------------
TimeDimension_key (int)
StudentDimension_key (int)
courseDimension_key (int)
Grade (float)
IsPassed (bit)

But in this case, how should we handle these courses without grades? It would be the easiest to pass a NULL value in this case for the grade, but most BI professionals do not advise this, because of both performance problems and unexpected results.
If we would aggregrate grades, it would be averages. By example top students are all those with all courses passed, an average grade of 7,5 and no grade lower than 7. For this grade measurement courses without grades are not taken in the account for the average. We would never total data. For this reason, we are unable to pass a '0' value.

Another solution would be to add a small 'grade'-dimension, with only grade and ispassed. But then calculations would be made from the dimensions and that shouldn't be ok, I guess.

So, I am very interested if anyone would know of any appropiate solution for this case. I appreciate your help very much.

With regards,

Jochem van Grondelle
student in Amsterdam, Netherlands
jochem@vangrondelle.net

jochem_van_grondelle

Posts : 11
Join date : 2009-09-22

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  ngalemmo Tue Sep 22, 2009 4:17 pm

"but most BI professionals do not advise this, because of both performance problems and unexpected results"...

This is an issue with foreign keys, not attributes or values. The thing is, if there is no grade, there is no grade... simple as that. There isn't any good reason to invent one. Keeping it null when there is no grade has advantages. If you are calculating an average grade for a student, null values will be ignored in the calculation using standard SQL, giving correct results. (the average of 5, null, 6, 7 is 6 not 4.5)

What you may want to consider is adding a grading method attribute to the course information (or if is it sometimes an elective, that is the student can choose to get a pass/fail grade, have a separate dimension indicating grading method) so that users can easily segregate what should and should not have a grade.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  ngalemmo Tue Sep 22, 2009 4:23 pm

One other thing, I assume you would always set the IsPassed value regardless of how the course was graded.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  jochem_van_grondelle Tue Sep 22, 2009 4:37 pm

ngalemmo wrote:"...The thing is, if there is no grade, there is no grade... simple as that. There isn't any good reason to invent one. Keeping it null when there is no grade has advantages...."

Thanks for your quick answer. It is a very clear explanation. This is what I thought, although could you confirm that this does not result in a performance degradation?


Last edited by jochem_van_grondelle on Tue Sep 22, 2009 4:38 pm; edited 1 time in total

jochem_van_grondelle

Posts : 11
Join date : 2009-09-22

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  jochem_van_grondelle Tue Sep 22, 2009 4:37 pm

ngalemmo wrote:One other thing, I assume you would always set the IsPassed value regardless of how the course was graded.
Definitely!

jochem_van_grondelle

Posts : 11
Join date : 2009-09-22

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  ngalemmo Tue Sep 22, 2009 9:46 pm

jochem_van_grondelle wrote:Thanks for your quick answer. It is a very clear explanation. This is what I thought, although could you confirm that this does not result in a performance degradation?

I have no explicit data one way or the other, but I have never noticed any impact based on weither an attribute was null or not. A null foreign key however would certainly impact results (but not performance) as all such rows would not be selected if an inner join is performed using that key.

By the way, Oracle has a non-standard definition of null. Any varchar field containing an empty string is considered 'null' even if the value of that field was explicitly set to an empty string. Null (i.e. blank) character fields are very common in Oracle DBs and it doesn't seem to affect how it performs.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  beyeguru Tue Sep 29, 2009 1:22 am

I would probably implement this solution in two fields and both can be on the fact table. I would have a Grade Point field and a Pass/No Pass field. The Grade Point field for a couse that does not offer grades can be set to 0. I understand your explanation of why you do not want to set it to zero but you can filter out these records when you are computing a students average Grade Point very easily; just do not include courses where GP = 0 (assuming that this is the only way a zero is being used). One advantage of haviing the two fields is that a user can easily find out the courses a student has not passed by querying the Pass/No Pass field.

beyeguru

Posts : 5
Join date : 2009-08-03

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  jochem_van_grondelle Tue Sep 29, 2009 2:57 am

After some research this week , it appears that NULL valued facts are not a problem (anymore) with the current generation of datawarehouse systems. If we would need a column without it, we could always create an extra view.

Could you please explain why you would not use NULL values?

jochem_van_grondelle

Posts : 11
Join date : 2009-09-22

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  BoxesAndLines Tue Sep 29, 2009 6:40 pm

You can't join on a null foreign key column. This can cause all sorts of user errors when running reports. My FK's are always NOT NULL. I do occasionally have other data on the fact table that is nullable.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  rajsundar Thu Oct 01, 2009 5:35 pm

Perhaps an attribute "IsGraded" (Yes/No) on the Course Dimension and a Null value for the Grade.

rajsundar

Posts : 4
Join date : 2009-08-06

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  jochem_van_grondelle Thu Oct 15, 2009 10:28 am

BoxesAndLines wrote:You can't join on a null foreign key column. This can cause all sorts of user errors when running reports. My FK's are always NOT NULL. I do occasionally have other data on the fact table that is nullable.
Thanks for your reply. Maybe you understand my question wrong, but I'm talking about nullable facts, not about nullable foreign keys.

jochem_van_grondelle

Posts : 11
Join date : 2009-09-22

Back to top Go down

Null values in facts, yes or no? Empty Re: Null values in facts, yes or no?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum