Did anybody read this article ???

View previous topic View next topic Go down

Did anybody read this article ???

Post  Vishy on Fri Mar 09, 2012 4:43 am


http://sqlblog.com/blogs/jamie_thomson/archive/2009/11/28/debunking-kimball-effective-dates.aspx

Vishy

Posts : 75
Join date : 2012-02-28
Location : India

View user profile

Back to top Go down

Re: Did anybody read this article ???

Post  John Simon on Fri Mar 09, 2012 5:10 am

I like Jamie's posts, but this is much ado about nothing.
He doesn't really present a case as to why removing End Dates is beneficial - other than to fix poor ETL. He could simply run an update statement each night to fix the end dates if required.

I don't like having an End Date be the same as the succeeding Start Date - I think it's easier to "end" the End Date one time period before the succeeding Start Date - generally that would be a day for a warehouse loaded on a daily basis.

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

View user profile http://jsimonbi.wordpress.com

Back to top Go down

Can we logically delete dimension records by setting SCD dates

Post  hang on Fri Mar 09, 2012 8:13 am

I agree with John about focusing on a robust ETL process instead of trying to minimise the data redundancy at price of less clarity on physical model, although I found the debates around the topic are quite interesting.

However some of the posts did remind me of a relevant issue with using SCD dates to cater for deletes. On the surface, I thought expiring (end date) a dimension record can achieve the effect of soft deletion, but reality is never that simple. How can we effectively achieve the logical (soft) deletes in SCD dimensions so that we can report (query) on historical data without influence of the deleted dimension records.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Did anybody read this article ???

Post  ngalemmo on Fri Mar 09, 2012 1:49 pm

Not to keen on the idea, and you wind up with much more complex SQL. It's unnecessary. Besides, since when was data redundancy a problem in a dimensional model?

ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Did anybody read this article ???

Post  LAndrews on Fri Mar 09, 2012 5:47 pm


I've seen this article a few times - always gives me a chuckle.

The problem he appears to want to address is inaccurate data in the effective-end-date column. He even labels it a "data integrity issue".
I always ask myself, since the column is generated by the ETL process - if the data is inaccurate, why not fix the ETL code? That would also result in the desired data integrity, without the overhead/complication of creating views to calculate effective-end-date and latest-flags on the fly.


LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

View user profile

Back to top Go down

Re: Did anybody read this article ???

Post  Vishy on Sat Mar 10, 2012 3:22 am

I agree with all you guys comments, when I was going through this article and comments posted there, nothing was matching with what I was thinking so I thought "am I missing something here ".

First thing that put me off was talking about "redundancy !!!" , how can you talk about redundancy in DWH and talk about writing complex queries and believe me I even tried the sql posted there but that updated all the previous rows enddate and not only the immediate previous row.

We have to always decide whether our decision is going to improve query performance or ETL performance. I always feel if any of our decisoin is going improve ETL but badly affect query performance then that decision is against the DWH spirit.

customer should always be put first as his experience with our solutoin is going to decide whether we have a delivered a good solution or bad solution.

Vishy

Posts : 75
Join date : 2012-02-28
Location : India

View user profile

Back to top Go down

Re: Did anybody read this article ???

Post  jangorecki on Thu Jan 14, 2016 12:14 pm

Hello,
Does this design even matter for performance if you are using it for your Historical schema? Data Mart schema being produced by ETL workflow.
Your queries are hitting Data Mart which can already have EndDate produces by ETL if needed.
Historical layer keeps data redundancy - useful when doing a lot in-memory analytics.
The approach taken by Jamie seems reasonable to me if used in historical layer.
Looking forward for your comments on that matter.

jangorecki

Posts : 2
Join date : 2016-01-14

View user profile

Back to top Go down

Re: Did anybody read this article ???

Post  Sponsored content Today at 11:49 pm


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum