SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

View previous topic View next topic Go down

SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  kimball_fan on Thu Apr 03, 2014 12:42 pm

Our team could use some guidance regarding the appropriate place(s) to maintain SCDs. We are in the planning phase of our project and have thus far been tracking type II SCDs in our EDW structure. After identifying potential issues with a bridge type table containing “customer” keys and “street address” keys to represent “customer street address”. The main table, “customer” is tracking SCD2 changes as is the “customer street address” table. The thing that got us talking about this was the fact that changes to “customer” history would need to be synchronized to the fully normalized “customer street address” table. Now our EDW is starting to become complex. We acknowledge that this runs contrary to what we have read in Mr. Kimball’s book and is cause for concern.

At this point, one of team members mentioned that the D in SCD stands for dimension! It works somewhat magically with a dimension table referencing a fact table that has a date key. We began to discuss the possibility of not tracking history in our EDW, effectively transferring history tracking responsibility to the data mart.

As a developer, I become concerned when the complexity of the project is shifted further from the source into what I consider to be the presentation layer, or data mart. I also have concerns about late arriving dimensions and such. I have had to set aside conventional wisdom more than once while working in BI.

Finally, given my long-winded story, my question to you is – where is the best database to maintain history?

a. In the EDW and then copied to the data mart?
b. In the data mart only?

My apologies if this question has been answered many times before on these forums. I wasn't able to find a post that addressed this specifically. Thank you in advance for any guidance you might be able to offer.

kimball_fan

Posts : 4
Join date : 2014-04-03

View user profile

Back to top Go down

Re: SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  ngalemmo on Thu Apr 03, 2014 1:20 pm

What do you mean by EDW? A 3NF data store?

The concept of a Type 2 SCD (slowly changing dimension) belongs to dimensional modeling. If you are designing a database using entity relational modeling there is no such thing. A logical ER model is made up of entities, attributes, and keys. All entities are peers, they do not have categorizations within the modeling methodology.

" It works somewhat magically with a dimension table referencing a fact table that has a date key. We began to discuss the possibility of not tracking history in our EDW, effectively transferring history tracking responsibility to the data mart."

It is not 'magical' and does not rely on a date key in the fact table. The dimension itself contains multiple rows for the same member, each row with a unique PK. The fact references the dimension using that PK which references the state of the member at the time of the transaction. This key is determined when loading the fact and does not require further interpretation when joining to the dimension in queries.

If you are not maintaining history in the EDW, why bother with it? Why not just implement a dimensional data warehouse?
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  kimball_fan on Thu Apr 03, 2014 1:46 pm

Thanks for your response! Yes it is (currently) a 3NF data store combining data from multiple source systems. We are still designing though.

It's my understanding that we use dimensional modeling to better present data to business users. I'm not yet sure if it's the most flexible approach as new requirements and/or source systems present themselves. I'll need to discuss it with our team.





kimball_fan

Posts : 4
Join date : 2014-04-03

View user profile

Back to top Go down

SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  zoom on Thu Apr 03, 2014 6:20 pm

[quote="kimball_fan"]Our team could use some guidance regarding the appropriate place(s) to maintain SCDs. We are in the planning phase of our project and have thus far been tracking type II SCDs in our EDW structure. After identifying potential issues with a bridge type table containing “customer” keys and “street address” keys to represent “customer street address”. The main table, “customer” is tracking SCD2 changes as is the “customer street address” table. The thing that got us talking about this was the fact that changes to “customer” history would need to be synchronized to the fully normalized “customer street address” table.  Now our EDW is starting to become complex. We acknowledge that this runs contrary to what we have read in Mr. Kimball’s book and is cause for concern.

At this point, one of team members mentioned that the D in SCD stands for dimension! It works somewhat magically with a dimension table referencing a fact table that has a date key. We began to discuss the possibility of not tracking history in our EDW, effectively transferring history tracking responsibility to the data mart.

As a developer, I become concerned when the complexity of the project is shifted further from the source into what I consider to be the presentation layer, or data mart. I also have concerns about late arriving dimensions and such. I have had to set aside conventional wisdom more than once while working in BI.

In relational data modeling, when you want to resolve many to many relationships you use "associate" table not a bridge table.
If you decided not to keep customer address history in DW but keep customer's sales history, then all historical sales record point to the most current address.... would that be okay with your business user?

zoom

Posts : 97
Join date : 2010-08-23
Location : Chicago

View user profile

Back to top Go down

Reply with quote SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  kimball_fan on Thu Apr 03, 2014 7:20 pm

No, I'm afraid that wouldn't work. We need to be able to report on customer geography at specific points in time.

kimball_fan

Posts : 4
Join date : 2014-04-03

View user profile

Back to top Go down

Re: SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  nick_white on Fri Apr 04, 2014 8:01 am

Hi,
to sort of reiterate ngalemmo's comments, your project seems to be a bit confused about terminology and the use of various types of database designs. Using Kimball's definition, an EDW is a dimensional model and is used for reporting purposes. Its design is fundamentally different to a (3NF) transactional DB design and is designed the way it is for query performance reasons: running the sort of query an EDW is designed to support will be orders of magnitude faster against a properly designed dimensional model than the equivalent 3NF model.
What you have described as an EDW sounds more like an Operational Datastore (ODS) to me - though I don't want to open the can of worms about what the definition of an ODS is (there are plenty of posts on this forum that will do that!).

So your statement "It's my understanding that we use dimensional modeling to better present data to business users", while not wrong, is not the point of dimensional modelling - it is more the result of having designed a dimensional model correctly.

My suggestion is that you read Kimball's DM book (and/or read a book on the Inmon methodology or any other reporting design methodology you are aware of) and you will then have the knowledge to decide what is the appropriate solution for your particular requirements - which may or may not be a DM but at least it will be an informed decision.

Regards,

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  kimball_fan on Fri Apr 04, 2014 11:21 am

Thanks nick_white. Yes, terminology does seem to be one of the issues I'm facing. Thankfully, I won't be the main designer of this project. I'm too new to BI and I'm here trying to gain some insights from this community. It's paying off in a big way.

kimball_fan

Posts : 4
Join date : 2014-04-03

View user profile

Back to top Go down

Re: SCDs - Do they belong in the EDW? Or do they belong only on dimensions in the data mart?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum