Enterprise Date Warehouse/Bus Architecture Physical Implementation

View previous topic View next topic Go down

Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  anandraj on Thu Jul 16, 2009 11:54 am

What is the best database design for EDW/Bus Architecture?
creating all the dimensions (confirmed) in one oracle schema/sql server database and
all the facts in another oracle schema/sql server database? OR
creating all dimensions and facts in one oracle schema/sql server database?
Please comment on these two different design approaches

Is there document or book which talks about EDW/Bus Architecture physical database implementation and best practices?

Thanks in advance.

anandraj

Posts : 1
Join date : 2009-07-16

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  robber on Thu Jul 16, 2009 12:23 pm

I would keep dims and facts in one schema, I would separate dims and facts and data and indexes into separate tablespaces(oracle) or filegroups(sql server). You can then manage the physical file placement (separate disks) to optimise performance.

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  caderoux on Tue Jul 21, 2009 4:42 pm

In addition, dims and facts really need to be within the same database (in SQL Server terminology) if you want to enforce your referential integrity on the dimension IDs which join the facts to the dimensions with a foreign-key contraint.

caderoux

Posts : 8
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  ngalemmo on Wed Jul 22, 2009 4:38 pm

caderoux wrote:In addition, dims and facts really need to be within the same database (in SQL Server terminology) if you want to enforce your referential integrity on the dimension IDs which join the facts to the dimensions with a foreign-key contraint.

If you are assigning surrogate keys, there really isn't a need to enforce RI at the database level. Enforcing FK restraints is a waste of time.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  smckee6452 on Mon Aug 10, 2009 10:54 am

ngalemmo wrote:If you are assigning surrogate keys, there really isn't a need to enforce RI at the database level. Enforcing FK restraints is a waste of time.

I believe this statement to be true, however, I cannot find documentation anywhere stating that this really is true. Can you please provide some type of documentation/references/internet links that say this? My problem is that I have some DBAs that are insisting that FKs exist and be enforced and in my limited experience that makes managing conformed dimensions and fact tables that sit on top of them a PITA.

Please, anyone, provide references to this quoted assertion.

smckee6452

Posts : 3
Join date : 2009-08-10

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  Omaha on Mon Aug 10, 2009 12:20 pm

smckee6452 wrote:
ngalemmo wrote:If you are assigning surrogate keys, there really isn't a need to enforce RI at the database level. Enforcing FK restraints is a waste of time.

I believe this statement to be true, however, I cannot find documentation anywhere stating that this really is true. Can you please provide some type of documentation/references/internet links that say this? My problem is that I have some DBAs that are insisting that FKs exist and be enforced and in my limited experience that makes managing conformed dimensions and fact tables that sit on top of them a PITA.

Please, anyone, provide references to this quoted assertion.

The impact is on the fact table loading process. I think it's more of a logical argument than something you'll find in a whitepaper. You've already used your ETL process to look up the surrogate keys and make sure you have a valid surrogate/business key relationship. If you have DRI between your fact table and your dimension tables you will slow down the loading process. With DRI, each row inserted into the fact table will need to do a lookup against each dimensional table. Why take that that performance hit twice (during ETL and then again during fact loading)? If you're loading a small number of facts each time, probably no big deal. With lots of fact rows to load and lots of dimensions to check, it can be a big deal.

Omaha

Posts : 6
Join date : 2009-08-07

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  smckee6452 on Mon Aug 10, 2009 12:54 pm

Please bear with me as I'm not an ETL or BI expert...

For my situation, it's more than impact on load time. We have several conformed dimensions which are truncated and reloaded every night. We also have many fact tables that sit on these dimensions and we are creating new fact tables all the time. So my problem is from a design perspective (although, our load times are getting insane so this FK stuff may creep that way as well). Having enabled FK constraints on fact tables requires the packages that manage the conformed dimensions (loaded first in the ETL process) to know about and keep track of FKs that reside on fact tables that are keyed (surrogate) to these dimensions... That becomes hard to manage and gets very "spaghetti" very quickly.

I guess my problem is that my DBAs are still living in a transactional world, and this BI/ETL stuff is not that sort of beast to me. I could be way off base, I'm new to this stuff, but that's how it seems. I'm finding it very hard to convince them that enabled FK constraints are doing more harm than good: they are redundant, they decrease maintainability, they increase load time, etc. But they insist that FKs will exist and will be enabled until I can provide them evidence to the contrary... What they want is a BI/ETL best practices document that says "do not enable FK constraints and here's why" that usurps the generally held best practice in the transactional world of "make sure you have FKs and here's why".

Am I just missing something here?


Last edited by smckee6452 on Mon Aug 10, 2009 12:55 pm; edited 1 time in total (Reason for editing : typo)

smckee6452

Posts : 3
Join date : 2009-08-10

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  ngalemmo on Mon Aug 10, 2009 12:59 pm

I've never used database enforced RI in any data warehouse I have built... and it has never been a problem. When a DBA balks, I sit down with him/her and explain how the ETL process works, how a surrogate key can only be assigned if there is a row in the dimension table, and how much overhead is reduced, particularly with large data loads.

I don't know if anyone has published a formal white paper (usually you see white papers from someone is trying to sell you something), but it has been a generally accepted practice for as long as I can remember. The key difference between a data warehouse and a transactional system is the update processes for a DW are controlled and structured, rather than random and uncontrolled. The rigor of the ETL process avoids the issues database RI is meant to prevent.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  Omaha on Mon Aug 10, 2009 1:39 pm

smckee6452 wrote:
...

We have several conformed dimensions which are truncated and reloaded every night. We also have many fact tables that sit on these dimensions and we are creating new fact tables all the time.

...

How do you do that with DRI without deleting all the facts also? Are you completely reloading the DW every day? Seems pretty processing intensive.

Omaha

Posts : 6
Join date : 2009-08-07

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  ngalemmo on Mon Aug 10, 2009 1:48 pm

Omaha wrote:
smckee6452 wrote:
...

We have several conformed dimensions which are truncated and reloaded every night. We also have many fact tables that sit on these dimensions and we are creating new fact tables all the time.

...

How do you do that with DRI without deleting all the facts also? Are you completely reloading the DW every day? Seems pretty processing intensive.

Agreed... First off, you NEVER delete anything from a dimension table. It is critical to preserve the surrogate primary key so the facts reference something that exists. Second, you should be loading atomic level facts (Rule #1 in Kimball U's 10 essential rules - http://www.intelligententerprise.com/showArticle.jhtml?articleID=217700810), which implies incremental loads. RI is the least of your worries...
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  smckee6452 on Mon Aug 10, 2009 1:51 pm

Omaha wrote:
How do you do that with DRI without deleting all the facts also? Are you completely reloading the DW every day? Seems pretty processing intensive.

In the cases where the dimensions are being truncated, any fact tables joined to the dimensions are also being truncated 99% of the time. This is because our source data may not have audit dates which can be trusted... You're right, it's processing intensive.

smckee6452

Posts : 3
Join date : 2009-08-10

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  ngalemmo on Wed Aug 19, 2009 11:58 am

smckee6452 wrote: I believe this statement to be true, however, I cannot find documentation anywhere stating that this really is true. Can you please provide some type of documentation/references/internet links that say this? My problem is that I have some DBAs that are insisting that FKs exist and be enforced and in my limited experience that makes managing conformed dimensions and fact tables that sit on top of them a PITA.

Please, anyone, provide references to this quoted assertion.

If you need something 'official' to give to the DBA's (and give me a chance to do a little self promotion - Heck, I get a $1 for every book sold), it is covered in pages 299-300 of "Mastering Data Warehouse Design - Relational and Dimensional Techniques" (Wiley). It goes into a little more depth (or maybe just more verbose) but essentially the same as my comments in the earlier posts.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  BoxesAndLines on Wed Aug 19, 2009 4:09 pm

$1?? That's a lot of work for $1 per book!
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  ngalemmo on Wed Aug 19, 2009 4:25 pm

BoxesAndLines wrote:$1?? That's a lot of work for $1 per book!

Tell me about it...

It's actually a little over $3, but it gets split 3 ways. From the royalties, I should be able to retire some time around 2298.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Enterprise Date Warehouse/Bus Architecture Physical Implementation

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum