Data redundancy in the staging area

View previous topic View next topic Go down

Data redundancy in the staging area

Post  figareg on Mon Sep 26, 2011 11:55 am

I have to design the ETL process and staging area for a new data warehouse and I am not sure whether it makes sense to keep a copy of all dimension and fact tables in the staging area.

The ETL process will be fed by a daily extract from SAP. Master data like material, customer etc. come as a complete download whereas transactional data come in daily buckets.

It seems like a waste of storage to keep the tables in the staging area and then copy them into the final data marts. Staging area and presentation area are within the same data base so I can easily access the tables in the presentation area if necessary.

Regards,
Fabio

figareg

Posts : 2
Join date : 2011-09-26
Location : Germany

View user profile

Back to top Go down

Re: Data redundancy in the staging area

Post  BoxesAndLines on Mon Sep 26, 2011 12:48 pm

I would design the process to load the facts and dimensions directly.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Data redundancy in the staging area

Post  ngalemmo on Mon Sep 26, 2011 11:08 pm

All you need to do is compare the incoming table with the dimension and update/insert if necessary. If needed you may have to reformat or transform the source before you compare.

An easy way to compare is to calculate a CRC code on the source, and when you insert or update, store the new CRC in the dimension. To check for change all you need to do is compare the new CRC with the CRC on the dimension row.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Data redundancy in the staging area

Post  figareg on Thu Sep 29, 2011 3:43 am

Thanks,

I thought it would make sense to compare directly with the final tables but was not sure.

Fabio

figareg

Posts : 2
Join date : 2011-09-26
Location : Germany

View user profile

Back to top Go down

Re: Data redundancy in the staging area

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum