How to consolidate numerous ETL processes?

View previous topic View next topic Go down

How to consolidate numerous ETL processes?

Post  c on Thu May 20, 2010 9:26 am

Hi,

Within the department there are numerous users whose ETL processes are slightly identical to one another (e.g. an extra variable). This not only puts a burden on the server load, but also create problems where a certain measure is not identical across the department.

Therefore we're venturing down a new path to consolidate all ETL processes into one, which will eliminate the repetitive processes but most importantly achieve a single source of truth. We have 10 data sources and an ETL tool to use for this project.

Currently ETL processes for each of the user is a long piece of code which brings together the source datasets with a lot of transposing and derived variables based on business rules.

What should we do venturing down this new path.
Should we design a data model where all the data sources will fit neatly into the relevant dimensions?
Or should we just consolidate all the ETL codes into a single one?
What will be the pros and cons of each alternative?

Thanks.

c

Posts : 3
Join date : 2009-08-20

View user profile

Back to top Go down

Re: How to consolidate numerous ETL processes?

Post  ngalemmo on Thu May 20, 2010 12:25 pm

It sounds like you have a data repository that was build piecemeal without any overall vision or structure. The problem you have isn't the fact that there is a lot of ETL code, but, more importantly, no integrated view of the data... ie "certain measure is not identical across the department."

The best approach is to redo the model.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: How to consolidate numerous ETL processes?

Post  c on Thu May 20, 2010 9:17 pm

ngalemmo wrote:It sounds like you have a data repository that was build piecemeal without any overall vision or structure. The problem you have isn't the fact that there is a lot of ETL code, but, more importantly, no integrated view of the data... ie "certain measure is not identical across the department."

The best approach is to redo the model.

Thanks!

c

Posts : 3
Join date : 2009-08-20

View user profile

Back to top Go down

Re: How to consolidate numerous ETL processes?

Post  sgudavalli on Thu Jun 10, 2010 10:28 am

we need to first list down couple of things for each ETL process in the department

1) At what time content is available for the ETL Process?
2) Is there any dependency set for the ETL process on external systems?
3) Is the schema changes, latency etc... (outcomes of the new path) is acceptable by the downstream applications of these ETL Process?

if the answer is acceptable then we can go and reconstruct it?

if not then we can fill in the gaps with a federated approach.

Regards
Shiv

sgudavalli

Posts : 29
Join date : 2010-06-10
Age : 33
Location : Pune, India

View user profile

Back to top Go down

Re: How to consolidate numerous ETL processes?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum