ETL from 250+ disparate customer host systems

View previous topic View next topic Go down

ETL from 250+ disparate customer host systems

Post  bg4 on Thu May 20, 2010 1:32 pm

My task is to produce an ETL process that pulls data daily from our customer systems (which can be most any database/OS). The data will be loaded into our central system, an analysis will be run and the aggregate data solution sent back to all customers.

Any recommendations for tools to achieve this scenario? I'd like to be able to connect to each customer's host (via whatever protocol) and execute the ETL from my central location. I will be able to install a client on the customer's system, in most cases. I've had some frameworks/tools recommended to me but I don't have enough experience yet to decide the best course of action.

Here are my current ideas, very open to suggestion:

  1. Communication/messaging to heterogeneous systems (Apache Service Mix)
  2. ETL process & data quality/cleansing/reporting (Apache Camel, Pentaho, Clover etc.) to pull data in
  3. Automated analysis and processing (ANT, ETL tool, legacy apps) to generate solutions
  4. Package/distribution of results (Apache Service Mix, FTP)

Any insight is greatly appreciated,


Posts : 1
Join date : 2010-05-20

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum