how to handle unstructured data?

View previous topic View next topic Go down

how to handle unstructured data?

Post  robber on Mon Jun 25, 2012 9:27 am

What approach have other people taken to process and store unstructured data? Inmon calls this textual etl as part of DW 2.0

If we want to process text sources such as documents, emails, facebook, google analytics, etc. what ETL tools/functions would you use to transform the text?
Where would you store the results? Assume we have an existing dimensionally modelled data warehouse, do we create a separate section of tables that reside independently of our dims and facts?

Curious what the rest of the Kimball community are doing...


Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: how to handle unstructured data?

Post  BoxesAndLines on Mon Jun 25, 2012 8:30 pm

I haven't ventured too far into the unstructured world. Unless you're doing big data, the best solution for unstructured data is to add structure. Informatica has screen scrapers, PDF scrapers, or just about any other type of report scraping capabilities that I've used to pull in externally produced soft copy reports. While the whole social media sentiment analysis gets lots of play in the media, I've not seen a whole lot of use in financial services, telecom, or even health care.

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum