Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Poorly sturctured data at source system

4 posters

Go down

Poorly sturctured data at source system Empty Poorly sturctured data at source system

Post  JSchroeder Thu Aug 08, 2013 12:16 pm

If the data in the source system is poorly structured and not normalized in any way, what is the most typical way to handle this? Is it common to try and normalize the data as it moves through the staging area before pushing it out into a dimensional data mart? We're pulling data from one of our own websites that is stored in SQL server, but it isn't really structured in any way. Would the right approach be to try and normalize it first?

JSchroeder

Posts : 12
Join date : 2012-03-29

Back to top Go down

Poorly sturctured data at source system Empty Reverse the telescope?

Post  Mike Honey Thu Oct 03, 2013 4:11 am

Hi JSchroeder,

I wouldn't make any effort to normalize data for a Data Mart application - I prefer the Star Schema design which is heavily de-normalized.

I also prefer the "aggresive load" strategy, where source system data is accepted as valid and correct until proven otherwise. In reality its typically the most heavily scrutinized data available, flawed or not.

I'd see examples of "poor structure" as an opportunity for the Data Mart application to add value. You can clean data, add hierarchies, banding etc.

Good luck!
Mike
Mike Honey
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

http://www.mangasolutions.com

Back to top Go down

Poorly sturctured data at source system Empty Re: Poorly sturctured data at source system

Post  BoxesAndLines Thu Oct 03, 2013 10:55 pm

I wouldn't normalize it either. Normalizing requires two extensive sets of ETL logic. One to extract and normalize to ETL staging area (ODS). Another set of ETL logic to extract and load to the dimensional warehouse. I try to minimize the amount of transformation logic that must be done to stage the data, usually just adding dates to store a few weeks of data in case I need to reload anything to the warehouse.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Poorly sturctured data at source system Empty Re: Poorly sturctured data at source system

Post  harisrana Thu Feb 12, 2015 1:19 am

I wouldn't normalize it either. Normalizing requires two extensive sets of ETL logic. One to extract and normalize to ETL staging area (ODS). Another set of ETL logic to extract and load to the dimensional warehouse. I try to minimize the amount of transformation logic that must be done to stage the data, usually just adding dates to store a few weeks of data in case I need to reload anything to the warehouse.

harisrana

Posts : 1
Join date : 2015-02-12

Back to top Go down

Poorly sturctured data at source system Empty Re: Poorly sturctured data at source system

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum