Data types for staging area db tables

View previous topic View next topic Go down

Data types for staging area db tables

Post  at710 on Fri Nov 02, 2012 10:55 am

I am doing my first real world dwh project and have not found any valuable information or best practice guidelines regarding which datatypes to choose for stagring area tables.

I have some ssis packages which are reading data from flatfile sources and filling tables of my SQL Server database. Transformations and data integration is performed by a T-SQL stored procedure after the initial import. Now i have no idea wether it's ok to use char/varchar datatypes for all columns or if i should do the tedious work of choosing and assigning the expected datatype to each of my 200+ table columns.

Are there any benefits / drawbacks of loose vs. strong typing regarding dwh tables in general and regarding initial staging talbes for 1:1 imports? What is your approach? What would you suggest?

thanks 4 your help & kind regards

at710

Posts : 10
Join date : 2012-11-02

View user profile

Back to top Go down

Re: Data types for staging area db tables

Post  BoxesAndLines on Fri Nov 02, 2012 12:07 pm

Well, that's what data modeling is all about. If you don't care to let the database enforce anything whatsoever and do every validation through ETL then you're really doing the same thing. My take is if you're coding up data validation in your ETL then you are better off letting the database do it since the database never has coding errors.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Data types for staging area db tables

Post  at710 on Fri Nov 02, 2012 1:06 pm

That's a good point indeed. I'd have to write at least some data check routines/queries myself if i don't set appropriate datatypes. I think I will stick to the strong typing paradigm when designing my staging area.

Thanks for your advice!

at710

Posts : 10
Join date : 2012-11-02

View user profile

Back to top Go down

Re: Data types for staging area db tables

Post  Mike Honey on Sun Nov 04, 2012 11:16 pm

Hi ... at ?

I certainly wouldn't recommend modeling every column as char/varchar, but I do think it's worthwhile standardising to a handful of datatypes, e.g. nvarchar (250) for string data, datetime2 for dates etc.

If your ETL layer is SQL SPs it may not matter too much in that layer, but other ETL and downstream tools (I'm looking at you, SSIS & SSAS), can be hypersensitive to datatype changes.

You also want to minimize your exposure to trivial datatype changes upstream, e.g. columnX changes from varchar ( 30 ) to varchar ( 60 ).

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: Data types for staging area db tables

Post  ngalemmo on Mon Nov 05, 2012 1:52 am

It all depends on the source of the data feeds. You must ultimately store the data in its native data type, so you must know what type it is.

If you are receiving data in text files or XML, the data needs to be converted. If the data is from a trusted source that you know correctly represents the data in well formed strings, you don't have to do much validation. On the other hand, if the data is from an unknown source, such as a web log of user entered information, you need to check to see if strings are in the proper form.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Data types for staging area db tables

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum