DW Architecture for a new setup

View previous topic View next topic Go down

DW Architecture for a new setup

Post  esh on Tue Feb 03, 2009 5:29 pm

Ok I'm a newby to DW. I am tasked with setting up a data warehouse this year. There are a couple of things which I feel make it unique, I'll outline the plan I have come up with to accomplish this as of right now.

We have 2 OLTP systems, one was from a company we acquired and is denormalized and uninterpretable for the most part... a cobol application runs off it. So they do strange things like storing 4 entries in one record, the result is having to loop through each record. There are no date fields or number fields... it is pretty much all string based. I found a product which 'decodes' (normalizes) the DB. The normalized version will be on a separate server and essentially is replicated from the production server.

The other system we have is very easy to get information out of. I am very familiar with its structure. So I am not worried about that.

Both systems are maintaining sales data, this is what we initially want to analyze. They both have the same items, however they are not the same Item Names or even descriptions. I plan to make a 'master style' listing and maintain this table so it will allow for "merging" of the two systems' sales/inventory data.


Non-Normalized System -> Normalized Version -> Merged Data (DW)
Normalized System -> Replicated version -> Merged Data (DW)

This will start out small, initial cubes will be for analyzing sales and inventory information, so we can have one view of both systems.

So some of the questions I have are related to the actual architecture of the database servers. I will have essentially 4 DB servers. 2 are the OLTP production. Those 2 will replicate to 1 DB server, which will be used to load the final DB server. I am doing this because i don't want to affect performance on the OLTP databases, is this a wise choice? I'm trying to avoid any locking on the production servers...

esh

Posts : 4
Join date : 2009-02-03

View user profile

Back to top Go down

Re: DW Architecture for a new setup

Post  BrianJarrett on Tue Feb 03, 2009 6:21 pm

I'm not familiar with all the details but here's what I'd do (at a high level):

To address the disparate source systems I'd create level 1 and level 2 fact tables. Source system A and source system B would both have their own fact tables (living at level 1). I'd then consolidate these up into a single table that contains all the ubiquitous fields between both level 1 tables (assuming each source system's fact data is not structured identically). Your cubes (or summary tables) could aggregate up from your level 2 fact tables.

The item disparity can be handled in a mapping table during the ETL process (which I think is what you're saying here).

If you can batch extract from your source systems I'd do that after hours (for minimal impact) and then dump the data into a staging area. Your lowest level fact tables can be populated from there. Your dimensions can be consolidated and conformed in the staging area and then loaded into your warehouse dimension tables from there as well. It looks like you already have a server at your disposal for this. A method for determining net change during this extraction is crucial if your tables are large.

So to answer your final question, I think your idea to hit your production systems only once per extraction period is a good one.

Hopefully this helps. If not, shoot me some more details over and we can review it again.
avatar
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 43
Location : St. Louis, MO

View user profile

Back to top Go down

Data Profile!

Post  tod mckenna on Wed Feb 04, 2009 7:12 am

Hi esh,

Personally, I think your approach is pretty good. You have an integration/staging server and a server dedicated for the data warehouse.

Before you get much further though, I think you can help yourself out a ton by doing the following:

(1) Perform a thorough data profiling of both systems. You'll spend more time on the "Non-Normalized System".
(2) From your data profiling, conform all data attributes and metrics. This means that you should jot down any datatype conversions, necessary decodings, numeric conversions, or mappings.
(3) Extract the raw data into your staging server and perform all necessary transformations to conform the data.
(4) From this conformed set of mostly relational data, load your dimensional model.

Let me know if you'd like some more details. That's where the devil lives!

-Tod
avatar
tod mckenna

Posts : 9
Join date : 2009-02-03

View user profile http://blog.todmeansfox.com

Back to top Go down

Re: DW Architecture for a new setup

Post  Edwin Kurian on Wed Feb 04, 2009 11:20 am

esh wrote:So some of the questions I have are related to the actual architecture of the database servers. I will have essentially 4 DB servers. 2 are the OLTP production. Those 2 will replicate to 1 DB server, which will be used to load the final DB server. I am doing this because i don't want to affect performance on the OLTP databases, is this a wise choice? I'm trying to avoid any locking on the production servers...
The architecture will depend on Performance SLA of the source systems. If it is a Global system, then batch processing at any time will affect someone. On the other hand if there is a quiet period, you could use it to perform extracts to a Staging area. Note that as your Data warehouse grows, and more subject areas are captured, the available time window for data extraction will shrink. You may want to explore the option of creating an ODS, which will be composed of replicated tables and batch populated tables.

Edwin
...who thinks every company is unique, and their solutions are unique.

Edwin Kurian

Posts : 13
Join date : 2009-02-03
Location : Milwaukee

View user profile http://www.valordevelopment.com

Back to top Go down

Re: DW Architecture for a new setup

Post  esh on Wed Feb 04, 2009 4:29 pm

Thanks for all the info/help!



  • There is a short window of only about 4 hours every night, however it shortens during holiday (retail) to only about 1 hour. So i really think we will have to do the staging server to pull data from. I will probably utilize the staging server as an adhoc analysis server as well, so it can serve 2 purposes, staging for the DW and Ad-hoc for all other types of reports people always want that may not be in the DW necessarily. Will look into the ODS concept as well.
  • What good data profiling tools are available? These are mostly all SQL 2005 servers except for one which is 2000, working on converting it to 2005.
  • Tod - 2,3,4 also seem to make good sense, will add this to my documentation for the project.
  • Brian, I'll have to read more about the level 1 & 2 fact table differences. Not too familiar with the differences between the two yet.
  • Brian, yes the mapping would occur during the ETL process. (that's my plan at least).

esh

Posts : 4
Join date : 2009-02-03

View user profile

Back to top Go down

Re: DW Architecture for a new setup

Post  esh on Wed Feb 04, 2009 4:35 pm

also the reporting/analysis tool I have been testing out is LogiXML's suite. Not sure if anyone is familiar with it? I tried some of the open source ones, and looked at 2 other commercial ones. This one seemed to me very easy to use and get into. The others I tried definitely were not as user friendly... and well that is important as I have many other things to do here (I do pretty much everything, dba/network eng/vmware admin/support & maintain multiple OLTP systems, as well as regular mgmt duties.). So I definitely want tools that are easy to use, so if anyone has any more suggestions please let me know.

esh

Posts : 4
Join date : 2009-02-03

View user profile

Back to top Go down

Re: DW Architecture for a new setup

Post  ngalemmo on Mon May 18, 2009 8:34 pm

"I have many other things to do here (I do pretty much everything, dba/network eng/vmware admin/support & maintain multiple OLTP systems, as well as regular mgmt duties.). "

And you are on the hook to build a data warehouse??? Dude... get some help! Last time I checked, a day is only 24 hours and they are not getting any longer.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: DW Architecture for a new setup

Post  esh on Thu Jul 09, 2009 10:32 am

ngalemmo wrote:
And you are on the hook to build a data warehouse??? Dude... get some help! Last time I checked, a day is only 24 hours and they are not getting any longer.
Budgets are tight. Luckily one of the products I purchased is almost out of the box in terms of being built specifically for that application and taking the data into a DW and also tying it in with a hyperion reporting solution. That project is 90% complete now, I will soon start working on merging that data with our other systems into a central DW to get the entire company wide data combined.

I'm hoping next year to hire a full time DBA to offload a lot of the db activities.

esh

Posts : 4
Join date : 2009-02-03

View user profile

Back to top Go down

Re: DW Architecture for a new setup

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum