Microbatch Performance Issues

View previous topic View next topic Go down

Microbatch Performance Issues

Post  jwilkerson on Tue May 01, 2012 1:41 pm

First post!

My current DW project is making use of microbatches (as our customers demand "real time" data updates). We can process the data generated during period X in less time than the duration of period X - so in theory we can do the work.

We are working in SQL 2008 R2. We have 8 cores and 16 GB Ram. We process between 1m and 10m transactions per day on average.

We have set up about a dozen different microbatches to load our data into our DW. Everything runs - but we would like the run times to be faster. The microbatches run in somewhere between 10s and 120s. We would like all of them to be 10s. We continue to tune and tune - and things get better - but as we make DB changes performance suffers and we have to tune somemore. Perhaps this is ok.

But the load on the system seems high. All the cores are very busy when the microbatches (MB in our speak) are running. We wonder if this is normal.

Purpose of the post is to ellicit input from others who are using microbatches and learn from their experiences.

Thanks,

Joe

jwilkerson

Posts : 2
Join date : 2012-05-01
Location : San Jose, CA

View user profile

Back to top Go down

Re: Microbatch Performance Issues

Post  Mike Honey on Tue May 01, 2012 7:45 pm

Hi Joe,

It's hard to achieve much practical ETL in 10 seconds - I think you may have set your bar too low. Anyway ...

What is the design of your "microbatches"? Are they a dozen executions of the same package? Different facts?

I'd consider high CPU a good sign (unless it is all for the SQL Server process) - it probably indicates you have good parallelism occuring and are working as fast as the non-CPU resources can go (e.g. disk, RAM, network).

You may be light on RAM - watch for SQL Server silently gobbling up all the RAM if it is on the same server. I usually reduce the SQL Maximum server memory to 1/4 - 1/2 of the physical RAM if SSIS is running on the same server. SSIS often needs a lot of RAM and tends to slow dramatically (I've seen 100x slower) when RAM runs out. I kinda miss SQL 2005 where SSIS would just crash in that scenario ...

One strategy I've used in a similar scenario is the "always running package". Basically a master package starts up and caches all the static Lookthrough data. It then goes into a perpetual For Loop to do the actual ETL tasks, which finish with a SQL Task using the WAIT FOR command to pause. After the WAIT FOR period, the actual ETL tasks run again. This avoids the overhead of spinning up the SSIS process and loading your Lookup caches for each "micro batch". For robust "fail over", just schedule it as a job to run every minute - if it fails it will restart in a minute.

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Microbatch Performance

Post  jwilkerson on Tue May 01, 2012 10:09 pm

Mike,
Thanks for the Reply!

Your questions will take a bit of detail to answer.

It's hard to achieve much practical ETL in 10 seconds - I think you may have set your bar too low. Anyway ...

When my boss first asked me to do this – my reaction was similar – actually my reaction was rather more harsh .

Two points:
01 – Our customers are demanding “real time” reporting – and they bring cash to the table.
02 – Our application tracks actionable internet events – making such information available to our customers (internet retailers) does have value.

The times may be a changing … DW used to be “strategic” decision tool … now it seems to be evolving into being an “execution” tool – as well. Time will tell.


What is the design of your "microbatches"? Are they a dozen executions of the same package? Different facts?
I’ll back up and describe our entire ETL process first.

We have a traditional nightly full ETL load which runs under SSIS. This runs once a day just after midnight and loads ALL data for the previous day – that has not already been loaded. This load runs about 90 packages and takes about 20 minutes on average (depending on front end transaction volume). Our microbatches (MB) are all suspended (we use a semaphore mechanism to control what ETL processes are allowed to run concurrently with other ETL processes) during the run of the nightly ETL SSIS load.

Once the nightly ETL is done the suspended MB start up (they are SQL jobs). Right now there are about a dozen of these MB. They run all day until the next full ETL starts up. Some of our MB are allowed to run concurrently, some are not, this is controlled by our semaphore mechanism.

Our MB strategy is that we will update only those star schema elements which must be updated to satisfy customer “real time” requirements. This is a total of about 10 dimension and 2 fact tables (as of today). We have a total of seven star schema with about 30 dimension tables. We also have another 30 staging tables that are the targets of the Extracts.

And MB will first announce that it is running – to the semaphore process. Then the MB will check all currently running MB (and the ETL) to see if it is allowed to run. If not it will loop (in a wait for loop with 10s delay) until the currently running MB mix allow the attempting to run MB to start.

Once this happens the MB will Extract the data IT needs into the staging tables and then call the dimension and fact table loads being updated by this MB. Finally the MB will announce that it is done – to the semaphore mechanism and then terminate.


I'd consider high CPU a good sign (unless it is all for the SQL Server process) - it probably indicates you have good parallelism occuring and are working as fast as the non-CPU resources can go (e.g. disk, RAM, network).
I agree. High CPU should indicate we are not totally IO bound. I’ve seen CPU drop to zero on heavily IO bound systems. We are in the process of moving to another server with triple the number of cores. This should move the bottleneck. 


You may be light on RAM - watch for SQL Server silently gobbling up all the RAM if it is on the same server. I usually reduce the SQL Maximum server memory to 1/4 - 1/2 of the physical RAM if SSIS is running on the same server. SSIS often needs a lot of RAM and tends to slow dramatically (I've seen 100x slower) when RAM runs out. I kinda miss SQL 2005 where SSIS would just crash in that scenario ...

We are also doubling our RAM. How can we tell how much RAM SQL is really using. We allocated 80% of the RAM to SQL – but we do not know how to see inside this chunk.


One strategy I've used in a similar scenario is the "always running package". Basically a master package starts up and caches all the static Lookthrough data. It then goes into a perpetual For Loop to do the actual ETL tasks, which finish with a SQL Task using the WAIT FOR command to pause. After the WAIT FOR period, the actual ETL tasks run again. This avoids the overhead of spinning up the SSIS process and loading your Lookup caches for each "micro batch". For robust "fail over", just schedule it as a job to run every minute - if it fails it will restart in a minute.

Our MB are SQL jobs - they do not run under SSIS.

I actually set up the MB to run in perpetual "wait for" loops to begin with. For some reason my boss changed things to run each MB in a separate job and removed the outer “WAIT FOR”. I forget why he did this. I will ask.

Again, thanks so much for your response! We need to get outside our “box” (thinking patterns) to find a way to make multitudinous simultaneous (semi)asynchronous MB a viable strategy!

I may be wrong – but I think real time DW will become a new norm. At least that is how it seems from where I sit.

Thanks,

Joe


jwilkerson

Posts : 2
Join date : 2012-05-01
Location : San Jose, CA

View user profile

Back to top Go down

Re: Microbatch Performance Issues

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum