Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

updating a live database

3 posters

Go down

updating a live database Empty updating a live database

Post  scott.kan Wed May 02, 2012 4:39 pm

Hello,

We get the previous day transactions at 8 am in the morning. It takes about four hours to load the data into the dw. So, the load happens during the regular business hours while the users are actively querying the database. One of the questions I was asked today is how do we ensure the integrity of the data when the data is being consumed as it gets updated. Some records may still show the data that is two days old and some others may show the current as of yesterday.

Here is a tentative solution that was recommended by someone: We will have two databases/schemas, lets call db1 and db2. The reports will run off of db1. We will have an exact copy of it as db2. We will load the daily jobs into db2 while the reports are being run from db1. Once the load is complete for that day, get the users out of the system for a brief period of time, drop db1, rename db1 to db2, bring the users back, then copy db1 to db2. Lather, rinse and repeat.

First time when I heard this, all I said was 'crazeeee'. This is way overkill if you ask me. But what other options do we have?

Any ideas anyone?

scott.kan

Posts : 3
Join date : 2012-05-02

Back to top Go down

updating a live database Empty Re: updating a live database

Post  ngalemmo Wed May 02, 2012 4:59 pm

How about transactions? Any update process against a relational database needs to operate in a transaction state. Database integrity rules are such that no concurrent user will see any changes performed by another transaction. Only after the transaction has finished and a commit performed will the updates become visible to new queries.

There are other issues more pertinent to your situation. First is resource contention during the day, second is what does it mean to load in the middle of a business day? Reports run in the morning will be missing data of reports run in the afternoon. If consistency during the day is more important than the currency of the data (day old versus 2 days old), you may want to consider delaying the load until after business hours.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

updating a live database Empty Re: updating a live database

Post  scott.kan Wed May 02, 2012 8:02 pm

Very valid points. We do have the resource contention. Delaying until after the business hours is something I suggested which is on the table. However as you can expect, this option will not go well with some users. Hence we are looking for alternate solutions for this problem.

scott.kan

Posts : 3
Join date : 2012-05-02

Back to top Go down

updating a live database Empty Re: updating a live database

Post  ngalemmo Wed May 02, 2012 10:20 pm

If you decide to load during the day, you separate fact loading from dimension loading. Do dimensions first. If the dimension is a type 2 dimension, users will never see the update with the current fact tables, since no fact table would reference a newly added dimension row.

Load all the fact tables at once if possible. Preprocess the source data to produce a temporary table containing the new fact rows. When all the preprocessing (validation, key assignment, etc...) is done, insert the rows from the temp tables to the target fact table. This will minimize the time window when specific fact tables may be out of sync. Besides, even if they are, it only matters if a query being run at the time needs both fact tables.

The only issue is type 1 or type 3 dimensions. After you run a process that updates the dimension, new queries will see the updated dimension. This may affect some reports as they may be using an attribute that changed. However, dimension updates are relatively small compared to transaction volumes. This shortcoming may have no material effect on most reports. Other than this case, a user should never see dimensions out of sync with the facts.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

updating a live database Empty Re: updating a live database

Post  scott.kan Thu May 03, 2012 8:19 pm

Thank you for all the pointers. They all make sense to me. I understand that I have to try and re-architect the integration architecture, given the complex requirement.

scott.kan

Posts : 3
Join date : 2012-05-02

Back to top Go down

updating a live database Empty Re: updating a live database

Post  Blackcat Wed Dec 17, 2014 2:12 am

The part that confuses me is the join, the concept of loading the fact into a new table makes perfect sense, however I assumed the problem would be how to know which dimension record to use for the fact... since none of the fact attributes exist in the dimension and the PK in the dim no longer exists.






________________
watson

Blackcat

Posts : 1
Join date : 2014-12-17

Back to top Go down

updating a live database Empty Re: updating a live database

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum