Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Incremental Load for fact table help

2 posters

Go down

Incremental Load for fact table help Empty Incremental Load for fact table help

Post  bonosungho Fri Feb 03, 2012 1:32 am

Hi All,

I have a fact table which has 8 million rows now and currently gets updated on drop and rebuild basis.

I will briefly describe the ETL process from our source to this fact table first.
1. I firstly incrementally load data from 8 source table into the 8 data storage tables in the data warehouse. (This data storage table contains all the historical data.)
Each source table contains date created and modified column so I retrieve the data that is created or modified later than the MAX date created or modified field on our each data storage table in the data warehouse.
2. Then I load the every data from the 8 data storage tables into the staging tables for transformation.
3. Once transformation is done, those data from 8 tables goes into the fact table. (Fact table is firstly truncated and stores those 8 million rows from the staging table)

I tried to incrementally load data from data storage tables to the fact table. However, there is one concern I can't figure out.

Lets assume we have 3 dimensions and one measure in the fact table called (product, provider, country as dimensions and sales_count as a fact) .
On the first day, one sales transaction is made and inserted into the fact table. Lets called the row ("iPhone", "Apple", "USA", 5)
And on the next day, manager found that the data entered incorrectly and fixed it to ("iPhone","Apple","UK", 5)

My approach for the incremental load was , firstly delete every records from the fact table that has the same dimension keys as in the last staging table, and insert those records.
But I realized that this approach can only handle when dimension key hasn't been updated. If any of dimension key gets updated in the source system, it will insert as a new record and leave the old record in the fact table.

In the above case, Dimension value has been updated so there is no way I can compare updated data with the row already inserted into the fact table on the first day.
Can someone please help me how I should handle this incremental load case?





bonosungho

Posts : 2
Join date : 2012-02-03

Back to top Go down

Incremental Load for fact table help Empty Re: Incremental Load for fact table help

Post  hkandpal Sun Feb 05, 2012 9:00 am

Hi,

how much new data you will be loading on each run and how many will be a update ?
Do you have a Order number field in the fact which helps in identifying the transaction, if yes then may be you can use that to identify
any updates to your transaction.

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

Back to top Go down

Incremental Load for fact table help Empty Re: Incremental Load for fact table help

Post  bonosungho Mon Feb 06, 2012 12:36 am

hkandpal wrote:Hi,

how much new data you will be loading on each run and how many will be a update ?
Do you have a Order number field in the fact which helps in identifying the transaction, if yes then may be you can use that to identify
any updates to your transaction.

thanks

I believe the volume of the rows doesn't need be concerned at this stage.
I want to find out the right methodology which I can apply to this case.

Is there no way to update the fact if one of business key changes? or should the business key never gets updated with the best practice?


Also, I don't have a order number field in the fact.


bonosungho

Posts : 2
Join date : 2012-02-03

Back to top Go down

Incremental Load for fact table help Empty Re: Incremental Load for fact table help

Post  hkandpal Tue Feb 07, 2012 10:12 am

Hi,

it depends upon what you want to capture in your DW, do you want to store the history of changes to your data or no it the requirement is to store the data than you should store it.
Now the question arises how to capture changes in your dimension, how are you identifying the data is it a order number / invoice number or PO number, as you may need something which you will use to idnetify the transaction.

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

Back to top Go down

Incremental Load for fact table help Empty Re: Incremental Load for fact table help

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum