SCD Type2 - ETL Design

View previous topic View next topic Go down

SCD Type2 - ETL Design

Post  KK_ETL on Thu Nov 05, 2009 1:02 pm

Hi,

We are trying to build a new data warehouse. Planning to capture the data as SCD Type2 in the Data Warehouse. However, the source system doesn't has any date fields for extraction.
Let me explain the scenario:
Table Name : DWS_CUST
Scenario 1
Columns : Customer No (PK) Cust_Name Cust_address1 Cust_address2 Cust_address3 Cust_address4 Start_date End_date

Data 1 XYZ 34 04-11-2008 01-01-1999

After few months say on 08-12-2008 the customer changes address: Incremental Load - Ovenight process in Data Warehouse.
Scenario 2
Columns : Customer No (PK) Cust_Name Cust_address1 Cust_address2 Cust_address3 Cust_address4 Start_date End_date

Data 1 XYZ 34 04-11-2008 07-12-2008
1 XYZ 89 08-12-2008 01-01-1999

So the data load which takes place overnight which starts at 08-12-2009 and carries forward to 09-12-2009 should make sure that the Start_date is still 08-12-2009.

To achive this so far upon research and thinking i felt the solution to this is to create a Load audit table that would identify the load date which is the Start_date and supply the information to all the records that are potential for incremental load.
If the record is new then it would Insert as in table whose Customer address 34 Scenario 1 .
If the record is for update then it would be as in customer address 89 Scenario 2.

Am I doing correct? If yes, how exactly can this be achieved practically?
Is there anything I'm missing? If I'm doing anything wrong please correct me. If there is any other way please suggest.

Regards,
KK

KK_ETL

Posts : 3
Join date : 2009-11-05

View user profile

Back to top Go down

Re: SCD Type2 - ETL Design

Post  JacekA on Thu Nov 05, 2009 8:44 pm

KK_ETL wrote:
So the data load which takes place overnight which starts at 08-12-2009 and carries forward to 09-12-2009 should make sure that the Start_date is still 08-12-2009.

To achive this so far upon research and thinking i felt the solution to this is to create a Load audit table that would identify the load date which is the Start_date and supply the information to all the records that are potential for incremental load.
As far as I understand, your goal is to save the date from the previous day as a cut-off date, is that right?
If so, you can (actually you should for many other reasons, e.g. for monitoring) build the log table containing info about your load jobs.
The other solution, which we sometimes use, is an agreement, that a processing day is not the same as the calendar day (similarly to the difference between calendar year and financial year). You could for example agree that the processing day D starts at 21:00 of calendar day D and lasts till 20:59 of calendar day D+1. So in Oracle SQL scripts, instead of
trunc(SYSDATE)
you would then write
trunc(SYSDATE - INTERVAL '21' HOUR)


Jacek Adamowicz

JacekA

Posts : 3
Join date : 2009-10-22

View user profile

Back to top Go down

Re: SCD Type2 - ETL Design

Post  KK_ETL on Thu Nov 12, 2009 10:31 am

Thank you JacekA.
What I did here was to create a Parameter File/Function on a Server that holds the business date and is only changed when the triggering element instructs to change the business date to new one. This also helps because the data can be processed for any historical date just by controlling the business date in the system.

But I have an issue here. Suppose if the data load fails on a given date. The next nightly loads should be able to pick up the data for different business dates. Is there something that anyone can suggest?

KK_ETL

Posts : 3
Join date : 2009-11-05

View user profile

Back to top Go down

Re: SCD Type2 - ETL Design

Post  kapoor_dh on Tue Dec 08, 2009 2:51 am

I have the potential solution but before that I would like to know what is the incremental mechanism you are following,i.e. how do you pull the incremental data each day,it all depends on that

kapoor_dh

Posts : 24
Join date : 2009-12-08

View user profile

Back to top Go down

Re: SCD Type2 - ETL Design

Post  KK_ETL on Tue Dec 08, 2009 5:51 am

Yes,We are currently using the journalizing method for incremental extraction. We have Journal tables in the source tables where whenever there is a change in the record in a table the respective journal table picks up this change. Using these tables we load the incremental data in the ODI.

KK_ETL

Posts : 3
Join date : 2009-11-05

View user profile

Back to top Go down

Re: SCD Type2 - ETL Design

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum