Change Data Capture

View previous topic View next topic Go down

Change Data Capture

Post  ATran on Tue Aug 17, 2010 11:02 am

Hi,
We are trying to chose a way to capture change from the source system (we have to support MSSQL and Oracle).

- i have problem understanding the difference between the two method : Using Audit Column and Timed Extract.
Plus, if we choose this technique, we need a way to capture the deleted row.
Relying on trigger is not a very good option for us, because we don't want to add trigger in every table.
Maybe, having a staging table with only the source id for a given table could do the trick ? (we can then compare with the source table to know which id have been deleted)

- i have problem understanding the difference between the two method : Process of Elimination and Initial and Incremental Loads. For both method, you seemed to compare two table, the current and the previous.
And how do you techniqualy implemet these method

ATran

Posts : 11
Join date : 2010-03-25

View user profile

Back to top Go down

Re: Change Data Capture

Post  ngalemmo on Tue Aug 17, 2010 12:04 pm

One approach is to use the database's own change logs. Some ETL tools support this type of interface. The challenge is the nature of these logs vary by database system, so you need different code for each database.

The most common generic method is to extract based on an update timestamp stored on the source row. But not all systems provide such a column. If available, the extract process would keep track of the last time it extracted and pull all rows changed since then. However, this approach will not capture hard deletes.

The brute force approach would be to pull everything (within reason) from the source and compare it to what is in the DW and apply changes.

But what you need to do for a particular situation varies depending on the nature of the system and the data you are working with. There is no one method that suits everything and a lot of variations to the methods I mentioned.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Change Data Capture

Post  ATran on Tue Aug 17, 2010 1:35 pm


Reading the log is too specific, and required two big developement when having 2 differents database system.

I think the simplest and better solution for us will be to use the update timestamp column in the source system.
To track hard delete row i may use this solution if i don't find any other :

To track delete row from table EMPLOYEES, i will have a staging table STG_EMPLOYEES with only the existing id from the previous ETL process.
And i will compare the id from the current table EMPLOYEES and STG_EMPLOYEES to know which id was deleted


ATran

Posts : 11
Join date : 2010-03-25

View user profile

Back to top Go down

Re: Change Data Capture

Post  Steveo250k on Tue Mar 26, 2013 12:38 pm

I realize this is an old post, but I just came across it. I don't believe ATran's first question was answered, at least I still have the same question. In Ralph Kimball's Design Tip #63 he describes 4 ways to detect changes.

#1 is by looking at "Audit Columns", columns stored on the source record that indicate when the record was created or last updated.

#3 is "Timed extracts. With a timed extract you typically select all of the rows where the date in the Create or Modified date fields equal" ... to some relative date like TODAY -1.

What's the difference between these two options for detecting changed records? It appears to me #3 is just #1 looking at yesterday's records. Am I missing something?

Steveo250k

Posts : 6
Join date : 2012-08-10

View user profile

Back to top Go down

Re: Change Data Capture

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum