Extracting data when there is no timestamp in the source system

View previous topic View next topic Go down

Extracting data when there is no timestamp in the source system

Post  DWH on Sat Feb 20, 2010 11:00 am

Guys, what are your strategies to extract data when the source system does not have any timestamps. Is there any way to still get deltas ?

Thanks

DWH

Posts : 2
Join date : 2010-02-17

View user profile

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  beyeguru on Thu Feb 25, 2010 9:01 pm

Your question is too general. Are you trying to extract transaction or dimension records?

beyeguru

Posts : 5
Join date : 2009-08-03

View user profile

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  DWH on Sat Feb 27, 2010 10:37 am

You are right beyeguru, it is very general. But I wanted to see how people have solved this problem. It can be for transactions or dimensions. We can always use CRC to do this but than we would have to load all the data in the staging everyday and then compare CRC. Has anyone used other ways ?

Thanks

DWH

Posts : 2
Join date : 2010-02-17

View user profile

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  BoxesAndLines on Sun Feb 28, 2010 2:49 pm

I use a checksum if I cannot get a delta feed.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  mselway on Fri Mar 19, 2010 6:18 am

So long as the table has a primary key you can left join the destination table to get only the new records.
Example:

Select s.Attribute1, s.Attribute2, s.Attribute3......
From SourceTable s
Left JOIN TargetTable t
ON s.PrimaryKey = t.PrimaryKey
Where t.PrimaryKey IS NULL

this will retrieve the records not already in the target.
avatar
mselway

Posts : 6
Join date : 2009-10-14
Location : London, England

View user profile http://www.swiftcover.com

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  mugen_kanosei on Mon Apr 12, 2010 2:14 am

I don't know if other database systems have something similar, but for Oracle, I used the ora_rowscn psuedo column because I didnt trust the application controlled modify time field. Its a psuedo column that stores the system change number that gets generated for each transaction. Unfortunantly it applies the SCN to all the rows in the block unless you use rowdependencies. There is an article at OraFaq about it. I keep track of the last ora_rowscn transaction number in my warehouse, and then pull anything greater than that. That will pull all the modifications as well as any other rows that were in that database block. From that I find out the delta to only get the actual modified columns. Oracle provides a better way with Change Data Capture, but due to reasons outside my control, I'm unable to implement.

mugen_kanosei

Posts : 13
Join date : 2009-02-03
Age : 36
Location : Japan

View user profile

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  ngalemmo on Mon Apr 12, 2010 12:21 pm

Interesting... I wasn't aware of ora_rowscn. It is certainly useful when there is resistance to modify the OLTP system to support DW extracts... which is usually the case 99% of the time.

Unfortunately, you probably can't index it... so you wind up doing table scans. But, for lack of anything better, it will significantly reduce the number of rows you need to evaluate in the ETL process.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Extracting data when there is no timestamp in the source system

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum