Precision for the ETL Load/Audit related Timstamp field.

View previous topic View next topic Go down

Precision for the ETL Load/Audit related Timstamp field.

Post  juz_b on Mon May 09, 2011 6:56 pm

Hi all,

For every target table, we always append the ETL load related Insert/Update/Delete timestamp for audit and timebased CDC for downstream extractions.

Questions:

1. Is it best practice to implement the timestamp datatype that supports Sub-SECONDS? And why?
2. What are the cons of have a timestamp datatype that only goes down to the SECONDS? Granted we will lose the precision going into the sub-seconds, but for practical purposes, is this really a problem?

It seems like the key here is consistency. As long as every stage of the ETL stream (stage, dim, fact etc) the timestamp datatype is the same, then we should fine.

Comments?

Thanks!


juz_b

Posts : 17
Join date : 2009-02-07

View user profile

Back to top Go down

Re: Precision for the ETL Load/Audit related Timstamp field.

Post  ngalemmo on Mon May 09, 2011 9:28 pm

Considering these are batch processes, I don't think it is really anything to be concerned about. Whole seconds should be fine. The only concern is that when pulling from the source system, you should use the clock on that system to log the time you initiated the extract whenever possible.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Precision for the ETL Load/Audit related Timstamp field.

Post  juz_b on Tue May 10, 2011 12:27 pm

Thanks!

So are there instances where it would be appropriate to implement in an ODS or Data Warehouse environment? Maybe a realtime ODS or Data Warehouse? Or is it more suitable for OLTP systems?

juz_b

Posts : 17
Join date : 2009-02-07

View user profile

Back to top Go down

Re: Precision for the ETL Load/Audit related Timstamp field.

Post  ngalemmo on Tue May 10, 2011 1:20 pm

Fractional seconds don't matter in any situation, nor does the precision difference matter between systems.

If you are pulling data from system A, you must use the clock on system A as the point of reference when comparing timestamps to determine if something is 'new'. If you do not use system A's clock, you have problems well beyond any issues fractional second precision issues would cause.

You can even reduce the precision of your benchmark timestamp and it still doesn't change anything. Let's say you call system A to get the time and it returns time in milliseconds. You round the time to the second, or even minute... it doesn't matter, and use that to extract data. The extract SQL would pull new data that arrived between the time of the last pull and the current time. It doesn't matter how often you do this, each extract will always pull new data provided there are no long running active transactions on the source system.

In fact, it is common to back up the current time to avoid issues with active transactions. It is not unusual to get the current time and subtract a few minutes (or whatever is appropriate) and use that as the upper limit.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Precision for the ETL Load/Audit related Timstamp field.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum