How to handle same day changes
2 posters
Page 1 of 1
How to handle same day changes
I am building the SCD2 and have a question about the end_date for the records for the same business key coming intra day.
Right now i am populating the end date = effstart_date -1 but for the same day changes, the end date is less than the effstart_Date, which is not right.
can any please suggest something on this?
Day at 11 AM ( Source) 7/31/2015
Natural Key drnumber name
A 123 xyz
B 456 abc
SCD 2 Type2 ( dimension table)
SKEY NK drnumber name Effstart_Date end_date Curr_version
1 A 123 xyz 7/31/2015 12/31/9999 Y
2 B 456 abc 7/31/2015 12/31/9999 Y
Day at 11:30 AM ( Source) 7/31/2015
Natural Key drnumber name
A 895 xyz
B 456 abc
SCD 2 Type2
SKEY NK drnumber name Effstart_Date end_date Curr_version
1 A 123 xyz 7/31/2015 7/30/2015 N
2 B 456 abc 7/31/2015 12/31/9999 Y
3 A 895 xyz 7/31/2015 12/31/9999 Y
Question-
1) Will Adding timestamp is good idea and use the effestart_date_TimeStamp - 5 min to get the end date
for the same records coming in intra day.
E.g
SKEY NK drnumber name Effstart_Datetime end_datetime Curr_version
1 A 123 xyz 7/31/2015 11:00:00 7/31/2015 11:25:00 N
2 B 456 abc 7/31/2015 11:00:00 12/31/9999 00:00:00 Y
3 A 895 xyz 7/31/2015 11:30:00 12/31/9999 00:00:00 Y
2) Any downside for the fact population with this?
2) Will Post process sql/code to group records where end_date
Thanks
Sharvan
Right now i am populating the end date = effstart_date -1 but for the same day changes, the end date is less than the effstart_Date, which is not right.
can any please suggest something on this?
Day at 11 AM ( Source) 7/31/2015
Natural Key drnumber name
A 123 xyz
B 456 abc
SCD 2 Type2 ( dimension table)
SKEY NK drnumber name Effstart_Date end_date Curr_version
1 A 123 xyz 7/31/2015 12/31/9999 Y
2 B 456 abc 7/31/2015 12/31/9999 Y
Day at 11:30 AM ( Source) 7/31/2015
Natural Key drnumber name
A 895 xyz
B 456 abc
SCD 2 Type2
SKEY NK drnumber name Effstart_Date end_date Curr_version
1 A 123 xyz 7/31/2015 7/30/2015 N
2 B 456 abc 7/31/2015 12/31/9999 Y
3 A 895 xyz 7/31/2015 12/31/9999 Y
Question-
1) Will Adding timestamp is good idea and use the effestart_date_TimeStamp - 5 min to get the end date
for the same records coming in intra day.
E.g
SKEY NK drnumber name Effstart_Datetime end_datetime Curr_version
1 A 123 xyz 7/31/2015 11:00:00 7/31/2015 11:25:00 N
2 B 456 abc 7/31/2015 11:00:00 12/31/9999 00:00:00 Y
3 A 895 xyz 7/31/2015 11:30:00 12/31/9999 00:00:00 Y
2) Any downside for the fact population with this?
2) Will Post process sql/code to group records where end_date
Thanks
Sharvan
sharvan.kumar.83@gmail.co- Posts : 10
Join date : 2014-11-17
Re: How to handle same day changes
Do you really need intraday changes? Are facts being loaded throughout the day?
If you are recording intraday changes, your source should provide the timestamp when the change occurred. The expiration timestamp for the previous row is calculated by subtracting the smallest unit of time (1 millisecond or 1 microsecond) from the effective timestamp.
If you are only loading facts once a day, the need for intraday change history is questionable. But if you do, it only makes sense if you use the transaction timestamp to locate the appropriate dimension version for assigning the foreign key.
If you are recording intraday changes, your source should provide the timestamp when the change occurred. The expiration timestamp for the previous row is calculated by subtracting the smallest unit of time (1 millisecond or 1 microsecond) from the effective timestamp.
If you are only loading facts once a day, the need for intraday change history is questionable. But if you do, it only makes sense if you use the transaction timestamp to locate the appropriate dimension version for assigning the foreign key.
Re: How to handle same day changes
we have source timestamp that provides the change records timestamp however it is not reliable as sometime backend data changes do not touch the change timestamp. Hence can not fully relay on that. So thinking to use the sysdate ( for EFFECTIVE_DATE ) and will use that to drive the expiration timestamp for the change record.) P.Experiation_date = Effective_Date -1/1440, ( subtracting 1 min)
On the fact side, we do not have the timestamp but we have transaction date ( no time factor ) , So while populating the fact how do i look up the date with dimension timestamp as there could be many intraday changes for some dimensions.
thanks
Sharvan
On the fact side, we do not have the timestamp but we have transaction date ( no time factor ) , So while populating the fact how do i look up the date with dimension timestamp as there could be many intraday changes for some dimensions.
thanks
Sharvan
sharvan.kumar.83@gmail.co- Posts : 10
Join date : 2014-11-17
Re: How to handle same day changes
Unless you have a reliable timestamp coming from the source, you cannot reliably record all intraday changes. You also cannot reliably apply updates if the source contains multiple rows for the same natural key.
When you have a source that contains multiple rows for the same NK, at the very least you need to know which is the last one. Without a proper timestamp you cannot do it reliably.
And, if you don't have a timestamp on the fact, what is the point of retaining intraday changes in dimensions?
When you have a source that contains multiple rows for the same NK, at the very least you need to know which is the last one. Without a proper timestamp you cannot do it reliably.
And, if you don't have a timestamp on the fact, what is the point of retaining intraday changes in dimensions?
Re: How to handle same day changes
Thanks a lot..
We have to accommodate the intra day for business requirement. I will see if i can use the source time stamp.
And while loading fact i will use the current dimension ( where current version = 'Y' ) for multiple intra day changes as there is no time stamp in transaction date.
Is this is strategy seems oK or Any other scenario that needs to take care?
Thanks
Sharvan
We have to accommodate the intra day for business requirement. I will see if i can use the source time stamp.
And while loading fact i will use the current dimension ( where current version = 'Y' ) for multiple intra day changes as there is no time stamp in transaction date.
Is this is strategy seems oK or Any other scenario that needs to take care?
Thanks
Sharvan
sharvan.kumar.83@gmail.co- Posts : 10
Join date : 2014-11-17
Re: How to handle same day changes
That will work, but facts will never reference some of the intraday dimension versions.
If the dimension timestamp is not reliable you may encounter multiple source rows with the same timestamp. You will not know which one to load. Even if you load all of them, the correct current one will be left to chance. The others will be 'invisible' if you filter by time as the effective timestamp will be later than the expiration timestamp.
If the dimension timestamp is not reliable you may encounter multiple source rows with the same timestamp. You will not know which one to load. Even if you load all of them, the correct current one will be left to chance. The others will be 'invisible' if you filter by time as the effective timestamp will be later than the expiration timestamp.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|