Changes to Facts?
2 posters
Page 1 of 1
Changes to Facts?
I currently have a fact table houses some data around investments and their related performance (1 row per investment per day held). Daily P&L, quantity of shares, etc. are sourced from the accounting system as part of a nightly ETL job. Every so-often, someone mistypes a piece of information into the accounting system (i.e. quantity of shares entered as 50 instead of 500) and we have a cancel-correct entry booked in the accounting system when the error is caught (sometimes days later). Currently, our ETL job finds cancel-corrects, deletes the corresponding facts, then re-creates them with the corrected data. From what I can tell - deleting fact data is generally frowned upon, but I'm not too keen on the idea of keeping the "wrong" data in the fact table. Keeping both versions of the fact record (good and bad), then adding logic to reports to pull the most-up-do-date version, seems like a fair amount of added complexity for storing a piece of information nobody's ever going to want to see. I'm curious to know how others have handled this scenario. Are others just deleting the bad fact-data, are they dealing with the extra effective date on the fact record, or is there some elegant solution I'm just not seeing?
benb123- Posts : 3
Join date : 2011-02-17
Re: Changes to Facts?
I don't like changing facts. My feeling is that I want to be able to recreate a report. If a report goes out today, I want to be able to recreate the report next week or next year.
I deal with insurance claims. Sometimes a claim gets reworked which changes the amount paid. A claim can originally be paid for $50 and then it gets reworked and it might need to be paid for $75. When this happens, I clone the original version of the claim except that I change the amounts to negatives and change the Paid date to the new version of the claim. And then I load the new version of the claim. This gives me 3 versions of the claim - the Original, and adjusting claims (with the negatives), and the most recent claim.
Sometimes the time between the original and reworked claim can be months. I would get
Claim 1, Version 1, Paid Amount $50, transaction date 1/1/2010, paid date 1/15/2010
Claim 1, Version 2, Paid Amount -$50, transaction date 1/1/2010, paid date 4/20/2010
Claim 1, Version 3, Paid Amount $75, transaction date 1/1/2010, paid date 4/20/2010
This tells people that on 1/15/2010, we paid $50. We should have paid $75, so on 4/20/2010, we paid another $25.
I deal with insurance claims. Sometimes a claim gets reworked which changes the amount paid. A claim can originally be paid for $50 and then it gets reworked and it might need to be paid for $75. When this happens, I clone the original version of the claim except that I change the amounts to negatives and change the Paid date to the new version of the claim. And then I load the new version of the claim. This gives me 3 versions of the claim - the Original, and adjusting claims (with the negatives), and the most recent claim.
Sometimes the time between the original and reworked claim can be months. I would get
Claim 1, Version 1, Paid Amount $50, transaction date 1/1/2010, paid date 1/15/2010
Claim 1, Version 2, Paid Amount -$50, transaction date 1/1/2010, paid date 4/20/2010
Claim 1, Version 3, Paid Amount $75, transaction date 1/1/2010, paid date 4/20/2010
This tells people that on 1/15/2010, we paid $50. We should have paid $75, so on 4/20/2010, we paid another $25.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Changes to Facts?
Ahh... That makes perfect sense. I didn't want to put extra logic in the reporting platform to use an "as-of" date to find the most-recent record, but with the canceling entry, I won't need to. I need to fully digest this, but I think it answers my question. Thanks a lot for the insight.
benb123- Posts : 3
Join date : 2011-02-17
Follow up question
Here's a followup. Every once in a while, a type 2 dimension record will "split" due to late arriving information. In the example below, I found recently that Company A was not always incorporated in USA and updated the dimension table to accurately reflect history. Without updating the fact records, all my records before 2005 are going to point to the wrong version of the dimension record. Would you use the same approach - essentially canceling out the old fact records with negative records and inserting new ones (pointing to the corrected dimension key)?
becomes
Company Key | Company ID | Company | Country | From | To |
1 | 1000 | Company A | USA | 1/1/2005 | 1/1/2010 |
becomes
Company Key | Company ID | Company | Country | From | To |
100 | 1000 | Company A | CANADA | 1/1/2005 | 12/31/2006 |
101 | 1000 | Company A | USA | 1/1/2007 | 1/1/2010 |
benb123- Posts : 3
Join date : 2011-02-17
Re: Changes to Facts?
I would use the same approach. To be honset with you, I'm not a trusting person of source data. I compare all of my new fact data to history to determine if the next fact row had been previously loaded and if so, I apply the adjustment period. In my situation, I can receive claims that were denied and then later paid, which would have a different set of dimension keys for certain attributes.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Similar topics
» Multiple Facts or Single Facts and Status Table?
» How best to model Timesheet facts against Sales Order facts
» Inventory facts and production facts
» Identify the facts and facts grain
» In which layer to relate facts to other facts?
» How best to model Timesheet facts against Sales Order facts
» Inventory facts and production facts
» Identify the facts and facts grain
» In which layer to relate facts to other facts?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|