Large number of late arriving facts
Page 1 of 1 • Share •
Large number of late arriving facts
I am loading a Star Schema from a data base that acts as the staging area. We are converting our source systems to a new transaction system. We are converting data to the new transaction systems in large chunks with conversion dates spread over 3 years. I am only loading data from the staging area only after it has been converted to the new transaction system. When converted, transaction system will contain 3 years of history. As a result, after each conversion, I get 3 years of data.
On the fact table, i want to create a clustered index on Paid Date (of the insurance claim). I am afraid that after a 3 year chunk of data gets converted, the load process will be brought to it's knees. I'd drop the index, load the 3 years worth of history and then rebuild the index. I'm afraid that rebuilding the index will bring the load process to it's knees.
I'm trying to figure out ways to facilitate the historical data. I was thinking about partitioning the table based on the Paid Date, which would limit the amount of data that had to be resorted at one time when the clustered index was rebuilt. I was also considering partitioning the table based on 2 factors - paid date on clients - the conversion schedule is based on clients and client data is a typical filter when querying the fact table.
Thoughts? How would I partition a fact table on 2 unrelated attributes?
On the fact table, i want to create a clustered index on Paid Date (of the insurance claim). I am afraid that after a 3 year chunk of data gets converted, the load process will be brought to it's knees. I'd drop the index, load the 3 years worth of history and then rebuild the index. I'm afraid that rebuilding the index will bring the load process to it's knees.
I'm trying to figure out ways to facilitate the historical data. I was thinking about partitioning the table based on the Paid Date, which would limit the amount of data that had to be resorted at one time when the clustered index was rebuilt. I was also considering partitioning the table based on 2 factors - paid date on clients - the conversion schedule is based on clients and client data is a typical filter when querying the fact table.
Thoughts? How would I partition a fact table on 2 unrelated attributes?
Jeff Smith- Posts: 406
Join date: 2009-02-03
Re: Large number of late arriving facts
I wouldn't focus on historical conversions. I would focus how the table is loaded and queried normally. Historical conversions are always painful which is why they are often done over the weekend.

BoxesAndLines- Posts: 868
Join date: 2009-02-03
Location: USA
Similar topics» locate VIN number on the frame
» Interesting little facts about our American revolution
» Hemroid Surgery Cost - Hemorrhoid Surgery Facts
» Late Arriving Facts
» Large number of late arriving facts
» Interesting little facts about our American revolution
» Hemroid Surgery Cost - Hemorrhoid Surgery Facts
» Late Arriving Facts
» Large number of late arriving facts
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum