Dimension Table - Primary Key Question SCD2

View previous topic View next topic Go down

Dimension Table - Primary Key Question SCD2

Post  mick on Tue Jul 22, 2014 8:05 pm

Hi There,

I am designing a dimension for the first time and have a question regarding primary keys with SCD2. Do all dimension tables simply use the surrogate key as the primary key? The reason why I ask is that I want to prevent data being loaded twice a day with the chance of having multiple SCD2 records with the same day stamp which could impact / complicate reporting. Should this restriction logic be handled solely at the ETL level with no logical prevention at the database level?

Thanks.

Michael

mick

Posts : 3
Join date : 2014-07-22

View user profile

Back to top Go down

Re: Dimension Table - Primary Key Question SCD2

Post  BoxesAndLines on Wed Jul 23, 2014 8:11 am

Yes, all dimensions have surrogate keys. Your ETL should be robust enough to identify a file that's already been loaded.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Reloading of data

Post  mick on Wed Jul 23, 2014 5:47 pm

Another quick question. If data needs to be reloaded for any particular day, what is the best way to do this? given SCD2 changes.

mick

Posts : 3
Join date : 2014-07-22

View user profile

Back to top Go down

Re: Dimension Table - Primary Key Question SCD2

Post  nick_white on Fri Jul 25, 2014 2:54 am

Probably not a quick answer - it depends on why you need to reload and what the current state of your system is.

Cleanest solution would be to restore your DW to a previously 'known good' state and re-load your data. But that pre-supposes
a) You have backups and your backup/restore process is known to work. You hear frequent stories of companies that think they are taking backups only to find out they aren't when they need them; or are taking backups but find out they can't restore from them - or it will take so long to restore that you'll never catch up you r backlog of unprocessed transactions
b) you have the source data available and in a format that you can re-build the DW from. This is the point when you understand the reason for not deleting anything from your staging area - when you realise you can't re-create history from your source system

The other approach is to reload your data on top of your DW as it currently stands. There is no generic solution for how to do this - it depends on what the state of your DW is (i.e. why you need to re-load data), how many tables are not SCD1 (so you can't just overwrite them), etc. This approach is almost certainly going to require a custom set of ETL routines.

Whatever solution you select, you need to have built any required routines and tested them during your development phase. Trying to hack some custom ETL together in a Prod environment to reload data while your users are shouting at you to get their data sorted is not a good place to be in!

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Dimension Table - Primary Key Question SCD2

Post  mick on Sun Jul 27, 2014 5:50 pm

Thanks for your response Nick. The reason why I asked is that we have the occasion where queries may need to be changed from source system which could impact on fact calculations. I guess the easiest thing to do would be to package them up and deploy them to run the following day knowing that a point in time for reports, there were some logical changes.

Interesting note about storing archives for staging. At the moment the ETL job that I have written simply copies a days data to a staging table before moving it over to the final fact / dimension tables. I could also pump the results back to daily text files if I needed to re-import history.

mick

Posts : 3
Join date : 2014-07-22

View user profile

Back to top Go down

Re: Dimension Table - Primary Key Question SCD2

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum