Data Warehouse Purge Strategy - HELP

View previous topic View next topic Go down

Data Warehouse Purge Strategy - HELP

Post  AndyPainter on Mon Oct 19, 2009 3:54 pm

I'm looking at the best way to implement a purge strategy in a dimensional model. The reason to purge is that no personal data can be kept after 4 years, but the users would still want to see counts/facts by no identifying data. For example i couldn't keep a name, but i could keep the gender or employer.

Is it best to do this by depersonalising data e.g. setting certain column to NULL after a number of years, or adopting an aggregate strategy. The data volumes are not huge, but if i go the aggregate route i'm going to be left with quite a lot of duplication, as opposed to just keeping all the detail data and NULL'ing certain columns, after a set time. Doing this aslo avoids any issues with lack of aggregate aware tools.

TIA

AndyPainter

Posts : 7
Join date : 2009-10-19
Location : Cambridge, UK

View user profile http://enterpriseinformationmanagement.wordpress.com/

Back to top Go down

Re: Data Warehouse Purge Strategy - HELP

Post  ngalemmo on Mon Oct 19, 2009 4:29 pm

Keep it simple, null the columns (or replace with some string, such as 'N/A' or 'Obsolete') you need to drop.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Data Warehouse Purge Strategy - HELP

Post  AndyPainter on Wed Oct 21, 2009 9:24 am

Thanks, it does seem to make a lot of sense to keep it simple, so i will be looking at purging specific data from my selected columns after a certain period of time.

I'll only look at an Aggregate strategy when storage/space issue becomes important.

I think what i will also do is to add a flag to each record where a purge can occur to add clarity that a column value has been purged rather than missing data.

AndyPainter

Posts : 7
Join date : 2009-10-19
Location : Cambridge, UK

View user profile http://enterpriseinformationmanagement.wordpress.com/

Back to top Go down

Re: Data Warehouse Purge Strategy - HELP

Post  ngalemmo on Wed Oct 21, 2009 11:19 am

Yes, adding such a column is a good idea. You may also want to include a timestamp as well.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Data Warehouse Purge Strategy - HELP

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum