Test Data Generation

View previous topic View next topic Go down

Test Data Generation

Post  Udankar on Sat Mar 21, 2009 7:12 am

Hi,

I am a part of a data warehousing project which has now entered into testing phase. Our client or if I have to generalize every Company is reluctant to share the production data with the software vendors for testing purpose - For security reasons.

For testing a data warehouse, we need to use good test data without compromising on data security and privacy concerns. So how to generate almost real test data which will improve the test quality. The test data must give a feeling of near real production data and also in equal proportion / size of the production machine.

Has this been discussed already?

Udankar

Posts : 1
Join date : 2009-03-21

View user profile

Back to top Go down

Re: Test Data Generation

Post  Mohsin on Thu Apr 02, 2009 5:17 am

Well we have faced the same problem. Cooking up data on your own is a time consuming and resource intensive task, so what we did was that we asked our client to give us some fudged sample data which they agreed.

For example we were dealing with clientís consumer data, so we asked our client to fabricate their customerís identification and contact details by eliminating 5-6 characters from important columns. Hope you get your hands around it.

Mohsin

Posts : 4
Join date : 2009-03-03

View user profile

Back to top Go down

Test Data

Post  pbestgen on Thu Apr 23, 2009 10:11 am

I think it is impossible to launch a datawarehouse project without having access to real production data. It is necessary at the early stages of analysis to execute a minimum of data profiling on source (file or DB) in order to have a clear picture of the format, patterns, referential integrity and business rules which apply to real data. What we found in the specifications or in the meta-data (if they exist) is often very different from reality. It would be dangerous to underestimate this step. This will avoid many surprises that we unfortunately discovered too late. Many data integration projects fail because this phase of analysis was not done correctly.

How to determine the granularity of a Fact table if fields forming the logical key are not clearly identified? How denormalize amounts for example, if the possible values of the "type of amount" are not fully listed, etc ...

Do not forget that if your data warehouse must take into account historical data, the problem may become more serious. The legacy application are the source of these data and rules that govern the data, changes over years. There are many examples where the meaning of a field in the source DB has changed over the years.

No honestly, your job is not easy. Building a data warehouse without having access to all real source data is a bit like walking blindfolded into a minefield.

Good luck!

pbestgen

Posts : 4
Join date : 2009-02-04

View user profile

Back to top Go down

Re: Test Data Generation

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum