SCD type 2 approach.

View previous topic View next topic Go down

SCD type 2 approach.

Post  abacusdotcom on Thu Dec 08, 2011 7:57 am

Hi Guys,

Am just lost in coming up with an idea of using SCD type 2 on table structure below.

Code:

create table TablePanelists
(
  pDate date
  ,PType varchar(10)
  ,email varchar(255)
)

Note

- pDate and PType are the table identifiers.

- For every pDate and PType, there are at least 3 occurrences of email.

Code:

select '10/01/2011' as pDate,  'A' as PType, 'le@abc.com' email
union
select '10/01/2011' as pDate,  'A' as PType, 'ka@abc.com' email
union
select '10/01/2011' as pDate,  'A' as PType, 'pi@abc.com' email

- My question is how can I place SCD type 2 on email column?
- My approach of updating the table is truncate and load, what other best approach I can adopt to keep historical data.

Many thanks

abacusdotcom

Posts : 4
Join date : 2009-11-23

View user profile

Back to top Go down

You need another column in your key

Post  elmorejr on Thu Dec 08, 2011 10:22 am

You said...

- pDate and PType are the table identifiers.

- For every pDate and PType, there are at least 3 occurrences of email.

These two statements contridict each other. If pDate and pType make up the table's natural key, then you can not have 3 rows in the same table with the same key. You will need another column to be a part of the natural key to distinguish between the different email addresses:

pDate
pType
pEmailType

For example...

Code:
select '10/01/2011' as pDate,  'A' as PType, 'home' as pEmailType, 'le@abc.com' email
union
select '10/01/2011' as pDate,  'A' as PType, 'work' as pEmailType,  'ka@abc.com' email
union
select '10/01/2011' as pDate,  'A' as PType, 'school' as pEmailType,  'pi@abc.com' email

Once you have the additional column, type 2 becomes trivial...

elmorejr

Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol

View user profile

Back to top Go down

Re: SCD type 2 approach.

Post  abacusdotcom on Thu Dec 08, 2011 11:15 am

Hi elmorejr,

Am sorry the misconstruction.

I meant to say they are record identifier.

So the scenario is like this for every pDate and PType, it requires 3 or less emails.

abacusdotcom

Posts : 4
Join date : 2009-11-23

View user profile

Back to top Go down

SCD Not possible

Post  elmorejr on Thu Dec 08, 2011 1:37 pm

Then I am afraid that SCD2 is not possible with your dataset. If you have three valid email addresses for the same date/type combo and there is no way to distinguish between them, then you have no way to compare the current records against the new records.

Or you may be misunderstanding the concept of SCD2.

I would suggest you reread the SCD section in the DW Toolkit...




elmorejr

Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol

View user profile

Back to top Go down

Re: SCD type 2 approach.

Post  ngalemmo on Thu Dec 08, 2011 2:19 pm

It would help to know what it is you are trying to accomplish and what information you are getting from the source. As elmorejr has pointed out, you don't have the right information for a type 2 to work. Dimension maintenance is driven by unambiguous natural keys, which you do not have.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Thank You Boss

Post  abacusdotcom on Fri Dec 09, 2011 6:41 am

Thanks all,

Your contributions has re-sharpen how I see SCD type 2. Am only trying to base my scd type 2 on 3 columns table where two of the columns are row identifier (non discreet), while I intend using the last as scd. I now see it's not possible to scd on just one column. Will go back to my drawing board, will device a method of update and insert.

Many thanks

abacusdotcom

Posts : 4
Join date : 2009-11-23

View user profile

Back to top Go down

Re: SCD type 2 approach.

Post  hang on Fri Dec 09, 2011 7:27 pm

It looks like the email is multivalued attribute for the dimension. SCD is not used to store multivalued attribute as different records, although it may be used to track the change on the email when an old one expires and a new one becomes valid. But normally email would be treated as type 1.

If the number of emails is limited to 2 or 3, the simplest approach is to store them as separate attributes, ie. email1, email2, email3 and allow null or 'NA' for not applicables. However the most effective approach is to store the multivalued attributes in separate bridge table with NK repeated in multiple records for each attribute value (eg. email).

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: SCD type 2 approach.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum