Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Dummy dimension values in the fact table

+2
elmorejr
noodleeater
6 posters

Go down

Dummy dimension values in the fact table Empty Dummy dimension values in the fact table

Post  noodleeater Mon Dec 12, 2011 7:04 am

Hi

I had a question around dummy values in the fact table.

Lets say:

Fact Sales
Dim Date
Dim Product
Dim Salesman

Also, lets say, my transaction system for sales does not have a unique transaction ID but instead has date, product and salesman that uniquely define a row.

If one of the dimension values (say salesman) in the source is unknown, it set the surrogate key in the fact table to -1

But what happens when this record is updated in the source? i.e. the salesman ID is now populated in the transaction system for the record?

I would have no option but to truncate and load my fact table to be able to reflect that update, right?

I wouldnt think there is any other way to remove the -1 surrogate key and update it with the correct one as in the source.

This is quite cumbersome if your transaction table has 20 million rows!

Can someone think of any other way?

noodleeater

Posts : 1
Join date : 2011-12-12

Back to top Go down

Dummy dimension values in the fact table Empty Regular reload

Post  elmorejr Mon Dec 12, 2011 9:58 am

One option is a regular reload. For example, if you know that the Salesperson ID is usually "filled in" within 30 days of the transaction, you can implement a rolling 30-day reload. On each load, remove the most recent 30 days and then reload data for those 30 days.

This will give the salesperson assignment time to catch up.

If you have appropriate partitioning, removing a range of data should not be an issue.

elmorejr

Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol

Back to top Go down

Dummy dimension values in the fact table Empty Re: Dummy dimension values in the fact table

Post  umutiscan Mon Dec 12, 2011 2:18 pm

May be rejecting that record untill salesman id is populated would be the better way.

If there is an update date column in source system, you may use transaction date + update date to get delta data from source. You get the record when created, if it is rejected you give another chance when updated.

umutiscan

Posts : 51
Join date : 2010-11-26
Age : 43
Location : Istanbul, Turkey

Back to top Go down

Dummy dimension values in the fact table Empty Re: Dummy dimension values in the fact table

Post  ngalemmo Mon Dec 12, 2011 10:46 pm

This is a business rules issue. If there should always be a salesman, why would salesman be unspecified? Is it really 'officially' an order or just an indication from the customer? Umutiscan's comment that maybe you shouldn't load it yet may be the correct approach. When you are dealing with a sales force, typically an order isn't an order until it has been accepted. The other stuff is just "pipeline".

I also find it difficult to understand why there isn't an identifier for an order. The source system must have something that it uses to keep things straight. Is this a hypothetical question?

The idea of generating net change rows is probably the best option as it avoids having to update the fact table. It gives you the ability to state and order at any point in time, the downside is there are more rows in the table. How much of an impact the extra rows may have on query performance depends on how often the order changes.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Dummy dimension values in the fact table Empty Dummy dimension values in the fact table

Post  pcs Wed Dec 14, 2011 6:46 pm

We run into a similar situation often in my current role. We often get upc codes from our point of sale system, but don't know what the other attributes are for these items (no description, size, name, etc.). In this case, we create an item record with a new surrogate key, where the business key (upc) is set, but most of the other item attributes are "unknown".
Usually, we will eventually get the descriptive attributes for the item. When we do, we track these attribute changes as normal scd attribute changes.

pcs

Posts : 20
Join date : 2009-02-03

Back to top Go down

Dummy dimension values in the fact table Empty Re: Dummy dimension values in the fact table

Post  ngalemmo Wed Dec 14, 2011 7:16 pm

pcs wrote:We run into a similar situation often in my current role. We often get upc codes from our point of sale system, but don't know what the other attributes are for these items (no description, size, name, etc.). In this case, we create an item record with a new surrogate key, where the business key (upc) is set, but most of the other item attributes are "unknown".
Usually, we will eventually get the descriptive attributes for the item. When we do, we track these attribute changes as normal scd attribute changes.

It is a good way to do it if you are getting a business key (in this case UPC). My impression from the original post was that no business key was assigned when the information was extracted for loading into the DW. In such cases, you can't create a dimension row because you have nothing to identify it other than assigning it to a default 'unknown' row.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Dummy dimension values in the fact table Empty Re: Dummy dimension values in the fact table

Post  Jeff Smith Thu Dec 15, 2011 10:56 am

The source system has to be able to track the transaction. How else can it append the missing info later in the process?

Jeff Smith

Posts : 471
Join date : 2009-02-03

Back to top Go down

Dummy dimension values in the fact table Empty Re: Dummy dimension values in the fact table

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum