Dummy dimension values in the fact table

View previous topic View next topic Go down

Dummy dimension values in the fact table

Post  noodleeater on Mon Dec 12, 2011 7:04 am

Hi

I had a question around dummy values in the fact table.

Lets say:

Fact Sales
Dim Date
Dim Product
Dim Salesman

Also, lets say, my transaction system for sales does not have a unique transaction ID but instead has date, product and salesman that uniquely define a row.

If one of the dimension values (say salesman) in the source is unknown, it set the surrogate key in the fact table to -1

But what happens when this record is updated in the source? i.e. the salesman ID is now populated in the transaction system for the record?

I would have no option but to truncate and load my fact table to be able to reflect that update, right?

I wouldnt think there is any other way to remove the -1 surrogate key and update it with the correct one as in the source.

This is quite cumbersome if your transaction table has 20 million rows!

Can someone think of any other way?

noodleeater

Posts : 1
Join date : 2011-12-12

View user profile

Back to top Go down

Regular reload

Post  elmorejr on Mon Dec 12, 2011 9:58 am

One option is a regular reload. For example, if you know that the Salesperson ID is usually "filled in" within 30 days of the transaction, you can implement a rolling 30-day reload. On each load, remove the most recent 30 days and then reload data for those 30 days.

This will give the salesperson assignment time to catch up.

If you have appropriate partitioning, removing a range of data should not be an issue.

elmorejr

Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol

View user profile

Back to top Go down

Re: Dummy dimension values in the fact table

Post  umutiscan on Mon Dec 12, 2011 2:18 pm

May be rejecting that record untill salesman id is populated would be the better way.

If there is an update date column in source system, you may use transaction date + update date to get delta data from source. You get the record when created, if it is rejected you give another chance when updated.

umutiscan

Posts : 51
Join date : 2010-11-26
Age : 36
Location : Istanbul, Turkey

View user profile

Back to top Go down

Re: Dummy dimension values in the fact table

Post  ngalemmo on Mon Dec 12, 2011 10:46 pm

This is a business rules issue. If there should always be a salesman, why would salesman be unspecified? Is it really 'officially' an order or just an indication from the customer? Umutiscan's comment that maybe you shouldn't load it yet may be the correct approach. When you are dealing with a sales force, typically an order isn't an order until it has been accepted. The other stuff is just "pipeline".

I also find it difficult to understand why there isn't an identifier for an order. The source system must have something that it uses to keep things straight. Is this a hypothetical question?

The idea of generating net change rows is probably the best option as it avoids having to update the fact table. It gives you the ability to state and order at any point in time, the downside is there are more rows in the table. How much of an impact the extra rows may have on query performance depends on how often the order changes.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Dummy dimension values in the fact table

Post  pcs on Wed Dec 14, 2011 6:46 pm

We run into a similar situation often in my current role. We often get upc codes from our point of sale system, but don't know what the other attributes are for these items (no description, size, name, etc.). In this case, we create an item record with a new surrogate key, where the business key (upc) is set, but most of the other item attributes are "unknown".
Usually, we will eventually get the descriptive attributes for the item. When we do, we track these attribute changes as normal scd attribute changes.

pcs

Posts : 20
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Dummy dimension values in the fact table

Post  ngalemmo on Wed Dec 14, 2011 7:16 pm

pcs wrote:We run into a similar situation often in my current role. We often get upc codes from our point of sale system, but don't know what the other attributes are for these items (no description, size, name, etc.). In this case, we create an item record with a new surrogate key, where the business key (upc) is set, but most of the other item attributes are "unknown".
Usually, we will eventually get the descriptive attributes for the item. When we do, we track these attribute changes as normal scd attribute changes.

It is a good way to do it if you are getting a business key (in this case UPC). My impression from the original post was that no business key was assigned when the information was extracted for loading into the DW. In such cases, you can't create a dimension row because you have nothing to identify it other than assigning it to a default 'unknown' row.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Dummy dimension values in the fact table

Post  Jeff Smith on Thu Dec 15, 2011 10:56 am

The source system has to be able to track the transaction. How else can it append the missing info later in the process?

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Dummy dimension values in the fact table

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum