Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Determining the most atomic level of data and wondering if you've gone to far

2 posters

Go down

Determining the most atomic level of data and wondering if you've gone to far Empty Determining the most atomic level of data and wondering if you've gone to far

Post  thekeel Sat Mar 12, 2011 10:33 pm

I have been struggling with determining how detailed my fact table should be and would be very grateful to obtain some input.

Let me start by briefly explaining the process and what defines a single transaction. We recycle goods for customers and record each product we take from the location as a line item in the transactional database. If we arrive at a store and pick up 30 gallons of oil and 30 pounds of oil filters then that store has two line items recorded with the cost of each service reported. So far this is very similar to what you would find on a commonly modeled order table, you have the order (summary) and items that make up the order (detailed).

Determining the most atomic level of data and wondering if you've gone to far ServiceSummary
In the view of accounting, this is the definition of one transaction.

As a recycling company, we also need to know how the products were recycled. A given product may be recycled by one or many different methods. Looking at the above example, we have two line items that made up 1 service to a location but now we are introducing recycling method to the fold.

Determining the most atomic level of data and wondering if you've gone to far Sustainability
Within the Operations group, the recycling method becomes the definition of one transaction.

Used oil was recycled by four different methods and the filters by two different methods. So now those two rows have grown into six. To keep things simple I have pretended that each product is equally recycled by each method; however, in the real world it varies.

I have allocated cost down to the recycling method by simply dividing the cost by the number of recycling methods. Again, over simplifying but you get the idea. Weight and volume were also allocated down to the recycling method level.

Adding recycling method will add a lot of rows to the fact table, but its necessary to determine cost per recycling method and weight/volume per recycling method.

Since I can aggregate up to the original service level, is having two fact tables necessary? Should there be a fact table at the higher level just for the accounting group to query? Or is having one fact table with all the information at the most atomic level (and aggregating with queries) the way to go?

After reading Kimball's books - it seems that I should go with the latter and break this down to the most atomic level possible, but is there any reason NOT to do that?

Thanks for your help...


thekeel

Posts : 12
Join date : 2011-02-12

Back to top Go down

Determining the most atomic level of data and wondering if you've gone to far Empty Re: Determining the most atomic level of data and wondering if you've gone to far

Post  ngalemmo Sun Mar 13, 2011 6:19 pm

Since I can aggregate up to the original service level, is having two fact tables necessary?

Yes, absolutely, two fact tables are necessary, but not for the reasons you imply. You need two fact tables because you are trying to represent two different and independent business processes: picking up materials, and recycling materials. Sure, both are part of the business cycle, but they are separate events, just like taking an order and shipping the order.

You have common keys for analysis, including the manifest. So, analysis across the two tables is not difficult.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum