Pros and cons of consolidated dimension table Vs. many dimension table ?

View previous topic View next topic Go down

Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  peace1aparna on Tue Mar 08, 2011 8:37 pm

Say to create sales star schema => with sale(IT), refund(RF), exchange(EX) … as few of its attributes (item codes).
Pros and cons of consolidated dimension table Vs. many dimension table ?
Should I create different dimension tables with each item code:
Or, Should I include "item Code" column with in the fact table.

Consider a receipt with a number of line item say, 3 IT and 1 EX on each of 3 different receipt. I need to report number of EX and IT with in that particular day. But in my fact table, for uniqueness of the receipt i have included transaction Date, location Id, register Id and receipt Id only. Do i need to include unique column for each line item within that receipt.

I am very new to data warehousing .. your help is much appreciated. Thank you .

peace1aparna

Posts : 7
Join date : 2011-02-21

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  ngalemmo on Tue Mar 08, 2011 10:12 pm

You would have an Item dimension containing all items. The table (as well as all dimension tables) would have a surrogate primary key which serves as the foreign key on the fact. You would have other dimensions as well, such as a transaction type dimension, location dimension, date dimension and so on. Things like receipt ID and line number can be stored on the fact as degenerate dimensions. You do not need a unique key on the fact (usually).

As far as measures go, it is common practice to store values as they relate to sales. So things like returns would have negative values in their measures as they reduce sales. Exchanges would typically be stored as two rows, one for the returned item and the other for the exchange item, with offsetting values as appropriate in the measures.

avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  Jeff Smith on Wed Mar 09, 2011 12:08 pm

It would be a Junk Dimension, in case you wanted to read up on it.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  peace1aparna on Fri Mar 11, 2011 8:11 pm

Still confused, as in my POS system the Sales and Return Receipt are different

Sale receipt have following information :
Header Information (employee, location, receipt number, date…)
Sales item
Price_hold_status (paid in half or full)
Item -1 sold
Item -2 sold
Item -3 sold
.
.
Discount (if applied)
Subtotal
Tax
Total
Payment Type

Return receipt have information like :
Header Information (employee, location, receipt number, date…)
Item-1 returned
Price_hold_status (paid in half or full)
Orignal purchase date
Original store
Original purchase receipt number :
Number of days since purchase

Item-2 returned
Price_hold_status (paid in half or full)
Orignal purchase date
Original store
Original purchase receipt number :
Number of days since purchase
.
.
.
Discount (if applied)
Subtotal
Tax
Total
Payment Type

So far I have created a single fact table with following measures
1. Employee_pk
2. Customer_pk
3. Product_pk
4. Date_pk
5. Location_pk
6. Price_modifier_pk (like promotion)
7. Transaction_type (Sale, Refund, Exchange) (where to include measure for Count of transaction_type? Like how many count of Refund per day, per employee, per location or per workstation)
8. Payment_type_pk
9. Workstation_pk (it is register number, should I make it DD or create dimension table?)
10. Receipt_id (DD) (Collection of all items in a single transaction)
11. Line_item_id(DD) (for each item sale /return item)
12. Quantity
13. Unique price
14. Tax
15. Total_before_tax
16. Total_after_tax

Again for all refund reporting, I need to report additional information like
Price_hold_status (paid in half or full)
Original purchase date
Original store
Original purchase receipt number:
Number of days since purchase

Report Criteria => Show all refunds with number of days since purchase is greater than 5 days ?
Also, should be feasible to track original purchase receipt from refund receipt ?

How and where do I include following information. Do I need to create new dimension/fact?

Help appreciated.
Thank you .

peace1aparna

Posts : 7
Join date : 2011-02-21

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  hang on Mon Mar 14, 2011 5:51 pm

It looks like you might need a separate fact table for refund as it has different dimensionality from sale fact. The purchase receipt number (DD) would be the connection between the two facts. if the fact for exchange is just a process of returns and sales, you may not need a separate fact table for exchange. I would combine Transaction type, Payment type and Workstation into a junk dimension, therefore the workstation should not be a DD as it is of low cardinality. You could add a physical measure and default it to 1 for count in the fact tables.

All other dimension keys look fine in the fact table. The additive count measure will give you the desired count by constraining on any relevant dimensions.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  ngalemmo on Tue Mar 15, 2011 9:43 am

If you go with a separate fact table for refunds AND refunds need to be included in sales numbers, then I would record the refund in BOTH fact tables.

Otherwise, I would just add the extra FKs to the sales fact (ie Original purchase date, Original store, Original purchase receipt number (a degenerate probably)) and populate with the same values as date, store and receipt when it is a sale.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  peace1aparna on Wed Mar 16, 2011 3:10 pm

A.Modelling in single fact table: (have included sample of refund/ sale receipt below)

SALEandRETURN_FACT ( 1.Employee_pk, 2. Customer_pk, 3. Product_pk, 4. Date_pk, 5. Location_pk, 6. Price_modifier_pk (like promotion), 7. Transaction_type (Sale, Refund, Exchange), 8. Receipt_id (DD), 9. Line_item_id(DD),10. Quantity, 11. Junk_key (transaction_type, payment_type, work_station, exchange_flag_if_any),12.Tax, 13. Total_before_tax, 14. Total_after_tax, 15. Original_date_key, 16. Original_store_key, 17. Original_purchase_receipt, 18. Days_since_purchase (how to compute this)

So when I do this will introduce 3 DD columns (15. Original_store_key, 16. Original_store_key, 17. Original_purchase_receipt).
Will this solve my problem of tracking original receipt from refund receipt. But again how will I compute Days_since_purchase.
Theoritically Days_since_purchase = (Original_purchase_receipt – refund date from date_pk). Still confused how to get there??

B.Modelling in different fact table:

SALE_FACT_TABLE(day_key, product_key, customer_key, sale_transaction_ID, sale_line_item_ID, Exchange_flag,quantity, amount, tax, total)

REFUND_FACT_TABLE(day_key_sale, day_key_return, refund_transaction_ID, return_junk_key, refund_line_item_ID, refund_quantity, refund_amount, tax,refund_ total, return_without_receipt_flag, sale_transaction_ID, sale_line_item_ID )

DAY_DIMENSION
(day_Key, day, month, year)

I came to this structure with the help of google but here in REFUND_FACT_TABLE why should the day_key_sale ,sale_transaction_ID, sale_line_item_ID be included in REFUND_FACT_TABLE and what data should be load ??

And what if i have to compute Return as % of item Sales ????
Which track should I Follow(A or B). Sorry Its long and lots of confusing, hope I explained my confusion well. Help appreciated.

Following is the sample of purchase receipt and Return Receipt.
Purchase Receipt :
Receipt # XYZ
Register # 1
Store# NY
Cashier Adam
47 - transaction duration (sec)
Friday 02/25 5:12:13 PM PDT

510 510022330 - LIC KIDDIE STEP STOOL 5.00
350 350048083 - BRNT THBCK PANEL NAVY 63 12.00
350 350048083 - BRNT THBCK PANEL NAVY 63 12.00
520 520015380 - 9.5in ENVELOPES 0.94

Subtotal 29.94
Tax 2.92
Total 32.86

DBIT 32.86

Return receipt :
Receipt # ABCr
Register # 2
Store# NY
Cashier Betty
11 - transaction duration (sec)
Monday 03/14 1:05:24 PM PDT

17 - days since purchase
350 350048083 - BRNT THBCK PANEL NAVY 63 -12.00
Original Date: 2011-02-25 00:00:00
Original Store: NY
Original Register: 1
Original Transaction: XYZ

17 - days since purchase
350 350048083 - BRNT THBCK PANEL NAVY 63 -12.00
Original Date: 2011-02-25 00:00:00
Original Store: NY
Original Reg: 1
Original Trans: XYZ

Subtotal -24.00
Tax -2.34
Total -26.34

CASH -26.34

peace1aparna

Posts : 7
Join date : 2011-02-21

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  Jeff Smith on Wed Mar 16, 2011 4:01 pm

I think the case can be made for 2 fact tables or even 3 fact tables.

Let's say you have attributes that only apply to a sale and others that only apply to a refund. In some cases, it might be possible to combine the attributes into a single dimension table with columns that only apply to the sale and others that only apply to the refund. In this case, you could put everything into 1 fact table.

There may be columns in the fact table that apply only to a refund - such as original store. But even then, it's possible to make the original store and refund store (or current store) the same on a sale. However, you may find that adding a bunch of columns to the sales fact and to the refund fact to make them fit into one fact table may dramatically increase the size of the fact table and it might be better to split the facts in to 2 fact tables.

But I think it's a total of 3 fact tables. The Sale and refund facts are at the item level. You could create a third fact table that is technically an aggregate of the refund and sale transactions. The measure might be total items in the transaction.

This third table would make it easy for users to see the number of transactions by store or by cashier. If the transactions had account level information, it would make it easy for people to see the geographic reach of the store.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  peace1aparna on Thu Mar 17, 2011 2:14 pm

Jeff Smith wrote:.

But I think it's a total of 3 fact tables. The Sale and refund facts are at the item level. You could create a third fact table that is technically an aggregate of the refund and sale transactions. The measure might be total items in the transaction.

This third table would make it easy for users to see the number of transactions by store or by cashier. If the transactions had account level information, it would make it easy for people to see the geographic reach of the store.

Is it possiblt to create 1 aggregate table using 2 existing fact table (refund, sales) ?
AGGREGATE_FACT(weekly_dat_key,location_key, employee_key, product_key,sale_count, sale_amount, refund_count, refund_amount) .. can I do this but this table have columns that only apply to the sale AND columns that only apply to the refund. What are the pros and cons of this table with following columns ??

Also, How have create 2 models, how to attach in this forum ??
Thank you .

peace1aparna

Posts : 7
Join date : 2011-02-21

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  peace1aparna on Thu Mar 17, 2011 2:40 pm

ngalemmo wrote:If you go with a separate fact table for refunds AND refunds need to be included in sales numbers, then I would record the refund in BOTH fact tables.
New to dataware housing so can you please explain this further ..
Also I am confused with a similar sample model .. (extracted from The complete refrence to Star schema by - Christopher Adams, pg:70, fig 4-8)

ORDER_FACTS (day_key. product_key, customer_key, ordered_quantity, ordered_amount, order_id, order_line_num)

SHIPMENT_FACTS (day_key_order, day_key_shipment, product_key, customer_key, shiped_quantity, shipped_amount, revenue_dollar, order_id, order_line_num, shipment_id, shipment_line_number )

Following Confusion:
1. why is day_key , day_key_order, day_key_shipment all reference to day_key dimension table (shouldnot day_key_order, day_key_shipment be different)
2. whys is order_id, order_line_num attribute are also included in SHIPMENT_FACTS and what data goes in here, assuming order and shipment invoice have different IDs.

Thank you .

peace1aparna

Posts : 7
Join date : 2011-02-21

View user profile

Back to top Go down

Re: Pros and cons of consolidated dimension table Vs. many dimension table ?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum