Measurement or Atrribute?

View previous topic View next topic Go down

Measurement or Atrribute?

Post  turbotortuga on Mon Nov 04, 2013 12:37 pm

Hello,

I am new to the Kimball Forum and have been learning how to build a DW/BI following the Kimball Methodology.

I am building my first DW/BI system!
My first question is if I'm identifying attributes (Dimension) and measurements (Facts) for invoice data coming from a source system correctly.

The grain of the fact is: One record every time an invoice is purchased from a client.

Dim_Factoring_Invoices:
Factoring_Invoices_Key (PK,SK)
Invoice_ID (BK)
Invoice_Number
Days_Due
Purchased_Order
Description
Approve_Comment
Approve_Status
Close_Code
Buy_Status
Last_Updated_by_ETL (datetime)

Fact_Factoring_Invoice:
Factoring_Invoices_Key (FK)
Factoring_Client_Key (FK)
Factoring_Debtor_Key (FK)
Factoring_Transactions_Key (FK)
Emp_Details_Key (FK)
Amount
Reserved_Escrow
Balance
Fee_Earned
At_Risk_Amount
Invoice_Date (FK to Date Dim - Date_Key)
Transaction_Posted_Date (FK to Date Dim - Date_Key)
Last_Payment_Date (FK to Date Dim - Date_Key)
Close_Date (FK to Date Dim - Date_Key)
Last_Updated_by_ETL (datetime)

Thank you much for your help,
Roberto G. Cedeņo
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  BoxesAndLines on Mon Nov 04, 2013 6:15 pm

Looks pretty good to me. What is the relationship cardinality between your fact and dimension?
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Measurement or Atrribute?

Post  turbotortuga on Mon Nov 04, 2013 6:31 pm

One-to-One.
A record in the Dim_Factoring_Invoices belongs to one record in the Fact_Factoring_Invoice.

Roberto G. Cedeņo
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  BoxesAndLines on Mon Nov 04, 2013 9:16 pm

A one-to-one relationship between facts and dimensions is generally not a good idea. Most databases start to choke when joining two large tables. The dimensional model design pattern is based on large fact tables and small (i.e. low row counts) dimensions. If at all possible, break the invoice dimension into smaller dimensions. Here are some ideas:

Invoice_Number - Degenerate dimension on fact table
Days_Due - metric on fact table
Purchased_Order - Degenerate dimension on fact table
Description - New dimension or add to junk dimension
Approve_Comment - New dimension or add to junk dimension
Approve_Status - New dimension or add to junk dimension
Close_Code - New dimension or add to junk dimension
Buy_Status - New dimension or add to junk dimension

avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  turbotortuga on Tue Nov 05, 2013 12:45 am

I got a feeling something was amiss with my approach, that is why I decided to register to the Kimball forum

I have read about junk dimensions but wasn't sure what to do with them. Your suggestion will help me with that.
Now that I will be breaking down the dim, is there a specific naming convention for junk dimensions, or can I still name it Dim_Factoring_Invoices?

Perhaps something like this:

Dim_Factoring_Invoices:
Factoring_Invoices_Key (PK,SK)
Invoice_ID (BK)
Description
Approve_Comment
Approve_Status
Close_Code
Buy_Status

Last_Updated_by_ETL (datetime)

Fact_Factoring_Invoice:
Factoring_Invoices_Key (FK)
Factoring_Client_Key (FK)
Factoring_Debtor_Key (FK)
Factoring_Transactions_Key (FK)
Emp_Details_Key (FK)
Amount
Reserved_Escrow
Balance
Fee_Earned
At_Risk_Amount
Invoice_Number
Days_Due
Purchased_Order

Invoice_Date (FK to Date Dim - Date_Key)
Transaction_Posted_Date (FK to Date Dim - Date_Key)
Last_Payment_Date (FK to Date Dim - Date_Key)
Close_Date (FK to Date Dim - Date_Key)
Last_Updated_by_ETL (datetime)

This will reduce the number of rows in the dim by 40%
But, what do I do with the Business Key (Invoice_ID)? This is the PK from the source system.
How do I map the data from the fact to the correct (PK,SK) in the dim?
Do I have to match all the columns from the dim with data coming from the source, in order to apply the FK to the dim in the fact table?

Example:
Description, Approve_Comment, Approve_Status, Close_Code, Buy_Status from source matched to
Description, Approve_Comment, Approve_Status, Close_Code, Buy_Status from target (dim) equals FK in fact table

Thank you for your time and help.
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  BoxesAndLines on Tue Nov 05, 2013 9:58 am

Put Invoice ID on your fact table. It's a common access path to group fact rows. Kimball calls it a degenerate dimension. For the junk dimension, all columns makeup the lookup key.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  turbotortuga on Tue Nov 05, 2013 10:22 am

Alright, I will apply the changes to my model and let you know.

So just to make sure I understand these concepts.

    When designing Facts and Dims:
  • Have a one-to-many cardinality between Dims and Facts
  • Dims should have fewer row counts compared to Facts
  • If a one-to-one cardinality is present, break the data in the Dim so as to reduce the number of rows.
  • This can be done by creating a Junk dimension where all of its columns will become the lookup key.
  • Numeric, metric, and some attributes can the moved to the Fact as degenerated dimensions.
  • Move the business key to the Fact as a degenerated dimension and to serve as a common access path to group Fact rows.


I think my problem was that I thought you always want to separate attributes from facts. I was looking at my model for invoices and the data coming from the source system contains mixed data for invoices in one table (both numeric and textual). I was splitting the data into a dim containing the attributes and a fact containing the measurements, but because I am splitting the table my cardinality is one-to-one.

Would it be better to leave it as is, the data coming from the source system in a single denormalize table?

For example, if the data from source comes like this:
Invoice_Key
Client_Key
Debtor_Key
Transaction_Key
Emp_Details_Key
Invoice_Amount
Invoice_Reserved_Escrow
Invoice_Balance
Invoice_Date
Invoice_Last_Payment_Date
Invoice_Number
Invoice_Decription
Invoice_Approved_Comment
Invoice_Approve_Status


Could I just leave it as it is and let (Invoice_Number, Invoice_Decription, Invoice_Approved_Comment, Invoice_Approve_Status) become degenerated dimensions and call the table a fact, the Invoice_Key being the BK and common access path?

Thanks again,
Roberto G. Cedeņo
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  BoxesAndLines on Tue Nov 05, 2013 2:29 pm

Sure you could, but you wouldn't have a dimensional model.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  turbotortuga on Fri Nov 08, 2013 3:46 pm

Things are looking good, thanks for the help.
avatar
turbotortuga

Posts : 13
Join date : 2013-11-04

View user profile

Back to top Go down

Re: Measurement or Atrribute?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum