Fact/dimension table Setup
+2
BoxesAndLines
VT Hokie
6 posters
Page 1 of 1
Fact/dimension table Setup
Hi,
I am building a dimensional model based off of our production database. I have a question about the architecture for our model.
In our production database, we have a table,lets call it widgets. Widgets is a generic table for any product instance we make. There is a "Type" field that describes the Type of widget. Let's say we have Type A, B and C. All widgets of the three types are stored in the widget table. However, each of the types may have different properties from the other types. These properties are stored in a WidgetProperty table. Each WidgetProperty record has an identifier that describes the property and a value that holds the value of the that Property
Widget
WidgetProperty
This example is very simplistic; there are many types and many properties. The properties will end up being fact measures, or lookups to dimensions.
When I go to build a fact table out of this, I see two approaches:
1) create 1 fact table that will have many many fields to represent all of the properties. Many of the fields will have empty values since alot of the properties only apply to only 1 product type. This table will have at least 100 fields, possibly 200 or more.
2) Create many fact tables; one for each product type. This will create much narrorower tables that are much more densely populated.
I lean towards option 2 but others are leaning towards 1. I'm sure this is a common question coming from this type of production database, but I can't seem to find a best practice.
Thanks!
I am building a dimensional model based off of our production database. I have a question about the architecture for our model.
In our production database, we have a table,lets call it widgets. Widgets is a generic table for any product instance we make. There is a "Type" field that describes the Type of widget. Let's say we have Type A, B and C. All widgets of the three types are stored in the widget table. However, each of the types may have different properties from the other types. These properties are stored in a WidgetProperty table. Each WidgetProperty record has an identifier that describes the property and a value that holds the value of the that Property
Widget
WidgetID | WidgetType | created |
1 | A | 10/1/2010 |
2 | B | 10/1/2010 |
WidgetProperty
WidgetID | property | value |
1 | Property X | CDE |
1 | Property Y | 4 |
2 | Property X | ABC |
2 | Property Z | 8 |
This example is very simplistic; there are many types and many properties. The properties will end up being fact measures, or lookups to dimensions.
When I go to build a fact table out of this, I see two approaches:
1) create 1 fact table that will have many many fields to represent all of the properties. Many of the fields will have empty values since alot of the properties only apply to only 1 product type. This table will have at least 100 fields, possibly 200 or more.
2) Create many fact tables; one for each product type. This will create much narrorower tables that are much more densely populated.
I lean towards option 2 but others are leaning towards 1. I'm sure this is a common question coming from this type of production database, but I can't seem to find a best practice.
Thanks!
VT Hokie- Posts : 4
Join date : 2010-11-09
Re: Fact/dimension table Setup
Product is a dimension. What are you measuring in your product fact tables?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact/dimension table Setup
The rows in that table represent the creation of a product unit, not the product description. Therefore, we might have measures such as "Weight", date created, etc.
VT Hokie- Posts : 4
Join date : 2010-11-09
Re: Fact/dimension table Setup
VT Hokie,
Why don't you treat WidgetProperty as a bridge table between your fact and Widget Table. Doing so will eliminate the need to create 100's of fileds ( i.e. option 1 which I don't agree, as 1. most tools reporting out of dimensional models doesn't support so many fields. 2. more memory which might hinder performance as well.)
One of the most important factors governing dimensional modeling is to reduce the query complexity, which opposes your option 2.
Thus research into using a surrogate key in the WidgetProperty table and referring it in the fact table.
Why don't you treat WidgetProperty as a bridge table between your fact and Widget Table. Doing so will eliminate the need to create 100's of fileds ( i.e. option 1 which I don't agree, as 1. most tools reporting out of dimensional models doesn't support so many fields. 2. more memory which might hinder performance as well.)
One of the most important factors governing dimensional modeling is to reduce the query complexity, which opposes your option 2.
Thus research into using a surrogate key in the WidgetProperty table and referring it in the fact table.
bifacts- Posts : 1
Join date : 2010-11-08
Location : YouYesYay
Re: Fact/dimension table Setup
You need fact containing measures such as sales amount and quantity to have a fuller picture of your dimensional model. Weight and date created are attributes of product, not measures in the fact. If you don’t have any true fact, then you are not building dimensionally modelled BI system, instead, it might just be a reporting requirement on an existing OLTP system.
However the design of WidgetProperty does not seem to be a good practice in data modeling in general and dimensional modelling in particular. Kimball has elaborated a modelling approach for heterogeneous products in his book, p211, http://www.kimballgroup.com/html/booksDWT2.html. I think the concept applies to both dimensional and relational modelling.
However the design of WidgetProperty does not seem to be a good practice in data modeling in general and dimensional modelling in particular. Kimball has elaborated a modelling approach for heterogeneous products in his book, p211, http://www.kimballgroup.com/html/booksDWT2.html. I think the concept applies to both dimensional and relational modelling.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Fact/dimension table Setup
Properties appear to be attributes of the product, correct? So we are talking about a dimension table.
Storing attributes as key/value pairs can be done, but requires some level of support (i.e. code) to make it usable in a typical ad-hoc query environment. Depending on the database system you are using, performance can be an issue.
A compromise solution is to identify a core set of attributes of primary interest to the business and to store them as columns in a traditional product dimension table, then leave the rest in an outrigger table (using the product dimension's PK as part of the key) as key/value pairs.
Storing attributes as key/value pairs can be done, but requires some level of support (i.e. code) to make it usable in a typical ad-hoc query environment. Depending on the database system you are using, performance can be an issue.
A compromise solution is to identify a core set of attributes of primary interest to the business and to store them as columns in a traditional product dimension table, then leave the rest in an outrigger table (using the product dimension's PK as part of the key) as key/value pairs.
Re: Fact/dimension table Setup
I probably wasn't clear enough. This happens not on a product definition but on an actual instance of a produced unit from a manufacturing line. In other words, a widget comes off a production line at 11:35. The core data about this particular widget is stored in the widget table. The properties that are not core are stored in the widget property table.
Based on this, I would thinK this should be a fact table. Hopefully that makes my original question more clear.
Based on this, I would thinK this should be a fact table. Hopefully that makes my original question more clear.
VT Hokie- Posts : 4
Join date : 2010-11-09
Re: Fact/dimension table Setup
In dimensional modeling, you don't store product definition or properties in fact table. You store FK to your product dimension alongside with other dimension keys, such as date key or a time series. So you could call the fact production or shipment which could also store the number of certain products comming out of production line during a certain period of time. That will form the base of your fact.
All other things about properties, core or non-core data are about designing dimensions as ngalemmo said. Querying against dimensions does not turn dimensions into facts.
All other things about properties, core or non-core data are about designing dimensions as ngalemmo said. Querying against dimensions does not turn dimensions into facts.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Fact/dimension table Setup
As I understand so far I don't see any fact. As per Kimball method it sounds like as dimension table with outtrigger model since product do not have fixed number of properties.
shaikh.afzal- Posts : 1
Join date : 2009-09-22
Re: Fact/dimension table Setup
Thanks for all of the input.
I struggle to see how this could be a dimension. Here is another example. We make a widget of type X with and ID of 100001. It weighs 50 lbs, took 25 minutes to produce and is blue. After 100001 is produced, we produce 100002 that weighs 40 lbs, took 30 minutes to produce and is green.
We have to measure those things for each widget. Therefore, I really can't see how this would be a dimension.
I struggle to see how this could be a dimension. Here is another example. We make a widget of type X with and ID of 100001. It weighs 50 lbs, took 25 minutes to produce and is blue. After 100001 is produced, we produce 100002 that weighs 40 lbs, took 30 minutes to produce and is green.
We have to measure those things for each widget. Therefore, I really can't see how this would be a dimension.
VT Hokie- Posts : 4
Join date : 2010-11-09
Re: Fact/dimension table Setup
I don't see any way around de-leveraging (I got that from the financial fiasco, :-) ) your source data driven data structure. A data driven data structure works nicely for applications since you can generally store anything without having to change the underlying data structure. It doesn't work so well for BI reporting. Whether you can stuff all of your products into one or more dimensions is a modeling exercise you'll need to go through. My approach would be to identify all the attributes hidden within the source tables, build a normalized model so you can understand what is what, and then formalize a dimensional model.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact/dimension table Setup
Are these custom items built to spec per order? Are there not predefined physical characteristics of what is being built?
If you built item X, would there be a time you built X again? OR is item X really a base product with options (color, size, etc)? If either of these answers are yes, the item and options (if applicable) are dimensions with attributes. The act of building the item is a fact, that references those dimensions. In the case of options, there would be multiple fact rows identified with a common build id if a particular configuration of an item does not have its own SKU.
If you built item X, would there be a time you built X again? OR is item X really a base product with options (color, size, etc)? If either of these answers are yes, the item and options (if applicable) are dimensions with attributes. The act of building the item is a fact, that references those dimensions. In the case of options, there would be multiple fact rows identified with a common build id if a particular configuration of an item does not have its own SKU.
Similar topics
» joining dimension table to dimension and again fact table
» attribute on fact table or dimension table?
» Large Dimension table compared to fact table?
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» attribute on fact table or dimension table?
» Large Dimension table compared to fact table?
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum