Snapshot Dimension & Fact tables
2 posters
Page 1 of 1
Snapshot Dimension & Fact tables
We are currently designing a logical multidimensional model from an OLTP tables.Dimension tables have monthly snapshot because some of or all the attributes might change on monthly basis.The same situation for the fact table has monthly snapshot.
I know that according to Kimball's modeling, the attributes for dimensions should be implemented using mini-dimensions and put combination key as a foreign key in the fact table but this step needs an ETL job to handle mini-dimension and other fact table.However, in our situation and according to scope limitation,there is no time to design separate ETL to handle mini-dimension.
An example for our records:
Customer:
Cust_ID
Month_ID
Attr1
Attr2
Attr3
......
Attr20
Installments
Installment_ID
Cust_ID
Month_ID
Attr1
Attr2
Attr3
...
Attr5
Measure1
Measure2
Measure3
Measure4
So, Installments table contains attributes as well as measures so what is the suitable consideration and the suitable design ,should we consider Installments table as fact or should we divide it into dimension and fact tables ?
I know that according to Kimball's modeling, the attributes for dimensions should be implemented using mini-dimensions and put combination key as a foreign key in the fact table but this step needs an ETL job to handle mini-dimension and other fact table.However, in our situation and according to scope limitation,there is no time to design separate ETL to handle mini-dimension.
An example for our records:
Customer:
Cust_ID
Month_ID
Attr1
Attr2
Attr3
......
Attr20
Installments
Installment_ID
Cust_ID
Month_ID
Attr1
Attr2
Attr3
...
Attr5
Measure1
Measure2
Measure3
Measure4
So, Installments table contains attributes as well as measures so what is the suitable consideration and the suitable design ,should we consider Installments table as fact or should we divide it into dimension and fact tables ?
a_sherbeeny- Posts : 15
Join date : 2009-02-04
Re: Snapshot Dimension & Fact tables
I have similar experience converting snapshot based history OLTP system into a dimensional data store. The main purpose of the process is to have more efficient customer SCD2 dimension which you don't need to repeatedly store the whole copy of customer data each month even most of them have not changed. You would also have an efficient instalments fact table by storing surrogate keys without repeating much lengthier textual attributes in the fact table.
In your simplified two table scenario, performance gain on two smaller tables would be the main drive for the dimensional modeling. In long run, it paves the way for any other collection frequencies which will provide more dynamic picture of the business than the monthly snapshot.
Most of the work is in ETL once you get the model right. I would implement an ETL process to handle both history rebuild and on-going load by looping through each month snapshot data. Depending on the business requirement, you may use either first day or last day of the month as your SCD start date, and end date when you need to expire previous customer record in case of SCD changes.
In your Instalment fact table, you need to group the attributes in dimensional sense and store them away in dimension tables so that you may only have dimension keys and degenerate dimensions plus all the relevant numeric values in your fact tables. Look into utilising junk dimension for low cardinality attributes, degenerate dimension for high cardinality attributes like Instalment number, and role playing dimension for different dates. Hope this will help.
In your simplified two table scenario, performance gain on two smaller tables would be the main drive for the dimensional modeling. In long run, it paves the way for any other collection frequencies which will provide more dynamic picture of the business than the monthly snapshot.
Most of the work is in ETL once you get the model right. I would implement an ETL process to handle both history rebuild and on-going load by looping through each month snapshot data. Depending on the business requirement, you may use either first day or last day of the month as your SCD start date, and end date when you need to expire previous customer record in case of SCD changes.
In your Instalment fact table, you need to group the attributes in dimensional sense and store them away in dimension tables so that you may only have dimension keys and degenerate dimensions plus all the relevant numeric values in your fact tables. Look into utilising junk dimension for low cardinality attributes, degenerate dimension for high cardinality attributes like Instalment number, and role playing dimension for different dates. Hope this will help.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Snapshot Dimension & Fact tables
Thanks Hang for your reply and I really appreciate your help ,but the problem is that the project scope doesn't cover any ETL process.Therefore we need to rely on the OLTP DB as the main source.We are using OBIEE as BI tool and as you know, we can form a multidimensional model from the OLTP in the logical layer.So,this is our objective to design a logical multidimensional model in OBIEE with the mentioned OLTP source.
a_sherbeeny- Posts : 15
Join date : 2009-02-04
Re: Snapshot Dimension & Fact tables
I am not surprised. In average, ETL accounts for 70 percent of the data warehousing work and yet it is the most under estimated area in most BI projects. I have never heard of OBIEE. I guess it would be something like reporting tool used to build a semantic layer for report developers to create canned or OLAP reports. But that's about reporting instead of a proper design/remodelling for DW/BI purpose.
To address your initial question directly, the suitable design is to turn your instalments table into fact and its surrounding dimensions with its textural attributes by the processes in my comments. However without ETL, I can't see if there is any point to talk about turning your OLTP tables into facts and dimensions. ETL may be an intimidating buzzword, but in simplest case, it's just a load process to store the data in a proper format to meet your reporting requirements. You may carry it out in form of stored procedures using a bunch of INSERT INTO or UPDATE SQL statements. But remember, dimensional modeling is about creating redundant data in a format suitable for reporting in terms of performance, ease of use and efficient history tracking. it is important to physically separate your data in different areas to avoid any confusion on single point of truth.
To address your initial question directly, the suitable design is to turn your instalments table into fact and its surrounding dimensions with its textural attributes by the processes in my comments. However without ETL, I can't see if there is any point to talk about turning your OLTP tables into facts and dimensions. ETL may be an intimidating buzzword, but in simplest case, it's just a load process to store the data in a proper format to meet your reporting requirements. You may carry it out in form of stored procedures using a bunch of INSERT INTO or UPDATE SQL statements. But remember, dimensional modeling is about creating redundant data in a format suitable for reporting in terms of performance, ease of use and efficient history tracking. it is important to physically separate your data in different areas to avoid any confusion on single point of truth.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» Monthly Periodic Snapshot Table using SCD Type1 Dimension tables - Design Question.
» Periodic snapshot fact tables with sparse data
» Storing Date Keys in dimension tables versus fact tables
» Accumulating Snapshot and Transaction Fact tables : question to design and use them together
» Monthly Periodic Snapshot Table using SCD Type1 Dimension tables - Design Question.
» Periodic snapshot fact tables with sparse data
» Storing Date Keys in dimension tables versus fact tables
» Accumulating Snapshot and Transaction Fact tables : question to design and use them together
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum