Partial vs Complete Dimensions
4 posters
Page 1 of 1
Partial vs Complete Dimensions
What are the pros and cons of Partial Dimensions vs Complete Dimensions?
I want to record factless facts about say Customers. Should I create a Customer Dimension that containing ALL Customers (a Complete Dimension) or should I just add Customers to the Customer Dimension as I need to record a fact about them (a Partial Dimension) ?
Apart from space implicatoins - obviously a Complete Dimension would consume more space than a Partial Dimension - and perhaps processing time to maintain them, what are the pros and cons for both? Is there a strong case for Complete Dimensions over Partial Dimensions?
Many thanks for any thoughts on the subject!
I want to record factless facts about say Customers. Should I create a Customer Dimension that containing ALL Customers (a Complete Dimension) or should I just add Customers to the Customer Dimension as I need to record a fact about them (a Partial Dimension) ?
Apart from space implicatoins - obviously a Complete Dimension would consume more space than a Partial Dimension - and perhaps processing time to maintain them, what are the pros and cons for both? Is there a strong case for Complete Dimensions over Partial Dimensions?
Many thanks for any thoughts on the subject!
dstan- Posts : 2
Join date : 2015-09-18
Re:Partial vs Complete Dimensions
Hi,
it depends upon the requirement if you have a requirement to print Customers who have no transaction or any reports where you want to analyze customers with no transaction then you have to load all the customers.
I don't think space should be the driving factor.
thanks
it depends upon the requirement if you have a requirement to print Customers who have no transaction or any reports where you want to analyze customers with no transaction then you have to load all the customers.
I don't think space should be the driving factor.
thanks
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Partial vs Complete Dimensions
I don't understand the question. You do not create dimension tables for specific facts. You create a dimension table that various facts can use. A customer dimension should contain all customers.
Besides, how are you supposed to maintain a 'partial' dimension? Check every incoming customer to see if they have something in a fact table, then decide if you want to load it? How does a fact get into a fact table if the customer isn't available? The idea adds unnecessary complexity to the load process.
Space issues are handled with data retention policies. It is a retrospective cleanup of obsolete data, not an active 'only if you need it' load process.
Besides, how are you supposed to maintain a 'partial' dimension? Check every incoming customer to see if they have something in a fact table, then decide if you want to load it? How does a fact get into a fact table if the customer isn't available? The idea adds unnecessary complexity to the load process.
Space issues are handled with data retention policies. It is a retrospective cleanup of obsolete data, not an active 'only if you need it' load process.
Clarification on Partial vs Complete Dimensions
I think a little bit more information might help to explain what seems like an strange way to try to build a Dimension.
I am buidling a data warehouse to measure and store the quality of the data held within our organisation. A data quality measure (DQM) will create a Fact for each entity affected by the DQM. Examples of a DQM are: completness of a record - Customers without an email address; referential integrity - contracts without an account; data alignment across systems - customers missing from a system; reference data checks - customers with a title not in the list of valid titles.
Generally I'm only interested in entities with a data quality issue and so a Partial Dimension is an option. However, I can see that I would need a Complete Dimension if I want to work out the percentage of Account, Contract, Customers with a particular data qaulity issue.
I am buidling a data warehouse to measure and store the quality of the data held within our organisation. A data quality measure (DQM) will create a Fact for each entity affected by the DQM. Examples of a DQM are: completness of a record - Customers without an email address; referential integrity - contracts without an account; data alignment across systems - customers missing from a system; reference data checks - customers with a title not in the list of valid titles.
Generally I'm only interested in entities with a data quality issue and so a Partial Dimension is an option. However, I can see that I would need a Complete Dimension if I want to work out the percentage of Account, Contract, Customers with a particular data qaulity issue.
dstan- Posts : 2
Join date : 2015-09-18
Re: Partial vs Complete Dimensions
Why not link the data quality fact with the existing (fully populated) customer dimension?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Partial vs Complete Dimensions
Partial isn't an option. You have a fact that is recording a quality issue with a customer. Why would you not reference the customer in the primary customer dimension?
Similar topics
» A complete Understanding on the Data Modeling
» SCD2 and partial information
» Conformed Dimension Partial Set of Attributes
» Correlated - Separate Dimensions OR Single Dimensions ?
» Combination of SCD2, partial snowflake, Effective and end dates in fact table
» SCD2 and partial information
» Conformed Dimension Partial Set of Attributes
» Correlated - Separate Dimensions OR Single Dimensions ?
» Combination of SCD2, partial snowflake, Effective and end dates in fact table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|