Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Fact table type

+2
ngalemmo
sarah_id1
6 posters

Go down

Fact table type Empty Fact table type

Post  sarah_id1 Thu Nov 18, 2010 7:10 am

Basic questions about fact table type in kimball methodologies. Would appreciate if someone could explain

1)Transactional fact table type
This is the transaction event captured from source OLTP system. Fact table in this case has all the keys instead of actual value.
I dont understand the advantage of fact table when the source OLTP table (e.g. sales table) anyways stores all the transaction records as does fact except that fact has in form of keys?

2)Periodic fact table type
The snapshot all the transaction at regualar interval. I understand in this case fact table has historic snapshot of data which OLTP source would not. Classical example would be account balance. It this the main advantage of this fact table type or more than that.

3)Accumulating fact table type
The snapshot sounds similar to transaction except that it increases horizontaly (columns are updated frequently). Again if source table already stores all the history data why need a fact table? The only advantage I see is query running a faster against fact table than OLTP source table

Any reporting example for these fact table type would be great.

Thanks,
Sarah

sarah_id1

Posts : 9
Join date : 2010-11-18

Back to top Go down

Fact table type Empty Re: Fact table type

Post  ngalemmo Thu Nov 18, 2010 1:40 pm

I dont understand the advantage of fact table when the source OLTP table (e.g. sales table) anyways stores all the transaction records as does fact except that fact has in form of keys?

This question is much too basic to address in the space available in this forum as it gets to the fundimental differences between dimensional data warehouses and OLTP systems. The representation of a sales order in a dimensional model is significantly different than that in a 3NF OLTP model and cannot be distilled down to the existance of foreign keys. Dimensional modeling involves looking at the data differently... not as relationships between keyed groups of attributes but rather assigning contexts to a collection of measures that are the result of a business process or event.

But to answer you questions:
Transaction facts are a collection of deltas (changes over time). You would get the current state of a particular transaction by summing all rows for that transaction. You can recreate a historical version by summing up to a particular point in time. Changes are records by adding new rows with difference values.

Snapshot facts are values at a particular point in time. The point in time may be a fixed point in the past (such as month end balances) or now (a fact table that always contains current values - rows are updated in place).

Accumulating snapshot facts are like snapshots, except it retains all versions of the fact over time. The table will usually contain effective and expiration dates. When a new version of a transaction is added, the previous version is expired. Point in time reporting is done by filtering on the time period a row is in effect.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Fact table type Empty Re: Fact table type

Post  sarah_id1 Mon Nov 22, 2010 8:45 am

Thanks ngalemmo. I appreciate your response.

[quote="ngalemmo"]

But to answer you questions:
Transaction facts are a collection of deltas (changes over time). You would get the current state of a particular transaction by summing all rows for that transaction. You can recreate a historical version by summing up to a particular point in time. Changes are records by adding new rows with difference values.
I have worked in datawarehouse projects in the past and this is how we implemented SCD type1/2 for dimensions to retain the history. I would imagine the facts in transactional fact table would be independent of each other as such unless we want to get a contextual summary (like total sale for product X). Current state of a particular transaction has confused me. For example for a retail industry like Tesco/Walmart I would just record atomic transaction and the facts would be independent of each other. So am still stuck with my original question i.e. the transactions are recorded in both Source OLTP and fact table.

[quote="ngalemmo"]

Snapshot facts are values at a particular point in time. The point in time may be a fixed point in the past (such as month end balances) or now (a fact table that always contains current values - rows are updated in place).
I think this is fairly clear to me.

[quote="ngalemmo"]

Accumulating snapshot facts are like snapshots, except it retains all versions of the fact over time. The table will usually contain effective and expiration dates. When a new version of a transaction is added, the previous version is expired. Point in time reporting is done by filtering on the time period a row is in effect.
I guess all the activity of a transaction is traced in the same row as in the example below as each activity completes the dates are updated horizontally. so where is the question of previous version expiry.
order_id, expected_delivery_date, payment_processed_date, transit_date, completed_date.

sarah_id1

Posts : 9
Join date : 2010-11-18

Back to top Go down

Fact table type Empty Re: Fact table type

Post  BoxesAndLines Mon Nov 22, 2010 9:25 pm

If your OLTP system contains all atomic level history, in an easily queryable (is that a word?) data structure, for all business lines across the organization, with good database performance, potentially including enriched 3rd party data, then you may not be a candidate for building a data warehouse (Kimball or Inmon). Most correctly modeled OLTP databases slow to a crawl when the user runs a YTD sales trend vs. last years sales numbers.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Fact table type Empty Re: Fact table type

Post  ngalemmo Tue Nov 23, 2010 4:55 am

Current state of a particular transaction has confused me. For example for a retail industry like Tesco/Walmart I would just record atomic transaction and the facts would be independent of each other.

Sure, in retail as well as banking and accounting, you have transactions that are complete and final when received. However, in other scenarios, such as manufacturing, an order has a life of its own, with potentential changes and adjustments prior to being fulfilled. If a requirement is to maintain a running history of an order, a transactional or accumulating snapshot fact would be used. A transactional fact has the ability to calculate the magnitude of change over time, while an accumulating snapshot is easier to maintain and report from.

I guess all the activity of a transaction is traced in the same row as in the example below as each activity completes the dates are updated horizontally. so where is the question of previous version expiry.
order_id, expected_delivery_date, payment_processed_date, transit_date, completed_date.

Depending on your industry, this would not be a particularly good way to model an order. There are usually other processes, such as fulfillment, shipping and invoicing, that are typically represented by other fact tables. You may wish to model the entire lifecycle in a single fact, but that is usually done as an aggregate of the component facts.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Fact table type Empty Re: Fact table type

Post  hang Tue Nov 23, 2010 10:01 am

sarah_id1 wrote:So am still stuck with my original question i.e. the transactions are recorded in both Source OLTP and fact table.
True, data warehousing is about storing redundant data in different structure, ie. dimensionally modelled structure, for the purpose of good performance, ease of query and efficiency. The fact tables in dimensional data store are supposed to be highly normalised, containing only FKs and measures in general. By replacing textual dimension attributes with keys, you could easily make rich set of dimension attributes available by a single entry, and avoid duplicating lengthy values thousands times in a average size fact table, resulting in much better performance.

sarah_id1 wrote:I guess all the activity of a transaction is traced in the same row as in the example below as each activity completes the dates are updated horizontally.
That's Kimball's version of Accumulating Snapshot fact. I think the main advantage is to let ETL set the start and end dates and work out the lag times of each stage of the life cycle during the nightly load, so that you don't have to conduct time consuming self joins during reporting time.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Fact table type Empty Accumulating snapshot

Post  Bing Thu Apr 28, 2011 12:03 pm

"Accumulating snapshot facts are like snapshots, except it retains all versions of the fact over time. The table will usually contain effective and expiration dates. When a new version of a transaction is added, the previous version is expired. Point in time reporting is done by filtering on the time period a row is in effect.."

Hi ngalemmo,

Could you please give an example of such an accumulating fact table with different versions of a transaction? It is definately not the defination of accumulating snapshot by Kimball, but I would like to know in which kind of bussness processes you would model a accumulating snapshot as such.

Thanks

Bing

Bing

Posts : 1
Join date : 2011-04-28

Back to top Go down

Fact table type Empty Re: Fact table type

Post  LAndrews Thu Apr 28, 2011 1:07 pm

A good example could be a fact table for helpdesk tickets. Fdor Simplicity sake, consider the ticket can be in 4 states (Open, InProgress, Resolved, Closed)

The grain of the fact would be one record for the ticket at a point in time. The fact record could include current status, dates for each status, in addition to effective start and end dates.

Each time the ticket changed status, another row would be created. By constraining on the effective dates, you can easily analyse your data for any point in time. (e.g. How many InProgress tickets did we have have on April 1?


LAndrews

Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada

Back to top Go down

Fact table type Empty Re: Fact table type

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum