Deriving Dimension attributes from Fact table

View previous topic View next topic Go down

Deriving Dimension attributes from Fact table

Post  dwbi_rb on Wed Oct 13, 2010 5:41 am

Hi,
I have a Customer_Activity fact table which stores the the Activity Start & End Date values. I need to derive the last time the customer had an activity before the current date (based on the Activity End Date - MAX(Activity End Date) WHERE Activity End Date < SYSDATE) and the next planned activity for the customer (based on the Activity Start Date - MIN(Activity Start Date) WHERE Activity Start Date >= SYSDATE) - and the corresponding Activity Types for those dates. Since these dates are stored in the fact table itself, I would have to do an in-line query to look up the same fact table to derive the Last/next Activity Dates grouped on the Customer ID on the Fact table.
If we do it at the reporting end, then it will be an expensive query and will take more time to bring in the resultsets; so the idea was to derive it at the ETL layer.

The above derivation would fetch the last/next Activity Dates and their tyopes along with the Customer ID. We could either store in a new table and join that table at the data mart layer to the main Customer dimension or actually add a further derivation to join it at the DW-to-Data Mart transformation layer and store these attributes in the main customer dimension itself.

Is it correct to derive these Last/Next Customer Activity DAtes and Types from the fact table and store THEM in the Customer dimension during the warehouse to data mart transformation stage?
ANy alternate solution for this?

Thanks & Regards

dwbi_rb

Posts : 17
Join date : 2009-02-19

View user profile

Back to top Go down

Re: Deriving Dimension attributes from Fact table

Post  Jeff Smith on Wed Oct 13, 2010 10:13 am

I think it's OK to store that information on the dimension table.

I recall one place I worked where the definition of an "active" account was based on the number of days since it was last used along with the balance on the account. We were loading 150 million transactions a month. It was easiest to assign the "Last Activity Date" on the account dimension.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Deriving Dimension attributes from Fact table

Post  ngalemmo on Wed Oct 13, 2010 12:12 pm

Its fine as long as the customer dimension is not type 2.... or if it is, you treat last activity date as a type 1 attribute in a type 2 dimension (i.e. update all versions of customer).
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Deriving Dimension attributes from Fact table

Post  dwbi_rb on Thu Oct 14, 2010 9:13 am

Thanks a lot for confirming on this!

The in-line query will be an expensive one as it would join the fact table itself. More so, since we would have to find the "Activity Type" based on the MAX/MIN date (logic as explained in my previous post) from the same fact table.

A discussion within the team was that the time taken to derive this would increase the ETL refresh time and whether this can be derived at the reporting layer. I feel that though at the reporting layer, it may return the intial partial results in a lesser time, it would actually be a very expensive query and would involve more response time for the end user. Based on my experience, I strongly believe such derivations should be done at the ETL layer so that the reporting response time is the least for the end user. It would be great if I could get your comments on this as well!

Secondly, the query to derive this is an expensive one - The structure of the query that would be required to derive this logic is given below. Please suggest if there is a better alternative. By concatenating the date and the activity type, I thouth of reducing the further joins to the fact table that would have been required if we wanted to derive the 4 values individually.

SELECT fact.customer_id
, (SELECT t1.activity_end_dt || '*' || t1.activity_type
FROM fact_table t1
WHERE t1.customer_id = fact.customer_id
AND t1. activity_end_dt = ( SELECT MAX(v1.activity_end_dt)
FROM fact_table v1
WHERE v1.activity_end_dt < sysdate
AND v1.customer_id = t1.customer_id
)
) last_con_dt_typ
, (SELECT t1.activity_start_dt || '*' || t1.activity_type
FROM fact_table t1
WHERE t1.customer_id = fact.customer_id
AND t1. activity_start_dt = ( SELECT MIN(v1.activity_start_dt)
FROM fact_table v1
WHERE v1.activity_end_dt >= sysdate
AND v1.customer_id = t1.customer_id
)
) next_con_dt_typ
from fact_table fact ;






Last edited by dwbi_rb on Thu Oct 14, 2010 9:17 am; edited 1 time in total (Reason for editing : trying to make the SQL query more readable with further indentations - but I doubt if it would work as the tabs did not either.)

dwbi_rb

Posts : 17
Join date : 2009-02-19

View user profile

Back to top Go down

Re: Deriving Dimension attributes from Fact table

Post  ngalemmo on Thu Oct 14, 2010 11:58 am

If its a choice between a little more complex ETL process versus a more complex and slower query, always choose changing the ETL. Its a one-time hit when nothing of importance is going on versus a degraded user experience.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Deriving Dimension attributes from Fact table

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum