Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

View previous topic View next topic Go down

Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hernandezpaul on Wed Mar 19, 2014 4:03 am

Hi folks,

I have maybe a basic question for some of you.

Background: We have a highly normalized Enterprise data warehouse with fact tables that use row versioning (compliance-enabled), that means, the fact tables have validfrom and validto timestamps like a type 2 SCD.

I want to design a fact table in the DM to aggregate a denormalized some data, in order to simplify the cube design and report creation but I'm little bit lost about how to aggregate these facts with the valid periods.

Let's say I have orders with different status (open, close, delayed, canceled, etc.) and the users want to determine how many articles are in orders with a given status at a given date or the sum of the net sales of canceled order in a given date.

I know I can leave the orders table like in the Enterprise data warehouse and then try to fulfill the analysis requirement with MDX but I also know that is better to have the proper design at the data base level.

Any comment would be appreciated,

Kind Regards,


Paul Hernández
http://hernandezpaul.wordpress.com/
https://twitter.com/paul_eng

hernandezpaul

Posts : 9
Join date : 2013-08-26
Age : 36
Location : Herzogenaurach, Germany

View user profile http://hernandezpaul.wordpress.com/

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hkandpal on Wed Mar 19, 2014 11:53 am

Hi,

do you have any field which idnetifis what is the currect stage of the order , like the order may be open yesterday but today it is closed which means when you print the status of the order today you will pick up the Closed record as it is the current status of the order.


thanks

Himanshu

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hernandezpaul on Wed Mar 19, 2014 3:21 pm

Hi, thanks for your answer.

Yes, there is a status column to identify the status between the validfrom and validto dates. What I would like to do is preaggregate the data in order to improve the performance in the cubes or to have a table structure that better represents the event "order status change". Like there are more than one status I am thinking in something like timestamp the changes and backout the measures for the previous status, but I have not found the satisfactory solution yet.

Kind Regards,
Paul

hernandezpaul

Posts : 9
Join date : 2013-08-26
Age : 36
Location : Herzogenaurach, Germany

View user profile http://hernandezpaul.wordpress.com/

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  nick_white on Fri Mar 21, 2014 4:28 am

If you want to track the event "order status change" then I would build a fact table that does just this i.e. when you load your order dim you work out if the status has changed and if it has then create a record in the fact table. The fact table would have 2 keys to a Status Dim for the old an new values (plus all your other relevant keys, obviously).

If you want an aggregate table that would tell you what the statuses were on any particular date you would have to create aggregate records for every date and status combination - regardless of whether there had been any status changes on a particular date.
This assumes that a status can only change once in a day - if it can change more frequently then you either can't build this aggregate (because your 'date' grain would need to go down to the minute, or something similar, which would not be sensible to implement) or you would need a business rule to define what an order's day's status actually means

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  BoxesAndLines on Fri Mar 21, 2014 8:01 am

I would build an accumulating snapshot fact table to pivot the statuses. Include a date dimension FK for each status as well.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hkandpal on Fri Mar 21, 2014 8:18 am

Hi,

from what you said it looks like you want the sum of net sales or quantity of orders by stages (this is on the assumption that the order can only be in one stage at a particular time).
If you have an active flag and a date range in the table then it would help you to get a status of the order at a praticular time.


How will you handle split orders or back orders ? will yousplit the main order ?


thanks

Himanshu

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hernandezpaul on Fri Mar 21, 2014 9:38 am

Hi everyone,

thanks do much for your answers.
One of the major difficulties is several corrections are perform in the central data warehouse and is really a haedache to transfer these corrections to the data mart, for this reasons we are trying to find a solution in which we can keep the structure of the fact table in the data warehouse. That is a row for each order number with a validto and validfrom timestamp plus a current field. For aggregations, like the sum of the net sales this is not a problem, you just have to filter the valid records with the current flag. For semi additiv measures, like the sum of the net sales in during Januar 2014, I have to determine the valid records for this specific period.
I created 3 time dimensions: Validfrom, Validto and Time, then I created the following calculated measure using the idea that I found here:
MDX Between Start Date and End Date
this is my mdx:
Code:
CREATE MEMBER CURRENTCUBE.[Measures].[Snapshot]
 AS AGGREGATE( {NULL:LINKMEMBER([Time].[Date Id].CURRENTMEMBER,[Valid From].[Date Id])} * {NULL:LINKMEMBER([Time].[Date Id].CURRENTMEMBER,[Valid To].[Date Id])}, [Measures].[OOH Units]),
FORMAT_STRING = "Currency",
NON_EMPTY_BEHAVIOR = { [OOH Units] }
The problem, like the author said, is the performance.

Maybe a good idea is to truncate the table in the datamart and fill it again instead of trying to correct the values and have a better structure.

Kind Regards,

Paul

hernandezpaul

Posts : 9
Join date : 2013-08-26
Age : 36
Location : Herzogenaurach, Germany

View user profile http://hernandezpaul.wordpress.com/

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  nick_white on Fri Mar 21, 2014 10:31 am

I'm unclear what your ValidFrom and ValidTo columns are being used for as fact table rows record an event at a point in time (ignoring accumulating snapshots). So is your ValidFrom date = transaction date and ValidTo date = the transaction date for the next fact record for the same order?
Also, what is the ValidTo date for a Closed record - as presumably this status is valid until the end of time?

When you say you want to report on "the sum of the net sales in during Januar 2014" is a sale only 'registered' when the order has a status of 'Closed' (or whatever status value you are using)? Presumably you have an equivalent of transaction date in your model so the aggregate would just be:
SELECT * FROM SALES WHERE STATUS = 'CLOSED' AND TRANSACTION_DATE BETWEEN '01-JAN-2014' AND '31-JAN-2014'?

If, instead, you are trying to pick up any record that existed in Jan 2014, and use the most recent record for each order in that month, then you'll probably have to use some form of subquery, a bit like this (though this is psuedo-code so don't expect to run it):

SELECT * FROM SALES WHERE SALES.VALIDFROM =
(SELECT MAX(S1.VALIDFROM) FROM SALES S1
WHERE SALES.ORDER_ID = S1.ORDER_ID
AND
(
S1.VALIDFROM <= '01-JAN-2014' AND S1.VALIDTO >= '01-JAN-2014' //Row started before Jan and ended after
OR
S1.VALIDFROM BETWEEN '01-JAN-2014' AND '31-JAN-2014' //Row started in Jan
OR
S1.VALIDTO BETWEEN '01-JAN-2014' AND '31-JAN-2014' //Row ended in Jan
)
GROUP BY S1.ORDER_ID
)

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hernandezpaul on Fri Mar 21, 2014 10:50 am

Hi Nick,

thanks for your dedicated answer.

What we have is a fact table like a type II SCD, when a record arrives for the first time, let´s say a line of an order, the validFrom date is the datetime of the current load and the validTo is assigned to a dummy datetime (9999-31-12). When new information for this line of this order arrives, then the validTo is marked with the current timestamp and another record is inserted:



What I would like to do, is find a way to calculate the sum of the Net and Gross Sales value at any given point in time using MDX or to store the data in a different structure that better represent what I want to measure

hernandezpaul

Posts : 9
Join date : 2013-08-26
Age : 36
Location : Herzogenaurach, Germany

View user profile http://hernandezpaul.wordpress.com/

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  nick_white on Mon Mar 24, 2014 8:30 am

OK - if you need to be able to answer the question "... at any given point in time..." then I don't think it is possible to improve on the design of the fact table as you currently have it, assuming by "point in time" you mean date/day? You would have to create a summary/aggregate record for every single day and while not knowing your data volumes and environment I doubt this would be a sensible way forward.
Assuming this fact table is indexed appropriately then queries like this should perform adequately:
SELECT SUM(GROSSSALESVALUE), SUM(NETSALESVALUE)
FROM FACT_TABLE
WHERE 'given_date' BETWEEN VALIDFROM AND VALIDTO

As you can only have one record per order line valid on any one date, you can't double count lines, as long as you are selecting information on a single 'given date'.

If, instead, you need to be able to query for a given time period that is greater than a day (e.g. show me the figures for January) then you will need to define with your business people what is meant by an "order line" in a month (i.e. what are all the scenarios that can affect an order line and for each of them how does that affect how the line's attributes for that month are defined). You can then use these rules to collapse each order line into a single record and then aggregate these single records into a monthly summary table. Bear in mind that implementing any process like this is likely to result in an aggregate table that answers a very specific question and won't be a generalised solution to many user requirements - though of course you could create multiple aggregates tables to answer multiple questions if necessary.

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  hernandezpaul on Mon Mar 24, 2014 8:46 am

Hi Nick,
thanks for your answer,

what I'm trying to do is actually translate this query to MDX,

nick_white wrote:
Assuming this fact table is indexed appropriately then queries like this should perform adequately:
SELECT SUM(GROSSSALESVALUE), SUM(NETSALESVALUE)
FROM FACT_TABLE
WHERE 'given_date' BETWEEN VALIDFROM AND VALIDTO

At the moment my equivalent MDX query is as follows:
Code:
with member [Measures].[OOH] as aggregate(
   {null:linkmember([Time].[Date Id].CurrentMember,[Valid From].[Date Id])}
   *
   {linkmember([Time].[Date Id].CurrentMember,[Valid To].[Date Id]):NULL}
   ,[Measures].[OOH Units]
)
select [Measures].[OOH] on 0,
[Time].[Date Id].&[2579]: [Time].[Date Id].&[2581] on 1

The Time dimension is a dummy dimension to let the user select a specific date.

The problem is due to the crossjoin the query takes about 2 minutes and that is not acceptable.

Kind Regards,

Paul

hernandezpaul

Posts : 9
Join date : 2013-08-26
Age : 36
Location : Herzogenaurach, Germany

View user profile http://hernandezpaul.wordpress.com/

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  nick_white on Mon Mar 24, 2014 8:56 am

Sorry, I'm afraid I can't help you with this. I'm from an Oracle background and have never used MDX

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum