data puzzle - looking for a query to solve this

View previous topic View next topic Go down

data puzzle - looking for a query to solve this

Post  topcat on Thu Jan 08, 2015 6:03 pm

i have rows in a table that i need to convert to a type 2 dimension, looking for a query to do this (in bulk, no cursors, looping).

simple thought is to group on value and take min/max of date, but the values can and will repeat.

current data is rows with a date and a value, assume each day has a value. ie
date, value
01/01/15, 10
01/02/15, 10
01/03/15, 10
01/04/15, 10
01/05/15, 12
01/06/15, 12
01/07/15, 12
01/08/15, 09
01/09/15, 10
01/10/15, 10

want to convert this to value and start/end dates, ie.
value, start date, end date
10, 01/01/15, 01/04/15
12, 01/05/15, 01/07/15
09, 01/08/15, 01/08/15
10, 01/09/15, 01/10/15

help me solve this puzzle!







topcat

Posts : 19
Join date : 2012-08-09

View user profile

Back to top Go down

Re: data puzzle - looking for a query to solve this

Post  nick_white on Fri Jan 09, 2015 3:32 am

Hi - got a couple of questions:

1. If you are trying to create an SCD2 dimension, and presumably these are the effective start and end dates, then what is the business key of the dimension? The only other field you've given is the value but this can't be the BK as in the example you've given the dates for the same value are not contiguous e.g. for value 10 you've got:
10, 01/01/15, 01/04/15
10, 01/09/15, 01/10/15

whereas the end date of the the 1st row would need to be 1 less that the start date of the 2nd row if this was an SCD2 with the value as the BK

2. Which DB are you using? There may well be solutions to this that use db-specific SQL and, if I can come up with a solution, I don't want to give you a solution that only works with Oracle if you are using SQL Server.

Regards

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: data puzzle - looking for a query to solve this

Post  ngalemmo on Fri Jan 09, 2015 4:12 am

You can use the LEAD and LAG functions.  They are categorized as 'windowing' functions that are in the SQL standard but are not implemented by everybody (or may have a variation). It allows you to look at another row based on a grouping and sequence.

With LEAD you can get the next value to determine if it is the last instance of the current value.  With LAG you get the previous value so you can determine the first instance of the current value.  You should then be able to figure out your date ranges.

Assuming your table had column D containing the date and column V containing the value of interest, the query would look like this:

SELECT D, V,
   LEAD(V) OVER (ORDER BY D) NEXT_V,
   LAG(V) OVER (ORDER BY D) PREV_V
FROM…

Then, if you filter out rows where V, NEXT_V and PREV_V are the same, use another LEAD function to combine the begin and end dates into a single row.  If you embed the first query you can do it in a single statement.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Its not really a type 2 dim, but it was easier to explain this way

Post  topcat on Mon Jan 12, 2015 11:15 am

this is for a retail site where we get price feeds on a daily basis. there are multiple type of prices, for example "standard", "sale", "one day sale", and each of these has a precedence. sale overrides standard, one day sale overrides both sale and standard. and i am simplifying this somewhat but essentially this is what we are provided.

product A, price type Standard, start date 1/1/15, end date, 3/31/15, price 20
product A, price type Sale, start date 1/5/15, end date, 1/10/15, price 15
product A, price type One day dale, start date 1/7/15, end date, 1/17/15, price 10

now i store all the data in a price fact, keys to the dates, product, and price type. so in this case there would be 3 fact records.

to make this data meaningful to our users, we create a rollup/agg that will convert the data into the following, kinda like a scd2.
product A, price type Standard, start date 1/1/15, end date, 1/4/15, price 20
product A, price type Sale, start date 1/5/15, end date, 1/6/15, price 15
product A, price type One day dale, start date 1/7/15, end date, 1/7/15, price 10
product A, price type Sale, start date 1/8/15, end date, 1/10/15, price 15
product A, price type Standard, start date 1/11/15, end date, 3/31/15, price 20

Note that these prices are sent to 3rd parties who want the data this way ...
And also note that there can be all type of overlaps, sale prices spanning multiple standard prices, and more than 3 tiers of pricing (standard, sale, one day sale is just my example).

what i do know, is that i can assign a priority to the prices, meaning that i know that "one day sale" is higher priority than "sale" which is a higher priority than "standard".

What i can do with the given data, is blow the data down the daily level using the priorities and end up with a single price per day. then i need to roll the data back up as shown above with start end dates. its the rollup that i have been struggling with. I have used lead/lag before, but couldn't figure out how to apply it here.

i will take a look at it again.

also the database is Netezza which works really well with bulk queries but is horrible with any kind of looping, in fact any kind of looping is not even an option. it does not have connect-by but does have most analytic functions.

thanks!

topcat

Posts : 19
Join date : 2012-08-09

View user profile

Back to top Go down

Lag/Lead solution is very close

Post  topcat on Mon Jan 12, 2015 11:31 am

there are cases where the lag/lead value will be the same but should not be filtered out.

if a product is put on a one day sale. then the lag (yesterday's price) and lead (tomorrow's price) will be the same.

topcat

Posts : 19
Join date : 2012-08-09

View user profile

Back to top Go down

whoops, my mistake

Post  topcat on Mon Jan 12, 2015 11:33 am

i didnt read the solution properly, says "V, lag V, and lead V" to be filtered out ...

this does seem to work!

thanks!

topcat

Posts : 19
Join date : 2012-08-09

View user profile

Back to top Go down

Re: data puzzle - looking for a query to solve this

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum