How to lookup a small group of records as one

View previous topic View next topic Go down

How to lookup a small group of records as one

Post  Al Wood on Thu Dec 16, 2010 8:48 am

Hi,

I have source data which contains [Customer], [EventKey] and [ProcedureType], and the grain of this table is Procedure. [EventKey] is sometimes repeated, and this means there were many Procedures at one time. I need to fill a fact table that has Procedure level grain, and I need to fill a cost field which is the overall cost for the [EventKey] divided by the number of Procedures.

The problem is that the cost is determined by the set of procedures that happened at one time. E.g. Procedure A has one cost, B has another, and A & B combined do not have the costs of A and B added, they have some arbitrary cost from a table that I was given.

So I need somehow to lookup the set of several ProcedureTypes to find the cost. How would you do this? I'm hoping to use SSIS, but its lookup gadget seems to use only one record at a time.

Al Wood

Al Wood

Posts : 46
Join date : 2010-12-08

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  hang on Fri Dec 17, 2010 3:14 am

I think it comes down to an SQL script that picks up only one record out of many records. I can think of a couple of options to achieve that depending on the business requirement.

1. Use max(PK) or min(PK) aggregate SQL function to pick up the record with highest or lowest PK.

select *
from tableA
where PK=(select max(PK) from tableA)

2. Use row_number function with window clause '[partition by (column list)] order by (sort column list)', to pick up Nth record in a specified sequence, as follows:

select * from
(
select row_number() over (order by fieldX desc/asc) as rec_order, *
from tableA
) as A
where rec_order=N

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

I don't understand

Post  Al Wood on Fri Dec 17, 2010 5:37 am

Hi,

I don't understand how your suggestion will help; maybe I didn't explain my problem very well. In the source data there are many sets of [ProcedureTypes]. e.g.

EventKey ProcedureType
1 TypeA
1 TypeB
1TypeC
5TypeA
2 TypeA
2 TypeD
3TypeD
4 TypeA
4 TypeB
4TypeC

I have cost information like this:
TypeA & TypeB and TypeC = 25
TypeA = 17
TypeA & TypeD = 18
TypeD = 11

I need to gather all the Procedure Types for each Event Key, whether it's one Procedure Type or many, and lookup the cost up for all of them as a bunch.

Many Thanks,
Al Wood

Al Wood

Posts : 46
Join date : 2010-12-08

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  hang on Fri Dec 17, 2010 6:35 am

Ok, obviously it's something else. But the solution is still relevant. So basically, you would have following lookup table:

ProcedureTypeStr - Cost
ABC - 25
A - 17
AD - 18
D - 11

You need to firstly convert your EventKey-ProedureType table into EventKey-ProcedureTypeStr using row_Number() as in my previous post, and perhaps using cursor if it's too hard by a SQL script. Then you should have one to one lookup.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

What we need..

Post  Al Wood on Fri Dec 17, 2010 10:31 am

Thanks for the reply.

I was afraid you'd say something like that! I was hoping to use a SSIS toolkit component, but now I think I'll have to start coding SQL.

What I need is someone to write a SSIS toolkit component like the KSCD!

Al

Al Wood

Posts : 46
Join date : 2010-12-08

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  hang on Fri Dec 17, 2010 5:13 pm

I would never count on a component in SSIS to completely resolve issues like what you described. Most of the time in real practice, you have to resort to some SQL scripts as up-stream or down-stream process around the component to complete the task. It's worth checking with Joy and Warren, two SQL Sever BI experts in the forum, to see whether my point is valid or not.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  gvarga on Tue Dec 21, 2010 6:35 pm

Hi,

Why not to create another fact table with event granularity ?
You will have a dimension Event Type

Key Name Cost No of procedures
1 TypeA 17 1
2 TypeB 18 1
3 Type D 19 1
4 TypeA & TypeB & TypeC 25 3
5 TypeA & TypeD 18 2

From the first 3 rows in your fact table you will load 1 event row with Type key=4 .
Of course you will need to gather all the Procedure Types for each Event Key during loading the new Event fact table, but later during reporting and analysis you will have the proper data. ( It is a little bit similar with summary management)


gvarga

Posts : 43
Join date : 2010-12-15

View user profile

Back to top Go down

The problem of how to perform the lookup remains

Post  Al Wood on Wed Dec 22, 2010 5:01 am

Hi,

I might consider the aggregate fact table that you suggest; in any case, the problem of how to perform the lookup remains.

http://www.projectdmx.com/tsql/rowconcatenate.aspx
This website has a number of methods to concatenate rows into strings. I hope to modify one of them to produce a sorted list of procedure IDs for each [EventRef], which I could then use in a normal SSIS lookup.

The result would be at the granularity of the aggregate fact table that you suggest. I could then use it to update the cost in my most granular fact table, even if I don't retain the aggregate fact table.

Al Wood

Al Wood

Posts : 46
Join date : 2010-12-08

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  VHF on Wed Jan 05, 2011 5:47 pm

Did you ever come up with a solution?

If you could get your data into a format something like this then you could easily look up your costs:

EventKey ProcedureType
1ABC
5A
2 AD
3D
4ABC

Transforming the data into this format isn't too hard, but probably requires a cursor (or better yet looping through a table variable if using SQL Server).

Here's another idea... How many unique procedures are there? If there aren't too many, you could use powers-of-two as numeric identifiers for each procedure:

Type A = 1
Type B = 2
Type C = 4
Type D = 8
etc.

The sum of the identifiers would represent combinations of procedures, i.e. A+B+C = 7, B+D = 10. Each possible combination would have a unique value you could use to look up the costing. As long as there are less than 32 individual procedures, the sum would always fit in a 32-bit integer.

The main advantage of this approach is that you could take your original data, easily subsititue the power-of-2 procedure identifiers, and then use a SQL GROUP BY query on EVENT and take the SUM of the procedure ids to get the identifier for the combination of procedures for each EVENT. No cursor or looping required!

In essense this approach is bit-mapping your procedures!

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Intended solution

Post  Al Wood on Thu Jan 06, 2011 6:31 am

Hi,

Nice idea, but there easily could be more than 32 codes.
I think I will use concatenation. The sql turns out to be not too complex and quite quick for small datasets.

I realised that there is possibly a bug in the idea of concatenating the procedure codes themselves. They are typed in by users, and what if they insert a comma into one of them? E.g. these procedure codes:
TypeA
TypeB
Type,E

These concatenate to "TypeA,TypeB,Type,E" which looks like four codes and can't be seperated from the case where "Type" or "E" are codes in themselves.

So I'm going to try to find a character that is forbidden in the source system, e.g. "¬" and concatenate with that, e.g. "TypeA¬TypeB¬Type,E". If I cant, I will use a small permanent table, in the data prepearation area, that auto-numbers any new codes:

1TypeA
2TypeB
3 Type,E

and I will concatenate the ID values, e.g. "1,2,3"
Then I create an SSIS lookup using the "1,2,3" string, remembering to sort them first, so I don't get "1,3,2"

The only slight doubt I have is what happens when the Data Steward has to maintain the cost lookup table with "1,2,3" = 872.33. Will they cope, or will I need to write some kind of Access Form or other frontend to make it plain as day for them. It should be fun!




Last edited by Al Wood on Thu Jan 06, 2011 6:36 am; edited 2 times in total (Reason for editing : Typos)

Al Wood

Posts : 46
Join date : 2010-12-08

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  VHF on Thu Jan 06, 2011 10:49 am

Good idea to use delimiters in your concatinated string! I frequently use a pipe character "|" as a delimiter; users seem to undertand a pipe-delimited list: TypeA|TypeB|TypeC. However, if your source system doesn't prevent them from typing that character it would still be possible to get into trouble.

You might also want to validate the user-entered values against a list of legal values during your ETL process. At the very least you could produce an exception report to let you know that you are loading some bad values.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  hang on Thu Jan 06, 2011 5:22 pm

VHF wrote:users seem to undertand a pipe-delimited list: TypeA|TypeB|TypeC.
Why do you need to prefix the the string with 'Type' and delimit each character anyway? The type string is an open ended lookup code. The worst thing is to make it unnecessarily long, and in this case it's 5 times longer, unless you could have event type with more than 1 character.

I'm not sure if the users ever need to see the type string code. Even if they do, the column name should indicate word 'Type' and I can't see the code 'TypeA|TypeB|TypeC' is more descriptive than 'ABC' under the column name 'Event type string'.

If there is any data quality issue, it should be fixed up by a process called ETL and it's hard and dirty. Using cursor or whatever means to get data in order could consume a lot of time, but that's what ETL used for, and there is plenty time during nightly load.

I think the core issue in the problem is how to structure and load the lookup table and the tricky bit is to prepare the event type character list in proper order before concatenating them.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: How to lookup a small group of records as one

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum