Provider Dimension Modeling

View previous topic View next topic Go down

Provider Dimension Modeling

Post  beatrixkiddo on Tue Oct 22, 2013 3:03 pm

Hey, new here to the forum.  Been lurking and reading a bit.

We're building a data warehouse around healthcare data.  We are starting with providers (doctors).  The first source is the NPI provider file which can be downloaded at cms.gov.  It's 300+ columns of doctor attributes.  Some columns are multiples of addresses, licenses, and specialties.  The first objective is to expose a file of providers for searching.  This will be a monthly load.  Here's an example of the NPI file and some columns I'm concerned with.

NPI_Num
Name
Address

[Healthcare Provider Taxonomy Code_1]  - these next 4 columns go 15 deep
[Provider License Number_1]
[Provider License Number State Code_1]
[Healthcare Provider Primary Taxonomy Switch_1]
... all the way to _15

[Other Provider Identifier_1] - these next 4 columns go 50 deep
[Other Provider Identifier Type Code_1]
[Other Provider Identifier State_1]
[Other Provider Identifier Issuer_1]
... all the way to _50

My questions are:

1. Some here are suggesting to (sort of forcing me to) build an ODS, which, reluctantly, I've begun doing.  It's becoming a normalized mess.  Is an ODS a valid option?  What are some reasons to avoid doing it this way?  Again, just provider info for searching, monthly load, no measurements, exporting a flat file.  Seems like a done deal for a provider dim!

2. In a dim/fact setup, how would I handle those sets of 4 columns that go 15 and 50 rows deep?  Snowflake or keep a wide dim?

I'm building a claims fact next, so I'll be building a provider dimension anyway.  We just have some that are hell bent on having an ODS, which may very well be needed.  I'm just not sure it suits this data and objective very well.

Thanks for any help!

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Tue Oct 22, 2013 11:02 pm

Ok, so reading through my data warehouse toolkit, it looks like a multivalued dimension is what I have and a bridge table is what I need. I was really over-thinking this one.

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  scabral on Wed Oct 23, 2013 9:15 am

Another great source for this situation is the book "Star Schema: The complete reference" by Chris Adamson. Chapter 9 discusses multi-valued dimensions and multi-valued attributes with some great examples.

scabral

Posts : 58
Join date : 2012-05-02

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Wed Oct 23, 2013 10:07 am

Awesome, thanks for the resource.

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  BoxesAndLines on Wed Oct 23, 2013 10:39 am

Multi-valued dimension refers to relationships from a dimension to a fact table, as in there is more than one. What you have is something different which would require some sort of hierarchy to consolidate to a single dimension. I would start with identifying which provider attributes are important from a business perspective instead of I want all provider attributes from CMS. Most folks aren't interested in 300 attributes for a given provider.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Wed Oct 23, 2013 11:18 am

BoxesAndLines wrote:What you have is something different which would require some sort of hierarchy to consolidate to a single dimension.  I would start with identifying which provider attributes are important from a business perspective instead of I want all provider attributes from CMS.
Yea, I'm not interested in all of them. The ones I'm stuck on are like Specialty, up to 15 of them. The users do want to see all those. This is the hierarchy you're talking about?

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Thu Nov 21, 2013 10:53 am

So I'm in the old Normalized vs Dimensional debate. These people don't know about dimensional modeling. I've mentioned simple queries, less joins, performance, etc. They don't know much about querying and aren't convinced the performance is better. I have handled the one-to-many provider address issue by creating an address dim and a factless fact to relate providers to their addresses. Maybe our approach is unorthodox, starting with building a dimension. This is the data they want first, and it's very much a "let's see what you can do" sort of project.

I've only used RDBMS for app development. I'm having a hard time explaining to them why we should avoid normalization in the DW. Can you help me formulate a better argument?

Thanks!

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  BoxesAndLines on Thu Nov 21, 2013 12:06 pm

The solution is simple but time consuming. First build out the normalized version of the model and then build out the dimensional version. Then ask simple questions of the model. e.g. How many claims from provider Dr. Bob for condition xyz for the past (week, month, quarter, etc) at this provider location. Compare this provider cost to other claims at same provider location, etc. You get the idea. Then demonstrate how to navigate the different models to answer the questions.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Thu Nov 21, 2013 12:48 pm

Thanks. That sounds like a good strategy. Do you avoid normalization completely, or is there a good scenario for it? Like I said, I've only normalized for app development. The change tracking on a normalized system with audit tables has always seemed really messy, but it does seem to work for its intended purpose.

I'm looking for good points that explain why we shouldn't normalize. To me, the dimensional model really sells itself. Then again, they aren't writing queries or maintaining the system and have zero experience with dimensional modeling. They're actually thinking it's a "newer technology". I'm sure at some point they'll want users to be able to build reports. I've brought up that point as well.

I want to really hammer on why normalization isn't needed, and also why dimensional modeling is a perfect fit. Your advice should help with that, but if you can think of anything to add to the argument, please throw it out here. I've been on two teams supporting fairly large EDW's and neither had anything normalized. The fact that we're even debating this is so frustrating.

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  BoxesAndLines on Thu Nov 21, 2013 2:54 pm

You're not arguing against normalization vs not normalized. You're arguing dimensional vs. normalized. You pick dimensional because it is easier to query historically due to the fact dimension design pattern. It's also highly performant. That's normally enough to close the deal.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  beatrixkiddo on Thu Nov 21, 2013 3:04 pm

Ok. I think the main hurdle here is people are set in their normalized ways. Thanks b.

beatrixkiddo

Posts : 22
Join date : 2013-10-22

View user profile

Back to top Go down

Re: Provider Dimension Modeling

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum