Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

View previous topic View next topic Go down

Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  cube99 on Sun Sep 26, 2010 3:20 pm

Can anyone list the advantages and disadvantages of Star vs. Snowflake in your DW environement . And has anyone switched from one to another and why?

cube99

Posts : 2
Join date : 2010-09-26

View user profile

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  John Simon on Mon Nov 08, 2010 8:40 pm

In short, if your DW is supporting SSAS cubes in SQL Server, then snowflake is better for performance. If your DW is mostly supporting SQL queries then star schema is better.

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

View user profile http://jsimonbi.wordpress.com

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  ngalemmo on Tue Nov 09, 2010 11:40 pm

Curious... why is a snowflake better when using Analytic Services? Are not the cubes built into a different data structure than the source relational data?
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  John Simon on Tue Nov 09, 2010 11:46 pm

Essentially each attribute in a dimension is done using a SELECT DISTINCT. So if I have a product dimension with an attribute of product category, a SELECT DISTINCT is done on Product Category, and this is then used internally within SSAS for aggregation purposes.

John Simon

Posts : 186
Join date : 2009-05-29
Location : Melbourne, Australia

View user profile http://jsimonbi.wordpress.com

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  hang on Mon Nov 15, 2010 3:04 am

I donít think it has anything to do with using which database system, OLAP cube or performance for that matter, it is about the modelling methodology. Star schema makes use of dimensions easier as the dimension users donít have to understand the intricate ER at 3rd normal form to query about the data.

However In a sense of avoiding extra joins and sub queries, the star schema does present obvious performance advantage over snowflaked schema. Dimensionally modelled star schema intentionally demormalise dimensions by materialising repeated attribute values at higher hierarchical levels, providing an inherent solution for better performance otherwise achieved by materialised views or intermediate query tables.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  Jeff Smith on Fri Dec 03, 2010 4:04 pm

Stars are easier to use and maintain and potentially use less space.

Stars are harder to design.

Stars are easier on the users but harder on the developers and modelers.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  ngalemmo on Mon Dec 06, 2010 11:17 am

Stars are harder to design.

Nooooo... ok, a 3NF model is mechanical, while a star requires some business understanding and creativity. But dimensional modeling is a lot more fun.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  BoxesAndLines on Thu Dec 09, 2010 2:38 pm

I'd say stars are easier to model. You have 3 fact choices and 3 dimension choices. All PK's are SK's so no need to sweat over non-identifying, identifying relationships. Makes it really simple. There aren't many folks making a living just doing dimensional modeling. It's normally an ETL, BI, or DBA type that's responsible.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  srisk on Fri Mar 04, 2011 9:39 am

I agree this is primarily a design decision, although most BI tools these days will insulate the end user from knowing anything about the underlying table structures. My experience is mostly with the Microsoft BI stack, so I can vouch that a snowflake is superior for performance reasons. However, the gain is only at processing time. For smaller data marts it's largely irrelavent (in SSAS), especially since dimension tables tend to be smaller in size anyways. At runtime it makes no difference at all because the cube structure will appear to the end user as a star.

A ROLAP tool on the other hand would fare much better if you were to stick with star schemas.

srisk

Posts : 1
Join date : 2011-03-02
Location : Michigan

View user profile http://www.blue-granite.com

Back to top Go down

Just my opinion

Post  Jeff Smith on Fri Mar 04, 2011 12:32 pm

There is a lot of science when it comes to DW but there is also a lot of art to it. I am a firm believer in Star Schemas, but there is a valid place, in my opinion, for snow flaking. Snow flaking can be used for data that is occassionally needed (performance is less of an issue) and where it results in space savings that can improve overall performance of the database. It can also come in handy when you want to be able to maintain history as it was and to restate history based on the how the world in now.

We are a dental health insurance company. I need to know the address of the dentist and the address of the patient at the time of the procedure. But I also need to be able to aggregate procedures to a dentist and attach a single, current address. Same with the member. To accomplich this, I keep the address dimension attached to the claim fact, but I also have a Current Address Key on the Dentist Dimension. If I want to aggregate the claims to the providers current address, I need to use the snow flake design. But this is usually done during the load process and rarely done by an end user.

Also, my member dimension has multiple rows for each member. I pick up every change. But I need to aggregate the member's claims to a single member and load into an aggregate table. One way to do this is to create a hierarchy in the membership table with the Current member information, add in another surrogate key and create a mini-dimension to for member level aggregate tables. But the performance of queries that join the aggregate table to the member mini-dimension (a select distinct view of the relevant fields in the member dimension) would be very slow. Instead, I flag each member's most current row in the member dimension as the current record, add a Current member Key at the end of the member dimension, and populate it with the member key from the member's current record. This enables me to aggregate claims data through the member dimension to the Current Member Key and I can now join the member dimension to the aggregate table.

If I need to, I can do a self join on the member dimension, joining the Member Dimension Key to the Current member Dimension Key. Saves time, saves space, improves performance.

Another example involves dates. Each claim record has a ton of dates. Most are usually the exact same date. I counted the unique combinations of 5 date fields that spanned 1 year, I came up with 500 records. the 5 date fields are rarely used and if left in the fact table, would take up 20 bytes of space. I created a junk date dimension of the 5 fields, replacing the 5 date fields in the fact table with a single 4 byte dimension key. If I need to query any of the 5 dates, I join the fact table to the date junk dimension and to the date dimension.

This design saves a ton of space, improving overall performance. There is a potential performance hit on queries that hit the 5 dates, but those queries occur very, very rarely.

I think 99% of end user queries should be on the star schema aspect of the database. But I think a limited amount of snow flake design can improve the overall performance of the database and facilitate the loading of aggregate tables.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Stick to the fundamentals

Post  AndreLinssen on Tue May 03, 2011 8:46 pm

Whatever design model you're going to use. At some day, somebody will want to deviate from it.
Most common reasons are:
1. They think some new functionality doesn't fit in the current model
2. Or they want to add new functionality "quick and dirty" (meaning: pushing the maintenance costs to the future)
3. Combination of 1 and 2 (plus lot of willpower)
My advise: stick to the design. It pays of in the long run and it keeps your system straightforward and I am sure that will reduce your maintenance costs.
For a DWH I would strongly suggest a star schema, especially if it has a size of more then 500 gigabytes.
Snowflaking might be easier to start with, because most people are used to it. But I would never use that. (Actually, I'd rather stack my flat files in a "data vault" way then using a 3NF model)
avatar
AndreLinssen

Posts : 6
Join date : 2011-02-15
Age : 51
Location : Netherlands

View user profile http://www.linssen-it.nl

Back to top Go down

Re: Advantages Disadvantages of Star Vs. Snowflake in your DW Env. Has anyone switched from one to another & Why ?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum