Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Why not snowflake if the aggregate dimension is needed for a different fact table?

4 posters

Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  arowshan Fri Oct 28, 2011 4:26 pm

I am in a bit of a dilemma in that if let's say we have a Product dimension needed for the Sales fact table but our forecast fact table is at the ProductCategory level. The Kimball toolkit suggests to still keep the Product Catergory attributes in the Product dimension for sales and create a new dimension ProductCategory. This means that we are duplicating the category info in both of these dimensions. The question is if we already know that we need an aggregate dimension, wouldn't it make more sense and also less work to snowflake that?

I understand that if we are querying the tables directly the snowflaked sales model might suffer on performance which might be reason a for not snowflaking but if we are building an OLAP cube on top of the model I don't think performance would suffer so in this case maybe snowflaked product and product catergory might make more sense. Any ideas?

arowshan

Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  cjrinpdx Fri Oct 28, 2011 5:05 pm

arowshan wrote:This means that we are duplicating the category info in both of these dimensions.

These are good questions. The duplication isn't much of a concern if it allows you to avoid snow flaking. It sounds like you will need both dimensions anyway to support multiple fact tables at different grains. Including the product category in the product dimension is just a convenience to make things easier for your users.

cjrinpdx

Posts : 51
Join date : 2011-07-14
Location : Portland, OR

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  arowshan Fri Oct 28, 2011 5:29 pm

I understand the convenience factor of avoiding the snowlfaked dimension. However, if we are buidling a cube on top and the users use the cube to browse the data, I am pretty sure you can hide the complexity so that to the end-users product and product catergory would seem like one dimension. In addition, for Product Category attributes we need perform change tracking on both dimensions if we keep the product category attributes in the product table.

arowshan

Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  hang Fri Oct 28, 2011 6:27 pm

Generally it's for ease of use rather than performance unless the product dimension is a million row monster dimension. Denormalised product dimension simplifies the hierarchical relationship between product and category when both are SCD 2 attributes where you only need to add a new row to a single product table in case of type 2 changes on any level of the hierarchies. Denormalised version also gives you the flexibility to form the hierarchies using any attributes in one dimension without complication of linking tables when the hierarchies are fixed by snowflake.


hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  cjrinpdx Fri Oct 28, 2011 6:39 pm

If it was me I would not snow flake the design, and I would create two dimensions. If you do snow flake, and the product category is a type 2 scd, be prepared to manage the complexity of the joins between the tables.

You are correct about building the dimensions in SSAS. You really only need one physical dimension table. One dimension (product category) could be built of the table, and the other one (product) could be built of a view.

cjrinpdx

Posts : 51
Join date : 2011-07-14
Location : Portland, OR

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  arowshan Fri Oct 28, 2011 7:59 pm

So you guys are saying that maintaining two snowflaked dimensions (product and prouduct category) is harder than maintaining two independent dimensions Product and Product catergory especially for SCD2.

arowshan

Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  hang Fri Oct 28, 2011 11:24 pm

From dimensional modelling perspective, it is the best practice to not snowflake any dimension tables unless you have to in cases like monster dimension, multivalued attributes or multiple reuse of outriggers like location dimension etc.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  VHF Sun Oct 30, 2011 4:53 pm

For my first two years of dimensional modeling, I always followed the basic Kimball recommendation to stick to a star schema to facilitate ease of use and ensure good query performance. As my DW grew, I was faced with exactly the same situation—the need to handle forecasts at a different grain than sales facts—and the need to support drill-across for end-users.

My first approach was to use conformed mini-dimensions. But while I could easily write hand-coded SQL to drill-across fact tables at different grains using conformed attributes, end users weren’t able to seamlessly drill-across to compare actual-to-forecast.

The specific problem I had was with Business Objects (BObj) Web Intelligence (WebI) using a universe going against our SQL Sever 2008 DW. There wasn’t any way to tell the universe that the ProductCategory attribute in the Product dimension was the same thing as the ProductCategory attribute in the ProductCategory dimension—no way to tell it we were dealing with a conformed attribute. Therefore, users weren’t able to build ad-hoc queries comparing actual-to-forecast.

After doing some performance measurements, I (reluctantly at first) snowflaked both customer and product dimensions in our DW. Note that I did not snowflake everything possible—my objective wasn’t to normalize the dimensions, but I did want to have only one occurrence of each dimension attribute in the DW, and I wanted to have a dimension key at whatever level a fact table might want to record data (product vs. product category for example.)

Snowflaking has worked out well in my case. Users can now easily drag-and-drop to query across fact tables at different grains—such as actual-to-forecast—using WebI. In addition, the snowflake schemas make configuring the dimension usage in SSAS much easier! Even with my modest DW (11GB), I did measure a 20% to 30% performance hit with some queries against the relational DW (in SQL Server 2008.) This could be worse for a larger DW. However, I don’t think snowflaking has any adverse effects on an SSAS OLAP cube.

In the “Dimensional Modeling in Depth” class, Margy and Ralph did say it was OK to snowflake as long as (a.) your queries run fast and (b.) you present a “flat” view of the snowflaked dimensions to your users (i.e. hide the complexity of a hierarchical set of normalized dimension tables from the users.)

Even with a slight performance hit, queries still run fast enough, and I can organize/present dimension attributes any way I want using a Business Objects (BObj) universe, so I’m going to keep snowflaking for the purpose of supporting facts at different grains.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  hang Sun Oct 30, 2011 7:07 pm

VHF wrote:In addition, the snowflake schemas make configuring the dimension usage in SSAS much easier!
Everything else might be true about snowflaking dimensions for BO and WebI, however I am curious to know how snowflake would help dimension usage in SSAS. I thought once the product dimension is configured to be used, any hierarchies you have configured for the flattened product dimension will be used automatically. Maybe you want more granular dimension usage control, on product and category separately.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  VHF Sun Oct 30, 2011 7:12 pm

Using a snowflake schema just makes setting the attribute relationships easier because SSAS can infer the hierarchies from the schema. The end result is the same. Not a big deal, but I do always seems to struggle with SSAS's "backwards" way of representing attribute relationships, so I'll jump on anything that makes it easier!

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  hang Sun Oct 30, 2011 7:20 pm

Yeah, it makes sense. No wonder why even MS sample DW database also snowflakes.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  VHF Sun Oct 30, 2011 7:28 pm

One of the cool things in SSAS is the dimension useage grid where you specify the granularity for each dimension for each measure group (fact table) when designing a cube. For example, being able to specify that your forecast facts uses the product dimension, but at the product category level. This works with either a snowflake or a star schema (as long as the attribute relationships are correctly set!), and I suspect that the resulting cube is the same for either case.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  arowshan Mon Oct 31, 2011 4:41 pm

Hi VHF,

You mentioned that you is is OK to snowflake as long is performance is acceptable and

present a “flat” view of the snowflaked dimensions to your users

I was wondering on the relational side how you would hide the complexity? Also, wouldn't it be harder to do SCD2 on the snowflaked dimensions?

arowshan

Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  VHF Mon Oct 31, 2011 5:08 pm

On the relational side you can create a view to flatten the snowflaked dimension (for example a Product view that joins the Product and Product Category tables).

The disclaimer I should have added to my post above is that I am using SCD1/3 dimensions, so I haven't had to deal with the complexity of SCD2. Let's think about it for a minute. If there is an SCD2 change in your snowflaked Product dimension, this would be handled the same way as an SCD2 change in your original star schema Product dimenson--i.e. create a new row.

Now the trick is what to do if there is an SCD2 change in one of the "roll-up" snowflake dimensions such as Product Category. Obviously, we need to create a new Product Category record (and expire the original record.) However, we also need to create a new Product record for each Product that belongs to that Category. The new Product records would point to the new Product Category record.

In other words, any SCD2 changes need to "trickle down" to the most detailed snowflake table in the dimension. This shouldn't be too hard if doing hand-coded ETL, but might be a challenge with an ETL tool that doesn't specificly support SCD2 with a snowflake schema.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

Back to top Go down

Why not snowflake if the aggregate dimension is needed for a different fact table? Empty Re: Why not snowflake if the aggregate dimension is needed for a different fact table?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum