Large number of snowflake code tables per dimension

View previous topic View next topic Go down

Large number of snowflake code tables per dimension

Post  dennisf on Thu Aug 13, 2009 12:26 pm

Ten(10) or so key dimensions(lets call them base dimensions) each has about 25 parent code tables in the 3rd normal operational source. Each code has a short and long description. In the multidimensional model If a fact table has say 3 base dimensions then I could end up with 3 * 25 extra code dimensions (minus the common code across the 3 base dimensions). This will be way too many dimensions. Any suggestions would be appreciated.

dennisf

Posts : 9
Join date : 2009-04-14

View user profile

Back to top Go down

Re: Large number of snowflake code tables per dimension

Post  ngalemmo on Thu Aug 13, 2009 12:37 pm

Why would you create copies of dimension tables for each fact table? But that's another topic...

What I do is have a 'list of values' table for each of the little code/description sets. On the dimension tables I store the code and description so that I do not have a snowflake (exposing the code to users is optional). I then put a trigger on the list of value tables that, when a description for a code changes, updates the dimension tables that hold the description with the new description based on the code. I also use the list of value tables during ETL as a lookup to get the descriptions when I am updating or inserting a dimension row.

The user never uses the list of value tables in queries. However, they do come in handy as a list source for most BI tool interfaces. It avoids the BI tool from doing a distinct query against the dimension when presenting a drop-down selection list.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Large number of code values

Post  dennisf on Mon Aug 24, 2009 11:36 am

I would have included the short and long descriptions in the base dimension table but for another feature we have to provide i.e. multi language descriptions. Currently we provide descriptions for 10 languages. If a base dimesnion table had 25 codes that would mean 25 * 2 * 10 description columns to cater for the short and long descriptions for all languages. Thats 500 additional columns just for descriptions, 25 for the codes and the other columns in the table.

dennisf

Posts : 9
Join date : 2009-04-14

View user profile

Back to top Go down

Re: Large number of snowflake code tables per dimension

Post  ngalemmo on Mon Aug 24, 2009 12:41 pm

When dealing with multiple languages, one approach is to have a simple star schema with multiple copies of a dimension table, each in a different language. Each copy would have identical structures and the same primary key, just the descriptive fields would be in a particular language. You would then use database synonyms or views to present the appropriate dimension based on a user's language.

For example, if you are using Oracle, you would have a common schem containing the fact tables and multiple language specific schema with the dimension tables. A user is logged into a language specific schema which contain synonyms to the fact tables. From a user/BI tool point of view, everthing looks the same regardless of the language. This approach does not work very well if you need to provide multiple languages in the same query.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Large number of code tables

Post  dennisf on Tue Aug 25, 2009 2:23 pm

In fact there is a requirement that the end user be allowed to select a language from a scroll down list at report execution time after he has loggen in.

dennisf

Posts : 9
Join date : 2009-04-14

View user profile

Back to top Go down

Re: Large number of snowflake code tables per dimension

Post  ngalemmo on Tue Aug 25, 2009 3:39 pm

Ok then... the main thing you want to avoid is have multiple columns for each language. It makes queries or using 3rd party tools very, very difficult.

You could always go with the snowflake you described at the outset, including a language code as part of the key and maintaining multiple rows for a code based on language, or you can denormalize into a single dimension table with a compound primary key made up of the surrogate key and a language code. For any given dimension entry, there would be multiple rows with the same surrogate key but different language codes. This way your fact table only carries a single foreign key, regardless of language. The language filter would need to be applied globally to all dimensions (which would be the case with the snowflake approach as well).

If your objective is reducing the number of tables, then the latter approach is the way to go. It not only allows users to choose a language, but it also allows you create the same report in multiple languages in a single query.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Large number of snowflake code tables per dimension

Post  dennisf on Fri Aug 28, 2009 11:40 am

I prefer the following: Say the surrogate key of the base dimension table say BD is SK. I would have another table BD_OUTRIGGER with a composite primary of LANGUAGE and SK list partioned on LANGUAGE. A single record in the base dimension table with surrogate key say SK1 would have many records in the BD_OUTRIGGER table with pKs of (English,SK1), (Chinese,SK1) etc ( approx. 10 languages are supported now). This means that once the language is selected at report execution time there would be a one to one relationship between table BD and BD_OUTRIGGER. The only problem with this solution is that if table BD is very large say 500 million rows then BD_OUTRIGGER will be500 million times the number of languages supported.

What do you think about the size problem the above solution poses?

dennisf

Posts : 9
Join date : 2009-04-14

View user profile

Back to top Go down

Re: Large number of snowflake code tables per dimension

Post  ngalemmo on Fri Aug 28, 2009 11:56 am

What dimension has 500M rows and needs reporting in 10 different languages?

It would be appreciated, if you are looking for advice, to be a little more forthcoming with the details. It's real difficult to propose a workable solution dealing with presumptions and hypotheticals. Obviously, if you are dealing with 500M rows, you would probably think about reworking the model a little bit...
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Large number of snowflake code tables per dimension

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum