How much is too much really ?

View previous topic View next topic Go down

How much is too much really ?

Post  karan_das on Mon Mar 11, 2013 5:49 am

Hi ,

I was wondering that for a FACT, how many columns is regarded as too much ?
Is there a best practice to keep the number of columns in a FACT to be limited to a specific range / number ?

The reason I ask is because I have seen FACT tables having > 150 columns. I have also come across designs where the number of columns were much , much greater than 100. Does having large number of columns (beyond a point) have adverse performance impact ?

Coming from a background of data modeling focussing on normalization, wanted your opinions on this to clear my knowledge of denormalized analytical models.

Thanks.
Karan
avatar
karan_das

Posts : 10
Join date : 2013-02-26
Age : 30

View user profile

Back to top Go down

Re: How much is too much really ?

Post  Jeff Smith on Mon Mar 11, 2013 3:13 pm

100 is way too many. Ideally, the number of dimensions is around 15 but they can be more. But I think you'll find general agreement that 100 is way too many.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: How much is too much really ?

Post  ngalemmo on Mon Mar 11, 2013 4:21 pm

There is no limit, but if you find it getting over 20 you may need to consider rethinking a design. But I wouldn't immediately dismiss a fact with a large number of columns. However, such wide tables will usually hinder performance.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: How much is too much really ?

Post  karan_das on Wed Mar 13, 2013 3:46 am

Hi,
Thanks for the replies.
Actually we are planning to build dimensional tables over which Views will be designed.
Only these views will be queried by the reporting tool.

The views that we build will be taking only a combination of few columns from the underlying fact tables (having 100 columns).

Is this a better approach ? Or is performance still an issue since the underlying FACT tables have large number of columns ?

Thanks,
Karan
avatar
karan_das

Posts : 10
Join date : 2013-02-26
Age : 30

View user profile

Back to top Go down

Re: How much is too much really ?

Post  hkandpal on Fri Mar 15, 2013 10:48 am

HI,

a view may or may not make your query run faster (as by defination a view is a compiled version of a SQL stored in the database).
You may need to look into the query and see the performance.

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: How much is too much really ?

Post  ngalemmo on Fri Mar 15, 2013 4:46 pm

The performance issue is primarily a fact table issue. Very wide fact tables take up more space, the system retrieves fewer rows when it accesses a block of disk, requiring more disk access. Disk is the slowest thing in the system, so it can have a significant effect. Views don't help as it is a physical problem.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: How much is too much really ?

Post  rishbeck1988 on Sat Apr 06, 2013 10:03 am

Hi,

We have created fact tables with around 120 columns. We have created this design for an ecommerce firm and our single fact contains shipping /sales info clubbed together. The huge number of columns is because of varied details that we have pulled together in one table. As the shipment passes through different phases we have different columns to capture each phase of shipping. These column amount to 25 itself. The only problem that we are facing is to have a solid update strategy. We want to bring our DWH near real time. Updating theses 120 columns is painful

rishbeck1988

Posts : 2
Join date : 2013-04-06
Age : 29
Location : India

View user profile

Back to top Go down

Re: How much is too much really ?

Post  umutiscan on Mon Apr 08, 2013 1:06 am

You may need to report shipping and sales info together, but you don't have to put all details in one table.

umutiscan

Posts : 51
Join date : 2010-11-26
Age : 37
Location : Istanbul, Turkey

View user profile

Back to top Go down

Re: How much is too much really ?

Post  rishbeck1988 on Mon Apr 08, 2013 1:19 am

Hi,

Definitely we can have seperate tables for this purpose but the problem is that most of the end users are data analysts who fetch data from fact tables directly. Around 100 users use fact dimension tables directly to run queries to drive analytics/reports etc. That's why splitting of the sales and shipping information would cause huge performance load during data fetch.


rishbeck1988

Posts : 2
Join date : 2013-04-06
Age : 29
Location : India

View user profile

Back to top Go down

Re: How much is too much really ?

Post  Mike Honey on Mon Apr 08, 2013 7:44 pm

Hi rishbeck1998,

What you describe does not sound like a DWH to me - it sounds more like the classic "data dump" with a single, very wide "fact dimension" table , in an effort "make it easier" for analysts. Perhaps the only benefit from this approach is that you have offloaded the data preparation from your core systems.

I suggest you review and consider the Kimbal Dimensional method, which will drive you to properly separate your data according to it's use and granularity. Once that design is in place you will find a huge performance improvement in data fetch, as that is the specific purpose that Diemnsional DW are designed for.

Good luck!
Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: How much is too much really ?

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum