Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Partition Pruning issue

5 posters

Go down

Partition Pruning issue Empty Partition Pruning issue

Post  John Tesson Tue Dec 07, 2010 12:25 pm

We're looking into partitioning one of our star schema fact tables on Campus Name. The fact table consists of students enrolled by Campus. We're using Oracle 11g. The DBA is saying that we need to add Campus Name to the fact table in order for partition pruning to occur. That doesn't make sense to me because it basically is saying to not use the Campus Dimension. The DBA says they have tried to partition the fact table on the campus dimension foreign key but the pruning does not occur. This seems to fly in the face of using star schemas. The DBA also says they have tried reference partitioning also but that didn't help either. This doesn't make sense to me. Any experience based on Oracle database that others can offer?

John Tesson

Posts : 7
Join date : 2009-02-03

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  ngalemmo Tue Dec 07, 2010 3:22 pm

The table needs to contain the column by which the partition is defined.

With that said, quite frankly, campus is a crappy way to partition your tables.

The main point of partitioning is to improve query performance through parallelism. That is, that Oracle can break the query down and distribute the load across paritions to get the data you need by accessing smaller tables in parallel. Partitioning by campus forces a query for a particular campus to access a single partition. Furthermore, performance for large campuses will be slower than smaller campuses, simply because their partitions will be larger.

ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  BoxesAndLines Tue Dec 07, 2010 7:54 pm

You should use the surrogate key for the campus dimension. If your DBA doesn't understand that this column represents column name then you have additional problems. :-O
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  John Tesson Wed Dec 08, 2010 9:40 am

They have made the surrogate key the partitioning key column but seems like partition pruning doesn't occur. Possibly the optimizer decides the best plan doesn't require a partition pruning? Does the fact table and dimensional table need the parallel attribute? Guess this might be a technical issue but I am interested in some examples of partitioning on star schemas where partition pruning actually occurs.

John Tesson

Posts : 7
Join date : 2009-02-03

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  ngalemmo Wed Dec 08, 2010 3:59 pm

On Oracle you need partitioning, parallelism, star schema optimization and local indexes (bitmaps on fact FKs, btrees on dimension PKs) and you need a partitioning scheme where a query can take advantage of it.

The problem with the campus partition is if you query for a single campus, the only speed advantage against the fact is you are working with a smaller index. If you are using bitmaps, which, when properly maintained, are very small to begin with, it is a very marginal improvement and probably not noticable.

If, however, facts can be spread so that parallelism can occur across multiple partitions, not only do you gain the slight advantage of smaller indexes (due to paritioning) but also parallel access across parititions which can provide significant query improvement.

Usually fact tables are partitioned by date, although that may not be ideal in an Education environment.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Partition Pruning issue Empty Resurrecting this thread...

Post  djantzen Wed Feb 16, 2011 11:53 pm

Several things here:

First, my understanding is that the *principal* benefit of partitioning is that you're shrinking the dataset you're searching. In my experience this can have a truly dramatic impact on performance. Another benefit is the ability to expire data without performing costly DELETEs. The argument that it's for parallelism I don't get -- you'd have to map your partitions onto different physical disks for that to help. And then you'd need a query that hits multiple partitions, which in most of the documentation I've read and conversations I've had is the opposite of what you're trying to do, i.e., prune your partitions.

Second, the reason databases have difficulty with pruning based on joins or subqueries is that the query plan effectively has to be done in stages -- 1) perform the join or subquery; 2) go back to the planner with the result and ask what partition to go to. Tricky business and I know neither Postgres or MySQL do this. Postgres at least will honor the result of a function marked IMMUTABLE so you could have a function like 'get_date_key(timestamp)' or 'get_campus_key(varchar)'.

I think John's right here: there's a fundamental tension between star schema design and partitioning because of #2 above. For example, right now I'm exploring MySQL "range" partitioning to see how a fact table partitioned by month would get along with a standard dates dimension. Unfortunately, the query planner does indeed opt to scan all partitions whenever there is a join or subquery. So, to benefit from partition pruning I have to specify the partition column in the query like this:

select * from fact where fact.date_key > 20101001

However, this is what I *want* to do and cannot without hitting all the partitions:

select * from fact join dim_dates on fact.date_key = dim_dates.date_key where dim_dates.date = '2010-10-13'

To take advantage of the date dimension's flexibility AND partition pruning my query must be this:

select * from fact join dim_dates on fact.date_key = dim_dates.date_key where fact.date_key > 20101001 and dim_dates.day_of_month = 13

This will properly narrow the set of partitions scanned to those holding dates greater than 2010-10-01 and return facts occurring on the 13th. At least there's a workaround, but it's not pretty.

djantzen

Posts : 4
Join date : 2011-02-16

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  BrianJarrett Sat Apr 16, 2011 2:41 am

We have the same issue with Oracle 11g. Business Objects has a feature called "Index Aware" that will force the partition pruning to occur. It's still a pain though, because we have to build one filter per fact table, then teach the users how to use it.

If you're using a tool with a semantic layer it might be worth checking into whether or not they've addressed this issue as BO has.
BrianJarrett
BrianJarrett

Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO

Back to top Go down

Partition Pruning issue Empty Re: Partition Pruning issue

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum