Wednesday, March 21, 2012

How many partitions

Hi,

in a cube a good partitioning schema is essential in my understanding. We have a requirement of a near realtime cube with quite some load in it (gigabytes of data added and changed every day throughout the day). To get around the ROLAP we were looking for having small partitions which can quickly be reloaded using proactive caching when they have changed data.

Is there any "practical" limit on partitions? Does having many partitions mean that you might have some critical overhead? We might think about having 7000 partitions or more...

Thanks,

7000 partitions is a bit much. Technically doable, we have no hard-coded limit on the number of partitions, but operationally you may have difficulty managing such a large number of partitions.

In Project REAL, we are currently using about 1500 and it is manageable, but you can feel the impact when using the tools, e.g. openning up the database in SSMS, or drilling down into the measure groups. The only performance impact we've seen so far is that having a large number of partitions has meant that we've had to increase some of the thread counts to be larger. If you think about it, it makes sense because AS parallelizes execution based on queries to partitions and then merges the results. Thus so long as your queries resolve to just a few partitions, then the normal configurations for thread counts is OK. However, if you query outside of the partitioning scheme, where you are issuing queries which are not constrained by the physical parititions then theoetically you could issue a partition which forces the system to scan 7000 partition before it can return results. Thus for a heavily parititoned system it is very important to partition based on your query patters.

For more information on Project REAL, see http://www.microsoft.com/sql/bi/ProjectREAL I don't think that we talk about this much in our current white papers (it is coming out as we conduct our performance studies), but it will be discussed in the performance findings white paper.

I think you will be a lot happier if you could target something closer to 700 partitions for your first system. Then if you wanted to go larger based on your observations, you could.

_-_-_ Dave

|||

Dave,

in our first project on SSAS 2005 we already experienced performance problems of the tools (BI Dev Studio) with very view partitions. In my case I think partitions could be managed by scripts like you did in Project REAL. But you might be right, the number of partitions will be much more and the boxes I have to run SSAS on are far not as "nice" as you had... The only thing was that I get the source data already partitioned in that way. So it would be very easy to setup these partitions. But if they are not manageable, what does that help...

Thanks

No comments:

Post a Comment