Wednesday, July 25, 2012

In memory MongoDB concurrency testing on ObjectRocket

A few weeks ago the folks at Memsql introduced a new benchmark to show off the performance of their new product named ‘bench’. They included a MongoDB option in the benchmark, and a blog post with some performance results.

Of course I had to play with the code and see what was up. Along the way I commited (and Memsql folks pulled) a patch for MongoDB auth support. I also went through and did a sanity check of all the queries to ensure they are reasonable, and the plans are ok. One thing to ensure is that config.py has safe mode on, thus looks like:

After all, we want to be fair about this.

Before I go too far, I must explain that this benchmark was crafted to test MemSQL vs MySQL and do it mostly with a highly contentious and memory centric workload. Sure the benchmark could be expanded to use dataset sizes that exceed memory size and thus test disk performance, but the initial blog post, and the defaults are not configured that way. It must be noted this workload is probably the worst possible MongoDB workload, it stresses all the bits in MongoDB that aren’t it’s strengths. The read vs write ratio is massively oriented to writes, especially updates. It heats up contention, and smokes processor cross-talk, so you had better have a handle on your NUMA settings. Well, if you look at what MemSQL is trying to do with thier product, it makes sense. But how would MongoDB do? Gulp.

I’ll spare you the suspense. It wasn’t great. You can see the results on the blog post for MemSQL here. I verified these results.

Well, there is one exception. They didn’t test on ObjectRocket. On ObjectRocket the results where drastically better than the MongoDB AWS test.

Yeah, that’s correct. We have taken great care to make MongoDB as fast as possible on our platform. Think Gordon Ramsey vs McDonalds™. We are still in private beta, but we are rolling out new improvements every day, and as we do, our platform gets faster. Even in beta we smoke the AWS MongoDB benchmark. Remember, this is a write intensive, concurrency shredding workload. It’s not just I/O bound queries.

This test was performed with the same configuration as the Memsql test: Now for some context. ObjectRocket is designed to be a sharded system. It’s designed to scale easily, on-demand, and automatically. The above tests were performed with a single instance. This is somewhat unrealistic for real world configurations, we would expect even better performance on a properly sharded system.

I’ll be honest, this is probably the smallest margin in which we plan to beat AWS performance. We aren’t releasing I/O based performance just yet, we will post some more benchmarks around that in later posts (yes, even against AWS SSD EBS vols). Stay tuned for some more bits later.

One more note. A keen reader will take a look at the source of bench and notice that the test is modelled relationally, not as documents. What if the model was redesigned in more of a document fashion? What if one uses MongoDB 2.2 and puts each collection in it’s own DB? More to come on these topics as well.