Thursday, August 9, 2012

ObjectRocket Sneak Peek

We thought we would expose a bit of of what we have been working on at ObjectRocket. Let’s take it from the top.

ObjectRocket is a Database-as-a-Service (DBaaS) based on the popular MongoDB database. What makes ObjectRocket completely different from other competitors in this space is that we have built the infrastructure from the ground up to ensure the best possible cloud database experience.

To that end, we are focusing on setting the bar in the following areas:

  1. Performance. Performance that elastically scales with your business. Not just performance, but consistent performance.
  2. Availability. The system is always available. Always up, always consistently fast.
  3. Ease of use. The system is very easy to use. Customers forget about the DB and can focus on their core business. Plus, amazing support.

We believe that in order to ensure our customers get a fast, smooth, predicable, and a always available database service, you must be in control of the entire stack. ObjectRocket isn’t sitting on top of some generic infrastructure provider. Instead, we designed and built our own infrastructure from the ground up. The obvious benefits of controlling the entire stack are we have quite a bit of space to innovate, provide greater value to our customers, and be a single interface for customers if there is a problem.

The ObjectRocket architecture is designed around what we call a pod. These pod’s house all the bits required for hosting our platform and are inherently scalable and redundant. ObjectRocket pods encompass a set of compute resources required for massive performance and availability. Each pod has redundant networking and routing components, MongoS router components, ObjectRocket API, GUI, and internal services, as well as core MongoD servers. Each pod has a number of bricks designed specifically as a MongoDB server. I/O duties are performed on an all-SSD based disk platform. However, there is much more to performance than just hacking some SSD’s into your infrastructure. Performance must be holistic, tuned as a stack, suited to purpose.

Access to ObjectRocket is via one of the numerous MongoDB drivers available in your favorite language, or via our REST based API. Customers can come on the platform, create an instance, and in a few seconds have a fully scalable, super high performance, and fault tolerant MongoDB database. When a user creates a new instance on ObjectRocket, they immediately get their own MongoDB replica set with two slaves. A primary and one slave in the local datacenter (US-West or US-East), and one in the remote datacenter. Yes that’s right, you automatically have geo-diverse databases. We also support adding our platform as a replica to an existing set for geo-diversity or to simply try us out.

ObjectRocket provides much more than just a nice MongoDB instance in the cloud. We have numerous innovations that extend and enhance the MongoDB experience. From how we resource manage our hardware, how we expose data via our API, our network topology from coast to coast, how we instrument and expose performance data with rocketstat, to how we elastically auto-scale. All these components are above and beyond current solutions in the marketplace.

Everything discussed here is available to beta customers today, but we aren’t stopping there. In the coming releases we plan big additions for sharding enhancements, instrumentation, and a bunch of exciting stuff we aren’t yet ready to talk about. We plan on releasing more info over the coming months so follow @objectrocket.

Wednesday, July 25, 2012

In memory MongoDB concurrency testing on ObjectRocket

A few weeks ago the folks at Memsql introduced a new benchmark to show off the performance of their new product named ‘bench’. They included a MongoDB option in the benchmark, and a blog post with some performance results.

Of course I had to play with the code and see what was up. Along the way I commited (and Memsql folks pulled) a patch for MongoDB auth support. I also went through and did a sanity check of all the queries to ensure they are reasonable, and the plans are ok. One thing to ensure is that config.py has safe mode on, thus looks like:

After all, we want to be fair about this.

Before I go too far, I must explain that this benchmark was crafted to test MemSQL vs MySQL and do it mostly with a highly contentious and memory centric workload. Sure the benchmark could be expanded to use dataset sizes that exceed memory size and thus test disk performance, but the initial blog post, and the defaults are not configured that way. It must be noted this workload is probably the worst possible MongoDB workload, it stresses all the bits in MongoDB that aren’t it’s strengths. The read vs write ratio is massively oriented to writes, especially updates. It heats up contention, and smokes processor cross-talk, so you had better have a handle on your NUMA settings. Well, if you look at what MemSQL is trying to do with thier product, it makes sense. But how would MongoDB do? Gulp.

I’ll spare you the suspense. It wasn’t great. You can see the results on the blog post for MemSQL here. I verified these results.

Well, there is one exception. They didn’t test on ObjectRocket. On ObjectRocket the results where drastically better than the MongoDB AWS test.

Yeah, that’s correct. We have taken great care to make MongoDB as fast as possible on our platform. Think Gordon Ramsey vs McDonalds™. We are still in private beta, but we are rolling out new improvements every day, and as we do, our platform gets faster. Even in beta we smoke the AWS MongoDB benchmark. Remember, this is a write intensive, concurrency shredding workload. It’s not just I/O bound queries.

This test was performed with the same configuration as the Memsql test: Now for some context. ObjectRocket is designed to be a sharded system. It’s designed to scale easily, on-demand, and automatically. The above tests were performed with a single instance. This is somewhat unrealistic for real world configurations, we would expect even better performance on a properly sharded system.

I’ll be honest, this is probably the smallest margin in which we plan to beat AWS performance. We aren’t releasing I/O based performance just yet, we will post some more benchmarks around that in later posts (yes, even against AWS SSD EBS vols). Stay tuned for some more bits later.

One more note. A keen reader will take a look at the source of bench and notice that the test is modelled relationally, not as documents. What if the model was redesigned in more of a document fashion? What if one uses MongoDB 2.2 and puts each collection in it’s own DB? More to come on these topics as well.