HAYCORN — 20 June 2010

MongoDB's MongoUK conference

MongoUK—a small MongoDB con­fer­ence in London—happened last Friday, and it was wonderful. I haven’t been to that many conferences, but it prob­a­bly the best tech con­fer­ence I’ve been to, at least in terms of my en­joy­ment of the talks themselves. (DPC is over three days, in Amsterdam…)

The con­fer­ence was run by 10gen (the company behind MongoDB) themselves, though it felt about as much like an in­de­pen­dent con­fer­ence as it was pos­si­ble to be under the circumstances. (I’m pretty the con­fer­ence was subsidised—10gen are just getting started, and so at this point they want to get as many people as pos­si­ble happily using MongoDB as possible, and they have the VC money to do it.)

A few points that struck me about MongoDB:

  • Unlike most NoSQL databases, MongoDB sup­ports SQL-like indexes, though it will only use one index per query. Curiously, if mul­ti­ple indices could po­ten­tially be used to resolve a par­tic­u­lar query, the server picks the index to be used via a heuris­tic that in­cludes some in­for­ma­tion about how useful the index on pre­vi­ous queries.
  • Shard­ing is complicated. Maybe even very complicated. Since pro­duc­tion de­ploy­ments are rec­om­mended to have two slaves per master, a de­ploy­ment with shard­ing in­volves at least 6 servers for the data itself, plus several con­fig­u­ra­tion servers, plus several servers to act as front-ends to the shards (to conceal the ex­is­tence of the shards from clients). These servers more or less need to be man­u­ally configured—the cluster doesn’t or­gan­ise itself. Perhaps some of these can be virtual ma­chines (?), but a pro­duc­tion setup with shard­ing in­volves many phys­i­cal machines.
  • With map/reduce over shards, the maps are done sep­a­rately within each shard, but the reduce is done on one server. This also means that the re­sult­ing col­lec­tion is itself not sharded.
  • Most con­tri­bu­tions to the MongoDB server itself come from 10gen itself—they haven’t had many outside contributions.
  • No-one asked about the AGPL license!

Overall, my feel­ings toward MongoDB didn’t change that much. It’s a new product, to be used with a modicum of caution, and it shouldn’t be the au­to­matic first choice. (Though perhaps there is no au­to­matic first choice when it comes to NoSQL document-oriented databases.) It is used in pro­duc­tion (many of the at­ten­dees I spoke to were using it in production), though I think many of the very large and com­pli­cated de­ploy­ments are done with a small amount of hand-holding from 10gen. I was re­as­sured by the quality of the MongoDB people, and their com­mit­ment to their product and its users. 10gen and MongoDB are ex­tremely developer-friendly in all respects—the product itself, the documentation, their people, their “marketing” (there was no hard sell), etc.