I’ve been pretty excited about Google’s LevelDB, not to mention there are some really old tanks already in the battle field like BDB, Tokyo Cabinet (Kyoto Cabinet as new one), HamsterDB etc. Fortunately I’ve already worked with Kyoto Cabinet and when I looked at the benchmarks I was totally blown away. I have a strong prejudice to name Google, but the benchmark results were way too good to believe. I also had an objection; LevelDB benchmarks compared itself only to TreeDB of Kyoto and not the HashDB (If ignoring key ordering and treating LevelDB to be true key value only). Some of you may start yelling right away that comparing Hash structure to an SSTable is not fair or something. Well by end of this post I will be comparing TreeDB with LevelDB as well and my results were different from officially posted benchmarks.
So to start off I decided to do a simple insertion, and iteration comparison (ignoring order) of simple HashDB and LevelDB without any compression or any sugar both stores come with, it will be a simple (no forced writes) benchmark of both comparing how much time each takes. I have a pretty normal machine (aka a commodity machine) with a normal HDD, and Intel Dual Core 3GB RAM. So without wasting anymore time; lets dive into code. Here is code for HashDB Kyoto and LevelDB. Pretty straight forward; I’ve carefully placed benchmarks point to only measure the “real time” consumed by both excluding additional key preparation and value preparation parts. The idea is to insert and iterate 100,000 key/value pairs. Running the benchmarks yield (Please note the first line indicates the time to insert 100,000 KV pairs, and second line indicates the time it took to iterate all those KV pairs, every time db was created from scratch i.e. previous copy was deleted. I took total 3 runs of each):
Kyoto Cabinet 100,000 entries:
LevelDB 100,000 entries:
Iterated reads are pretty fast in LevelDB, but writes are almost twice as slow as Kyoto Cabinet. I didn’t stop here because I’ve not hit the weak spot of LevelDB yet (yes larger values). So in my next test I fixed the length of value buffer to 512 bytes instead of the small string length, results took the expected form:
Kyoto Cabinet HashDB 512 bytes 100,000 entries:
LevelDB 512 bytes 100,000 entries:
At this point I became suspicious about the TreeDB benchmarks by LevelDB being true on my machine; and guess what, I was right! Here are benchmarks of TreeDB of Kyoto Cabinet:
TreeDB with Fixed 512 byte value length:
TreeDB with small length values (same as 1st HashDB benchmarks above):
LevelDB 1,000,000 entries (small values to give it an advantage):
TreeDB 1,000,000 entries (small values):
Surprised? Same here! Now the question is am I missing something? I am planning to put up some more benchmarks with some real life scenarios and values (may be a tweet sized stuff). Until then I would love to have feedback, and know if I’ve done any mistake while benchmarking.