Yes I know you are really happy with your “persistent” Key Value store. But did anybody notice hstore that comes along Postgresql. I find Postgresql to be a really great RDBMS that has been ignored all the time. It even has some great publisher/subscriber system as well (or LISTEN/NOTIFY in terms of Postgresql) that a lot of people may have implement using Redis, RabbitMQ etc. For people who have not lived anything other than MySQL. I would simply ask them to try out Postgres.
Instead of looking at benchmarks, I will be focusing on a key value store that is ACID compliant for real! Postgres takes advantage of its storage engine and has an extension on top for key value storage. So plan is to have a table can have a column that has a datatype of hstore; which in turn has a structure free storage. Thinking of this model multiple analogies throw themselves in. It can be a Column Family Store just like Cassandra where row key can be PK of the table, and each column of hstore type in table can be imagined like a super column, and each key in the hstore entry can be a column name. Similarly you can imagine it some what like Hash structures in Redis (HSET, HDEL), or 2 or 3 level MongoDB store (few modifications required). Despite being similar (when little tricks are applied) to your NoSQL store structures, this gives me an opportunity to demonstrate you some really trivial examples.
Lets setup our system first. For my experiment I will be using Postgres 9.1 and I will compile it from source. Once in source directory you can: ./configure && make install to install your Postgres. Don’t forget to install the extensions in the contrib directory: cd ./contrib && make install. Once you have setup the database you can create your own database and start the server (Hints: use initdb and pg_ctl). Then launch your psql and make sure you install your hstore extension:
CREATE EXTENSION hstore;
SELECT 'foo=>bar'::hstore;
If everything goes well you should be able to see table output. Now we are ready to do some DDL. I created a table my_store as schema definition below:
CREATE TABLE my_store
(
id character varying(1024) NOT NULL,
doc hstore,
CONSTRAINT my_store_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
CREATE INDEX my_store_doc_idx_gist
ON my_store
USING gist
(doc);
As you can see I’ve created a table with hstore column type and one GiST index (for operators ? ?& ?| etc.). You can checkout of documentation to have a look on different type of operators you have.
Now that we have database and tables setup I wrote a simple script to populate it with about 115K rows from twitter stream. Now keep in mind that its a real life data and I was interested in querying few basic things from collected data. For example, how many people are putting hash tags, or doing mentions, or were posting links in the tweets? For doing this I wrote a simple python script using tweepy and psycopg2 and ran it for about few hours. For each tweet in my store I added a key value pair of ‘has_hashtags=>:t’ if there were any hash tags in the tweet, similarly I introduced has_urls and has_mentions if they were present in tweet, I will be using these keys along with my GiST index to query my table later on.
So after populating my data with 115,142 tweets the database grew to a size of 239691780 bytes (Just 228MB). Now comes the fun part. I was totally blown away by what I can achieve by combining the power of relational and key value style under 1 store. So for example I want to query all the tweets tweeted at unix timestamp of 1323446095 (since I stored the timestamps as a string here is what my query looks like):
SELECT doc -> ‘text’ as tweet, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘created_at=>00001323446095’;
I can add simple count or any other SQL famous aggregate function without going into any complications of my data store specific map reduce or new language to learn hustle. Do note that I padded my timestamp value with zeros since I am only storing strings as values. Also I am utilizing @> operator, thats gonna use the GiST to really do a quick bitmap index scan instead of sequential scan. That was pretty good for starter. Lets try to fetch out all the tweets that had hash tags in them:
SELECT doc -> ‘text’ as tweet, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘has_hashtags=>:t’;
Yes querying complete database pulling out complete data (That you won’t probably do because you page the data :) ) gives me 14689 rows just under 360ms on average. Since we have SQL at hand lets make a condition little more complicated, and use a different operator for same stuff and also sort the data by created_at:
SELECT doc -> ‘text’ as tweet, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘has_hashtags=>:t’ AND doc ? ‘has_urls’
ORDER BY doc -> ‘created_at’ DESC;
It already sounds tasty! This is not it Postgres has more operators, so pulling out hash tagged tweets with urls or mentions is also possible,
SELECT doc -> ‘text’ as tweet, doc -> ‘created_at’ as created_at
FROM my_store
WHERE doc @> ‘has_hashtags=>:t’ AND doc ?| ARRAY[‘has_urls’, ‘has_mentions’]
This is not it! hstore comes with all sort of operators and index systems that you can ask for hash store. Check them out here. Now, despite the NoSQL boom I think we have some great examples and reasons of why RDBMS still remains core part of many market giants (Facebook being something everyone knows). Postgres just gives me one more reason to not ignore RDBMS systems, So If you have been moving around on some document stores just because the reason that RDBMS don’t provide them; think again! You can get the same rock solid durability with structure free systems.
I will be pretty soon revisiting the FriendFeed use case with MySQL to store structure free data with Postgresql approach. Stay tuned, leave your comments and thoughts.
Today I will be doing the long awaited initial benchmarks of Oracle’s NoSQL. Before I jump in I would like to mention that I had a previous post on Oracle’s NoSQL that examines the philosophy and design of Oracle NoSQL store. It’s been few days I’ve been playing around with system. I found it pretty simple due to its major and minor key system; some great multi-get, iterator constructs for read, basic delete, and put system with versions. I did find some short comings, and we will see them in details at the end of this post. So lets get started.
For sake of benchmarking I am going to make things really simple. I will be benchmarking the insert and read speeds of a single node (1 machine no replication, no distribution) and look how a single node performs, based on that we can make rough estimates of performance gains by adding more machines.
Machine used for benchmarks is a pretty normal laptop machine with Dual Core 2.1GHz processor, 3 GB RAM, and commodity HDD. As an example I implemented a twitter like user stream of 100,000 tweets for 100 users randomly (that make approx 1000 tweets per user) each tweet with size of a little more than 100 bytes. So what we will be benchmarking is a speed test for inserting 100,000 tweets (approx 100+bytes) for inserting and reading them all. They should be pretty good experiments (close to a real world scenario) to give us an idea of what Oracle NoSQL can do.
Here is the piece of code that inserts and reads the 100,000 million entries and benchmarks the total time consumed. Compiling and running this I get output as following:
Write Time consumed 143156
Iteration Total time taken 223
Please note the time consumed in above benchmark is number of milliseconds. It turns out that inserting 100,000 entries in random order (since user ids generated are random) take 143156 ms and iterating over each entry of each user take 223 ms. I am pretty satisfied with read speed, and for write speed I found and average time of 1.4 ms (which is slower than reads but almost 700 inserts per second). Its important to note that each insert here is disk synchronized (with durability of Durability.SyncPolicy.SYNC on master, and since we have only 1 node means we are doing the most disk write with no buffering). If I lower the durability value (Durability.COMMIT_WRITE_NO_SYNC write with buffering), the average time drops to 0.5 ms; which is almost double the performance of previous version (or Durability.COMMIT_SYNC). It’s worth noting that you can also customize durability per transaction, which is great for letting a programmer choose what he wants.
I didn’t stop here and continued to implement a example twitter class, which proves the simplicity of what such this simple structure can do. I think the core power lies in not being a monolithic (JSON like) tree like object, and allowing us to pull data on partial paths as well as from full major key path. The class is used in this example code to create a user, authenticate, tweet, and query time-line of user.
Somebody may wonder how it compares with other data stores (document free, column based, key value). I can definitely achieve same effect by other key value stores (some what close as well), but I must say this style was much breeze and easy to imagine (just look at the code for reading tweets and you will know what I am saying), Oracle must rename store type from Key Value store to a new title. Because it’s pretty clear that its not a key value store; it overlaps between the document and key value style (Again I achieve this same effect in key sorted KV stores like LevelDB or TokyoCabinet).
In closing I must also give out some negatives I found, plus the some extras. Currently the number of partitions on Oracle NoSQL are fixed! What does this mean? It implies that at storage can’t be scaled horizontally yet since they can’t rebalance; once partitions have been made they remain fixed. Although Oracle in it’s documentation mentions about the next version having the re-balancing feature. Second (may be most for people) hurting part is dependence on Java platform. Now this will hurt lot of Ruby, and Python etc. people but again, as I said in my previous post, with help of a Java programmer you can write a REST API for yourself (I am planning a Protobuff server to eat this plague). Third, I found administration of system to be a little tricky (could not be understood without reading complete administration docs), but this is no hurdle for a programmers since it provides launch and go script to run the server. Right now I am planning to write a Scala wrapper (syntactic awesomeness) for Oracle NoSQL, so once I am done I will put it on GitHub. You folks in the mean while have few more calories of this NoSQL deliciousness.
Today I finally downloaded a community version copy of Oracle NoSQL, I am pretty happy to see a market giant like Oracle in the NoSQL era (Now waiting for DB2 guys to come in too). I’ve used Oracle 9i way back in my history and have been using MySQL for like decades (almost daily basis) and I know how feature rich these platforms are. Now since Oracle has landed with their NoSQL machinery, I just went through documentation from developers perspective to have a glimpse of feature sets. Lets go through highlights really quickly:
Despite the good features, I feel it has some set backs or I may say a wish list that can actually be fulfilled:
I can pretty much see how basic and really powerful this tool can turn out to be. I’ve only considered the Pros and Cons from programming point of view (API) completely ignoring administration and other parts (Will be doing a detailed benchmarks next). I must say I am pretty satisfied until now, and my first impressions are pretty good. Stay tuned for a detailed benchmarks, and stress test.
Recently I posted some benchmarks on how can you tame MySQL to get the Key-Value behavior, (You can do almost same to get that column family behavior its all about your imagination *grinnn*) lets bullet down exactly what I meant when I used MySQL with HandlerSocket:
Given all this I did some benchmarks and as expected the NOSQL community was hurt and this is what one of them thinks:
- with a similar setup, a NoSQL solution like Redis might give you even more operations per second
- key-value only access might not be enough. Many NoSQL solutions are offering at least MapReduce support.
— Quoted from mynosql blog
And I think answers to what they say are pretty simple. Redis is not something to be compared to MySQL; why? Because, its not truly persistent. You have options to either do Append Only File fsync which can be timer based (not reliable) or on every write (try this option and your system will be dead doing writes all the time!), or save based on time interval in seconds (not reliable as quoted by Redis themselves), or the manual save command (a bomb in your head if you use save button frequently). In short Redis is truly memcached on steroids. Comparing MySQL to Redis is like comparing two different genres; where choosing one is matter of your requirements.
To answer MapReduce part; let me be very clear MapReduce was made by Google to meet its own needs, which is no where near the general business needs. Plus MapReduce is used as substitute to SQL; and thats how powerful SQL is ( Yep they have articles on it ); its like a runtime compiler + storage engine. So its not fair to compare a stripped down version of a (at times pre-compiled) routine to such a flexible thing (SQL).
Keeping things simple, here is what I want to say: “Keep NOSQL as Not Only SQL, rather than NoSQL due to some buzz word or marketing cliche”. I’ve not been paid by SUN to market MySQL, but I completely hate it when people make useless assumptions about the speeds, and put so much effort designing so much denormalized NOSQL structures which could have been achieved in an SQL database (without scattering data, avoiding additional headache different API, updates, bugs and maintainability issues).
Design by Simon Fletcher. Powered by Tumblr.
© Copyright 2010