Adventures in NoSQL, part 2

In Part 1 of this blog series, "Adventures in NoSQL", I deployed a single instance of MongoDB and used Python's tweetstream module to fill a collection with a data feed from Twitter.

In the real world you wouldn't ever use a single instance of MongoDB (or twitter data :-) ) as there is no redundancy if an instance fails, all your data is gone or you need to take some time to restore it from a backup.

However, we can harness the power of a private Eucalyptus IaaS Cloud to use as our infrastructure, this means we can quickly scale out resources using direct EC2 API calls, the euca2ools command line utilities or the Eucalyptus Web interface.

In this post, I'll explore using Replication to spread your data across multiple MongoDB servers for redundancy.

more ...

Adventures in NoSQL, part 1

You've deployed and setup a private Cloud platform but now what? You need an application!

I've been experimenting with a number of technologies to generate workloads and give some demos to prospective Eucalyptus customers. A NoSQL database seems like a great use-case to demo as the technology benefits from being designed for scale-out workloads and this happens to be exactly what an IaaS Cloud does best.

There are an abundance of NoSQL implementations (Cassandra, MongoDB, Couchbase, Neo4j...), written in different programming languages and with slightly different takes on which two parts of the CAP theorem they choose to implement and which method they will use to store and display data.

For this post I'm going to be using MongoDB, which is in the "CP" camp, it handles Consistency and Partition Tolerance whilst forgoing Availability (Every request may not see a response), although MongoDB still provides some great availability options.

more ...