TREllisNet logo

Random musings on life & free software

Road Warrior 101

I’ve been very fortunate enough to have travelled a lot in the past 6 years in my roles as a Consultant at both Red Hat, Canonical and now Eucalyptus, but also for holidays. Those countries have included: Philippines, India, Australia, USA (San Francisco, Santa Barbara, Boston, New York, Washington DC, Florida), Canada (Montreal), South Africa, Finland, Sweden, Netherlands, Spain, Germany, Italy, Belgium, Ireland, Poland… to name a few!

Suffice to say, when you spend a bit of time “on the road”, you learn a few tricks to help make life a bit easier. This blog post is a break from my technical posts to impart a few of those tricks on you, the public.

do you have any tips? I’d love to hear them, stick them in the comment box below!

Suitcase

If you buy the cheapest case you find, you will need a new one within a few trips.

I tend to go through one suitcase every 1-2 years if I’m lucky or perhaps sooner. That being said, do not go out and buy the most expensive one like a Samsonite. Avoid airport shops, they tend to be very overpriced. Samsonite cases are great cases but the prices aren’t representative of the quality and they will not necessarily last you any longer and of course they are a great target for thieves.

My recommendation is to search around buying something in the mid-range prices. I like Delsey cases or Swiss gear as you can get a very good quality one for around $100 and it will last. Also, make sure you get one that fits in the overhead cabin - even if you check your bags in, you want to have the option to carry it onto the plane if you are running late.

Lastly, Avoid anything with huge zips (they will break off!) or any dangling bits - think about what could get trapped in machinery.

Airports

Love them or hate them, they will become somewhere you spend a lot of time!

I prefer to get there early and check baggage (it’s generally free in Europe unless you fly with a low budge airline) some prefer to come late and take carry on baggage only. I hate the fight for cabin space, so if I just take my laptop bag on-board I know I can get it under the seat and that my clothes are checked in. Generally, it depends on how long you are going away for. For me consulting engagements are around 5 days or longer so I need a bigger bag.

If you live in London, then you can avoid Heathrow by travelling via Gatwick or London City, both of which I find to be quicker to get to and have a lot less people around and consequently this makes them much less hassle. However, Gatwick does have a lot of holiday makers - beware of the school holidays! London City airport tends to be my favourite due to the majority of people there travelling for business, no families to avoid or screaming babies! :-)

If you haven’t see Up in the Air then rent/buy/stream it - it has some great tips for airport queues.

Priority Pass

Priority Pass is a lounge access card to get you into the 3rd party (non-airline) lounges. I highly recommend it if you are travelling a lot. It’ll help to stop you spending money in the shops and it gives you a quiet, clean, comfortable place to sit and work in and includes free wifi access & alcohol - need I say more?

The UK lounges tend to be operated by “ServisAir”, which tends to mean they are a bit dated, however at Heathrow Terminal 3 and Gatwick you can enter the “No 1. Travellers Lounge” which in my opinion is much better. Heathrow Terminal 5 does not have a Priority Pass lounge though, so if you travel British Airways, you are stuffed. In airports outside of the UK it’s a bit hit and miss, but generally I find you get more access into airline specific lounges and that the overall quality is very good.

Coffee & Tea

“Life is too short for bad coffee” - Hasbean.co.uk

I love coffee, I love tea and I hate the fact that every hotel has poor coffee and only fruit teas (with some exceptions!)

The best way to ensure you get good coffee and tea is to bring it yourself!

I bring a bag of PG Tips everywhere I go, it gives me a taste of what I have at home and is 100x better than the Lipton tea which seems to be present in every European hotel (and which British people never drink, ever!) although finding fresh milk can sometimes be a challenge.

For Coffee, I highly recommend buying an Aerobie Aeropress - a fantastic coffee brewing gadget that is very compact and will fit nicely in your suitcase. Coupled with a Porlex ceramic grinder to grind your beans in the morning, it is the best solution I’ve come across and makes wonderful coffee. The bonus is that the grinder and Aeropress fit perfectly together, to take up even less space in your suitcase!

There are tons of videos on youtube about brewing with the Aeropress, I recommend the hasbean video.

I buy my beans from HasBean and can’t recommend them enough. All beans are roasted the day before they send them to you and they have a large selection direct from the coffee plantations with lots of videos and tasting notes.

Fitness

Most high-end hotels have a pristine gym, mainly because nobody ever uses them! Take advantage of them when you can.

If I’m going somewhere for a few weeks I like to find a quieter hotel - they tend to have better Internet connections! This can mean the hotel does not have a gym, which is how I started running.

I like to run barefoot or minimalist and instead of taking up precious suitcase space with some trainers, I use these great minimalist running sandals or huaraches from XeroShoes, they are very light-weight, compact and the do-it-yourself kits are very cheap.

Membership Cards

Most hotels, airlines and other organisations offer Loyalty cards to collect points. My advise is to sign up for every single one of them. Why? Well, you are travelling on company dime… it costs you nothing to collect them. I’ve had free flights, free hotel rooms and upgrades from them without spending a penny of my own money!

I don’t have a preference on which airline to travel on, however Star Alliance will cover a wider range of airlines and it’s much easier to move up the membership levels than the British Airways / One World club.

My top tip is for hotels.com if you book your own hotel rooms, as you get a free nights stay for every 10 nights and can spend it in any of the hotels.com hotels and that tends to be thousands.

That long airport wait…

What’s your favourite TV series? Load them up onto your tablet or laptop in case you get stuck in an airport or load up a set of books onto your Kindle.

I find that TV episodes coupled with the Priority Pass card, help me wind away those hours of delays.

Last but not least, are you having trouble finding your travel plug and swapping it between your gadgets? Worry no more and just take a 4-way socket extension lead, which you can use one travel adapter for and then plug in all your gadgets using your native socket type, simple!

Adventures in NoSQL, Part 2

In Part 1 of this blog series, “Adventures in NoSQL”, I deployed a single instance of MongoDB and used Python’s tweetstream module to fill a collection with a data feed from Twitter.

In the real world you wouldn’t ever use a single instance of MongoDB (or twitter data :-) ) as there is no redundancy if an instance fails, all your data is gone or you need to take some time to restore it from a backup.

However, we can harness the power of a private Eucalyptus IaaS Cloud to use as our infrastructure, this means we can quickly scale out resources using direct EC2 API calls, the euca2ools command line utilities or the Eucalyptus Web interface.

In this post, I’ll explore using Replication to spread your data across multiple MongoDB servers for redundancy.

Start a new instance

When you are scaling out a service, you do not want to be concerned with configuring and installing each virtual instance manually.

You could use configuration management tools like Chef, Puppet or Ansible but to keep things simple in this example I’ll use Cloud-init, a tool that has become the defacto initial instance run-time configuration tool in most public clouds.

Cloud-init runs on boot and can interpret and run scripts passed within the EC2 user-data available to an instance, with euca2ools this is passed to the instance with the -f option. It’s written by the awesome Scott Moser at Canonical and it can configure a number of services, install packages and arbitrary commands and even bootstrap your favourite configuration management tool.

I’ve create a cloud-config file that Cloud-init can use to install, setup MongoDB and set the name of the replica set:

mongo-db.config
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#cloud-config
# Update apt database on first boot
apt_update: true
# Upgrade the instance on first boot
apt_upgrade: true
# Add 10gen MongoDB repo
apt_sources:
 - source: "deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen"
   keyid: 7F0CEB10
   filename: 10gen.list
# Install packages
packages:
 - mongodb-10gen
 - ntp
runcmd:
 - [ sed, -i, "s/# replSet = setname/replSet = twitterdata/g", /etc/mongodb.conf ]
 - [ restart, mongodb ]

This cloud-config script automates the commands we ran in my last blog post and updates the system to ensure all updates have been applied. It leaves us with a system already running MongoDB and the replicaset name for our MongoDB cluster ready to join the first server.

Start two new instances running Ubuntu 12.04 LTS using the cloud-init script and the security group and keypair we created in the first post:

1
euca-run-instances -n 2 -k mongodb -g mongodb -t c1.xlarge -f ~/path/to/mongo-db.config emi-87F63CE5

After a few moments our instances should show as ‘running’:

1
2
3
4
$  euca-describe-instances i-E2DF4157 i-4D433A97
RESERVATION r-8FA84324  985725263417    mongodb
INSTANCE    i-4D433A97  emi-87F63CE5    78.152.43.15    172.30.66.28    running mongodb 1       c1.xlarge   2013-02-05T18:22:10.310Z    emea-testlab-cluster1   eki-222540D6    eri-A5753DBE        monitoring-disabled 78.152.43.15    172.30.66.28            instance-store
INSTANCE    i-E2DF4157  emi-87F63CE5    78.152.43.12    172.30.66.17    running mongodb 0       c1.xlarge   2013-02-05T18:22:10.297Z    emea-testlab-cluster1   eki-222540D6    eri-A5753DBE        monitoring-disabled 78.152.43.12    172.30.66.17            instance-store

Setting up MongoDB replication

Replication in MongoDB allows you to store multiple copies of your data across multiple mongod instances.

The MongoDB documentation clearly outlines how to convert a standalone system to a replica set, the commands I use below are adapted from the documentation for a Ubuntu package install.

On your standalone instance (instance 1) that we already configured MongoDB on and imported our twitter data, setup replication, ntp and restart MongoDB:

Instance 1
1
2
3
4
5
6
7
8
# Stop MongoDB service
sudo stop mongodb
# Install NTP
sudo apt-get install ntp -y
# Setup a replica set with the name 'twitterdata'
sudo sed -i 's/# replSet = setname/replSet = twitterdata/g' /etc/mongodb.conf
# Start MongoDB service
sudo start mongodb

Next, connect to the Mongo shell and initiate the replica set (make sure you can ping your instance hostnames, if not, setup an /etc/hosts file):

Instance 1
1
2
3
4
5
6
7
8
9
10
$ mongo
MongoDB shell version: 2.2.3
connecting to: test
> rs.initiate()
{
    "info2" : "no configuration explicitly specified -- making one",
    "me" : "instance1:27017",
    "info" : "Config now saved locally.  Should come online in about a minute.",
    "ok" : 1
}

Again, in the mongo shell we can now tell our first instance which members should join the replica set:

Instance 1
1
2
3
4
5
twitterdata:PRIMARY> rs.add("instance2")
{ "ok" : 1 }
twitterdata:PRIMARY> rs.add("instance3")
{ "ok" : 1 }
twitterdata:PRIMARY>

That’s it! Replication is configured and our data is making it’s way across to our other systems.

Replication Status

On Instance 2, let’s check the status. Below you can see our prompt in the ‘mongo’ shell shows ‘SECONDARY’, back on Instance 1 this will show as ‘PRIMARY’. If it’s not fully synced across yet the prompt may show ‘STARTUP2’.

Instance 2
1
2
3
twitterdata:SECONDARY> show dbs
local   1.203125GB
twitterstream   3.9521484375GB

On any of the three replicas we can use the command ‘rs.conf()’ to return the replica set configuration:

Instance 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
twitterdata:PRIMARY> rs.conf()
{
        "_id" : "twitterdata",
        "version" : 3,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "instance1:27017"
                },
                {
                        "_id" : 1,
                        "host" : "instance2:27017"
                },
                {
                        "_id" : 2,
                        "host" : "instance3:27017"
                }
        ]
}

Or we can see the status of the replica set with ‘rs.status()’:

Instance 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
twitterdata:PRIMARY> rs.status()
{
        "set" : "twitterdata",
        "date" : ISODate("2013-02-06T10:52:08Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "instance1:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 58958,
                        "optime" : Timestamp(1360147928000, 4),
                        "optimeDate" : ISODate("2013-02-06T10:52:08Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "instance2:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 58667,
                        "optime" : Timestamp(1360147926000, 3),
                        "optimeDate" : ISODate("2013-02-06T10:52:06Z"),
                        "lastHeartbeat" : ISODate("2013-02-06T10:52:06Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "instance3:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 58667,
                        "optime" : Timestamp(1360147926000, 3),
                        "optimeDate" : ISODate("2013-02-06T10:52:06Z"),
                        "lastHeartbeat" : ISODate("2013-02-06T10:52:06Z"),
                        "pingMs" : 0
                }
        ],
        "ok" : 1
}

OK, so we are replicating our data, let’s start our twitter streaming script again to continue to put data into MongoDB:

Instance 1
1
python tweet2mongo.py

Failure of Instances in the replica set

If we now stop MongoDB on the primary (Instance 1) replica, we notice that our script stops printing out tweets as it’s lost the connection:

Instance 1
1
sudo stop mongodb

On our Secondary MongoDB replicas, you can check the status of replication:

Instance 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
twitterdata:SECONDARY> rs.status()
{
    "set" : "twitterdata",
    "date" : ISODate("2013-02-06T10:53:13Z"),
    "myState" : 2,
    "syncingTo" : "instance3:27017",
    "members" : [
        {
            "_id" : 0,
            "name" : "instance1:27017",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : Timestamp(1360147972000, 19),
            "optimeDate" : ISODate("2013-02-06T10:52:52Z"),
            "lastHeartbeat" : ISODate("2013-02-06T10:52:52Z"),
            "pingMs" : 0,
            "errmsg" : "socket exception [CONNECT_ERROR] for instance1:27017"
        },
...

We can see from the ‘errmsg’ field above that instance1 is down (no surprise, we stopped it!), but fortunately our data is safe, it’s on the other servers in the replicaset!

To bring it back up we can start the instance1 MongoDB service again:

Instance 1
1
sudo start mongodb

Now that we’ve start it, our script is again adding data into MongoDB, however on closer inspection it’s really just passing data through to Instance 3:

Instance 1
1
"errmsg" : "syncing to: instance3:27017"

That’s because our Instance 3 has been elected as the primary and Instance 1 needs to sync back any data it does not have. After it has done this it will become healthy again. The mongos routing process has forwarded our writes over to the primary node.

We can force Instance 1 to be primary again using a documented procedure.

In each of the mongo shells on Instance 2 and Instance 3 run:

Instance 2
1
rs.freeze(120)
Instance 3
1
rs.stepDown(120)

Make sure you run the rs.freeze command on the server that is marked as secondary and the rs.stepDown command on the server currently marked as primary.

This will force Instance 2 to not be elected as a primary for 120 seconds and cause Instance 3 to step down from being a primary and not be elected for 120 seconds, thus allowing Instance 1 to be elected as the primary in the replica set. Both Instance 2 and Instance 3 will show as syncing from Instance 1 in ‘rs.status()’ again and then return to a healthy state as they catch-up.

Conclusion

In the real world, instances die. Sometimes for a particular reason, perhaps they run out of resources (CPU, Memory, Disk) or sometimes because the hardware or availability zone fails. Sh*t happens, we should be prepared for it.

In part 3, I’ll look at using the mongos service so that when a replica dies our application read and writes are sent to the correct server and have a look at using Sharding to split our data into ranges and spread it out across systems to give us increased horizontal write capacity.

Any thoughts so far? Drop me a comment in the box below!

Backing Up a Eucalyptus Cloud Controller

Introduction

When you first start using Eucalyptus, you are concerned with the intricacies of building images, configuring virtual instances and using all the variety of API features and tools that Eucalyptus offers.

However, when Eucalyptus becomes a tool your business relies on you need to consider making sure you can recover the system from a catastrophic failure. In the enterprise this usually means making adequate backups and being able to restore the whole system from those backups using an automated procedure perhaps with tools such as Ansible, Puppet and Chef or via a documented manual restore procedure.

Of course, like most things - it’s never as easy as it seems it should be! :-)

This article will cover backing up and manually restoring the Eucalyptus Cloud Controller (CLC) to a known state.

The Eucalyptus CLC is the “brain” of your whole cloud. It stores metadata for all images, user, account and policy settings and all state related to your cloud. It is one larger pieces you need to consider when backing up your cloud - however there are other components that need backing up too such as the Walrus server (containing your S3 buckets) and Storage Controller (SC), which contains your EBS volumes.

The Eucalyptus Enhancement Bug EUCA-2139 has some more details on the reasons to backup the CLC.

PostgreSQL

Eucalyptus 3.1 and above uses PostgreSQL as the database that stores all of the critical data I mentioned. PostgreSQL is a very well established database with thousands of users worldwide and has well defined backup and replication procedures. PostgeSQL dumps and Continuous Archiving are the main two options we can consider.

I’ll focus on dumps specifically as they are simple to understand and restore from and most sysadmins using PostgreSQL are familiar with them.

Backup

Taking a PostgreSQL dump is very simple via the pg_dumpall or pg_dump commands, however we also need to make sure we backup our cloud key files and certificates and then restore them in the correct order.

1. Run pg_dumpall to produce SQL dumps of all databases

CLC
1
2
mkdir -p /tmp/euca-backup
pg_dumpall -c -o -h /var/lib/eucalyptus/db/data -p 8777 -U root -f /tmp/euca-backup/eucalyptus-database-`date +%Y%m%d`.sql

2. Backup the keys directory

CLC
1
tar -czvf /tmp/euca-backup/eucalyptus-keysdir-`date +%Y%m%d`.tgz /var/lib/eucalyptus/keys

3. Backup the /etc/eucalyptus directory

CLC
1
tar -czvf /tmp/euca-backup/eucalyptus-etcdir-`date +%Y%m%d`.tgz /etc/eucalyptus

4. Store the sets of files together

CLC
1
tar -czvf /tmp/eucalyptus-backup-`date +%Y%m%d`.tgz /tmp/euca-backup/eucalyptus-database-`date +%Y%m%d`.sql /tmp/euca-backup/eucalyptus-keysdir-`date +%Y%m%d`.tgz /tmp/euca-backup/eucalyptus-etcdir-`date +%Y%m%d`.tgz

Now we have a backup file that contains SQL dumps, keys directory and Eucalyptus settings directory in a compressed file called “/tmp/eucalyptus-backup-date +%Y%m%d.tgz” (where the date command is replaced with todays date) - you can safely store this file off-site with your other backups.

Restore

Restoring a Eucalyptus CLC involves several additional steps that aren’t completely obvious at first glance.

1. Install CentOS 6

CentOS 6 installation is above and beyond what this guide is demonstrating, if you don’t already know how to do this then skip this whole article and head over to http://www.centos.org.

2. Follow the Eucalyptus Documentation for package installation

The Eucalyptus documentation covers everything you need to know regarding pre-configuring your system (network settings for example) and package installation.

You do not need to register any components, just install the packages and dependencies up to the point where your next step would be to initialise the Eucalyptus database.

3. Stop the CLC service

For good measure stop the service, it should already be in this state.

CLC
1
/etc/init.d/eucalyptus-cloud stop

4. Remove any old database directory (may not exist)

You may find that you have an old database on the system, if you want it then make sure you copy this directory to another location, otherwise remove it.

CLC
1
rm /var/lib/eucalyptus/db -rf

5. Initialise the new database structure

This command initialises the whole database structure including the db directory and the PostgreSQL configuration files.

CLC
1
euca_conf --initialize

6. Start the Eucalyptus PostgreSQL database manually

CLC
1
su eucalyptus -c "/usr/pgsql-9.1/bin/pg_ctl start -w -s -D/var/lib/eucalyptus/db/data -o '-h0.0.0.0/0 -p8777 -i'"

If we start the database via the Eucalyptus-cloud init script, it populates the database with some content that will make restoring our backup difficult.

7. Prepare the backup file

Copy the backup file from your off-site backup facility back to your new Eucalyptus CLC and untar it.

CLC
1
tar -xvf /tmp/eucalyptus-backup-XXXX.tgz -C /

Where XXXX is the date the backup was taken on e.g. eucalyptus-backup-20130308.tgz

8. Restore the SQL backup

CLC
1
psql -U root -d postgres -p 8777 -h /var/lib/eucalyptus/db/data -f /tmp/euca-backup/eucalyptus-database-XXXX.sql

9. Restore keys and certificates

CLC
1
tar -xvf /tmp/euca-backup/eucalyptus-keysdir-XXXX.tgz -C /

10. Restore /etc/eucalyptus directory

CLC
1
tar -xvf /tmp/euca-backup/eucalyptus-etcdir-XXXX.tgz -C /

11. Stop Eucalyptus PostgreSQL database

CLC
1
su eucalyptus -c "/usr/pgsql-9.1/bin/pg_ctl stop -D/var/lib/eucalyptus/db/data"

12. Start the CLC

CLC
1
/etc/init.d/eucalyptus-cloud start

Automation

I’ve written a rather rudimentary bash script to automate this (I’m working on a better python version!), which you are welcome to download, change and modify under a BSD license. I use this script in cron to backup my test systems every Sunday night, however you may want to incorporate this into your nightly OS backup procedure.

You can get it here: https://github.com/tomellis/scripts/blob/master/euca/euca-clc-backup.sh

Now you can go to sleep safe in the knowledge that your CLC is backed up ready to restore in the event of any catastrophic failure!

FOSDEM 2013

This last weekend I was over at FOSDEM 2013, where 5,000 geeks descended onto the ULB university in Brussels, Belgium for a huge free and open source software conference.

It was my first time at FOSDEM after many years of nearly attending and the first thing that struck me was the size. 5,000 people in attendance over the two days, 390hrs of talks and every talk was absolutely packed. It reminded me of the Ubuntu Developer Summit (UDS), but on steroids.

This was two days packed full of developer talks on everything you can imagine from Java, Gnome, Xorg, MySQL, BSD, Cloud, Law etc and a large room full of stalls from all the distros and latest cloud platforms.

Cloud IaaS was represented very well with lots of talks on OpenNebula, CloudStack, Openstack, Ganeti and others including Synnefo, a Ganeti based IaaS stack with Openstack API’s.

Eucalyptus didn’t have a talk in the cloud dev room, but I did have a number of great discussions about the latest Eucalyptus version and features in Eucalyptus 3.3, such as ELB, Autoscaling and CloudWatch.

Saturday

On Saturday I spent the day getting my orientation and attended a number of talks.

Automated Openstack Testing

This was hosted by Canonical folk who are running extensive QA of Openstack above and beyond the QA on the gated upstream trunk of the Openstack projects.

They are doing a heck of a lot of testing across multiple test suites. I think we have something very comparable going on in QA at Eucalyptus now and it would be cool to see a QA presentation from us at FOSDEM next year, there is a dedicated QA/Jenkins developer room which was cool.

oVirt Live Migration

Some interesting notes on live migration in the oVirt project, I think this is above and beyond what KVM/Libvirt are currently offering and adds some nice enterprise features to oVirt, which will eventually be in Red Hat’s subscription supported RHEV product.

Boxes - Virtualisation install automation

Boxes is a new virt-manager-esk UI but extremely simplified and allows simple automated installations of Linux and Windows. I was fairly impressed with the way they’ve done this and it lead me onto a library they are using: libosinfo

Libosinfo is pretty sweet, it provides you with a library that gives you access to metadata about different distributions, how to install them in an automated way, life-cycle, iso location etc. We could use this in Eucalyptus with an image library.

Also attended a Ganeti update talk and oVirt introduction which were both fairly detailed and it seems like there is a lot of progress in both projects.

Sunday

I sat on the CentOS booth (Thanks to the CentOS guys!) giving out Eucalyptus Faststart DVD’s (Thanks to Andy Grimm for quickly creating a modified version that has a minimal CentOS install on it!), Pen’s and Stickers.

All 50 DVD’s went very quickly and I had a number of conversations regarding our current status and on the WebUI. I demo’ed the WebUI running in the Eucalyptus EMEA test lab to a few people including a Eucalyptus fan at a large hosting company who hadn’t see our latest feature set.

I also met the whole ComodIT team, who seem like a great group of developers and I can’t wait to see what features they release next.

After I ran out of freebies I headed over to the cloud dev room.

Security Priorities for Cloud Developers

This turned out to be a talk on Cloud Security in Openstack by an HP security guy (Robert Clark) who is on the Openstack security team.

Key takeaways for me was as Openstack is targeting service providers and public clouds, there is a heck of a lot more to think about in terms of security.

“HP Cloud is growing massively, our storage requirements are growing as fast as our data centre monkeys can add disk.”

Heat API

I’d had a look at Heat before but I didn’t realise that it already implemented CloudWatch and Auto-scaling on-top of CloudFormation. The presentation gave an overview of Heat and a demo of the functionality.

Now that Heat is part of Openstack (incubating), it also has it’s own Openstack REST API but will continue to keep CloudFormation API compatibility. I can imagine it will get more and more tied into Openstack components (such as Ceilometer) as it’s developed further.

A question at the end asked about other cloud support, apparently there is interest from CloudStack and the design allows pluggable support for other clouds. Deltacloud has a plugin. It would be cool to see if a Eucalyptus plugin was feasible.

All in all, a great conference… with great beer in the evening - FOSDEM, I’ll be back!

Adventures in NoSQL, Part 1

You’ve deployed and setup a private Cloud platform but now what? You need an application!

I’ve been experimenting with a number of technologies to generate workloads and give some demos to prospective Eucalyptus customers. A NoSQL database seems like a great use-case to demo as the technology benefits from being designed for scale-out workloads and this happens to be exactly what an IaaS Cloud does best.

There are an abundance of NoSQL implementations (Cassandra, MongoDB, Couchbase, Neo4j…), written in different programming languages and with slightly different takes on which two parts of the CAP theorem they choose to implement and which method they will use to store and display data.

For this post I’m going to be using MongoDB, which is in the “CP” camp, it handles Consistency and Partition Tolerance whilst forgoing Availability (Every request may not see a response), although MongoDB still provides some great availability options.

MongoDB is supported by 10gen, seems fairly mature and has a large community of users with modules for a ton of different programming languages. Cassandra also interests me and I’ll tackle that in a later post.

We also need a bunch of data and whilst there are large datasets available on the internet, last week I read a post on using the Twitter streaming API with Ruby and storing that data in MongoDB and thought it would be cool to use it, albeit with Python instead of Ruby.

Creating an ssh keypair and application security group

To start, let’s setup a keypair and security group for MongoDB so that we can ensure it is not going to be accessed by anyone else:

1
2
3
4
5
6
7
8
9
10
11
# Ensure we have our Eucalyptus or Amazon credentials in the environment
source ~/eucarc
# Create an ssh keypair
euca-add-keypair mongodb > ~/mongodb.key
chmod 400 ~/mongodb.key
# Add SSH, MongoDB and MongoDB admin interface ports to mongodb security group
euca-create-group mongodb -d "MongoDB databases"
# Replace 0.0.0.0/0 with your IP e.g. 1.2.3.4/32 to restrict it to just your system
euca-authorize -P tcp -p 22 -s 0.0.0.0/0 mongodb
euca-authorize -P tcp -p 27017 -s 0.0.0.0/0 mongodb
euca-authorize -P tcp -p 28017 -s 0.0.0.0/0 mongodb

Run an instance

We can now spin up an instance running Ubuntu 12.04 LTS x86_64 and install MongoDB on our private cloud:

1
euca-run-instances -k mongodb -g mongodb -t c1.xlarge emi-87F63CE5

If you are using AWS or your own cloud you’ll need to substitute the EMI ID I’ve used with one an AMI of Ubuntu or your own image ID. You will also need to use your own keypair.

After a few moments our instance should show as ‘running’:

1
2
3
$ euca-describe-instances
RESERVATION r-AB3F4645  985725263417    mongodb
INSTANCE    i-D89D40E2  emi-87F63CE5    1.2.3.4    4.3.2.1    running mongodb 0       c1.xlarge   2013-02-03T22:40:26.743Z    cluster1   eki-222540D6    eri-A5753DBE        monitoring-disabled 1.2.3.4    14.3.2.1            instance-store

Install MongoDB

Let’s connect to the instance and install MongoDB:

The MongoDB documentation goes into the installation of MongoDB in more detail.

Ubuntu 12.04 LTS has version 2.0.4 of MongoDB in it’s repositories, 2.2.3 is the current stable version upstream so we’ll use the repository from 10gen to install the latest package.

1
2
3
4
5
ssh -i mongodb.key ubuntu@1.2.3.4  #replace 1.2.3.4 with your instance IP!
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
echo "deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen"| sudo tee -a /etc/apt/sources.list.d/10gen.list
sudo apt-get update
sudo apt-get install -y mongodb-10gen

At this point we have an instance running that has MongoDB installed and running. You should be able to navigate to the MongoDB admin interface in your web browser:

1
http://1.2.3.4:28017

Installing Tweetstream

Now we have MongoDB running, we need to import some twitter data. Twitter has a streaming API that is publicly accessible (as long as you have a twitter account!) and there a number of modules for the programming language of your choice.

Tweetstream is a Python module that provides easy access to the streaming API and we can use it in combination with pymongo to store tweets into MongoDB.

Tweetstream isn’t packaged for Ubuntu, so I’ll use the source:

1
2
3
4
sudo apt-get install -y python-setuptools
wget -c http://pypi.python.org/packages/source/t/tweetstream/tweetstream-1.1.1.tar.gz
tar -zxvf tweetstream-1.1.1.tar.gz
cd tweetstream-1.1.1 && sudo python setup.py install

Installing pyMongo

pyMongo is the official MongoDB python driver and is available from the Ubuntu archive.

1
sudo apt-get install -y python-pymongo

Writing a python script to save tweets into MongoDB

There are a number of articles detailing how to do this via curl or tweetstream and it’s very surprisingly very simple to do it.

This following script is based on some of those examples. It connects to MongoDB and stores tweets in a collection called ‘twitterstream’. It stores the whole tweet which includes a lot of metadata, it might be useful to use this metadata later to sort tweets or index for particular fields we are interested in querying. It’s important to note that the streaming API does not give us all tweets on twitter, it’s merely a small percentage as the “Firehose” API that contains all tweets is not public.

tweet2mongo.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tweetstream
import pymongo

username = "TWITTER_USERNAME"
password = "TIWTTER_PASSWORD"
mongohost = "localhost"

connection = pymongo.Connection(mongohost, 27017)
db = connection.twitterstream

with tweetstream.TweetStream(username, password) as stream:
    for tweet in stream:
        try:
            # Save the whole tweet but only show certain fields on screen
            db.tweets.save(tweet)
            print tweet['created_at'], tweet['id'], "Username: ", tweet['user']['screen_name'],':', tweet['text'].encode('utf-8')
        except:
            pass

If we run this, you should see a stream of tweets printed out and the whole tweets stored within MongoDB:

1
python tweet2mongo.py

Use the mongo shell to see if there are entries in the database:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ mongo
MongoDB shell version: 2.2.3
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
>
> show dbs
admin   (empty)
local   (empty)
twitterstream   0.203125GB
> use twitterstream
switched to db twitterstream
> show collections
system.indexes
tweets
> db.tweets.find()

The final command should output a portion of the tweets in the json document format that MongoDB queries are displayed in.

That’s it, we’re now streaming tweets into MongoDB via Python tweetstream!

In part 2, I’ll investigate scaling out the MongoDB database by spinning up new Eucalyptus instances and configuring replication and sharding.

A Keepalived VIP for Eucalyptus Node Controllers

Background

In a Eucalyptus HA configuration there are two Cluster Controllers’s (CC) which are in an active-passive state. One is in “ENABLED” mode and one is “DISABLED” mode. If a failure occurs, the active CC services moves to the secondary CC system.

If you combine this with Eucalyptus MANAGED or MANAGED-NOVLAN networking configuration with a private back-end network your Node Controllers (NC) will require a default gateway for access to external networks and to the Walrus service to download Eucalyptus Machine Images.

This gateway is not yet part of Eucalyptus but may be included in a later release. This article describes how to setup a virtual IP that is shared between the two CC systems to use as the default gateway on all NC.

Further details can be found in the following bug:

https://eucalyptus.atlassian.net/browse/EUCA-2412

I’ve also posted this article on the Eucalyptus wiki for future reference.

Keepalived

As a workaround, you can use a tool that creates a virtual IP (VIP) and maintains membership of servers through healthchecks. There are a number of tools to do this under Linux, however we’ll focus on Keepalived.

Keepalived is available in any modern Linux distribution such as RHEL/CentOS (located in the EPEL repo) and Ubuntu and provides a vast number of features, too many to cover here.

We will use it to create a VIP between our two Cluster Controllers and add a status check script to make sure it only runs on the system that is in ENABLED/active mode.

Installation

1. Install the package (EPEL required for CentOS/RHEL)

centos
1
$ yum -y install keepalived
ubuntu
1
$ sudo apt-get install -y keepalived

2. Setup the keepalived configuration file

Add the following file to both CC systems ensuring you update the fields below for each one.

See ‘man keepalived’ or the /usr/share/doc/keepalived examples for details of all the options.

  • SERVER_HOSTNAME - Hostname of the system e.g. cc1.cloud.mycompany.com
  • VIP_ADDRESS/32 - Virtual IP Address to use as a gateway address e.g. 192.168.1.50/32
  • PRIV_DEV - This is usually VNET_PRIVINTERFACE from eucalyptus.conf on your CC. The private network used to connect the CC to NC systems e.g. bond1
keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
global_defs {
    notification_email {
        root@localhost
    }
    notification_email_from keepalived@localhost

    smtp_server 127.0.0.1
    smtp_connect_timeout 10

    lvs_id <SERVER_HOSTNAME>
}

vrrp_script cc-enabled-check {
    script "/usr/local/bin/CCclient-enabled-check.sh"
    interval 5
    weight 2
    fall 1
    rise 1
}

vrrp_instance cloud_cluster1_gateway {
    state EQUAL
    interface <PRIV_DEV>
    virtual_router_id 45
    priority 100
    advert_int 1
    smtp_alert
    authentication {
        auth_type PASS
        auth_pass <RANDOM-PASSWORD>
    }

    virtual_ipaddress {
        <VIP_ADDRESS> dev <PRIV_DEV>
    }

    track_script {
        cc-enabled-check
    }
}

3. Compile and Install CCclient_full

CCclient_full is a tool used for debugging Eucalyptus. It allows you to query the CC on the internal API just like it is the CLC and importantly, poll state information. See localState in the output below:

Using CCclient_full
1
2
3
4
5
6
$ export AXIS2C_HOME=/usr/lib64/axis2c
$ export LD_LIBRARY_PATH=$AXIS2C_HOME/lib:$AXIS2C_HOME/modules/rampart/
$ /usr/local/bin/CCclient_full 127.0.0.1:8774 describeServices
LocalState=ENABLED localEpoch=0 details=ERRORS=0
type=cluster name=cc1_c1
uri=http://192.168.1.50:8774/axis2/services/EucalyptusCC

This state information shows if the component is ENABLED. That’s exactly what we need to feed to keepalived so that the VIP runs on the ENABLED CC component.

CCClient_full is part of the Eucalyptus source code but not distributed within any Eucalyptus distribution packages, so you will need to compile it yourself following the standard Eucalytpus compilation instructions on Github.

Alternatively, if you are running on CentOS 6 x86_64 you can download this pre-compiled binary:

1
2
wget -c https://github.com/tomellis/puppet-modules/raw/master/keepalived/scripts/CCclient_full -O /usr/local/bin/CCclient_full
chmod 750 /usr/local/bin/CCclient_full

Importantly, to get CCclient_full to run you will need an additional certificate copied on to the CC from your Cloud Controller (CLC), that key cloud-pk.pem is the private key for your cloud and is required for CCclient_full to mimic the CLC.

Copy Certificate from CLC to CC
1
scp /var/lib/eucalyptus/keys/cloud-pk.pem root@cc:/var/lib/eucalyptus/keys/

4. Add check script

This script is used as a heuristic to determine if the CC component is in an ENABLED or DISABLED state by running the CCclient_full utility and checking to see if the local state is ENABLED. It returns a 0 if enabled, or a 1 if disabled.

Keepalived uses this script to determine which host it should run the VIP on, when the script returns enabled it adds additional weight of 2 to the priority of the system and removes two from the priority if it is disabled. This means the system that is running the enabled CC service becomes the system which runs the VIP.

Add this script to both systems along with the CCclient_full binary above, /usr/local/bin is a good location for this.

/usr/local/bin/CCclient-enabled-check.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/bash
#
# Check to see if a CC service is in ENABLED or DISABLED state using CCclient_full
# Author: Tom Ellis <tom.ellis@eucalyptus.com>
#
# Returns 1 if DISABLED or not installed
# Return 0 if ENABLED

export AXIS2C_HOME=/usr/lib64/axis2c
export LD_LIBRARY_PATH=$AXIS2C_HOME/lib:$AXIS2C_HOME/modules/rampart/

if [ -f /usr/local/bin/CCclient_full ]; then
    RETVAL=1
    /usr/local/bin/CCclient_full 127.0.0.1:8774 describeServices | grep -q ENABLED
    RETVAL=$?
    if [ $RETVAL = "0" ]; then
       echo "Local Eucalyptus services in ENABLED state"
       exit 0
    else 
       echo "Local Eucalyptus services in DISABLED state"
       exit 1
    fi
else
    echo "CCclient_full not present"
    exit 1
fi

You can also grab this directly:

1
2
wget -c https://raw.github.com/tomellis/puppet-modules/master/keepalived/scripts/CCclient-enabled-check.sh -O /usr/local/bin/Cclient-enabled-check.sh
chmod 750 /usr/local/bin/Cclient-enabled-check.sh

Test that the script works on your system, it should report the CC service state.

5. Enable keepalived and start it

1
2
chkconfig keepalived on
service keepalived start

You should see the logs from keepalived in your syslog as it comes up and creates the shared VIP on the ENABLED Cluster Controller.

Further Information

Puppet

I like to automate configuration. So, I’ve created a modified version of a keepalived puppet module to be used with the scripts above:

https://github.com/tomellis/puppet-modules/tree/master/keepalived

Links

Eucalyptus Bug: https://eucalyptus.atlassian.net/browse/EUCA-2412

CCClient_full: https://github.com/eucalyptus/eucalyptus/wiki/Debugging-Eucalyptus-C-language-components

Keepalived: http://www.keepalived.org/

Eucalyptus Sosreport Plugin

After some discussion on the Eucalyptus Community Mailing list, I’m happy to annouce I’ve had a crack at a python based plugin for sosreport.

Sosreport is a cool support data collection utility written in python and availble for RHEL/CentOS, Fedora, Ubuntu & Debian. It can collect lots of useful system files and stats that you might use to debug a server or send to a support organisation to help you.

I’ve packaged up my plugin for review by the community before I submit upstream to the sosreport project.

You can install and run it from a RHEL/CentOS 6 system like so:

1
2
3
$ yum install http://www.trellisnet.co.uk/rpms/eucalyptus-sos-plugin-0.1-0.el6.noarch.rpm
$ source ~/path/to/your/eucarc
$ sosreport

This should produce a tarball in /tmp that also contains useful Eucalyptus command output.

Please test it and let me know what you think, suggestions & patches welcome!

Monitoring Eucalyptus IaaS Availability With Nagios

This article describes one way to monitor your Eucalyptus cloud with Eutester, a framework for testing clouds and Nagios, a popular open source infrastructure monitoring tool. I’ve also posted this over on the Eucalyptus wiki for future reference.

We use a simple instance testcase from Eutester to spin up an instance in Eucalyptus with a custom ssh key and security group, ping the instance, ssh into it and then terminate the instance. We then report to Nagios on the overall success of the test.

This test simulates a simple Cloud User’s interaction with Eucalyptus to ensure the service is working for users.

System Requirements

  • CentOS 6
  • NTP running and syncing time correctly
  • EPEL repository

Install some helpful tools

yum -y install git unzip euca2ools

Install Nagios

Nagios-core and lots of Nagios plugins are part of the EPEL repository.

Install packages

yum install -y nagios nagios-plugins nagios-plugins-nrpe nagios-plugins-all

Set nagiosadmin htaccess access password

htpasswd /etc/nagios/passwd nagiosadmin

Turn on services at boot start them

chkconfig nagios on
chkconfig httpd on
service nagios start
service httpd start

Test Nagios

Open your browser to test the webui is working, enter the user ‘nagiosadmin’ and the password you set above.

http://<your-nagios-server>/nagios/

More information on configuring Nagios can be found on the Nagios website:

http://go.nagios.com/nagioscore/docs

Create a Eucalyptus account for Eutester

This is useful for tracking eutester usage with Eucalyptus’ reporting and makes it possible to restrict its resources using IAM policies.

Run the following commands from an environment where your Eucalyptus admin credentials are sourced.

Create Account

euare-accountcreate -a eutester

Create key

euare-useraddkey --delegate=eutester -u admin

Changing pwd

euare-useraddloginprofile -u admin -p [PASSWORD] --delegate=eutester

Generate Certificate

euare-usercreatecert -u admin --delegate=eutester

Download credentials for admin user in eutester account

euca-get-credentials -a eutester -u admin eutester-admin.zip

Copy the zipfile to your Nagios server

scp eutester-admin.zip root@<your-nagios-server>:~

Login to your nagios server and source the credentials for use with Eucalyptus

ssh root@<your-nagios-server>
unzip eutester-admin.zip -d ~/.eucarc
source ~/.eucarc/eucarc

To ensure your credentials are working, try a euca-describe-images command

euca-describe-images

This command should list the available images in your cloud

Install Eutester

Clone the latest eutester code from Github

cd /root && git clone git://github.com/eucalyptus/eutester.git && cd eutester
python setup.py install

Install some python dependancies

yum -y install python-argparse python-paramiko

Download Custom testcase for Eutester

This testcase is a modified version of instancetest.py and is the process of being submitted to Eutester upstream. For the meantime, let’s just remove the upstream version and replace it with the custom version that can output in the format we require for Nagios.

rm -f /root/eutester/testcases/cloud_user/instances/instancetest.py
wget -c https://raw.github.com/tomellis/eutester/master/testcases/cloud_user/instances/instancetest.py -O /root/eutester/testcases/cloud_user/instances/instancetest.py

Install Nagios plugin and helper scripts

Download Nagios eutester check scripts

cd /root && git clone git://github.com/monolive/nagios-eucalyptus.git

Copy Nagios plugin to plugins dir

cp nagios-eucalyptus/nagios/check_eutester_test.sh /usr/lib64/nagios/plugins/

Add script to run Eutester every hour to cron

This script requires a Eucalyptus Machine Image ID of a usable Linux instance in your cloud that it can spin up, ping and ssh to. It also requires the path to your Eucalyptus credentials. If you followed the example from above, you shouldn’t need to modify anything except for supplying a valid Linux based EMI.

Edit the script and update the EMI to a valid EMI

vim /root/nagios-eucalyptus/nagios/run_eutester_testcase.sh

On the line with EMI=, add a valid EMI from your cloud. See euca-describe-images for a list of EMI’s.

Link the script to /etc/cron.hourly - This will run the script once an hour

ln -s /root/nagios-eucalyptus/nagios/run_eutester_testcase.sh /etc/cron.hourly/

Test the script

This should pass the test and the results file should show PASS.

/root/nagios-eucalyptus/nagios/run_eutester_testcase.sh
cat /tmp/results/test-BasicInstanceChecks-result.txt

Add the Nagios Check

Finally, we need to add our Nagios check which will look at the Eutester result and throw a critical alert if the test failed, a warning if the results file does not exist.

This nagios check has a simple command and service check that is performed from the local nagios server.

Append the following to /etc/nagios/objects/commands.cfg:

1
2
3
4
5
# Eutester check command
define command {
        command_name    check_eutester_test
        command_line    $USER1$/check_eutester_test.sh
        }

Append the following to /etc/nagios/objects/localhost.cfg:

1
2
3
4
5
6
7
8
# Define Eutester service check
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             Eutester Basic Instance Test
        check_command                   check_eutester_test
        notification_interval           0
        }

Restart Nagios

/etc/init.d/nagios restart

Done!

Now go checkout the Nagios WebUI to see the status of the check:

http://<your-nagios-server>/nagios/

It’s Alive!

I’ve decided to bring back my blog from near death.

I’ve had a blog for the past 10 years with very little content, so time to publish some things I’ve been working on.

I have a ton of small articles written as notes, I’ll try to write some of these up as I go along. For some reason I’ve never managed to get around to posting them online. I blame actually working with customers to solve problems.

This blog is powered by Octopress and hosted on Amazon S3 storage as I like cloudy services. I’ve even moved my DNS over to Route 53.

There are plently of docs out there if you are interested…

Lucid Cluster Testing

In the past I’ve worked quite a lot with Red Hat Cluster Suite and other Linux based active/passive failover clustering technologies. As technology has moved on, so have the available options for cluster software. Pacemaker + Corosync are two of these fantastic technologies.

Ante Karamatić recently sent a mail to the Ubuntu server list, to ask for help testing the latest packages… If you are interested in cluster technologies in Ubuntu then this is a great opportunity to get involved, all instructions are provided on the wiki page so get testing!

See the UDS Lucid blueprint for the cluster stack for more info on the progress: https://blueprints.edge.launchpad.net/ubuntu/+spec/server-lucid-cluster-stack