Installing Elasticsearch Plugins on Graylog2

Thought I’d share this since it was something I unfortunately spent a good portion of my afternoon wrestling with. So you want to use an elasticsearch plugin within graylog2-server? I don’t care your reasons, but this will help you do it. I’m going to go out on a limb and assume you’re wanting to use the kopf plugin to view cluster state, but this will work for any plugin.

1. Download the Plugins

This can be slightly tricky… I’ve found that the best option is to install ES 0.90.10 (or whatever version is compatible with your version of graylog2) and use it to install plugins. You’ll then move the one you want from /plugins to /plugins. But if you are familiar with the plugin structure that will be created, you can manually download and unzip the plugins to the graylog2 plugins directory you define.

So for example, I’d do the following for installing kopf (a site plugin) and cloud-aws (a java based plugin).

2. Specify an Elasticsearch Config File For Graylog2

This easy, just specify make sure that elasticsearch_config_file = /etc/graylog2-elasticsearch.yml is set in your graylog2.conf. You can also just run this quick sed against the stock config file.

3. Specify a Plugin Dir

You’ll need to tell elasitcsearch where to actually look for the plugins, so add this to /etc/graylog2-elasticsearch.yml:


4. Put Any Plugin Specific Configuration in graylog2-elasticsearch.yml

This is pretty much plugin specific, but you’ll do this following the plugin’s installation instructions.

I’m currently using this method to make my graylog2-server instance autojoin a specific cluster based on security group and EC2 tag and it works pretty well so far. :-)

An Even Better Way to Use Puppet Modules in a Vagrant Project

I previously blogged about what I thought was a good way to tie librarian-puppet and vagrant together in a way that allowed one to use librarian puppet without dealing with rubygems on their system (which despite the excellent tooling can be a pain for non-rubyists).

Today I discovered there is a handy vagrant plugin for this and found that using it is a much better approach.

For the quick and dirty on how to use it:

Suffice to say, you should just use this instead.

Before You Start Your Day

Before you start your day think of what is one thing you can do to enrich someone’s life? Whether your family, schoolteachers, co-workers or even a random person on the street. If you could find just one thing to do that will enrich someone’s live and then multiply that to each and every day, could you imagine what kind of impact that could have?

Effective Puppet Module Management in Vagrant

I still remember my first early forays into using vagrant and puppet together to provision local development environments. Everything was easy accept figuring out a proper way to bundle puppet modules with a project. Basically it was a three step phase of discovery.

1. Run “puppet module install” and adding them to the git repo (not a bright est idea but simple).
2. Add puppet modules as git submodules in the project. This turned out to be even more troublesome as adding/removing/updating modules became a real pain.
3. Use puppet-librarian to manage puppet modules as the dependencies they are.

The third option was the best… we could now just simply add, remove or upgrade puppet module versions in a Puppetfile and just run “librarian-puppet install” to install the modules. But a final caveat wound up being that users had to install rubygems on their host machine which can bring other troubles. So why not just install the modules within the vagrant box when it comes up and be done with it?

This effectively adds the Puppetfile in the root of the project to the guest machine and installs the modules, referencing the modules directory when running puppet apply. This works great as you can guarantee the same install across multiple environments where developers may or may not be familiar with rubygems. ;)

Securing Docker’s Remote API

One piece to docker that is interesting AMAZING is the Remote API that can be used to programatically interact with docker. I recently had a situation where I wanted to run many containers on a host with a single container managing the other containers through the API. But the problem I soon discovered is that at the moment when you turn networking on it is an all or nothing type of thing… you can’t turn networking off selectively on a container by container basis. You can disable IPv4 forwarding, but you can still reach the docker remote API on the machine if you can guess the IP address of it.

One solution I came up with for this is to use nginx to expose the unix socket for docker over HTTPS and utilize client-side ssl certificates to only allow trusted containers to have access. I liked this setup a lot so I thought I would share how it’s done. Disclaimer: assumes some knowledge of docker!

Generate The SSL Certificates

We’ll use openssl to generate and self-sign the certs. Since this is for an internal service we’ll just sign it ourselves. We also remove the password from the keys so that we aren’t prompted for it each time we start nginx.

Another option may be to leave the passphrase in and provide it as an environment variable when running a docker container or through some other means as an extra layer of security.

We’ll move ca.crt, server.key and server.crt to /etc/nginx/certs.

Setup Nginx

The nginx setup for this is pretty straightforward. We just listen for traffic on localhost on port 4242. We require client-side ssl certificate validation and reference the certificates we generated in the previous step. And most important of all, set up an upstream proxy to the docker unix socket. I simply overwrote what was already in /etc/nginx/sites-enabled/default.

One important piece to make this work is you should add the user nginx runs as to the docker group so that it can read from the socket. This could be www-data, nginx, or something else!

Hack It Up!

With this setup and nginx restarted, let’s first run a curl command to make sure that this setup correctly. First we’ll make a call without the client cert to double check that we get denied access then a proper one.

For the first two we should get some run of the mill 400 http response codes before we get a proper JSON response from the final command! Woot!

But wait there’s more… let’s build a container that can call the service to launch other containers!

For this example we’ll simply build two containers: one that has the client certificate and key and one that doesn’t. The code for these examples are pretty straightforward and to save space I’ll leave the untrusted container out. You can view the untrusted container on github (although it is nothing exciting).

First, the node.js application that will connect and display information:

And the Dockerfile used to build the container. Notice we add the client.crt and client.key as part of building it!

That’s about it. Run docker build . and docker run -n >IMAGE ID< and we should see a json dump to the console of the actively running containers. Doing the same in the untrusted directory should present us with some 400 error about not providing a client ssl certificate. :)

I’ve shared a project with all this code plus a vagrant file on github for your own prusual. Enjoy!

Parameterized Docker Containers

I’ve been hacking a lot on docker at Zapier lately and one of the things I found to be somewhat cumbersome with docker containers is that it seemed to be a little difficult to customize published containers without extending them and modifying files within them or some other mechanism. What I have come to discover is that you can publish containers that are customizable without modification from the end user by utilizing one of the most important concepts from 12 factor application development to Store Configuration in the Environment.

Let’s use a really good example of this, the docker-registry application used to host docker images internally. When docker first came out I whipped up a puppet manifest to configure this bad boy but then realized that the right way would be to run this as a container (which was published). Unfortunately the Dockerfile as it was didn’t fit my needs.

The gunicorn setup was hardcoded and to make matters more complicated the configuration defaulted strictly to the development based configuration that stored images in /tmp vs. the recommended production setting that stored images in S3 (where I wanted them).

The solution was easy, create a couple bash script that utilized environment variables that could be set when calling `docker run`.

First we generate the configuration file:

And wrap the gunicorn run call:

Finally the Dockerfile is modified to call these scripts with CMD, meaning that they are called when the container starts.

Since we use puppet-docker, the manifest for our dockerregistry server role simply sets these environment variables when it runs the container to configure it to our liking.

I’m really a big fan of this concept. This means people can publish docker containers that can be used as standalone application appliances with users tweaking to their liking via environment variables.

EDIT: Although I used puppet in this example to run docker, you don’t need to. You can easily do the following as well.

Handy Hub Alias

I’ve recently become a big fan of Hub and use a lot of the commands to interact with github from the comfort of my commandline. One of my personal favorites is pull-request as we use PRs often as a form of both code reviews and code promotion. Here’s a handy alias I have for the common task of issuing a PR for promotion.

Now I just need to figure out how to make it open the URL for the pull request that it dumps to the console. :)

Immutable Servers With Packer and Puppet

Lately I’ve been becoming more and more of a fan of is the concept of Immutable Servers while automating our infrastructure at Zapier. The concept is simple: never do server upgrades or changes on live servers, instead just build out new servers with applied updates and throw away the old ones. You basically get all the benefits of immutability in programming at the infrastructure level plus you never have to worry about configuration drift. And even better, I no longer have to have the fear that despite extensive tests someone might push a puppet manifest change that out the blue breaks our front web servers (sure we can rollback the changes and recover, but there is still a small potential outage to worry about).

Obviously you need some good tooling to make this happen. Some recent fooling around with packer has allowed me to put together a setup that I’ve been a little pleased with so far.

The Nodes

In our infrastructure project we have a nodes.yaml that defines node names and the AWS security groups they belong to. This is pretty straightforward and used for a variety of other tools (for example, vagrant).

The Rakefile

We use this nodes.yaml file with rake to produce packer templates to build out new AMIs. This keeps me from having to manage a ton of packer templates as they mostly have the same features.

This is used in conjunction with a simple erb template that simply injects the nodename into it.

This will generate a packer template for each node that will

  • Create an AMI in us-east-1
  • Uses an Ubuntu Server 13.04 AMI to start with
  • Sets the security group to packer in EC2. We create this and allow it access to puppetmaster’s security group. Otherwise packer will create a random temporary security group that won’t have access to any other groups (if you follow best practices at least)!
  • installs puppet
  • Runs puppet once to configure the system

We also never enable puppet agent (it defaults to not starting) so that it never polls for updates. We could also remove puppet from the server after it completes so the AMI doesn’t have it baked in.

The Script

Packer has a nice feature of enabling the user to specify shell commands and shell files to run. This is fine for bootstrapping but not so fine for doing the level of configuration management that puppet is more suited for. So our packer templates call a shell script that makes sure we don’t use the age old version of ruby linux distros love to default to and installs puppet. As part of the installation it also specifies the puppet master server name (if you’re using VPC instead of EC2 classic, you don’t need this as you can just assign the internal dns “puppet” to puppetmaster).

Building It

Now all we need to do to build out a new AMI for redis is run packer build packs/redis.json and boom! A server is created, configured, imaged and terminated. Now just set up a few jobs in jenkins to generate these based on certain triggers and you’re one step closer to automating your immutable infrastructure.

Cleaning Up

Of course, each AMI you generate is going to cost you a penny a day or some such. This might seem small, but once you have 100 revisions of each AMI it’s going to cost you! So as a final step I whipped up a simple fabfile script to cleanup the old images. This proved to be a simple task because we include a unix timestamp in the AMI name.

Set this up as a post-build job to the jenkins job that generates the AMI and you always ensure you have only the latest one. You could probably also tweak this to keep the last 5 AMIs around too for archiving purposes.

What’s Next?

I admit I’m still a little fresh with this concept. Ideally I’d be happy as hell to get our infrastructure to the point where each month (or week!) servers get recycled with fresh copies. Servers that are more transient like web servers or queue works this is easy. With data stores this can be a little more trickier as you need an effective strategy to boot up replicas of primary instances, promote replicas to primaries and retire the old primaries.

A final challenge is deciding what level of mutability is allowed. Deployments are obviously fine as they don’t tweak the server configuration but what about adding / removing users? Do we take an all or nothing approach or allow tiny details like SSH public keys to be updated without complete server rebuilds?