Playing with Repose

At work we use a proxy called repose in front of most services in order to make common tasks such as auth, rate limiting, etc. consistent. In python, this type of function might also be accomplished via WSGI middleware, but by using a separate proxy, you get two benefits.

  1. The service can be written in any language that understands HTTP.
  2. The service gets to avoid many orthogonal concerns.

While the reasoning for repose makes a lot of sense, for someone not familiar with Java, it can be a little daunting to play with. Fortunately, the repose folks have provided some packages to make playing with repose pretty easy.

We’ll start with a docker container to run repose. The repose docs has an example we can use as a template. But first lets make a directory to play in.

$ mkdir repose-playground
$ cd repose-playground

Now lets create our Dockerfile:

FROM ubuntu

RUN apt-get install -y wget

RUN wget -O - http://repo.openrepose.org/debian/pubkey.gpg | apt-key add - && echo "deb http://repo.openrepose.org/debian stable main" > /etc/apt/sources.list.d/openrepose.list

RUN apt-get update && apt-get install -y \
  repose-valve \
  repose-filter-bundle \
  repose-extensions-filter-bundle

CMD ["java", "-jar", "/usr/share/repose/repose-valve.jar"]

The next step will be to start up our container and grab the default config files. This makes it much easier to experiment since we have decent defaults.

$ docker build -t repose-playground .
$ mkdir etc
$ docker run -it -v `pwd`/etc:/code repose-playground cp -r /etc/repose /code

Now we have our config in ./etc/respose, we can try something out. Lets change our default endpoint to point to a different website.

<?xml version="1.0" encoding="UTF-8"?>

<!-- To configure Repose see: http://wiki.openrepose.org/display/REPOSE/Configuration -->
<system-model xmlns="http://docs.openrepose.org/repose/system-model/v2.0">
    <repose-cluster id="repose">
        <nodes>
            <node id="repose_node1" hostname="localhost" http-port="8080"/>
        </nodes>
        <filters></filters>
        <services></services>
        <destinations>
            <endpoint id="open_repose" protocol="http"
                      <!-- redirect to ionrock.org! -->
                      hostname="ionrock.org"
                      root-path="/" port="80"
                      default="true"/>
        </destinations>
    </repose-cluster>
</system-model>

Now we’ll run repose from our container, using our local config instead of the config in the container.

$ docker run -it -v `pwd`/etc/repose:/etc/repose -p 8080:8080

If you’re using boot2docker, you can use boot2docker ip to find the IP of your VM.

$ export REPOSE_HOST=`boot2docker ip`
$ curl "http://$REPOSE_HOST:8080"

You should see the homepage HTML from ionrock.org!

Once you have repose running, you can leave it up and change the config as needed. Repose will periodically pick up any changes without restarting.

I’ve gone ahead and automated the steps in this repose-playground repo. While it can be tricky to get started with repose, especially if you’re not familiar with Java, it is worth taking a look at repose for implementing orthogonal requirements that make the essential application code more complex. This especially true if you’re using a micro services model where the less code the better. Just run repose on the same node, proxying requests to your service, which only listens on localhost and you’re good to go.

Docker vs. Provisioning

Lately, I’ve been playing around with Docker as I’ve moved back to OS X for development. At the same time, I’ve been getting acquainted with Chef in a reasonably complex production environment. As both systems have a decent level of overlap, it might be helpful to compare and contrast the different methodologies of these two deployment tactics.

What does Docker actually do?

Docker wraps up the container functionality built into the Linux kernel. Basically, it lets a process use the machine’s hardware in a very specific manner, using a predefined filesystem. When you use docker, it feels like starting up a tiny VM to run a process. But, what really happens, the container’s filesystem is used along with the hardware provided by the kernel in order to run the process in an isolated environment.

When you use Docker, you typically start from an “image”. The image is just an initial filesystem you’ll be starting from. From there, you might install some packages and add some files in order to run some process. When you are ready to run the process, you use docker run and it will get the filesystem ready and run the process using the computer’s hardware.

Where this differs from VM is that you only start one process. While you might create a container that has installed Postgres, RabbitMQ and your own app, when you run docker run myimage myapp, no other processes are running. The container only provides the filesystem. It is up to the caller how the underlying hardware is accessed and utilized. This includes everything from the disk to the network.

What does a Provisioner do?

A provisioner, like Chef, configures a machine in a certain state. Like Docker, this means getting the file system sorted out, including installing packages, adding configuration, adding users, etc. A provisioner also can start processes on the machine as part of the provisioning process.

A provisioner usually starts from a known image. In this case, I’m using “image” in the more common VM context, where it is a snapshot of the OS. With that in mind, a provisioner doesn’t require a specific image, but rather, the set of required resources necessary to consider the provisioned machine as complete. For example, there is no reason you couldn’t use a provisioner to create user directories across variety of unices, including OS X and the BSDs.

Different Deployment Strategies

The key difference when using Docker or a provisioner is the strategy used for deployment. How do you take your infrastructure and configure it to run your applications consistently?

Docker takes the approach of deploying containers. The concept of a container is that it is self contained. The OS doesn’t matter, assuming it supports docker. Your deployment then involves getting the container image and running the processes supported by the container.

From a development perspective, the deliverable artifact of the build process would be a container (or set of containers) to run your different processes. From there, you would configure your infrastructure accordingly, configuring the resources the processes can use at run time.

A provisioner takes a more generalized route. The provisioner configures the machine, therefore, it can use any number of deliverables to get your processes running. You can create system packages, programming language environments or even containers to get your system up and running.

The key difference from the devops perspective (the intersection of development and sysops), is development within constraints of the system must be coordinated with the provisioner. In other words, developers can’t simply choose some platform or application. All dependencies must be integrated into the provisioning system. A docker container, on the other hand, can be based on any image and use any resource available within the image’s filesystem.

What do you want to do?

The question of whether to use Docker or a provisioning system is not an either or proposition. If you choose to use Docker containers as your deployment artifact, the actual machines may still need to be configured. There are options that avoid the need to use a provisioning system, but generally, you may still use something like Chef to maintain and provision the servers that will be running your containers.

One vector to make a decision on what strategy to use is the level of consistency across your infrastructure. If you are fine with developers creating containers that may use different operating systems and tooling, docker is an excellent choice. If you have hard requirements as to how your OS should be configured, using a provisioning system might be better suited for you.

Another thing to consider is development resources. It can be a blessing and a curse to provision a development system, no matter what system you use. Your team might be more than happy to take on managing containers efficiently, while other teams would be better off leaving most system decisions to the provisioning system. The ecosystem surrounding each platform is another consideration.

Conclusions

I don’t imagine that docker (and containers generally) will completely supplant provisioning services. But, I do believe the model does aid in producing more consistent deployment artifacts over time. Testing a container locally is a reasonably reliable means of ensuring it should run in production. That said, containers require that many resources must be configured (network, disk, etc.) in order to work correctly. This is a non-trivial step and making it work in development, especially when you consider devs using tools like boot2docker, can be a difficult process. It can much easier to simply spin up a Vagrant VM with the necessary processes and be done with it. Fortunately, there tools like docker compose and docker machine that seem to be addressing this shortcoming.

Some Thoughts on Agile

I’ve recently started working under in an “agile” environment. There are stories, story points, a board, etc. I couldn’t tell you whether it was pure scrum or some other flavor of “agile”, but I can say it is definitely meant to be an “agile” system of software development. It is not perfect, but that is sort of the point, to roll with the punches of the real world and do your best with a reasonable about of data.

Some folks argue agile is nonsense, but after using agile techniques, the detractors typically don’t consider using agile techniques as a tool and consider it a concrete set of rules. No project management technique perfectly compartmentalizes all problems into easily solvable units. The best we can do is utilize techniques in order to improve our chances of success writing software.

There are two benefits that I’ve noticed of using an agile technique.

  1. You must communicate and record what is happening
  2. You may change things according to your needs

The requirement to communicate and record what’s happening is important because it forces developers to make information public. The process of writing a good story and updating the story with comments (I’m assuming some bug tracking software is being used) helps guard against problems going unnoticed. It also provides history that can be learned from. It holds people accountable.

Allowing change is also critical. Something like Scrum is extremely specific and detailed, yet as an organization, you have the option and priviledge to adapt and change vanilla Scrum to your own requirements. For example, some organizations should use story points for estimating work and establishing velocity, while others would be better suited for using more specific time estimates. Both estimation methods have their place and you have can choose the method that best meets your needs.

When adopting an agile practice it is a good idea to try out the vanilla version. But, just like your software, you should iterate and try things to optimize the process for your needs. It is OK to create stories that establish specifications. It is OK to use 1-10 for estimating work. It is OK to write “stories” more like traditional bug reports.

It is not OK to skip the communication and recording of what is going on and it is not OK to ignore the needs of your organization in order to adhere to the tenents of your chosen agile methodology.

Announcing rdo

At work I’ve been using Vagrant for development. The thing that bothers me most about using Vagrant or any remote development machine is the disconnect it presents with local tools. You are forced to either login into the machine and run command or jump through hoops to run the commands from the local machine, most often times losing the file system context that make the local tools possible.

Local Tools

What I mean by local tools are things like IDEs or build code that performs tasks on your repository. IDEs assume you are developing locally and expect a local executable for certain tasks in order to work. Build code can be platform specific, which is likely why you are using Vagrant in the first place.

My answer to this is rdo.

Why rdo

I have a similar project called xe that you can configure to sort out your path when in a specific project. For example, if I have a virtualenv venv, in my cloned repo, I can use xe python to run the correct python without having to activate the virtualenv or manually include the path to the python executable.

rdo works in a similar way, the difference being that instead of adjust the path, it configures the command to run on a remote machine, such as a Vagrant VM.

Using rdo

For example, lets assume you have a Makefile in your project repo. You’ve written a bootstrap task that will download any system dependencies for your app.

bootstrap:
     sudo apt-get install -y python-pip python-lxml

Obviously if you are on OS X or RHEL, you don’t use apt for package management, and therefore use a Vagrant VM. Rather than having to ssh into the VM, you can use rdo.

The first step is to create a config file in your repo.

[default]
driver = vagrant
directory = /vagrant

That assumes you’re Vagrantfile is mounting your repo at /vagrant. You can change it as needed.

From there you can use rdo to run commands.

$ rdo make bootstrap

That will compose a command to connect to the vagrant VM, cd to the correct directory and run your command.

Conclusion

I hope you give it a try and report back any issues. At the moment it extremely basic in that it doesn’t do anything terribly smart as far as escaping goes. I hope to remedy that as well as support generic ssh connections as well.

Virtual Machine Development

I’ve recently started developing on OS X again on software that will be run on Linux. The solution I’ve used has been to use a Vagrant VM, but I’m not entirely happy with it. Here are a few other things I’ve tried.

Docker / Fig

On OS X, boot2docker makes it possible to use docker for running processes in containers. Fig lets you orchestrate and connect containers.

Note

Fig is deprecated and will be replaced with Docker Compose, but I found that Docker Compose didn’t work for me on OS X.

The idea is that you’d run MySQL, RabbitMQ, etc. in containers and expose those processes’ ports and hosts to your app container. Here is an example:

mysql:
  image: mysql:5.5

app:
  build: path/to/myapp/  # A Dockerfile must be here
  links:
    - mysql

The app container then can access mysql as a host in order to get to the container running MySQL.

While I think this pattern could work, I found that it needs a bit too much hand holding. For example, you explicitly need to make sure volumes are set up for each service that needs persistence. Doing the typical database sync ended up being problematic because it wasn’t trivial to connect. I’m sure I was doing it wrong along the way, but it seems that you have to constantly tweak any tutorial because you have to use boot2docker.

Docker Machine

Another tactic I used was docker-machine. This is basically how boot2docker works. It can start a machine, configured by docker, and provide you commands so you can run things on that machine via the normal docker command line. This seemed promising, but in the end, it was pretty much the same as using Vagrant, only a lot less convenient.

I also considered using it with my Rackspace account, but, for whatever reason, the client couldn’t destroy machines, which made it much less enticing.

Vagrant

One thing that was frustrating with Vagrant is that if you use a virtualenv that is on part of the file system that is mounted from the host (ie OS X), doing any sort of package loading is really slow. I have no clue why this is. I ended up just installing things as root, but I think a better tactic might be to use virtualenvwrapper, which should install it to the home directory, while code still lives in /vagrant/*.

One thing that I did do initially was to write a Makefile for working with Vagrant. Here is a snippet:

SRC=/vagrant/designate
VENV=/usr/local
SPHINX=$(VENV)/bin/sphinx-build
VCMD=vagrant ssh -c

bootstrap:
     $(VCMD) 'virtualenv $(VENV)'
     $(VCMD) 'cd $(SRC) && $(VENV)/bin/pip install -r requirements.txt -r test-requirements.txt'
     $(VCMD) 'cd $(SRC) && $(VENV)/bin/python setup.py develop'

tests:
     $(VCMD) 'cd $(SRC) && $(VENV)/bin/tox'

It is kind of ugly, but it more or less works. I also tried some other options such as using my xe package to use paramiko or fabric, but both tactics made it too hard to simply do things like:

$ xe tox -e py27

And make xe figure out what needs to happen to run the commands correctly on the remote host. What is frustrating is that docker managed to essentially do this aspect rather well.

Conclusions

OS X is not Linux. There are more than enough differences that make developing locally really difficult. Also, most code is not meant to be portable. I’m on the fence as to whether this is a real problem or just a fact of life with more work being done on servers in the cloud. Finally, virtualization and containers still need tons of work. It feels a little like the wild west in that there are really no rules and very few best practices. The potential is obvious, but the path is far from paved. Of the two, virtualization definitely feels like a better tactic for development. With that in mind, it would be even better if you could simply do with Vagrant what you can do with docker. Time will tell!

Even though I didn’t manage to make major strides into a better dev story for OS X, I did learn quite a bit about the different options out there. Did it make me miss using Linux? Yes. But I haven’t given up yet!

tsf and randstr

I wrote a couple really small tools the other day that I packaged up. I hope someone else finds them useful!

tsf

tsf directs stdin to a timestamped file or directory. For example:

$ curl http://ionrock.org | tsf homepage.html

The output from the cURL request goes into a file 20150326123412-homepage.html. You can also specify that a directory should be used.

$ curl http://ionrock.org | tsf -d homepage.html

That will create a homepage.html directory with the timestamped files in the directory.

Why is this helpful?

I often debug things by writing small scripts that automate repetitive actions. It is common that I’ll keep around output for this sort of thing so I can examine it. Using tsf, it is a little less likely that I’ll overwrite a version that I needed to keep.

Another place it can be helpful is if you periodically run a script and you need to keep the result in a time stamped file. It does that too.

randstr

randstr is a function creates a random string.

from randstr import randstr

print(randstr())

randstr provides some globals containing different sets of characters you can pass in to the call in order to get different varieties of random strings. You can also set the length.

Why is this helpful?

I’ve written this function a ton of times so I figured it was worth making a package out of it. It is helpful for testing b/c you can easily create random identifiers. For example:

from randstr import randstr

batch_records = [{'name': randstr()} for i in range(1000)]

I’m sure there are other tools out there that do similar or exactly the same thing, but these are mine and I like them. I hope you do too.

Automate Everything

I’ve found that it has become increasingly difficult to simply hack something together without formally automating the process. Something inside me just can’t handle the idea of repeating steps that could be automated. My solution has been to become faster at the process of formal automation. Most of the steps are small, easy to do and don’t take much time. Rather than feeling guilty that I’m wasting time by writing small a library or script, I work to make the process faster and am able to re-use these scripts and snippets in the future.

A nice side effect is that writing code has become much more fluid. I get more practice using essential libraries and tools where over time they’ve become second nature. It also can be helpful getting in the flow because taking the extra steps of writing to files and setting up a small package feels like a warm up of sorts.

One thing that has been difficult is navigating the wealth of options. For example, I’ve gone back to OS X for development. I’ve had to use VMs for running processes and tests. I’ve been playing with Vagrant and Docker. These can be configured with chef, ansible, or puppet in addition to writing a Vagrantfile or Dockerfile. Does chef set up a docker container? Do you call chef-apply in your Dockerfile? On OS X you have to use boot2docker, which seems to be a wrapper around docker machine. Even though I know the process can be configured to be completely automated, it is tough to feel as though you’re doing it right.

Obviously, there is a balance. It can be easy to become caught in a quagmire of automation, especially when you’re trying to automate something that was never intended to be driven programaticaly. At some point, even though it hurts, I have to just bear down and type the commands or click the mouse over and over again.

That is until I break down and start writing soem elisp to do it for me ;)

Server Buffer Names in Circe

Circe is an IRC client for Emacs. If you are dying to try out Emacs for your IRC-ing needs, it already comes with two other clients, ERC and rcirc. Both work just fine. Personally, I’ve found circe to be a great mix of helpful features alongside simple configuration.

One thing that was always bugging me was that the server buffers names. I use an IRC bouncer that keeps me connected to the different IRC networks I use. At work, I connect to each network using a different username using a port forwarded by ssh. The result being I get 3 buffers with extremely descriptive names such as localhost:6668<2>. I’d love to have names like *irc-freenode* instead, so here is what I came up with.

First off, I wrote a small function to connect to each network that looks like this:

(defun my-start-ircs ()
  (interactive)
  (start-freenode-irc)
  (start-oftc-irc)
  (start-work-irc)
  ;; Tell circe not to show mode changes as they are pretty noisey
  (circe-set-display-handler "MODE" (lambda (&rest ignored) nil)))

Then for each IRC server I call the normal circe call. The circe call returns the server buffer. In order to rename the buffer, I can do the following:

(defun start-freenode-irc ()
  (interactive)
  (with-current-buffer (circe "locahost"
                              :port 6689
                              :nick "elarson"
                              :user "eric_on_freenode"
                              :password (my-irc-pw))
    (rename-buffer "*irc-freenode*"))

Bingo! I get a nice server buffer name. I suspect this could work with ERC and rcirc, but I haven’t tried it. Hope it helps someone else out!

Hello Rackspace!

After quite a while at YouGov, I’ve started a new position at Rackspace working on the Cloud DNS team! Specifically, I’m working on Designate, a DNS as a Service project that is in incubation for OpenStack. I’ve had an interest in OpenStack for a while now, so I feel extremely lucky I have the opportunity to join the community with one of the founding organizations.

One thing that has been interesting is the idea of the Managed Cloud. AWS focuses on Infrastructure as a Service (IaaS). Rackspace also offers IaaS, but takes that a step farther by providing support. For example, if you need a DB as a Service, AWS provides services like Redshift or SimpleDB. It is up to the users to figure out ways to optimize queries and tune the database for the user’s specific needs. In the Managed Cloud, you can ask for help and know that an experienced expert will understand what you need to do and help to make it happen, even at the application level.

While this support feels expensive, it can be much cheaper than you think when you consider the amount of time developers spend becoming pseudo experts at a huge breadth of technology that doesn’t offer any actual value to a business. Just think of the time spent on sharding a database, maintaining a CI service, learning the latest / greatest container technology, building your own VM images, maintaining your own configuration management systems, etc. Then imagine having a company that will help you set it up and maintain it over time. That is what the managed cloud is all about.

It doesn’t stop at software either. You can mix and match physical hardware with traditional cloud infrastructure as needed. If your database server needs real hardware that is integrated with your cloud storage and cloud servers, Rackspace can do it. If you have strict security compliance requirements that prevent you from using traditional IaaS providers, Rackspace can help there too. If you need to use VMWare or Windows as well as Open Stack cloud technologies, Rackspace has you covered.

I just got back from orientation, so I’m still full of the Kool-Aid.

That said, Fanatical Support truly is ingrained in the culture here. It started when the founders were challenged. They hired someone to get a handle on support and he proposed Fanatical Support. His argument was simple. If we offer a product and don’t support it, we are lying to our customers. The service they are buying is not what they are getting, so don’t be a lier and give users Fanatical Support.

I’m extremely excited to work on great technology at an extremely large scale, but more importantly, I’m ecstatic to being working at a company that ingrains integrity treats its customers and employees with the utmost respect.

Setting up magit-gerrit in Emacs

I recently started working on OpenStack and, being an avid Emacs user, I hoped to find a more integrated workflow with my editor of choice. Of the options out there, I settled on magit-gerrit.

OpenStack uses git for source control and gerrit for code review. The way code gets merged into OpenStack is through code review and gerrit. In a nutshell, you create a branch, write some code, submit a code review and after that code is reviewed and approved, it is merged upstream. The key is ensuring the code review process is thorough and convenient.

As developers with specific environments, it is crucial to be able to quickly download a patch and play around with the code. For example, running the tests locally or playing around with a new endpoint is important when approving a review. Fortunately, magit-gerrit makes this process really easy.

First off, you need to install the git-review tool. This is available via pip.

$ pip install git-review

Next up, you can check out a repo. We’ll use the Designate repo because that is what I’m working on!

$ git clone https://github.com/openstack/designate.git
$ cd designate

With a repo in place, we can start setting up magit-gerrit. Assuming you’ve setup Melpa, you can install it via M-x package-install RET magit-gerrit. Add to your emacs init file:

(require 'magit-gerrit)

The magit-gerrit docs suggest setting two variables.

;; if remote url is not using the default gerrit port and
;; ssh scheme, need to manually set this variable
(setq-default magit-gerrit-ssh-creds "myid@gerrithost.org")

;; if necessary, use an alternative remote instead of 'origin'
(setq-default magit-gerrit-remote "gerrit")

The magit-gerrit package can infer the magit-gerrit-ssh-creds from the magit-gerrit-remote. This makes it easy to configure your repo via a .dir-locals.el file.

((magit-mode
  (magit-gerrit-remote . "ssh://eric@review.openstack.org:29418/openstack/designate")))

Once you have your repo configured, you open your repo in magit via M-x magit-status. You should also see a message saying “Detected magit-gerrit-ssh-creds” that shows the credentials used to login into the gerrit server. These are simple ssh credentials, so if you can’t ssh into the gerrit server using the credentials, then you need to adjust your settings accordingly.

If everything is configured correctly, there should be an entry in the status page that lists any reviews for the project. The listing shows the summary of the review. You can navigate to the review and press T to get a list of options. From there, you can download the patchset as a branch or simply view the diff. You can also browse to the review in gerrit. From what I can tell, you can’t comment on a review, but you can give a +/- for a review.

I’ve just started using gerrit and magit-gerrit, so I’m sure there are some features that I don’t fully understand. For example, I’ve yet to understand how to re-run git review in order to update a patch after getting feedback. Assuming that isn’t supported, I’m sure it shouldn’t be too hard to add.

Feel free to ping me if you try it and have questions or tips!