Someone mentioned to me a little while back a disinterest in going to PyCon because it felt directed towards operators more than programmers. Basically, there have become more talks about integrations using Python than discussions regarding language features, libraries or development techniques. I think this trend is natural because Python has proven itself as a main stream language that has solved many common programming problems. Therefore, when people talk about it, it is a matter of how Python was used rather than describing how to apply some programming technique using the language.
With that in mind, it got me thinking about “Operators” and what that means.
Where I work there are two types of operators. The first is the somewhat traditional system administrator. This role is focused on knowledge about the particular system being administered. There is still a good deal of automation work that happens at this level, but it is typically focused on administering a particular suite of applications. For example, managing apache httpd or bind9 via config files and rolling out updates using the specific package manager. There is typically more nuance to this sort of role than can be expressed in a paragraph, so needless to say, these are domain experts that understand the common and extreme corner cases for the applications and systems they administer.
The second type of operator is closer to the operations included devops. These operators are responsible for building the systems that run application software. These folks are responsible for designing the systems and infrastructure to run the custom applications. While more traditional sysadmins use configuration management, these operators master it. Ops must have a huge breadth of knowledge that spans everything. File systems, networking, databases, services, *nix, shell, version control and everything in between are all topics that Ops are familiar with.
As a software developer, we think about abstract designs, while ops makes the abstract concrete.
After working with Ops for a while, I have a huge amount of respect due to the complexity that must be managed. There is no way to simply import cloud and cloud.start(). The tools available to Ops for enacapsulating concepts is rudimentary by necessity. The configuration management tools are still very new and the terminology hasn’t coalesced towards design patterns due to the fact that everyone’s starting point is different. Ops is where linux distros, databases, load balancers, firewalls, user management and apps come together to actually have working products.
It is this complexity that makes DevOps such an interesting place for software development. Amidst the myriad of programs and systems, there needs to be established concepts that can be reused as best practices, and eventually, as programs. Just as C revolutionized programming by allowing a way to build for different architectures, DevOps is creating the language, frameworks, and concepts to deploy large scale systems.
The current state of the art is using configuration manangement / orchestration tools to configure a system. While in many ways this is very high level, I’d argue that it is closer to assembly in the grand scheme of things. There is still room to encapsulate these tools and provide higher level abstractions that simplify and make safe the processes of working with systems.
Chef is considered a “configuration management” tool, but really is an environment automation tool. Chef makes an effort to peform operations on your system according to a series of recipes. In theory, these recipes provide a declarative means of:
- Defining the process of performing some operations
- Defining the different paths to complete an operation
- The completed state on the system when the recipe has finished
An obvious, configuration specific, example would be a chef recipe to add a new httpd config file in /etc/httpd/sites.enabled.d/ or somewhere similar. You can use similar tactics you see in make check if you have a newer file or not and how to apply the change.
Defining the operations that need to happen, along with handling valid error cases, is non-trivial. When you add to that also defining what the final state should look like between processes running, file changes or even database updates, you have a ton of work to do with an incredible amount of room for error.
Docker, while it is not a configuration management tool, allows you to bundle your build with your configuration, thus separating some of the responsibility. This doesn’t preclude using chef as much as it limits it to configuring the system in which you will run the containers.
Putting this into more concrete terms, what we want is a cascading system that allows each level to encapsulate its responsibilities. By doing so, a declaration that some requirement has been met can allow the lower layer to report back a simple true/false.
In a nutshell, use chef to configure the host that will run your processes. Use docker containers to run your process with the production configuration bundled in the container. By doing so, you take advantage of Chef and its community cookbooks while making configuration of your application encapsulated in your build step and the resulting container.
While this should work, there are still questions to consider. Chef can dyanmically find configuration values when converging while a docker container’s filesystem is read only. While I don’t have a clear answer for this, my gut says it shouldn’t be that difficult to sort out in a reliable pattern. For example, chef could check out some tagged configuration from a git repo that gets mounted at /etc/$appname when running the container. Another option would be to use etcd to update the filesystem mounted in a container. In either case, the application uses the filesystem normally, while chef provides the dynamism when converging.
Another concern is that in order to use docker containers, it is important you have access to a docker registry. Fortunately, this is a relatively simple process. One downside is that there is not a OpenStack Swift backed v2 registry. The other option is to use docker hub and pay for more private containers. The containers should be registered as private because they include the production configuration.
It seems clear that a declarative system is valuable when configuring a host. Unfortunately, the reality is that the resources that are typically “declared” with Chef are too complex to maintain a completely declarative pattern. Using docker, a container can be can be tested reliably such that a running container is enough to consider its dependency met in the declared state.
If you’ve ever programmed any elisp (emacs lisp) you might have been frustrated and surprised by the lack of string handling functions. In Python, it is trivial to do things like:
The lack of string functions in elisp has been improved greatly by s.el, but why haven’t these sorts functions existed in Emacs in the first place? Obviously, I don’t know the answer, but I do have a theory.
Elisp is (obviously) a LISP and LISPs are functional! One tenant of functional languages is the use of immutable data. While many would argue immutability is not something elisp is known for, when acting on a buffer, it is effectively immutable. So, rather than load some string into memory, mutate it and use it somewhere, my hunch is early emacs authors saw things differently. Instead, they considered the buffer the place to act on strings. When you call an elisp function it acts like a monad or a transaction where the underlying text is effectively locked. Rather than loading it into some data structure, you instead are given access to the editor primitives to literally “edit” the text as necessary. When the function exits, the buffer is then returned to the UI and user in its new state.
The benefits here are:
- You use the same actions the user uses to manipulate text
- You re-use the same memory and content the editor is using
While, it feels confusing coming from other languages, if you think of all the tools available to edit text in Emacs, one could argue that string manipulation is not necessary.
Of course, my theory could be totally wrong, so who knows. Fortunately, there is s.el to help bridge the gap between editing buffers and manipulating text.
I wrote a tool to help sanely manage environment variables. Environment Variables (env vars) are a great way to pass data to programs because it works practically everywhere with no set up. It is a lowest common denominator that almost all systems support all the way from dev to production.
The problem with env vars is that they can be sticky. You are in a shell (zsh, bash, fish, etc...) and you set an environment variable. It exists and is available to every command from then on. If an env var contains an important secret such as a cloud account key, you could silently delete production nodes by mistake. Someone else could use your computer and do the same thing, with or without malicious intent.
Another difficulty with env vars is that they are a global key value store. Writing small shell scripts to export environment variables can be error prone. Copying and pasting or commenting out env vars in order to configure a script is easy to screw up. The fact these env vars are long lasting only makes it more difficult to automate reliably.
Withenv tries to improve this situation by providing some helpful features:
- Setup the environment for each command without it leaking into your shell
- Organization of your environment via YAML files
- Cascading of your environment files in order to override specific values
- Debugging the environment variables
Here is how it works.
Lets say we have a script that starts up some servers. It uses some environment variables to choose how many servers to spin, what cloud account to use and what role to configure them with (via Chef or Ansible or Salt, etc.). The script isn’t important, so we’ll just assume make create does all the work.
Lets organize our environment YAML files. We’ll create a envs folder that we can use to populate our environment. It will have some directories to help build up an environment.
envs ├─ env │ ├─ dev │ └─ prod └─ roles ├─ app-foo └─ app-bar
Now we’ll add some YAML files. For example, lets create a YAML file in the envs/env/dev that connects to a development account.
# envs/env/dev/rax_creds.yml --- - RACKSPACE: - USERNAME: eric - API_KEY: 02aloksjdfp;aoidjf;aosdijf
You’ll notice that we used a nested data structure as well as lists. Using lists ensure we get an explicit ordering. We could have used a normal dictionary as well if the order doesn’t matter. The nesting ensures that each child entry will use the correct prefix. For example, the YAML above is equivalent to the following bash script.
export RACKSPACE_USERNAME=eric export RACKSPACE_API_KEY=02aloksjdfp;aoidjf;aosdijf
Now, lets create another file for defining some object storage info.
# envs/env/dev/cloud_storage.yml --- - STORAGE_BUCKET: devstore - STORAGE_PREFIX: $STORAGE_BUCKET/dev
You’ll notice that the STORAGE_PREFIX uses the value of the STORAGE_BUCKET. You can do normal dollar prefixed replacements like you would do normally in an shell. This includes any variables currently defined in your environment such as $HOME or $USER that are typically set. Also, by using a list (as defined by the -), we ensure that we apply the variables in order and the STORAGE_BUCKET exists for use within the STORAGE_PREFIX value.
With our environment YAML in place, we can now use the we command withenv provides in order to set up the environment before calling a command.
$ we -e envs/common.yml -d envs/env/dev -d envs/role/app-foo make create
The -e flag lets you point to a specific YAML file, while the -d flag points to a directory of YAML files. The ordering of the flags is important because the last entry will take precedence. In the command above, we might have configured common.yml with a personal dev account along with our defaults. The envs/env/dev/ folder contains a rax_creds.yml file that overrides the default cloud account with shared development account, leaving the other defaults alone.
The one limitation is that you cannot use the output from commands as a value to an env var. For example, the following wouldn’t work to set a directory path.
This might be fixed in the future, but at the moment it is not supported.
If you don’t pass any argument to the we command it will output he environment as a bash script using export to set variables.
Have you ever had code that needed to do some logging, but your logging configuration hadn’t been loaded? While it is a best practice to set up logging as early as possible, logging is still code that needs to be executed. The Python runtime will still do some setup (ie import everything) that MUST come before ANY code is executed, including your logging code.
One solution would be to jump through some hoops to make that code evaluated more lazily. For example, say you wanted to apply a decorator from some other package if it is installed. The first time the function is called, you could apply the decorator. This would get pretty complex pretty quickly.
class LazyDecorator(object): def __init__(self, entry_point): self.entry_point = entry_point self.func = None def find_decorator(self): # find our decorator... def __call__(self, f): self.original_func = f def lazy_wrapper(*args, **kw): if not self.func: self.func = self.find_decorator() return self.func(*args, **kw) return lazy_wrapper
I haven’t tried the code above, but it does rub me the wrong way. The reason being is that we’re jumping through hoops just to do some logging. Function calls are expensive in Python, which means if you decorated a ton of functions, the result could end up as a lot of overhead for a feature that only effects start up.
Instead, we can just buffer the log output until after we’ve loaded our logging config.
import logging class LazyLogger(object): LVLS = dict( debug=logging.DEBUG, info=logging.INFO, warning=logging.WARNING, error=logging.ERROR, critical=logging.CRITICAL, exception=logging.ERROR, ) def __init__(self): self.messages =  def replay(self, logger=None): logger = logging.getLogger(__name__) for level, msg, args, kw in self.messages: logger.log(level, msg, *args, **kw) __call__ = replay def capture(self, lvl, msg, *args, **kw): self.messages.append((lvl, msg, args, kw)) def __getattr__(self, name): if name in self.LVLS: return functools.partial(self.capture, self.LVLS[name])
We can use this as our logging object in our code that needs to log before logging has been configured. Then, when we can replay our log when it is appropriate by importing the logger, and calling the replay method. We could even keep a registry of lazy loggers and call them all after configuring logging.
The benefit of this tactic is that you avoid adding runtime complexity, while supporting the same logging patterns at startup / import time.
One thing I’ve found when looking at DevOps is the adherance to specific tools. For example, if an organization uses chef, then it is expected that chef be responsible for all tasks. It is understandable to reuse knowledge gained in a system, but at the same time, all systems have pros and cons.
More importantly, each tool adheres to its own philosophies for how a system should be defined. Some are declarative while others are iterative and almost all systems define their own (clever at times) verbage for what the different elements of a system should be.
What the DevOps ecosystem really needs is a low level suite of common primitives we can build off of. A set of DevOps System Calls, if you will, we can use to build higher order systems. The reason is to gain the ability to have some gaurantees we can start to assume will work.
For example, in Python, when I write tests, I assume the standard library functions such as open or the socket module work as expected. You don’t see tests such as:
def test_open(): with open('test_file.txt') as fh: fh.write('foo') assert open('test_file.txt').read() = 'foo'
We have similar expectations regarding much of the TCP/IP stack. We assume the bits are read correctly on the network hardware and passed to the OS, eventually landing in our program correctly. We take it for granted that the HTTP request becomes something like request.headers[‘Content-Type’] in our language of choice.
These assumptions let us consider our program in higher level terms that are portable across languages and systems. Every programmer understands what it means to open file, connect to a database or make a HTTP request within our programs because our level of abstraction is reasonably high.
DevOps could use a similar standard and the implementation doesn’t matter. A machine might be created with Ansible, but configured via Chef. That part doesn’t matter. What matters is we can write simple code that manages our operations.
For example, lets say I want to spin up a machine to run an app and a DB. Here is some psuedo code that might get the job done.
machine = cloud.create(flavor=provider.FLAVOR_COMPUTE) machine.bootstrap() app = packages.find('my-app') machine.deploy(app)
This would compile to a suite of commands that trigger some DevOps tools do the work necessary to build the machines. The configuration of what provider, available flavors, and repository locations would all live in OS level config like you see for your OS networking, auth and everything else in /etc.
The key is that we can assume the calls will work or throw an error. The process is ecapsulated in such a way that we don’t need to think about the provider, setting API keys in an environment, bootstrapping the node for our configuration managment and every other tiny detail that needs to be performed and validated in order to consider the “recipe” or “playbook” as done.
Obviously, this is not trivial. But, if we consider where our tools excel and begin the process of encapsulating the tools behind some higher order concepts, we can begin to create a glossary and shared expectations. The result is a true Cloud OS.
At work we use a proxy called repose in front of most services in order to make common tasks such as auth, rate limiting, etc. consistent. In python, this type of function might also be accomplished via WSGI middleware, but by using a separate proxy, you get two benefits.
- The service can be written in any language that understands HTTP.
- The service gets to avoid many orthogonal concerns.
While the reasoning for repose makes a lot of sense, for someone not familiar with Java, it can be a little daunting to play with. Fortunately, the repose folks have provided some packages to make playing with repose pretty easy.
$ mkdir repose-playground $ cd repose-playground
Now lets create our Dockerfile:
FROM ubuntu RUN apt-get install -y wget RUN wget -O - http://repo.openrepose.org/debian/pubkey.gpg | apt-key add - && echo "deb http://repo.openrepose.org/debian stable main" > /etc/apt/sources.list.d/openrepose.list RUN apt-get update && apt-get install -y \ repose-valve \ repose-filter-bundle \ repose-extensions-filter-bundle CMD ["java", "-jar", "/usr/share/repose/repose-valve.jar"]
The next step will be to start up our container and grab the default config files. This makes it much easier to experiment since we have decent defaults.
$ docker build -t repose-playground . $ mkdir etc $ docker run -it -v `pwd`/etc:/code repose-playground cp -r /etc/repose /code
Now we have our config in ./etc/repose, we can try something out. Lets change our default endpoint to point to a different website.
<?xml version="1.0" encoding="UTF-8"?> <!-- To configure Repose see: http://wiki.openrepose.org/display/REPOSE/Configuration --> <system-model xmlns="http://docs.openrepose.org/repose/system-model/v2.0"> <repose-cluster id="repose"> <nodes> <node id="repose_node1" hostname="localhost" http-port="8080"/> </nodes> <filters></filters> <services></services> <destinations> <endpoint id="open_repose" protocol="http" <!-- redirect to ionrock.org! --> hostname="ionrock.org" root-path="/" port="80" default="true"/> </destinations> </repose-cluster> </system-model>
Now we’ll run repose from our container, using our local config instead of the config in the container.
$ docker run -it -v `pwd`/etc/repose:/etc/repose -p 8080:8080
If you’re using boot2docker, you can use boot2docker ip to find the IP of your VM.
$ export REPOSE_HOST=`boot2docker ip` $ curl "http://$REPOSE_HOST:8080"
You should see the homepage HTML from ionrock.org!
Once you have repose running, you can leave it up and change the config as needed. Repose will periodically pick up any changes without restarting.
I’ve gone ahead and automated the steps in this repose-playground repo. While it can be tricky to get started with repose, especially if you’re not familiar with Java, it is worth taking a look at repose for implementing orthogonal requirements that make the essential application code more complex. This especially true if you’re using a micro services model where the less code the better. Just run repose on the same node, proxying requests to your service, which only listens on localhost and you’re good to go.
Lately, I’ve been playing around with Docker as I’ve moved back to OS X for development. At the same time, I’ve been getting acquainted with Chef in a reasonably complex production environment. As both systems have a decent level of overlap, it might be helpful to compare and contrast the different methodologies of these two deployment tactics.
What does Docker actually do?
Docker wraps up the container functionality built into the Linux kernel. Basically, it lets a process use the machine’s hardware in a very specific manner, using a predefined filesystem. When you use docker, it feels like starting up a tiny VM to run a process. But, what really happens, the container’s filesystem is used along with the hardware provided by the kernel in order to run the process in an isolated environment.
When you use Docker, you typically start from an “image”. The image is just an initial filesystem you’ll be starting from. From there, you might install some packages and add some files in order to run some process. When you are ready to run the process, you use docker run and it will get the filesystem ready and run the process using the computer’s hardware.
Where this differs from VM is that you only start one process. While you might create a container that has installed Postgres, RabbitMQ and your own app, when you run docker run myimage myapp, no other processes are running. The container only provides the filesystem. It is up to the caller how the underlying hardware is accessed and utilized. This includes everything from the disk to the network.
What does a Provisioner do?
A provisioner, like Chef, configures a machine in a certain state. Like Docker, this means getting the file system sorted out, including installing packages, adding configuration, adding users, etc. A provisioner also can start processes on the machine as part of the provisioning process.
A provisioner usually starts from a known image. In this case, I’m using “image” in the more common VM context, where it is a snapshot of the OS. With that in mind, a provisioner doesn’t require a specific image, but rather, the set of required resources necessary to consider the provisioned machine as complete. For example, there is no reason you couldn’t use a provisioner to create user directories across variety of unices, including OS X and the BSDs.
Different Deployment Strategies
The key difference when using Docker or a provisioner is the strategy used for deployment. How do you take your infrastructure and configure it to run your applications consistently?
Docker takes the approach of deploying containers. The concept of a container is that it is self contained. The OS doesn’t matter, assuming it supports docker. Your deployment then involves getting the container image and running the processes supported by the container.
From a development perspective, the deliverable artifact of the build process would be a container (or set of containers) to run your different processes. From there, you would configure your infrastructure accordingly, configuring the resources the processes can use at run time.
A provisioner takes a more generalized route. The provisioner configures the machine, therefore, it can use any number of deliverables to get your processes running. You can create system packages, programming language environments or even containers to get your system up and running.
The key difference from the devops perspective (the intersection of development and sysops), is development within constraints of the system must be coordinated with the provisioner. In other words, developers can’t simply choose some platform or application. All dependencies must be integrated into the provisioning system. A docker container, on the other hand, can be based on any image and use any resource available within the image’s filesystem.
What do you want to do?
The question of whether to use Docker or a provisioning system is not an either or proposition. If you choose to use Docker containers as your deployment artifact, the actual machines may still need to be configured. There are options that avoid the need to use a provisioning system, but generally, you may still use something like Chef to maintain and provision the servers that will be running your containers.
One vector to make a decision on what strategy to use is the level of consistency across your infrastructure. If you are fine with developers creating containers that may use different operating systems and tooling, docker is an excellent choice. If you have hard requirements as to how your OS should be configured, using a provisioning system might be better suited for you.
Another thing to consider is development resources. It can be a blessing and a curse to provision a development system, no matter what system you use. Your team might be more than happy to take on managing containers efficiently, while other teams would be better off leaving most system decisions to the provisioning system. The ecosystem surrounding each platform is another consideration.
I don’t imagine that docker (and containers generally) will completely supplant provisioning services. But, I do believe the model does aid in producing more consistent deployment artifacts over time. Testing a container locally is a reasonably reliable means of ensuring it should run in production. That said, containers require that many resources must be configured (network, disk, etc.) in order to work correctly. This is a non-trivial step and making it work in development, especially when you consider devs using tools like boot2docker, can be a difficult process. It can much easier to simply spin up a Vagrant VM with the necessary processes and be done with it. Fortunately, there tools like docker compose and docker machine that seem to be addressing this shortcoming.
I’ve recently started working under in an “agile” environment. There are stories, story points, a board, etc. I couldn’t tell you whether it was pure scrum or some other flavor of “agile”, but I can say it is definitely meant to be an “agile” system of software development. It is not perfect, but that is sort of the point, to roll with the punches of the real world and do your best with a reasonable about of data.
Some folks argue agile is nonsense, but after using agile techniques, the detractors typically don’t consider using agile techniques as a tool and consider it a concrete set of rules. No project management technique perfectly compartmentalizes all problems into easily solvable units. The best we can do is utilize techniques in order to improve our chances of success writing software.
There are two benefits that I’ve noticed of using an agile technique.
- You must communicate and record what is happening
- You may change things according to your needs
The requirement to communicate and record what’s happening is important because it forces developers to make information public. The process of writing a good story and updating the story with comments (I’m assuming some bug tracking software is being used) helps guard against problems going unnoticed. It also provides history that can be learned from. It holds people accountable.
Allowing change is also critical. Something like Scrum is extremely specific and detailed, yet as an organization, you have the option and priviledge to adapt and change vanilla Scrum to your own requirements. For example, some organizations should use story points for estimating work and establishing velocity, while others would be better suited for using more specific time estimates. Both estimation methods have their place and you have can choose the method that best meets your needs.
When adopting an agile practice it is a good idea to try out the vanilla version. But, just like your software, you should iterate and try things to optimize the process for your needs. It is OK to create stories that establish specifications. It is OK to use 1-10 for estimating work. It is OK to write “stories” more like traditional bug reports.
It is not OK to skip the communication and recording of what is going on and it is not OK to ignore the needs of your organization in order to adhere to the tenents of your chosen agile methodology.
At work I’ve been using Vagrant for development. The thing that bothers me most about using Vagrant or any remote development machine is the disconnect it presents with local tools. You are forced to either login into the machine and run command or jump through hoops to run the commands from the local machine, most often times losing the file system context that make the local tools possible.
What I mean by local tools are things like IDEs or build code that performs tasks on your repository. IDEs assume you are developing locally and expect a local executable for certain tasks in order to work. Build code can be platform specific, which is likely why you are using Vagrant in the first place.
My answer to this is rdo.
I have a similar project called xe that you can configure to sort out your path when in a specific project. For example, if I have a virtualenv venv, in my cloned repo, I can use xe python to run the correct python without having to activate the virtualenv or manually include the path to the python executable.
rdo works in a similar way, the difference being that instead of adjust the path, it configures the command to run on a remote machine, such as a Vagrant VM.
For example, lets assume you have a Makefile in your project repo. You’ve written a bootstrap task that will download any system dependencies for your app.
bootstrap: sudo apt-get install -y python-pip python-lxml
Obviously if you are on OS X or RHEL, you don’t use apt for package management, and therefore use a Vagrant VM. Rather than having to ssh into the VM, you can use rdo.
The first step is to create a config file in your repo.
[default] driver = vagrant directory = /vagrant
That assumes you’re Vagrantfile is mounting your repo at /vagrant. You can change it as needed.
From there you can use rdo to run commands.
$ rdo make bootstrap
That will compose a command to connect to the vagrant VM, cd to the correct directory and run your command.
I hope you give it a try and report back any issues. At the moment it extremely basic in that it doesn’t do anything terribly smart as far as escaping goes. I hope to remedy that as well as support generic ssh connections as well.