Ionrock Dot Org

by Eric Larson

My Weblog

Code Virtues

I read this article about 7 virtues of good code. A part of me agrees, especially because the suggestions seem very similar to the zen of python. At the same time, the virtues are a bit unclear in terms of what the real goals are.

The first two feel like no brainers to me. The first is that code should work. This is true on both a practical and meta counts. On a practical note, broken code breaks builds. So, yes, working code is important. The second is unique as opposed to duplicated. Avoiding copy/paste is another good thing. Writing small functions to avoid side effects and the like is another +1 in my book.

The third is simplicity. I think this is generally a good thing, but one must be conscious of the fact that constantly moving code out can have some side effects as well. In languages like Java or C# you get an editor that lets you go to the class or method definition. In this case you have almost nothing constraining you from finding a piece of code. In Python though, the ability to immediately know where code is coming from drops considerably. This is especially true when you are talking about passing around functions/callables or objects that implement core types. Duck typing can really bite you in the end, so while moving code around to keep things simple is often a good idea, you really need to be aware that context is important. Moving code can take it out of context, which has dangers of its own.

The fourth is “Clear, as opposed to puzzling”. I’m going to call bs on this one. It is definitely true good code is clear. How you communicate what code is doing is a whole other story. The article points out naming variables and classes is the main aspect of clarity. Wrong! Just because you name some class effectively or have a relatively easy to read bit of code, you’re not done. Here is an example:

class GettextParer(object):
    def __init__(self...):

    def parse(self, ...):

# and later

gt = GettextParser()
catalogs = gt.parse(some_gettext_file)

organized_messages = dict([
    (catalog, [msg for msg in msgs if m != ''])
    for catalog, msgs in catalogs
])

The thing that is fishy about the code is the why and how. First off how do you know what sort of data structure you’re going to get back from the GettextParser. How do you even know it returns something? Secondly, why are you stripping out the empties in the msgs list? Hopefully some context would help provide the answers, but the fact is the naming doesn’t tell the whole story. The intersection of code, helpful comments and context all need to work together to gain the clarity that I’d describe as a virtue.

And as for comments being unnecessary, if you’ve never gotten the lyric wrong to a song you enjoy without ever looking, then I’d say you’re a prime candidate for not using comments.

The fifth virtue deals with how easy it is to update some bit of code. Personally, I believe this is one of the most important metrics for how well code has been written. This is what makes writing code really hard. The sixth virtue deals with code being developed and I think that ends up being a technique used for achieving ease. The better your internal APIs and models are, the better adding and removing code is going to be.

The last virtue is being brief. I’m all for brevity, but it really is a side effect. If your code is communicative, there is helpful context that gets supplemented by meaning comments, then chances are the code will be a reasonable size. Huge functions definitely smell, but at the same time helpful comments might make the context and meaning clear. A long but clear list is better than a black magic and poorly organized sets of methods.

Overall I think the virtues in the article are more or less agreeable, but the simplicity bothers me. I think most coders who deal with large code bases have their personal gripes, but they are probably just that, personal viewpoints on why something is wrong. The real issue is not so much the quality of the code. I would argue that should be measured by how effectively it succeeds solving the user’s problem. The real issue is communication. You need to communicate what the code does if you want good code. At the same time, if the code works perfectly or close to it, then it doesn’t really matter because no one needs to read it.


Posted Wed Mar 10 19:15:03 2010 by Eric Larson

Music vs. Programming

Last night I realized the biggest difference between programming and writing music. It is pretty obvious when you think about it. The difference is the audience. Both instances you are trying to communicate something. It is never easy and rarely does it work perfectly the first time. There is a ton of practice involved alongside a whole set of tools. Both have artistic aspects that always seem to be based on some mathematical concepts. But biggest difference is really the audience.

On the programming side of things you have an emotionless set of components that epitomizes stupidity. Sure, you can make a computer seem smart, but really that hunk of silicon and connections is about as dumb as it gets. It is either on or off. As a programmer, your goal is to make that big hunk of simplicity do something interesting and/or useful. It is a huge challenge because you have to use language that is going to be read by other programmers. This means there is a secondary audience (think other folks in the band) who need to understand what the heck is going on. The biggest problem on all fronts is communication thanks to the horrible medium of 1s and 0s.

Music on the other hand is another extreme when it comes to audiences. The listener is the person you write for. Your goal is not to tell them what to do, but rather to relate to their emotions. In a way, you’re communicating how to actually feel! Likewise, you have a set of musicians or a band that you need to communicate how to communicate to the listener. If you’ve ever been in a studio where they are trying to get sounds you’ll quickly see the deconstruction of genre devolve into hand gestures accompanied by dancing and random vocal noises. Again, the problem is the medium of emotion that makes music such a horrible communication method.

Programming and music are at two edges of the spectrum of communication. When you are writing a song what you really are doing is communicating. Sometimes that communication is focused on those around you and other times it is meant for the masses. Likewise, when you program you write for millions of x86 processors as well as the other developers on your team. In both of these cases the language you are given provides a huge challenge.

Working in a team can be challenging enough, but it becomes even more difficult when the output is a really hard communication medium. What’s is interesting is that you do see similar arguments even though the medium of music and code are so different. One would assume that in the coding world data and facts would reign supreme, but programmers can often become emotional about implementation details. Likewise, in a band there are tons of times where emotions run high, but the vast majority of the time it’s simple repetition and counting parts or measures.

The thing to take away though is that at the core is communication. If you are not communicating something to the audience, whether that is your laptop or a few hundred fans, you are not really doing anything. Effective communication is what spawns action. In fact action is really one of the best ways to communicate and is what leading by example really means. When you consider what you do in terms of communication I think it helps to gain a valuable perspective on what is really important. The focus shifts to others and that is always a good.


Posted Tue Mar 9 19:13:09 2010 by Eric Larson

Someone Please Get Rid of Paper

Printing has never been a large part of my life with computers. I remember when I was Novell and we needed to print something in color, there was puzzling amount of communication regarding how to get it done that usually ended up in asking a mac user to print it. Sometimes eating your own dog food is painful.

Recently at work I’ve had to turn in some hard copy materials. The first step was to try and scan them. No dice there. Our printer / scanner wasn’t scanning (with OS X no less!), so I had to find another way. My next option was sending a fax. Yet another troublesome practice that forces me to go to Kinkos and spend money on a silly expense. Did I mention that these hard copy materials are partially an expense report? One might suggest that I simply “expense” the fax. The irony here is that I’d have to send another fax with the fax receipt, but hold for a second. That would be another fax!!! A truly vicious and far from worth it cycle.

I hope that someday we won’t need so much paper. I’m totally fine dealing with real deal hard copy for things like contracts or other documents that are valuable outside of the digital realm. But for things like receipts and expense reports it just seems silly. Credit card companies should let you pick a set of charges and sent a link to a certified statement. They would be vouching that the expenses are real and the SSL cert makes it verified. Expense reports could be a thing of the past if there was a good way to verify and agree on digital documents.

There has always been the idea of the digital signature floating around. It never seems to catch on though. It might be a chicken/egg kind of problem where there aren’t enough users to validate a service, yet no one wants to jump a service that is not really trusted. Some might argue the government should step in but they shouldn’t. Paper will continue to stick around until we can find a way to put our mark on digital documents that we can all agree on.

I think the real reason there hasn’t been a digital signature success story is no one knows how to do it the right way. If I knew what the right way was, rest assured I wouldn’t be blogging about it right now. No, I would be looking for funding and start my search for the perfect island to buy. My guess is that the real answer is going to be a simple fiat system built around existing web services. Something like a seal that folks like PayPal, eBay, Amazon and the credit industry agree on for no other reason than to set the standard. On a technical level, there won’t be much past some libraries that do whatever conversation with web entities. OAuth is probably good example of transferring trust from one entity to another.

My hope is that we do start looking for ways to phase out our use of paper. Eliminating it is not a good idea because bits are too easy to flip. But if we can migrate some everyday requirements to something that digital it would be helpful. It is pretty shameful that we still use so much paper and that it is made from trees. I had a teacher as a kid mention that part of the reason marijuana is illegal is because the logging industry sees a major threat in paper produced via marijuana plants. This argument seems fishy since we have hemp, but in either case if there is a renewable process to get paper, it seems like a good idea to go ahead and use it instead of using trees.

No matter what happens with our digital signatures and paper usage, I’ll most likely hear I should be a doctor thanks to messiness of my signature (digital or otherwise).


Posted Mon Mar 8 16:49:58 2010 by Eric Larson

Responsible Musicians

Last night I had an epiphany. I was reading an article on Courtney Love in the latest Spin and it occurred to me that there is the not the same sort of “trash the hotel room” mystique bands used to have. I’m sure plenty of bands are still living hard and doing their share of trashing rooms, but I’d say generally bands are more responsible, at least in terms of their image.

What changed was the web. The web opened the door to real time updates for everything and that includes the day to day lives of musicians. Back in the day (whenever that actually was), a band could make a stink and it would last for a year or more. It became a oral legend passed around between friends and most likely blown slightly out of proportion. Now, when an artist trashes a hotel room, there was probably a cleaning person at the hotel that got pictures on their cell phone and can tweet that they really just left out their uneaten dinner on the table.

In some ways it is too bad we’ve lost the mystique. The silent tortured artist is gone forever. Fans know it is all an act and they don’t really care. The fact the fans have changed is what interests me.

When I was a kid there was always a concern about selling out. As I’ve gotten older and had more responsibility, it became clear that selling out was extremely relative. Unless we bring in money, we can’t make as much music. Something has to be sold. As a kid, if I heard a band on a commercial or even on the radio, that band had sold out, period. Now, no one cares. Part of me is glad since we could use the money and most of the time licensing music is a great way to bring in income. At the same time, I wonder why kids don’t care anymore that the musicians they love are so easily selling their music to commercial entities.

My theory is that fans are smarter now. In addition to having constant up to the minute updates on their favorite artists, they also have a constant stream of music industry content. It is difficult not to see articles and blogs discussing the state of the music industry and what is happening. When an artist says to their fans they don’t make any money, fans believe them and give them money. They may still pirate the music, but then they make sure to buy a t-shirt and donate on Kickstarter. Others have pointed out the importance of super fans, but I think it is quickly becoming the norm. Fans see the real struggles of musicians such that accepting money is not selling out, but simply part of the job. A band can be respected for touring like crazy, putting out a ton of music, and creating relationships with fans. Selling out is no longer a function money, but of time.

If you can’t tell, this is really exciting. The prospects of doing everything yourself never looked so good. Fans on a large scale are realizing they need to support artists and that doesn’t mean buying a record. Support is a different from the economics of selling records. It costs more for all the parties involve, but it pays more dividends. In the DIY punk scene bands used to tour based on letters they received from kids who said they could put a show on. Now, instead of the 50 kids putting on shows, you have a website with a forum, twitter and a blog to talk to thousands of people and create similar relationships.

None of this is easy, but it really never was. I’m really excited because it is apparent that there is a new generation approaching that will appreciate integrity. Not only that, integrity won’t be defined by a lack of money, but relationships. Ian MacKaye always did this for me. I saw him play in The Evens and bought a record afterward to support him. He said thank you in the most honest of ways and shook my hand, staring me straight in the eye and smiled. That guy has and probably always will define punk for me. It is about community and relationships. It is about working hard and having control over your life. There is a new era of punk as an ideal on the horizon.


Posted Fri Mar 5 16:14:00 2010 by Eric Larson

Getting Ready for SXSW

Once again SXSW is coming up an we’re getting ready. We’re in full on practice mode focusing on getting our songs extremely solid. In addition to playing, we managed to get ourselves roped into helping out with a day show. This was a first. The first step was getting sponsors. That managed to work itself out at the eleventh hour. Our next task is to put together a backline for the bands as many are from out of town and won’t be touring down.

Outside of the obvious logistics and preparation, we’re getting ready mentally. It is funny because last year we really had high expectations. We didn’t really think that we’d be walking out with mountains of record deals or massive hype, but it did feel as though some opportunities might come to fruition. This year, we had hoped to have solidified our plans for immediately after southby (as we say in the ATX). There was an assumption that we’d know what we were doing next in terms of a full length, which meant that SXSW was not a critical event for us. Life is never what you expect and things didn’t happen as we expected. The result is southby is once again an important event for us.

There is one thing in life that seem to be true. When you’re planning an expecting great things to happen, they probably won’t. The best example I have of this is how I met my wife. I was in high school and gotten sick of wanting to be in a heavy relationship. The truth is I hadn’t been some sort of serious relationship or anything close, but in my own childish high school way, my mind was made up to stop trying to find a soul mate and just get to know people. Not a week or so after making this internal decision do I meet my wife and the rest is history! I’ve been happily married for 10 years, have had a blast and learned more about life than I could ever imagine.

I think the lesson to take away when preparing for something important like SXSW is that it is really just a few days and it doesn’t define what or who you are. Believe it or not great opportunities happen all the time. It is not realistic to be maxed out all the time in terms of readiness, so be realistic. This is the cliche “it’s a marathon, not a sprint” idea. The thing about cliches is that they often are painfully true, even though you don’t really want them to be. This year for SXSW we’re expecting to play some shows and have a good time. Things have already been more exciting, so simply trying to be ready is the mindset going forward. I have a feeling it will all work out in the end.


Posted Wed Mar 3 17:48:18 2010 by Eric Larson

Slowing Succumbing to TDD

You always hear things about great athletes just doing the right thing. Golfers couldn’t really tell you how they manage to sync up their entire body in order to slap a ball down the fairway. They’ve just practiced like crazy and let their body do the execution. This kind of unknown benefit is something I’ve seen others bring up regarding TDD. At this point, I’m apt to believe them.

Writing tests make you feel good. Tests help improve your confidence because there is something to point to that says, “see, this works”. What is interesting though is that the more you test, the more you depend on your tests. It no longer a side story for the code, it is what makes the code valid. I made this transition recently and if I hadn’t noticed it on a personal level and acknowledged it, I’d continue to have a slim argument for TDD.

One facet that is key to TDD is that you write tests first. Up until this point, my assumption was that it was due to the benefits of taking the time. You preload your app with a client library that you need to make work. You have to think about the design from the user’s perspective, which is a Good Thing. The thing is, once you get started writing more tests and you’ve reached that point where the tests are what validate the code, it becomes apparent it is easier to change the tests before changing the code.

Changing the tests means that you not only have something to fix instead of something to write. This is a subtle constraint that helps to improve focus. It also keeps you designing when you may very well be doing maintenance. Ironically, it is also a little faster. TDD must be really slow is a really popular argument, but I’m inclined to suggest it is wrong. It actually should make you much faster.

Writing code quickly requires complete ideas. If you know exactly what the library does and what you want to do with it along with a rough idea of the algorithm, code can really fly into the editor. Writing tests first almost forces complete ideas. Since it forces design as well, you often don’t need to rewrite or refactor as much code. With practice, the speed only increases.

I’m really glad that I’ve taken the time to find an interest in testing. It is a hassle, but like exercise, you start to need it. Also, like exercise, it really improves the entire experience. It is a lot of fun to write code really fast. You feel confident and a feeling of accomplishment. Sooner or later you’ll find yourself churning out tons of great code and people might wonder how you got to be such a great coder. When they ask you can have a real answer beyond some buzz words.


Posted Tue Mar 2 17:30:49 2010 by Eric Larson

Coding Comments

I’ve heard plenty of times from coders that comments are pointless. Some folks are even avidly against them. In the past my own thoughts most definitely tended towards code telling the algorithmic story. Lately though, my perceptions have changed. My recent goals have been to understand what really makes code maintainable in the real world. There are plenty of blogs out there that can offer small examples of how something could be made more readable and assume that it negates the need for comments, but I don’t believe them at this point. It is not that the refactoring and naming isn’t important, but rather it is not enough in the real world. A blog post will get a few paragraphs to explain the context, give you the small snippet and walk you through the changes. At then end you feel like the code is so obvious it hurts. Unfortunately, the difference is that in the real world, that snippet is wrapped by a few hundred lines on either side. And, in python, you have no clue what some arguments type possibly is, so looking at the original object just doesn’t cut it. It just isn’t that easy.

What is ironic then is that those folks trying to make code more readable are also making a strong case for commenting. The blog usually does a really good job of explaining the context and then how that context applies to the actual code. Unfortunately we’ve all see the massively commented code complete with paragraphs for getters and setters and found it extremely difficult to gain anything from the extreme verbosity that is probably incorrect anyway. Still, there is a need to provide context.

Lately my code has drawn me into the CherryPy internals a bit to see how things work. CherryPy has always been a framework (or library really) that just gets out of the way. It has done such a good job at this that I’ve been trying to use more of it! In looking at the code, there are some really consistent practices that have made understanding what is going on much easier. Here are my summations.

1. All code should be in chunks.

If you look at the CherryPy code you quickly notice that each piece of structure is actually created from smaller bits of structure defined by spaces and short comments. Most I would say are between 4 and 8 lines long. They also have a really short one line comment describing what is happening. This leads me to my next point.

2. All comments length should be relative to the size of the code.

If you have most of your code in 4 to 8 line segments, a one line comment is perfect. If you have something that takes up most of the page in your editor, the comments should either be a single larger before the algorithm/section or should follow the logic as it goes through the code.

I should mention that when I say “segments” these are not language constructs. This is simply putting a blank line between things.

Commenting in generally is one of those thing that is tough to do right. It is easy to skirt the issue, but I’m going to go ahead an say if you’re not commenting your code, you’re really doing it wrong. Comments share the context of your well written clean code. It is not your fault that code is mean for machine not people, so don’t feel bad that your code is easier to read because you added a little comment. Likewise, when you commit your code, it is really publishing it. Someone else will read it at some point, so consider your audience.

I’d encourage anyone to take a look at the CherryPy source and try to understand it. It is surprisingly simple to follow along. It doesn’t make you fluent in the code, but it does let you see what is happening in a way that diving deeper is easier.

I should also mention these suggestions are somewhat limited to languages like Python and Ruby where the types are dynamic. In C#/Java for example you usually have an very clear picture of what is happening and I’d say in rare cases to comments really help. Python doesn’t have this so adding a comment or two here and there allows the reader a context for digging deeper.


Posted Fri Feb 26 17:14:28 2010 by Eric Larson

Making Code More Testable

Lately I’ve been really making efforts to become a better tester. I’ve heard that tests make code better just about as much as I’ve heard that there isn’t really a difference. After working on a large code base for a while, my conclusion is that tests do help isolate code, which is a definite positive for maintainability.

The issue though is a catch 22 when you have a code base that managed to get a little out of control. This tendency is entirely normal of course. If you can stop a user from hurting with a few lines of code then that is a big win. Users are important and in the end then code is not.

That said, at some point there becomes the requirement to start really nailing down corner cases. This is where bugs occur in really sneaky parts of the code that you may never have considered or because of slow changes in use cases. It challenges you to reconsider seemingly stable code, in which case your tests are really the only way to have some confidence things are working correctly. If you originally wrote the code you probably feel more assured things can work, but when the code is foreign, you need the tests to verify in real terms what is happening.

The question then is how the heck to tear apart the old code to improve the testability? To be perfectly frank, I have no idea. Part of the problem is simply finding an easy to way to iterate on the problem. It is easy to bite off more than you can chew, so there needs to be ways to roll things back. Likewise, it is also difficult to know your making improvements since you don’t have tests there to reassure yourself things didn’t get way worse in some unknown manner.

There is obviously not a single answer to this, but I would like to find some resources that define some techniques or at the very least ideas. For example, DVCS does help because in theory breaking off a “cleanup” branch is trivial. Unfortunately, that is only the beginning since you also have to consider any sort of environment that needs to be setup.

No matter what techniques are out there it can’t hurt to dive in. If you fail, remove the branch and try again. Eventually you’ll hit on something. There have been many times where I started learning some language or library only to become extremely frustrated. Eventually when I revisit it, things are clearer and it all makes sense. This is the same kind of thing, so step one is just doing it even if you fail.


Posted Wed Feb 24 16:40:02 2010 by Eric Larson

Disliking the GIL in Python

One theme that I took away from PyCon this year was how truly bad the GIL is. For those that don’t know, the GIL is the Global Interpreter Lock. When you are programming and you want to do two things at the same time, you usually use something like a thread. Threads can access resources, so often times it is important to lock them while you use them. This prevents the situation where things get overwritten or corrupted due to two different threads accessing the same resources.

When you have one processor the result is that while things are presumably threaded, the operating system is actually just acting as though it can do two things at one time. So, in Python, when you are using threads on a one processor or single core machine, your threads are effectively just as good as any other language that uses the operating system’s threads. The problem is that when you move to more than one core, the operating system allows you to use more than one core, while Python’s GIL continues to mimic parallel operations.

The reality is that most of time it really doesn’t matter. I’ve avoided using threads in my design for the majority of my programing career and my findings are that it was a good idea to do so. But, now that machines are getting more cores, avoiding parallel operations is in fact more limiting.

Again, most people won’t need it. Python is pretty quick and there are instances where Python’s threads really can help. The problem is not where we are now, but where we want to be in the future. There was definitely an asynchronous theme at PyCon as well that effectively addresses some of the problems with threads, but it considers it in terms of connections. When programs communicate there is a limit to how many connections can be made. Threads are one option where each thread is a connection, but threads have a cost in terms of memory and operating system resources that make using 10k (for example) threads impossible. The async model instead changes the concepts around, so instead of having a thread constantly listening, you utilize the operating system to notify you of events. This means having 10k connections is trivial because each connection is truly efficient in that it won’t do any work without being notified.

While async helps with the aspect of connections, it doesn’t help at all in terms of utilization of the hardware. The current option then is to simply use more than one process and make sure your state is not held within your applications. This is a good practice generally with both threads and async. Unfortunately, not all applications are focused on connections like web applications are.

The whole issue that I personally have with the GIL is not so much the actual implications but rather that we need new models and that even if we get them, without the GIL issues fixed, they won’t help as much as they could. The async model is pretty tough. The libraries don’t usually work with it because they can block the main loop. This makes them rather impractical then. Threading is becoming pretty well understood but it is still difficult because threading implies a shared state that is a bad idea. There are still some required changes in how we program that are needed to make threading more scalable from a programmer perspective. The design patterns are coming along, but again, with the GIL, it just doesn’t make that much sense when you’re talking about something like a 16 core machine.

Even though I personally don’t have a specific use case that is a problem today, it shouldn’t really matter. Python is a great language with a ton of “batteries” included along with a battery store close by. If we can’t figure out a way to make the GIL go away or work across cores, then we really limit the gains we can get from the history in the Python community. It limits rather deeply where Python can be used.

One solution I see is taking something like the multiprocessing module and use that as the basis for implementing higher level concurrency models. This already works pretty well on Linux but there are some definite corner cases that cause problems. That said, it seems reasonable that a higher level library could restrict the use to a certain model in such a way that the corners cases are never hit.


Posted Tue Feb 23 19:18:03 2010 by Eric Larson

A Few PyCon Thoughts

I’m almost done with the first day of PyCon and it has been a really good time. One thing that is kind of nice is that I don’t have a really strong topical focus. It has been allowing me to attend talks that fall on the edges of my interests, which means topics that don’t seem reflect directly on my day to day work. Fortunately, these kinds of talks often have an impact through exposure to different ideals or technology. A great example was an open space on MongoDB. They were talking about some pretty serious low level stuff, but at the same time, it was some great overviews of how things work.

Culturally it is a pretty interesting experience. This past year has pushed me to split my focus on music in addition to technology. It has been a good thing to get a better balance life and appreciate the time I do spend on technology. The converse is that the overwhelming geekiness of the crowd is rather stark. This is far from a bad thing. It is really inspiring to see people who are entirely engulfed in not just technology but the community surrounding it. While the initial social interactions can be tough for all involved (myself definitely included) once the commonality of Python comes to forefront, lasting friendships quickly develop.

It is definitely a good time to spend the days with such an interesting and intelligent group of people. There are immensely smart people all in once place and while there are plenty of quirks, there is a mass of kindness that is rather refreshing. I’m really looking forward to tomorrow!


Posted Fri Feb 19 23:46:13 2010 by Eric Larson
Created using Python, jQuery and Emacs