Ionrock Dot Org

by Eric Larson

My Weblog

More Fun with Mercurial

The other day I had something of an issue at work. I was working on retooling our testing environment when there was a need to provide a fix for something in production. I couldn't reproduce the issue, so I decided to add some extra logging to help try and gather some data on the issue. With the code in place, it became clear that I didn't know how I was going to move those changes to the production repo while keeping my other work safe.

After looking into the issue further, I thought rebase might be helpful. Rebase is a great extension, but it wasn't going to provide a fix (that I know of). The rebase extension allows you to choose the order of two existing heads. The classic example is when you are working on a feature, you pull to get the most recent changes and you want to upgrade to the latest from the remote repo, while keeping your changes "in front" or after the pulled code in the history. My description might be a little off, but it was how I understood the process.

In my situation the scenario was as if I already rebased and did so incorrectly. Fortunately, the transplant extension came to the rescue. What I wanted to do was effectively recreate my local repo and correct the order of commits so my unfinished work was "in front of" my production fix. To put things plainly, I had a sequence of commits 'ACB' where 'C' was unfinished, so I wanted to move it to the front and have 'ABC'. What I gathered is it is not really possible to reorder the commits since the time is always attached to the changeset. But, I was able to push my production fixes without having to push my working changes, which was good enough for me.

I started by cloning the remote repo. Then I transplanted the production changesets I needed from my local repo. Then I pushed back to the remote repo. I then transplanted the rest of my changes to the new clone. Just for good measure I pull my new remote changes into my local repo and merged to see what would happen in terms of history. It actually made it clear that things had been transplanted at different points in time and reordered. Here is what it looked like:

> ls local > hg clone ssh://user@remote/hg/repo remote > ls local remote > cd remote > hg transplant -s ../local ... interactively choose changesets to apply ... > hg transplant -s --continue # if any merges failed > hg push

Being able to push my production changes without having to also push my working changes means the person doing the release can merge with default without having to exclude my working changesets. This doesn't seem like a huge win, but I think it is pretty helpful way to avoid someone working with changesets they didn't write themselves. It seems like it is a decent work flow as well. Keeping your own "production" or "pusher" repo as an intermediary for a remote production repo can be a helpful way of making sure you introduce atomicity while still keeping your changes in VC. I've found the more commit points you create, the easier it is to see where things might have gone wrong. The downside is that your changes might become interspersed with other changes. Rebase definitely helps this case and I believe using a local production repo for pushing also provides another means of keeping merges simple and obvious.

Posted Fri Jun 5 18:10:16 2009 by Eric Larson

TwitrView: A Small Announcement

When I first started using twitter for Ume, I thought it might be kind of cool if people did interviews over twitter. The character limit is an interesting constraint, there is a real time aspect, and the person getting interviewed can review an answer. For myself, I enjoy handling email interviews because you have a chance to edit your thoughts, which means readers get a more interesting and informative response. With that in mind I set out to write a crawler of sorts to monitor a set of twitter accounts and compile an interview between some people.

The idea for a crawler type of application was also partly stemmed from my desire to get more comfortable with threads. In addition to my own perceived concurrency issues, I started playing with BerkleyDB, and later, Tokyo Cabinet. In the end though, simplicity won and I dropped the persistence and just do everything on demand.

The result is TwtrView! All it does is pull the last 50 tweets from the users you enter and see what messages include @replies to any of those users. This makes it possible to see an interview as well as find conversations folks might be having. It actually has been somewhat useful because when I see a conversation someone might be having, I enter the two usernames and I can see where things came from.

There are definitely some limits. I'm using on twitter account for the API calls so if something doesn't work, it might have had too many requests. If people use this, then I'm sure OAuth will become much higher on my list of things to do. Also, the requests are made from my server, so it is very much network bound in terms of processing.

I hope someone else might find it useful. If you have any issues feel free to send me an email or comment below.

Posted Tue Jun 2 16:41:22 2009 by Eric Larson

Seriously Mutable

I ran into this today and wanted to write it down somewhere both for myself and posterity (ok, mostly for myself).

As you may know, python dictionaries are mutable. This means you can do things like this:

x = {'foo':'bar'} def change_it(d, new_value):     d['foo'] = new_value change_it(x, 'baz') print x # should print {'foo': 'baz'}

As you can see, the variable x can be changed anywhere. With that in mind what I was doing was basically this:

x = {'foo': 'bar'} def change_it(new_value):     y = x['foo']     y = new_value change_it('baz') print x # we get {'foo': 'bar'} as expected

So, this might give you the impression that when you access a value in a dictionary that value is not a reference to the original. But, what about this?

x = {'foo': {1: 'one'}} def change_it(new_value):     y = x['foo']     y.update(new_value) change_it({2: 'two'}) print x     # do you expect {'foo': {1: 'one'}} ?

You will actually get:

{'foo': {1: 'one', 2: 'two'}}

The reason this happens is because the reference you get back is a mutable object, so in the above code, the variable "y" points to the same mutable dict "x['foo']" points to. We can test this with this bit of code:

x = {'foo': {1: 'one'}} y = x['foo'] print y is x['foo'] # True

When you consider python always passes by reference this makes sense. At the same time, it can be a little sneaky if you're not thinking about it since it is usually pretty simple to just use code like "y = x['foo']" and expect it to act like a copy.

Thanks to a few folks in #cherrypy on oftc.net for clearing up my explanation on this.

Posted Thu Jun 11 07:58:00 2009 by Eric Larson

When It's Time To Start Over

I recently read about a campaign to fix outlook and it got me thinking. Why hasn't Microsoft considered starting fresh on any of their software? My personal choice would be a reset on Internet Explorer. The browser wars are over and Microsoft won. It's too bad the web lost. At work we've been considering trying out Flex for some more intensive UI elements and it occurred to me that the consistency alone would probably be worth losing the hacky tests we have to keep up in order to feel confident releasing javascript intensive code on multiple platforms.

Random web developer wish lists aside, it is tough to recognize when some piece of software should be rewritten from scratch. Weighing the pros and cons rarely provides a true measure of the whether you'll be successful or not. Yet, even without a good means of measuring needs, it is clear that a rewrite can be very helpful. I think Apple provides an excellent model for rewrites. They have successfully rewritten their operating system, a web browser, and an office suite, to great success (in my opinion at least).

One theme in all these rewrites has been the inclusion of other pieces of software. Gecko (khtml) has been critical to providing a new suite of tools simply by making it possible to pay attention to other aspects of the applications. Likewise, OS X utilizing FreeBSD was not necessarily innovative, but rather an effective means of raising the level of abstraction.

Like programming languages, abstracting away low level detail enables thinking at a level closer to how people think. In programming, this allows programmers to reduce the complexity. When you can safely say "x = 5" without having to think about memory management or the scope of the variable, you create space in your available mind for other details. In the same way, introducing a well established library or piece of code can help take an application to the next level by moving the innovations closer to the user.

Going back to Microsoft, I wonder why they have not really made efforts to capitalise on these kind of changes in the software landscape. I'm sure lawyers would be involved in the equation, but as a participant in the software landscape, it is clear it hurts users. It is always difficult to standardise on things as competition and differences are where innovation occurs, but just like programming languages, when there is a low level standard to build on, using something else just adds complexity.

I think the time to start over is never clear. One sign that it might be worth it though is a low level library or system that might free up your resources for features closer to the user. At the end of the day software is there for people to use, so anything you can do to improve the experience for the person using is worth more than keeping a code base out of convenience.

Posted Thu Jun 25 16:34:30 2009 by Eric Larson
Created using Python, jQuery and Emacs