Ionrock Dot Org

by Eric Larson

My Weblog

Mercurial Feeds

I started using Mercurial for projects and I am really excited about it. One feature that I think is rather addicting is the RSS feed for commits. This simply rocks! I find that my commit messages keep something of a log of the project. I place TODO items and questions I might have on my decisions and generally give an overview of what I worked on. It really feels like a blog.

I am sure there are some downsides to doing this. I suppose in a huge project if tons of contributors were double dipping by using a mercurial repository as a blog things could get sketchy. At this point though, it really make communication as to what is going on in a project much simpler and easy to manage.

The next step is to hack together an Atom feed instead of using RSS…

Posted Thu Aug 30 06:58:51 2007 by Eric Larson

Writing XML in Code

I have been working with Sylvain recently on Amplee and it has been interesting to see how each of us use Amara. I have always tried to use it like an object as much as possible. What you end up seeing is things like this:

if hasattr(doc.entry, 'title'):
    title = str(doc.entry.title)

Amplee takes a slightly different approach that doesn’t feel as obvious, but is a little safer. I actually like this approach because it feels closer to using a Python dictionary rather than an object.

title = doc.entry.xml_child_elements.get('title')


This is one line, safer for when you don’t actually have the element and doesn’t use the ugly hasattr function. Chalk one up for Sylvain!

Another difference I have noticed is when we write a XML document with Amara. These differences have no actual impact on the code, but I think it is interesting nonetheless.

col.xml_append(doc.xml_create_element(qname(u"title", ATOM10_PREFIX),
                                                       ns=ATOM10_NS, attributes={u'type': u'text'},
                                                       content=collection.title))


The qname function is different to me and the lines are a bit long for my test.

I would do something like this:

col.xml_append(
    doc.xml_create_element(
        u'title', ATOM10_PREFIX,
        attributes={ u'type': u'text'},
        content=collection.title
    )
)


I am not sure this is clearer, but it does present a recognizable pattern over code. In the places I add elements you can see the pattern, which is somewhat helpful. The downside is I think it kind of looks like a C based language, which isn’t as cool ;)

There are other differences as well. I think some of it stems from our backgrounds with different XML technologies. I never really wrapped my head around DOM in a way that I was effective with it. My path would usually lead to a object wrapper. Sylvain though has spent a good deal of time working with the DOM and understands how to use it more fluently. I also have been making a huge effort to keep lines under 80 columns, which is probably why some of my code tends to be longer vertically. It is always interesting to contrast your ideas and code with someone elses so this has been a fun exercise and good learning experience.

Posted Thu Aug 23 15:38:49 2007 by Eric Larson

ampleeblog - Added an index for the @link[rel='alternate']

Added an index for the @link[rel='alternate']

Started using a shelve as well since there seems to be some reloadissue with paste or something where if the process reloads, thelookups from the indexes fail and provides incorrect member_ids. Thisin turn makes creating a feed impossible since it tries to sort on aNone member.

Posted Tue Aug 21 00:55:26 2007 by Eric Larson

ampleeblog - Verified the index is working and added a permalink method to the

Verified the index is working and added a permalink method to themembers

In BC we made a method in the store to take a date and slug and getthe entry via a @link[rel=alternate]. This worked well and I am tryingto get a similar functionality with amplee.

Posted Sat Aug 18 22:03:25 2007 by Eric Larson

URLs Rule the World

So, Bill de hÓra has been blogging quite a bit on URLs and their design. What is interesting to me is that context of today’s entry, he reveals JSPs don’t have something like Routes. This is something of interest to me simply because I have written and used quite a few dispatchers over the past few months with the common theme being it is a simple problem to solve.

You can easily dispatch on the typical:

/{controller}/{method}/{*args}

without much trouble. In Bright Content, we use regex to get dates and create essentially an ID. I have also done slightly more complex patterns such as

/{model}/{id}/{action}/{args}

What is interesting is this method of viewing URLs is really different from something like mod_rewrite because there is an assumption made that the URL contains important keys to the resource. I am not sure if this is conventional wisdom regarding cool URIs, but it seems like it should be. From the Java/C# perspective, I can see why someone could essentially disregard the content of the URL in favor of GET parameters. The history behind the tools and frameworks have been to use GET parameters instead of utilizing the URL tokens, which makes sense because there is less ambiguity. For example, if I have a url:

/blog/2007/3/14/Some_Slugged_Up_Title/

The question arises as to what each token (that which is split by the “/” character) represents. In the above example, the pattern is essentially

/{controller}/{year}/{month}/{day}/{slug}

But the issue is that the year, month and day could be considered one parameter. The slug then could also be different based on an extension, meaning that in the case above, I have a trailing slash, yet if I added a “.atom”, it might serve a different content type.

These really aren’t hard problems to deal with, but I can imagine that for some, it is a new way of thinking. I know as I have thought about the problem the issues run very deep and there is little to stop you from making bad decisions that can really hurt an application’s design. It is always interesting to see issues that seem simple become complicated by going one path and then finding a new direction that starts the process all over again.

Posted Thu Aug 16 16:47:50 2007 by Eric Larson

Creating a Resolver for Tranformations

The other day I spent a bit of time getting a resolver working for 4Suite and my XSLTemplates library. I am pretty sure the concept of a resolver is in the XSLT spec, but it might not be as it seems the implementations/examples rarely mention them. The idea is that you can resolve links yourself in XSLT. This alludes to the XInclude spec but really is a bit more magic. The resolver can lookup where the requested resource is and grab it. Really it is a pretty simple concept.

The problem is that implementing a resolver is less than intuitive. In .NET my understanding is there is a specific XmlUriResolver class that must be extended in order to create a custom resolver. I haven’t looked into what it takes for something like libxslt2 (which would impact lxml) or saxon. 4Suite handles the resolution when creating an InputSource which is essentially just a wrapper for objects getting passed into different 4Suite objects and methods.

Part of the reason I think resolvers are not very wide spread (from what I have seen at least) is because most folks just don’t think about the functionality. I wanted a resolver in order to allow an external tool to transparently find packages via web requests, local template directories and using data files in eggs via pkg_resources. We use a resolver at my job to do a cascade of lookups so you can override different XSLT files at runtime.

The one caveat regarding a resolver is that it adds a bit of magic to your XSLTs in that a simple URI could be referencing a file anywhere. This could make debugging an issue if someone didn’t understand there was a custom resolver in play. With that said, it is a pretty slick tool to have around. I’ll be adding my new template resolver in my XSLTemplates module, which might help to provide a good example.

Posted Wed Aug 15 15:42:44 2007 by Eric Larson

AtomPub Hacking

Recently the Atom Publishing Protocol was blessed as an IETF standard. I have been personally working on an AtomPub implementation for Bright Content while playing around with Amplee here and there. One thing I realize regarding Atom and generally working with XML is that an understanding of all the different XML technologies is critical in order to get the most value.

Take for example updating an Atom entry. One thing that I constantly run into is recursively copying nodes. This is a breeze with XSLT, which is why I started processing updates through a stylesheet. The issue I found there was simply organizing my stylesheets within the context of my application. In the end I think I changed the design to simply just use Amara, but there was a definite path for building Python and XSLT based apps using distutils, which is really exciting.

Where this connection between XSLT and Python can get tricky is in the case where external sources are referenced. In my AtomPub client, I was getting service documents directly within the XSLT by wrapping httplib2 and creating a node-set from the GET results. I ended up changing this because it was more forgiving to get the service document in Python and grab any collections. This revealed a slight disjoint, but in the end it worked out since most of the code could still use the same match templates.

I think navigating these differences between models and taking advantage of each language’s strength simply takes time and practice. It can be hard to see where you might want to do something with XSLT when you have to think about resolving stylesheet links and everything else, so possibly the answer to making the whole thing work is finding a good way to easy apply an XSLT to XML. The Amara xml_xslt method is a good start but something more robust might be beneficial. Something to think about…

Posted Tue Aug 7 17:45:00 2007 by Eric Larson
using python, jquery and emacs ;)