All in the <head> – Ponderings and code by Drew McLellan –

hAtom and Last.fm Shoutboxes

Late last week I received an email from a user of the hAtom to Atom service I maintain at tools.microformatic.com, asking if I could update to the latest version of the hAtom2Atom XSLT that the service implements. Every happy to oblige, this weekend I set about doing just that, and after the upgrade began to tail -f the httpd log so that I could check a few requests to see if the results looked correct.

I’d never really promoted the service in any particular way, and knew that a few people used it for testing their hAtom implementations, as well as using it to subscribe to the odd hAtom enabled page – mostly, I presumed, to keep tabs on their own implementations. You can imagine my surprise, then, to see the log files ticking by and a fair old rate, with URLs from Yahoo! Pipes, but mostly from social music service Last.fm.

A bit of investigation lead me to a blog post describing how to subscribe to a Last.fm shoutbox using my hAtom to Atom service. This is a superb example of the utility of hAtom. Last.fm don’t have a dedicated feed for their shoutboxes, but because they’re nicely marked up with hAtom, it can be converted to Atom on the fly. Awesome.

Now, about my smoking server. At the moment I don’t use any caching on the hAtom to Atom service. Of course, every request to the service causes me to make a request out to the destination server which then does its thing and returns me the result. I take that result and process it and pass it back to my user. Any caching I can implement to cut down that process for common requests seems like the right thing to do – even though I’m not really having any problem serving the volume of requests at the moment.

However, I don’t want to get in the way of those who are using the service as a method of testing their hAtom markup, and an unexpected caching layer could cause havoc there. I’ve considered implementing a no-cache flag, but it’s all too easy for people to forget to remove that or to use it without being fully aware of the implications.

I think what I’ll do is selectively apply caching to known URL patterns (like Last.fm shoutboxes) where I know that retaining the result for 10 or 15 minutes really won’t be a problem, and perhaps drop a comment into the result indicating the time at which it was cached.