All in the <head> – Ponderings and code by Drew McLellan –

Blog Data Exchange

Many thousands of people keep weblogs, and most of those use some sort of content management system to enable them to easily manage their output. Whilst some use their own bespoke systems, many use specific blogging products either on their own server (like Movable Type) or as a remote service (like Blogger). One thing is certain, however, and that is that sooner or later nearly every blog-keeper is going to want to switch to a different management tool.

This can be for a number of reasons. For some, they simply outgrow the facilities of services like Blogger and want to move to something more fully-featured. As many of these tools are developed by volunteers, it’s not unusual to see support drying up and a product’s life coming to a natural conclusion (a la Gray Matter). Sometimes a new tool will come onto the market that has a different feature-set or approach, and that in itself will entice a switch. Whatever the reason, switching is common and the need to move data from one system to another becomes extremely important to the individual user.

Whilst many blogging tools offer data import facilities, these often rely on transferring data from one database structure to the other – the net result being that if either side changes their data structure the import routine has to be rewritten. This is labour intensive, and it’s obviously hard to get this sort of information freely shared between developers. What’s more, some systems can operate on a number of different database systems, meaning that the import routine needs to be able to deal with multiple configurations of even just one version of a competitor’s product. You can quickly see that this route is never going to be satisfactory.

The obvious conclusion is that we need a common data exchange format – probably in XML – that all the blogging CMSs can read and write. The difficulty is then getting the developers to implement yet another XML dialect into their tools… unless you use a format that they’ve already integrated – couldn’t Atom do all this?

I’ve not waded through the Atom spec in huge detail lately, and I know it’s constantly evolving, but it strikes me that this should be easy. We already have a machine-readable data exchange format that all the blogging tools are supporting, all you should need is one mother of an article feed and one mother of a comments feed and that’s your data import (and export) done. Easy. If you really wanted to get fancypants, I guess you could approach the issue on a more interactive level and get the receiving tool to act as an Atom client and retrieve the posts one-by-one. Alternatively, the exporting tool could post the articles one-by-one to the receiving tool, also via the API. There seem to be a lot of possibilities, and I feel that this is so obvious that I’m either missing something, or it’s already in the Grand Plan.