Many thousands of people keep weblogs, and most of those use some sort of content management system to enable them to easily manage their output. Whilst some use their own bespoke systems, many use specific blogging products either on their own server (like Movable Type) or as a remote service (like Blogger). One thing is certain, however, and that is that sooner or later nearly every blog-keeper is going to want to switch to a different management tool.
This can be for a number of reasons. For some, they simply outgrow the facilities of services like Blogger and want to move to something more fully-featured. As many of these tools are developed by volunteers, it’s not unusual to see support drying up and a product’s life coming to a natural conclusion (a la Gray Matter). Sometimes a new tool will come onto the market that has a different feature-set or approach, and that in itself will entice a switch. Whatever the reason, switching is common and the need to move data from one system to another becomes extremely important to the individual user.
Whilst many blogging tools offer data import facilities, these often rely on transferring data from one database structure to the other – the net result being that if either side changes their data structure the import routine has to be rewritten. This is labour intensive, and it’s obviously hard to get this sort of information freely shared between developers. What’s more, some systems can operate on a number of different database systems, meaning that the import routine needs to be able to deal with multiple configurations of even just one version of a competitor’s product. You can quickly see that this route is never going to be satisfactory.
The obvious conclusion is that we need a common data exchange format – probably in XML – that all the blogging CMSs can read and write. The difficulty is then getting the developers to implement yet another XML dialect into their tools… unless you use a format that they’ve already integrated – couldn’t Atom do all this?
I’ve not waded through the Atom spec in huge detail lately, and I know it’s constantly evolving, but it strikes me that this should be easy. We already have a machine-readable data exchange format that all the blogging tools are supporting, all you should need is one mother of an article feed and one mother of a comments feed and that’s your data import (and export) done. Easy. If you really wanted to get fancypants, I guess you could approach the issue on a more interactive level and get the receiving tool to act as an Atom client and retrieve the posts one-by-one. Alternatively, the exporting tool could post the articles one-by-one to the receiving tool, also via the API. There seem to be a lot of possibilities, and I feel that this is so obvious that I’m either missing something, or it’s already in the Grand Plan.



Comments
I suppose that there might be some datafields that aren’t addressed, but not being a bloger I wouldn’t know.
Haven’t looked at Atom. Only thing that comes to mind was reading a while ago about several blogs getting into “wars” over whether to allow an Atom client to be able to read malformed Atom feeds.
Although I’m don’t think it was implemented for the reasons stated, I think it would be possible to set up a script that would recursively ‘browse’ through the old archives and insert the entries into the new archives via the Atom API.
If you are interested,
http://bitworking.org/news/Atom_Archive_Format
The hard part is always agreeing on the particular schema, but if this is all in Atom, well, hey! What more do you want for nuthin? Rubber Biscuit?
Anywhoo – very nice site – I particularly like the logo.