All in the <head>

– Ponderings & code by Drew McLellan –

– Live from The Internets since 2003 –

About

Blog Data Exchange

2 March 2004

Many thousands of people keep weblogs, and most of those use some sort of content management system to enable them to easily manage their output. Whilst some use their own bespoke systems, many use specific blogging products either on their own server (like Movable Type) or as a remote service (like Blogger). One thing is certain, however, and that is that sooner or later nearly every blog-keeper is going to want to switch to a different management tool.

This can be for a number of reasons. For some, they simply outgrow the facilities of services like Blogger and want to move to something more fully-featured. As many of these tools are developed by volunteers, it’s not unusual to see support drying up and a product’s life coming to a natural conclusion (a la Gray Matter). Sometimes a new tool will come onto the market that has a different feature-set or approach, and that in itself will entice a switch. Whatever the reason, switching is common and the need to move data from one system to another becomes extremely important to the individual user.

Whilst many blogging tools offer data import facilities, these often rely on transferring data from one database structure to the other – the net result being that if either side changes their data structure the import routine has to be rewritten. This is labour intensive, and it’s obviously hard to get this sort of information freely shared between developers. What’s more, some systems can operate on a number of different database systems, meaning that the import routine needs to be able to deal with multiple configurations of even just one version of a competitor’s product. You can quickly see that this route is never going to be satisfactory.

The obvious conclusion is that we need a common data exchange format – probably in XML – that all the blogging CMSs can read and write. The difficulty is then getting the developers to implement yet another XML dialect into their tools… unless you use a format that they’ve already integrated – couldn’t Atom do all this?

I’ve not waded through the Atom spec in huge detail lately, and I know it’s constantly evolving, but it strikes me that this should be easy. We already have a machine-readable data exchange format that all the blogging tools are supporting, all you should need is one mother of an article feed and one mother of a comments feed and that’s your data import (and export) done. Easy. If you really wanted to get fancypants, I guess you could approach the issue on a more interactive level and get the receiving tool to act as an Atom client and retrieve the posts one-by-one. Alternatively, the exporting tool could post the articles one-by-one to the receiving tool, also via the API. There seem to be a lot of possibilities, and I feel that this is so obvious that I’m either missing something, or it’s already in the Grand Plan.

- Drew McLellan

Comments

  1. § Danilo: Before I got to the Atom paragraphs, I was thinking, why not just export everything as RSS and then have a tool that imports the feed into whatever datastore tha the new tool uses.

    I suppose that there might be some datafields that aren’t addressed, but not being a bloger I wouldn’t know.

    Haven’t looked at Atom. Only thing that comes to mind was reading a while ago about several blogs getting into “wars” over whether to allow an Atom client to be able to read malformed Atom feeds.
  2. § Ronan: Drew, as far as I know the feature you are looking for is included in the current Atom specification.

    Although I’m don’t think it was implemented for the reasons stated, I think it would be possible to set up a script that would recursively ‘browse’ through the old archives and insert the entries into the new archives via the Atom API.

    If you are interested,

    http://bitworking.org/news/Atom_Archive_Format
  3. § Nick: Since I’m swimming in XML right now, this seems like an excellent idea.

    The hard part is always agreeing on the particular schema, but if this is all in Atom, well, hey! What more do you want for nuthin? Rubber Biscuit?

    Anywhoo – very nice site – I particularly like the logo.

Photographs

Work With Me

edgeofmyseat.com logo

At edgeofmyseat.com we build custom content management systems, ecommerce solutions and develop web apps.

Recent Links

Affiliation

  • Web Standards Project
  • Britpack
  • 24 ways

About Drew McLellan

Photo of Drew McLellan

Drew McLellan has been hacking on the web since around 1996 following an unfortunate incident with a margarine tub. Since then he’s spread himself between both front- and back-end development projects, and now is Director and Senior Web Developer at edgeofmyseat.com in Maidenhead, UK (GEO: 51.5217, -0.7177). Prior to this, Drew was a Web Developer for Yahoo!, and before that primarily worked as a technical lead within design and branding agencies for clients such as Nissan, Goodyear Dunlop, Siemens/Bosch, Cadburys, ICI Dulux and Virgin.net. Somewhere along the way, Drew managed to get himself embroiled with Dreamweaver and was made an early Macromedia Evangelist for that product. This lead to book deals, public appearances, fame, glory, and his eventual downfall.

Picking himself up again, Drew is now a strong advocate for best practises, and stood as Group Lead for The Web Standards Project 2006-08. He has had articles published by A List Apart, Adobe, and O’Reilly Media’s XML.com, mostly due to mistaken identity. Drew is a proponent of the lower-case semantic web, and is currently expending energies in the direction of the microformats movement, with particular interests in making parsers an off-the-shelf commodity and developing simple UI conventions. He writes here at all in the head and, with a little help from his friends, at 24 ways.