All in the <head>

– Ponderings & code by Drew McLellan –

– Live from The Internets since 2003 –

About

About Drew McLellan

Photo of Drew McLellan

Drew McLellan has been hacking on the web since around 1996 following an unfortunate incident with a margarine tub. Since then he’s spread himself between both front- and back-end development projects, and now is Director and Senior Web Developer at edgeofmyseat.com in Maidenhead, UK (GEO: 51.5217, -0.7177). Prior to this, Drew was a Web Developer for Yahoo!, and before that primarily worked as a technical lead within design and branding agencies for clients such as Nissan, Goodyear Dunlop, Siemens/Bosch, Caburys, ICI Dulux and Virgin.net. Somewhere along the way, Drew managed to get himself embroiled with Dreamweaver and was made an early Macromedia Evangelist for that product. This lead to book deals, public appearances, fame, glory, and his eventual downfall.

Picking himself up again, Drew is now a strong advocate for best practises, and is currently Group Lead for The Web Standards Project. He has had articles published by A List Apart, Adobe, and O’Reilly Media’s XML.com, mostly due to mistaken identity. Drew is a proponent of the lower-case semantic web, and is currently expending energies in the direction of the microformats movement, with particular interests in making parsers an off-the-shelf commodity and developing simple UI conventions. He writes here at all in the head and, with a little help from his friends, at 24 ways.

IWMW, Amazon Web Services and hKit

17 July 2007

Today I attended the Institutional Web Management Workshop at the University of York to present about microformats. I was delivering a revised version of the Can Your Website Be Your API presentation I’ve given a couple of times over the last year, and failed to appreciate that each time I add new material it takes a bit longer to get through to the end. Who knew? Anyway, I managed to squeeze it into 45 minutes, and it seemed to be well received.

Presenting before me was Jeff Barr, Web Services Evangelist from Amazon. I first met Jeff at d.Construct last year where, unsurprisingly, he was presenting on the same topic of Amazon’s web services. Last time I was podcasting the session, and so could only give half my attention to the content of Jeff’s presentation, so was pleased to get the opportunity to properly soak it up this time.

My conclusion? Amazon have some really excellent, low cost services. S3 (the online storage service) I already knew about and understood to a degree. The concept is simple – you fling some files up into the cloud, and there they stay, stored redundantly on Amazon’s servers. What I didn’t fully appreciate was that access can be finely controlled through an ACL – meaning that not only can backups be safely kept private, but resources such as web assets or ‘downloads’ (software or podcasts or whatever) can be made fully public and therefore take advantage of Amazon’s high availability infrastructure. Of course, S3 charges on the basis of both storage space and data transfer (so you may want to think twice about using it to publish a freely available podcase, for example), but for things that really matter those costs seem very reasonable.

What I really missed last time (or perhaps the service wasn’t available or ready at that point) was the potential of their EC2 (Elastic Compute Cloud) service. This is basically a service where you can rent virtual servers by the ‘compute hour’. I’m not sure of the finer details of how that works, but the concept is that you can programmically bring servers online to perform whatever task you like, as you need it, just in time. That task can be almost anything – from performing a big batch job like processing a bunch of images, through to just providing an additional web server to help cope with load. The virtual servers have a good spec (something like 1.7GHz CPU, I think 1.7GB RAM and 160GB disc), and data between them and the S3 storage system is free. If you have a bunch of data on S3 you could bring up a EC2 server to grab it, process it and put it back and you only pay for the compute time, not the transfer in or out of either EC2 or S3.

For most standard web applications, it not often all that useful to be able to bring up an additional database server out in the cloud to help you with load. That sort of thing needs to be designed for from the start, and for a lot of applications just wouldn’t work architecturally anyway. Another option is if you were to host your entire application out on the cloud using a bunch of EC2 servers, full time. Depending on your needs, that could be cost effective compared to renting from a conventional hosting company. You do need to have quite a bit of trust in Amazon at that point, but I suspect many would consider Amazon more trustworthy than a lot of fly-by-night hosting companies anyway. The big advantage of hosting entirely on EC2, of course, would be that if you experience a spike in traffic you can just bring more servers online, right in the same data centre as your primary servers, and you only pay for what you use. Once traffic subsides, you can drop back down to normal. (It’s worth noting that this point that EC2 also has accounted for the need for multiple servers to share a secure local networking environment.)

This got me to thinking. For the last year or so, I’ve been hosting a service at tools.microformatic.com for people wishing to make casual use of the hKit microformat parser to extract microformatted data from a page. Pass in a URI and an output format (either plain text, serialised PHP or JSON) and the service fetches the page, parses it and returns the result. It’s very similar conceptually to how Technorati’s hosted version of X2V works.

Now this is all well and good, it works just fine for people running tests to validate that they’ve implemented a particular microformat in an understandable way, and it copes with reasonable traffic as we saw recently with the Last.fm shoutbox thing, which is another service on the same box. However, it’s not a redundant, scalable and utterly reliable system that you could start building applications on top of. So what if I were to reimplement this service on top of EC2? There’s no databases involved, in fact the service holds no data at all, so architecturally dealing with extra load should literally be a case of bringing another server online. Amazon claims to have 99.99% uptime on these things, which sounds pretty astonishing for such a low cost.

It certainly sounds like something that would be more reliable and dependable than my little server on its own, and possibly something that people would feel comfortable enough to build services on top of. With the cost from Amazon being as low as it is, it’s certainly in the realms of something that could be paid for by running a bit of advertising or perhaps seeking the odd bit of micropatronage.

EC3 is still in beta, but if I can manage to get access it might be something worth giving a go.

- Drew McLellan

Comments

  1. § Matt Dawson:

    So just for clarification, is it your understanding that EC2 lets you set thresholds that bring additional servers online only as needed? A kind of “set it and forget it” kind of arrangement? That’d be pretty amazing.

    Best of luck. I hope you end up going this direction, as I’m really intrigued by the possibilities.

  2. § Drew McLellan:

    From what I understand, Matt, there’s no magic involved but you could have a script on one server start up other servers when various conditions were met. Sort of like calling for backup.

  3. § Gareth Rushgrove:

    I like the idea of the microformats services been on something like this – maybe even having a central reference point for lots of services? I’d keep meaning to write up a little service I knocked up around geo. Send it a url and it redirects you to the relevant google map.

    I’ve had an EC2 account for a while, although I haven’t actually done anything with it. The project I’m working on at the moment is making pretty good use out of S3 though – the cost saving and ease of use are pretty impressive.

  4. § Jeff Barr:

    Hi Drew, great review, thanks.

    You can go to http://aws.amazon.com/ec2 and sign up for the EC2 beta now—no waiting. Enjoyed your talk too,

    I have been a big fan of microformats for several years. Your use of highlighted HTML was a nice presentation technique, by the way.

Photographs

Work With Me

edgeofmyseat.com logo

At edgeofmyseat.com we build custom content management systems, ecommerce solutions and develop web apps.

Recent Links

Affiliation

  • Web Standards Project
  • Britpack
  • 24 ways