All in the <head>

– Ponderings & code by Drew McLellan –

– Live from The Internets since 2003 –

About

How To Make Your Website Fast

61 days ago

Since launching the new Greenbelt website this week, one thing a lot of people have commented on and asked about is the speed. It’s noticeably fast. I’ve never heard someone complain that they really like a site, but goshdarnit if it weren’t so fast. Fast beats slow, every time. So we wanted to make it fast from the outset.

That said, boring doesn’t beat fast, and neither does not reflecting the brand. Greenbelt Festival is a buzzing, vibrant event. It also has a strong identity developed by our designer on this project, Wilf Whitty at Ratiotype. A craigslist-style non-design wasn’t going to cut it, no matter how fast that may be. As a result, the site is full of big photos and iconography, even video and audio. And it’s still fast.

I don’t even remotely claim to be any sort of authority on the subject, but I can tell you what worked for us.

Start with good hosting

It frequently surprises me how little some designers and developers appear to care about the quality of their hosting. They’ll spend days, weeks, months crafting a site and then launch it onto $3 per month crappy shared hosting.

It should go without saying that if you’re paying $3 per month for hosting, that hosting is going to be over-sold. Putting networked hardware in data centres, keeping it cooled, powered and staffed costs quite a lot of money. Simple economics dictate that if you’re not paying very much money for that service, then the hosting company are going to have to make it up on volume. That means lots of customers per server – probably more customers per server than will be acceptable if you care about the response time of your website.

A reasonable rule of thumb is that shared hosting will not be fast. If you care about speed you need to think about a virtualised server (VPS-style, cloud or traditional) which has CPU and RAM resources reserved for it, not in contention with other customers. If you want more grunt, a dedicated server is a good option.

The Greenbelt site is on a dedicated server with Memset, whose data centres are located here in the UK, geographically close to the majority of the site’s traffic. As a straightforward PHP and MySQL site with reasonably predictable traffic and no need to scale up at the drop of a hat, there’s insufficient benefit to using dynamically provisioned cloud hosting. Just a good quality, reasonably priced, solid dedicated box with a high quality, reliable hosting company. Not glamourous, just smart.

Cache it all the way

I’ve become a massive fan of Varnish of late. It’s an HTTP cache (or reverse proxy) that sits on port 80 in front of your web server. If the web server’s response is cachable, it keeps a copy in memory (by default) and serves it up the next time that same page is requested. Done right, it can dramatically reduce the number of requests hitting your backend web server, whilst serving precompiled pages super-fast from memory.

Good use of Varnish can make your site much faster, however, it is no silver bullet. The caveat “if the web server’s response is cachable” turns out to be a very important one. You really need to design your site from the ground up to use a front end cache in order to make the best use of it.

As soon as you’ve identified the user with a cookie (including something like a PHP session, which of course uses cookies) then the request will hit your backend web server. Unless configured otherwise (as we have) that would include things like Google Analytics cookies, which of course, would be every request from any JavaScript-enabled browser. If you static assets (images, CSS, JavaScript) from the same domain, by default the cache will be blown on those, too, as soon as a cookie is set. So you have to design for that.

So while Varnish will help to take the load and shorten response times on common pages like your site’s front page, you can’t rely on it as an end-all solution for speeding up a slow site. If your backend app is slow, your site will still be slow for a lot of requests.

It’s a bit like putting WP Super Cache on a WordPress site. It will mask the underlying issue to an extent, but it won’t solve the underlying problem.

Your CMS or app has to be fast

The Greenbelt site runs on a custom CMS. The details of why (people always ask, as if it were heretical for developers to, you know, write their own code) are probably best saved for another post.

When developing the CMS, I set a target time for each page to be compiled, and had the code time itself and output the result at the bottom of the page. Working locally, on a MacBook Pro, obviously the build times would be significantly slower than the production web server, but the key is relative speed between pages. On my dev system, I wanted to have a regular page build in less than 0.01 seconds, and only to go above that if absolutely necessary for complex pages. The front page – an important one for speed – builds in around 0.003 seconds.

My constantly outputting the build time, I was able to keep track of the implications of every bit of code as I was writing it – which is absolutely the best time to fix any issues.

The general approach is the same as taken in Perch – do as much of the work at edit time as possible. When an author writes a blog post using Textile markup, we translate it to HTML and store it that way too. When a content-based page is published, we compile it against its templates, and store a copy of each region as HTML. At runtime, we just perform a simple query to retrieve the precompiled parts and assemble them into an otherwise dynamic page.

Anything that happens at runtime that is expensive to produce and doesn’t need to be bang-up-to-date gets cached for at least an hour. That includes things like search facet displays from Solr, the latest tweet from Twitter, blog post listings. If the result of an action is likely to be the same the next time it’s performed, do it once and cache it for a while. As I said, cache it all the way.

Optimise the front end

The majority of the time between the user requesting a page and it finishing loading is spent not at the server, but in the web browser. Entire books have been written about optimising the load time of your pages. All I can say is read them and implement all the advice that applies to you. It’s not ultra-nerdery for bored front end engineers, this stuff actually works.

Some key tools I found useful were Google Page Speed for monitoring testing, and the Network panel in webkit’s Web Inspector tools. I used Dustin’s script.js for asynchronous JavaScript loading, which I found to be much faster in practise than Steve Souders ControlJS, although not without a few bugs in older IEs.

I combined most of my JavaScript and minified it using YUI Compressor as a build option on the server. I found that the gains from minifying CSS (just a few kilobytes) weren’t worth the loss of line numbers when debugging.

All the site’s images are managed by a new Media Management System, which I’ll write about another time. Those are all served from subdomains (m1 – m4.greenbelt.org.uk) and are handled by nginx rather than the Apache 2 server that handles the PHP page requests. The routing of the requests to different backend servers is handled in the Varnish configuration.

Other shared static resources (like a copy of jQuery and script.js itself) are served from another subdomain, again through nginx and Varnish.

Why the subdomains? Despite the requests ending up on the same server, the subdomains help increase the number of resources the browser will download in parallel, as browsers limit on a per-domain basis. I may have gone a bit overboard on the subdomains, truth by told, but this one site is part of a larger system of sites and apps, and it serves a broader purpose.

Depending on the width of your browser window, Page Speed ranks the front page at around 92%. The points are docked (and change) due to the ‘scaled images’ rule. The rule says you shouldn’t have larger images that are scaled down in your HTML. Instead you should scale the images first and display them at 100%. As this site has a responsive layout, the images scale to fit at any window size, so that rule is a red herring in this case.

Follow the rules as best you can, but remember it’s fine to ignore ones that simply don’t apply.

To conclude

That was a lot of words to explain what I hope is a simple point. There’s no silver bullet to making a slow site fast. You must take a holistic approach. High performance runs the entire way through from the hardware it’s hosted on, through the app that builds the pages, to the server software that delivers the pages and the front end code that displays them in a browser. Speed is a feature that you must design, not just a bit of configuration done at the end.

- Drew McLellan

Ideas of March

62 days ago

If there’s one thing more tedious than blogging about blogging, it’s blogging about lack of blogging. But this post isn’t about that. It’s about the importance of blogs and the importance of sharing ideas and debate with follow designers and developers in a permanent, referable way.

Since last year’s Ideas of March post (in which, lest we forget, I committed to, ahem, blog more) I’ve posted just three times. But those are three substantial, mainly technical posts, which not only would I have not been able to express in any meaningful way on Twitter, but that have been useful for referring back to frequently since they were written.

A blog post, of course, offers the freedom to say as much or as little as the subject requires, and crucially, it’s available to refer back to and for others to find in the future. You can’t really do that with Twitter. All ideas and opinions are widely spread for those there in the moment, but are almost impossible to find and reconstruct after the fact. In 2009 I started archiving my tweets, but even so, I’d missed the first 10,000, and there’s no hope of tracing any conversations.

Often, I can’t find something I know I tweeted the day before. Conversely, as an example, here’s the post from 2003 when Dave announced the CSS Zen Garden, which took me roughly 30 seconds to find. Or Dan’s entire Simple Quiz archive, equally so. Everything I’ve posted to this blog since 2003 (most of it fairly throw-away) is still available as it was and where it was the day it was posted.

Permanence and findability are important for ideas to spread and grow. Twitter is a fragile and fleeting place. Give your ideas and thoughts the permanent home they deserve. Here’s how you can join in the blog revival:

  • Write a post called Ideas of March.
  • List some of the reasons you like blogs.
  • Pledge to blog more the rest of the month.
  • Share your thoughts on Twitter with the #ideasofmarch hashtag.

Will you join us?

- Drew McLellan

How A Missing Favicon Broke My App for Chrome Users

92 days ago

Excuse me while I let off some steam. I’ve just spent many hours debugging an authentication issue that was preventing Google Chrome users from logging into a PHP web app we’re currently working on. Here are the gory details for your amusement.

The app uses OAuth to authenticate against the client’s central auth service. That means instead of logging in to our app directly, the process is more like signing into Twitter from a third party – we send the user away to log in, they authenticate on the central server as normal and get sent back to us with a token which we can then use to log them in to our app.

The issue we were seeing is that Chrome users were clicking the login link, being sent out to authenticate, coming back and still being told they weren’t logged in. In all other browsers, returning from a successful login allowed access the app as expected. Puzzling.

As I went through debugging, it became clear that the point at which the process was failing was when the CSRF tokens were being compared. When we create the log in link, we add a CSRF token to the URL and at the same time store the same token in the user’s PHP session. On returning from authenticating, the token is handed back to us and we can compare it to the copy in the session storage to make sure there’s no funny business going on.

Poking further, it became clear that the reason the CSRF tokens didn’t match was due to the user being given a new PHP session when they returned. So when we compared the token, there was no stored value to compare it to. Very odd indeed.

Usually, if sessions go awry, the best thing to do is turn to the raw HTTP headers. A PHP session is of course based on a cookie – the server issues a session ID in a cookie and the browser presents it back to the server with each request. If the session ID was changing, that had to mean that either the browser wasn’t sending the cookie, or the server was issuing a new one. Either way, that would show up in the request and response headers.

Using the Network tab of the Chrome web inspector, I could see the headers for each file. I could see the cookie being set with the page response, and then presented back with the stylesheet request. Clicking out to log in and coming back to my page, I could see the cookie being presented again… but wait! With a different session ID. When did that change?

Double checking the headers again, I could see that no new cookie was being sent in the response to any of the items in the Network inspector. But there are requests that happen that the network inspector doesn’t show you. Like requesting a favicon.

As this app is in development, it doesn’t have a favicon, so the request should result in a 404. Except in this case, it didn’t. The server is configured (and to be fair, misconfigured) to route any requests that are not for real files or directories to my index.php page. This is so the app can have /any/url/it/pleases/ and it just gets routed through to my app to parse internally. A front controller.

This meant that instead of getting a 404 from Apache, the favicon request was being handed back the app’s home page. Niiice. But that shouldn’t be an issue, as even a request for a favicon is sent along with the cookie information, so PHP should pick up the session and carry on as normal. But the plot thickens. Apache is sitting behind Varnish, which I’d configured to strip cookies from any request for static files such as images, CSS, JavaScript, and, you know, favicons.

That meant that by the time the request got passed through to Apache, the cookie had been stripped and the request looked like a new user. Who was then given a brand new PHP session. As soon as I put a favicon in place, bingo, Chrome users could log in.

I’m not sure if Chrome requests the favicon at the beginning of the request, or right at the end, but either way, it was enough to change the user’s session cookie between the end of one page load and the beginning of the next.

A perfect storm of a bit of misconfiguration on my part, a lack of transparency from Chrome’s network inspector, and a missing favicon.

- Drew McLellan

Photographs

Work With Me

edgeofmyseat.com logo

At edgeofmyseat.com we build custom content management systems, ecommerce solutions and develop web apps.

Follow me

Recent Links

Affiliation

  • Web Standards Project
  • Britpack
  • 24 ways

I made

Perch - a really little cms

About Drew McLellan

Photo of Drew McLellan

Drew McLellan (@drewm) has been hacking on the web since around 1996 following an unfortunate incident with a margarine tub. Since then he’s spread himself between both front- and back-end development projects, and now is Director and Senior Web Developer at edgeofmyseat.com in Maidenhead, UK (GEO: 51.5217, -0.7177). Prior to this, Drew was a Web Developer for Yahoo!, and before that primarily worked as a technical lead within design and branding agencies for clients such as Nissan, Goodyear Dunlop, Siemens/Bosch, Cadburys, ICI Dulux and Virgin.net. Somewhere along the way, Drew managed to get himself embroiled with Dreamweaver and was made an early Macromedia Evangelist for that product. This lead to book deals, public appearances, fame, glory, and his eventual downfall.

Picking himself up again, Drew is now a strong advocate for best practises, and stood as Group Lead for The Web Standards Project 2006-08. He has had articles published by A List Apart, Adobe, and O’Reilly Media’s XML.com, mostly due to mistaken identity. Drew is a proponent of the lower-case semantic web, and is currently expending energies in the direction of the microformats movement, with particular interests in making parsers an off-the-shelf commodity and developing simple UI conventions. He writes here at all in the head and, with a little help from his friends, at 24 ways.