All in the <head>

– Ponderings & code by Drew McLellan –

– Live from The Internets since 2003 –

About

The Curse of max_file_uploads

1 June 2010

Today marks a year since we shipped the first version of Perch, and we celebrated by putting out another big release. We’ve been following a strategy of shipping medium-sized updates regularly throughout the year (July, October, December, February), each time fixing any issues that users have reported and always adding some new functionality to make it worth the trouble of updating.

This latest release has been a big one. We’ve reflected that in the version number, jumping from 1.2.6 to 1.5. As well as the usual fixes and features, we’ve added an entire developer API enabling the extension of Perch through apps. The first app we’ve launched with is for the creation of new pages.

One bug that cropped up late in the development cycle had to do with a change to PHP that caught us by surprise. PHP 5.2.12 had added a new INI directive called max_file_uploads, designed to prevent DOS attacks. The supposed attack would work by uploading a huge number of files to a server, filling up the available space in its temp folder. The default setting for max_file_uploads is 20 files, and of course we’re at the point now where PHP 5.2.12 and greater are becoming reasonably common in the wild.

So how is this an issue for Perch? Well, Perch enables users to upload images and files as content to their site. A template for an item of content might have a couple of image upload fields. If you allow your content region to hold multiple items, these all appear on one long edit form in Perch. So a region with 10 items, each with 2 upload fields, and you suddenly have the possibility to upload 20 files.

Initially, I didn’t think this was going to be a problem, because that’s not typically how users add content. They don’t add 10 empty items and then go through and fill them in with content. They add one at a time, and so typically will only be uploading one or two files at a time – nowhere near the default limit of 20. But here’s the catch:

max_file_uploads counts empty upload fields as if they were being used. This means that the limit is not on how many files are uploaded, but on how many upload fields you have in your HTML form. If you have 21 file fields, you can’t even upload one single file unless it’s in one of the first 20 fields.

This issue was logged as PHP Bug #50749, but marked as “bogus” due to what sounds like a design flaw in how PHP handles uploads. The idiocy continues, however, as unlike most other PHP INI directives, this one can’t be overridden in a local .htaccess file. It gets set once, for the entire server and the individual site owner has no control over the setting.

This is pretty bad news for Perch, as the way our interface works means that it’s easy for users to end up with more than 20 fields on a form, and so it looks like we’re going to need to redesign how the UI works to get around a fairly dubious security setting.

A JavaScript workaround

Obviously, until we can restructure to work around the issue, we need something in place to fix the issue for existing customers. We make a point of not building with a dependancy on JavaScript, but in this case the only solution I could find without rebuilding the UI (which wasn’t an option this late in the cycle) was to paper over the cracks with some help from jQuery.

$('form').submit(function(){
    $('input:file[value=""]').attr('disabled', true);
});

That should be fairly self-evident, but on submit of the form, it finds any empty file input fields and toggles them as disabled. In every browser I tested, this prevents the value being submitted with the form, and so the server never knows the field existed. Any field with a value submits as normal.

If my tone sounds a little hacked off, it’s because this has annoyed me a bit. I do appreciate the need to improve security all the time, absolutely. I think mostly it’s that Bug #50749 was marked as “bogus” that annoys me so much.

The bug reporter had the same concerns as me. This security setting was not backward compatible. It was not something that had been deprecated and then gradually removed. There’s nothing at all wrong with having forms with lots of file upload fields. This change broke existing functionality, without warning.

For me as a developer of commercial PHP-based software, to have that concern marked as bogus feels like a direct insult. For my customers, software that was valid and worked well, suddenly broke due to a change in PHP. Their concerns are not bogus either – they’re very real. PHP can screw me about as much as it wants – I’m a developer and I’ll cope. But please keep things stable for my customers.

- Drew McLellan

Comments

  1. § Robert:

    I feel your pain… Reminds me of our surprise when we learned the hard way that the wizards of PHP chose to report GD’s JPEG support by returning “JPEG” instead of “JPG” in gd_info().

    Of course this is no problem at all from a developer’s view, but, as you,say, why on earth are the PHP folk so intensly tempted to break existing applications out of the blue every now and then?

  2. § Robert Ketter:

    Very well written Drew. Your hard work is truly appreciated and your products are wonderful. Keep up the GREAT work.

  3. § Peter:

    Welcome to PHP. They’ve always done things like this. Point out an unwelcome implementation detail peeking out and they’ll say it’s not a bug but a feature.

    For example, it’s absurd how many bugreports they’ve gotten about the fact that PHP just segfaults when you cause a stack overflow, and they just keep marking it as bogus. There’s not a single high-level language around anymore that doesn’t throw an exception when you run out of stack, but they don’t care. They also always choose performance over security and convenience. It’s not funny!

  4. § Nick Fitzsimons:

    Marking that bug report as “Bogus” strikes me as bogus. I just ran a quick test to confirm my suspicions, and the request body of a multipart/form-data HTTP POST with an unused file input followed by a used one is as follows (trimmed for brevity):

    ———WebKitFormBoundarydAjSeDgYBTLp3DWX
    Content-Disposition: form-data; name=“file0”; filename=”“

    ———WebKitFormBoundarydAjSeDgYBTLp3DWX
    Content-Disposition: form-data; name=“file1”; filename=“somefile.txt”
    Content-Type: text/plain

    This is the contents of the file.
    ———WebKitFormBoundarydAjSeDgYBTLp3DWX—

    As this is basic MIME stuff, I would expect other browsers to behave the same way, apart from their method of generating the boundary string. A quick check shows that IE6 differs only in sending the full local path of the file for the second field’s filename value, but still the empty string for the first.

    So all that stuff about having to “create the filename and sit and wait for the data, even if none ever comes” doesn’t seem to make sense: one can easily identify a file whose filename is the empty string (and which has no content) and not increment the count of files received.

  5. § Drew McLellan:

    I’d pretty much drawn the same conclusion, Nick. I don’t know how the file upload is implemented at a low level in PHP, but it strikes me that with the information available, an empty upload should not consume significant resources. If it does, then it sounds like a weak design.

  6. § Francis:

    nice work around,

    I had run into similar problems in the past and I crafted a javascript solution too, :)

    somewhat different approach though,

    in my case the user specify the number of file fields (*less than 20) he/she requires for an upload session

    and I let javascript write the specified field numbers to the DOM.

    in that way the user only has the number of fields needed on the page at any time thus the problem of empty fields on the server is taken care of…

    what do you think?

Photographs

Work With Me

edgeofmyseat.com logo

At edgeofmyseat.com we build custom content management systems, ecommerce solutions and develop web apps.

Follow me

Affiliation

  • Web Standards Project
  • Britpack
  • 24 ways

I made

Perch - a really little cms

About Drew McLellan

Photo of Drew McLellan

Drew McLellan (@drewm) has been hacking on the web since around 1996 following an unfortunate incident with a margarine tub. Since then he’s spread himself between both front- and back-end development projects, and now is Director and Senior Web Developer at edgeofmyseat.com in Maidenhead, UK (GEO: 51.5217, -0.7177). Prior to this, Drew was a Web Developer for Yahoo!, and before that primarily worked as a technical lead within design and branding agencies for clients such as Nissan, Goodyear Dunlop, Siemens/Bosch, Cadburys, ICI Dulux and Virgin.net. Somewhere along the way, Drew managed to get himself embroiled with Dreamweaver and was made an early Macromedia Evangelist for that product. This lead to book deals, public appearances, fame, glory, and his eventual downfall.

Picking himself up again, Drew is now a strong advocate for best practises, and stood as Group Lead for The Web Standards Project 2006-08. He has had articles published by A List Apart, Adobe, and O’Reilly Media’s XML.com, mostly due to mistaken identity. Drew is a proponent of the lower-case semantic web, and is currently expending energies in the direction of the microformats movement, with particular interests in making parsers an off-the-shelf commodity and developing simple UI conventions. He writes here at all in the head and, with a little help from his friends, at 24 ways.