Implementing Webmentions

In a world before social media, a lot of online communities existed around blog comments. The particular community I was part of – web standards – was all built up around the personal websites of those involved.

As social media sites gained traction, those communities moved away from blog commenting systems. Instead of reacting to a post underneath the post, most people will now react with a URL someplace else. That might be a tweet, a Reddit post, a Facebook emission, basically anywhere that combines an audience with the ability to comment on a URL.

Oh man, the memories of dynamic text replacement and the lengths we went to just to get some non-standard text. https://t.co/f0whYW6hh1
— One Bright Light ☣️ (@onebrightlight) July 13, 2017

Whether you think that’s a good thing or not isn’t really worth debating – it’s just the way it is now, things change, no big deal. However, something valuable that has been lost is the ability to see others’ reactions when viewing a post. Comments from others can add so much to a post, and that overview is lost when the comments exist elsewhere.

This is what webmentions do

Webmention is a W3C Recommendation that solves a big part of this. It describes a system for one site to notify another when it links to it. It’s similar in concept to Pingback for those who remember that, just with all the lessons learned from Pingback informing the design.

The flow goes something like this.

Frankie posts a blog entry.
Alex has thoughts in response, so also posts a blog entry linking to Frankie’s.
Alex’s publishing software finds the link and fetches Frankie’s post, finding the URL of Frankie’s Webmention endpoint in the document.
Alex’s software sends a notification to the endpoint.
Frankie’s software then fetches Alex’s post to verify that it really does link back, and then chooses how to display the reaction alongside Frankie’s post.

The end result is that by being notified of the external reaction, the publisher is able to aggregate those reactions and collect them together with the original content.

The reactions can be comments, but also likes or reposts, which is quite a nice touch. For the nuts and bolts of how that works, Jeremy explains it better than I could.

Beyond blogs

Not two minutes ago was I talking about the reactions occurring in places other than blogs, so what about that, hotshot? It would be totally possible for services like Twitter and Facebook to implement Webmention themselves, in the meantime there are services like Bridgy that can act as a proxy for you. They’ll monitor your social feed and then send corresponding webmentions as required. Nice, right?

Challenges

I’ve been implementing Webmention for the Perch Blog add-on, which has by and large been straightforward. For sending webmentions, I was able to make use of Aaron Parecki’s PHP client, but the process for receiving mentions is very much implementation-specific so you’re on your own when it comes to how to actually deal with an incoming mention.

Keeping it asynchronous

In order for your mention endpoint not to be a vector for a DoS attack, the spec highly recommends that you make processing of incoming mentions asynchronous. I believe this was a lesson learned from Pingback.

In practise that means doing as little work as possible when receiving the mention, just minimally validating it and adding it to a job queue. Then you’d have another worker pick up and process those jobs at a rate you control.

In Perch we have a central task scheduler, so that’s fine for this purpose. My job queue is a basic MySQL database table, and I have a scheduled task to pick up the next job and process it once a minute.

I work in publishing, dhaaaling

Another issue that popped up for me in Perch was that we didn’t have any sort of post published event I could hook into for sending webmentions out to any URLs we link to. Blog posts have a publish status (usually draft or published in 99% of cases) but they also have a publish date which is dynamically filtered to make posts visible when the date is reached.

If we sent our outgoing webmentions as soon as a post was marked as published, it still might not be visible on the site due to the date filter, causing the process to fail.

The solution was to go back to the task scheduler and again run a task to find newly published posts and fire off a publish event. This is an API event that any other add-on can listen for, so opens up options for us to do this like auto-tweeting of blog posts in the future.

Updating reactions

A massive improvement of webmentions over most commenting systems is the affordance in the spec for updating a reaction. If you change a post, your software will re-notify the URLs you link to, sending out more webmention notifications.

A naive implementation would then pull in duplicate content, so it’s important to understand this process and know how to deal with updating (or removing) a reaction when a duplicate notification comes along. For us, that meant also thinking carefully about the moderation logic to try to do the right thing around deciding which content should be re-moderated when it changes.

Finding the target

One interesting problem I hit in my endpoint code was trying to figure out which blog post was being reacted to when a mention was received. The mention includes a source URL (the thing linking to you) and a target URL (the URL on your site they link to) which in many cases should be enough.

For Perch, we don’t actually know what content you’re displaying on any given URL. It’s a completely flexible system where the CMS doesn’t try to impose a structure on your site – you build the pages you want and pull out the content you want onto those pages. From the URL alone, we can’t tell what content is being displayed.

This required going back to the spec and confirming two things:

The endpoint advertised with a post is scoped to that one URL. i.e. this is the endpoint that should be used for reacting to content on this page. If it’s another page, you should check that page for its endpoint.
If an endpoint URL has query string parameters, those must be preserved.

The combination of those two factors means that I can provide an endpoint URL that has the ID of the post built into it. When a mention comes in, I don’t need to look at the target but instead the endpoint URL itself.

It’s possible that Bridgy might not be compliant with the spec on this point, so it’s something I’m actively testing on this blog first.

Comments disabled

With that, after about fifteen years of having them enabled, I’ve disabled comments on this blog. I’m still displaying all the old comments, of course, but for the moment at least I’m only accepting reactions via webmentions.

– 16 July 2017 –