All in the <head>

– Ponderings & code by Drew McLellan –

– Live from The Internets since 2003 –

About

Podcast Aggregators Should Support Cookies

8 March 2005

Here’s a rough idea I’d appreciate feedback on.

One of the main principals behind podcasting is that the podcaster publishes an RSS feed detailing their most recent releases. At least that is the current model, with nearly all podcasters viewing their shows as a rolling series. Consider a different model, however, where a set of shows might need to be heard sequentially. This might be “Teach yourself Spanish in 24 hours” or a serialised novel or anything of that kind. Using a ‘most recent’ RSS feed would be like dipping into a series of 24 half way through – it just wouldn’t work.

For sequential podcasts a more controlled RSS feed is required to drip feed the shows in the right order, as and when they become available. Obviously the key to this is to personalise the RSS feed to the individual and keep track of their position in the sequence on the server side. The trouble arises in identifying the user.

If someone is discovering the podcast via the web, then offering a personalised feed is easy. At the simplest level a unique ID could be generated for every page load, thus ensuring the user has a unique feed address. More complex systems might require registration. This, however doesn’t address the great many users subscribing via one of the various podcast directories, or being passed the address manually by and friend and so on. Additionally, if a naive user manages to share their personalised feed address with others, the whole sequence will mess up spoiling everyone’s enjoyment.

What might be more useful is if the podcast aggregator (podcatcher) supported cookies. By setting a new cookie with a unique ID, or reading in an existing cookie for return users, the server could uniquely identify the user and therefore their position in the sequence.

This does, however, raise a few issues. Firstly it complicates the model. One of the really appealing aspects of podcasting is the simplicity of the model. Ultimately there must be some trade off between simplicity and more advanced functionality.

The second issue is that of course this would require support from the major podcatchers out there, of which there is an increasing number almost daily. However, we’re probably at a stage where something like this could be introduced without too much trouble. The format is young, and all the clients are under current development – there isn’t really any legacy stuff out there yet. From a programming point of view, adding a bit of HTTP manipulation is a little extra work, but hopefully wouldn’t be too onerous.

The third issue is one of portability, and it’s not one I have an answer for straight off. Cookies are ultimately tied to the user agent. If you download from multiple machines or decide to try out a different aggregator, the cookie data would not be ported. To enable the user to download from multiple machines, or change aggregators, there’d need to be some method of porting that data across or syncing up with an external service. That’s a whole different kettle of chips when it comes to programming effort, as well as user experience.

The final issue is that of privacy. It’s a social issue however, as cookies tend to get a bad press for little reason. People think they’re being tracked and that the world is out to get them and their nastyass data. Aggregators would need to employ a similar security model to that of standard browsers – with access to cookies limited by domain. As each feed would be managing its own cookies, there’s no cross contamination and so the privacy issue is moot. Of course it would be friendly to give the user the option to accept the cookie or reject it. Any such dialogue should be non-alarmist where possible.

So that’s the rough idea. I like the idea of cookies above authentication or any other such method as it’s already established technology and it also is non-specific. There could be a thousand different use cases that I’ve not dreamed of that cookies could be a solution to. I just wanted to get this out there for feedback.

- Drew McLellan

Comments

  1. § Olly: Why not have the user register on your site, then provide them for a link to the RSS feed which contains a variable to track what they have downloaded. Such as:

    http://example.com/rss.php?userid=”bob”

    Then you can track which podcasts they have seen, and with a bit more wizardry, which they have downloaded. This also means that it doesn’t have to implemented at the client side and would work with any aggregator capable of handling enclosures.

    On the downside the user would have to register and account to subscribe which could be slightly off-putting.

    P.S. The comment form could be improved by indicating which fields are required and which not.
  2. § James Stewart: If it’s a case of tracking running order, then could that perhaps be done by extending the feed with an extra namespace? A ‘programme sequence’ namespace could easily be added to an atom feed, and it would then just be a case of adding something like 5 to indicate the fifth in a series?

    It would require that aggregator writers were making use of extended namespaces, but would save any HTTP manipulation, and could keep the personalisation on the client-side.
  3. § MH: Sorry if this is slightly off topic, but: what is it about “podcasting” that is explicitly tied to iPods?
  4. § Drew McLellan: Olly: re-read the paragraph starting “If someone is discovering the podcast via the web”. :-)

    James: My concern with this approach is that in a very long series the feed would have to be exceptionally large and effectively include every item in the series.

    MH: nothing.
  5. § Lance Robinson: This is interesting, and definitely worth conversation, IMO.

    It seems to be that this could be done much more simply by having the website make use of the If-Modified-Since header provided by the catcher software in its request for the feed. Your feed could only contain items newer than that If-Modified-Since date.

    If its a new subscription, there will of course be no If-Modifid-Since header. If its not a new subscription – any decent catcher software is going to include this header.
  6. § Tor: I was going to suggest something along the lines of what James said, but even that seems too complicated.

    I see no reason why any authentication should be needed, as it’s the responsibility of the mediaplayer/podcatching software to keep track of what has been downloaded and/or listened to.

    If you want people to be able to get the beginning of your series, just make sure all items are in the rss feed, and the enclosures have an easily recognized sequence in their naming. It’s the catcher’s responsibility to download the enclosures according to the pubDate, preferably using a user-defined option.

    I.E. User says, get these enclosures for this feed chronologically.
  7. § Erwin: Using the pubDate would be possible to download the files in chronological order if everyone would put it in their feed. Believe me, that’s not the case. And if they put it in, they will have to make sure it follows the standards, which is also not always the case. And while we’re at it, could people also start using guids? please?

    Adding cookie support to Doppler (as that’s the client I’m doing) is not too difficult, but I’m not so sure if that’s the way to go in this case. But if people manage to convince me, I’m up to it.
  8. § Greg Smith: Any solution for this that can be done server side is better because 1) this is a niche problem, 2) niche problems are best solved “in the niche” (i.e. solve on the server, not all the clients).

    That said, you may be able to use the ETag as well to identify clients; as a “poor man’s cookie”. The ETag is one of the ways (along with If-Modified-Since) that clients implement Conditional GET. ETag is a unique identifier that is given by the server and regurgitated by the client back to the server on the next download.

    The difference between ETag and If-Modified-Since is the HTTP version (1.0 vs. 1.1). I don’t know if this brings any restrictions with it. I know that FeederReader implements both If-Modified-Since and ETag headers. (see http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers ) What I do is just save the ETag and Last-Modified headers and spit them back out to the server on the next download. I do not do any processing of these headers. You may wnat to request from other client developers if they do the same. I would think at least for ETag, they would. If-Modified-Since can be generated by the client, but it is dangerous to do so because of the potential for time difference between client and server. I’d go with ETag if you can.

    This in combination with a HTTP redirect may give you the capability you need. I would expect (hope?) that many clients already handle HTTP redirects as well as the Conditional GET mechanism.

    Greg Smith
    Author, FeederReader – The Pocket PC RSS, podcatcher, videocatcher
    www.FeederReader.com – Download on the Road
  9. § Erwin: Like FeedReader, Doppler also supports ETags and If-Modified-Since headers, and as FeedReader, Doppler does not parse them but just includes them in the following requests. I think Greg made a good point.
  10. § W.B. McNamara: I’d suggest that authenticated feeds might be a better approach for this.

    Feed authentication already a known quantity and supported in an increasing number of feed readers. It’s not universal, but you’ve got a working base from which to build.

    As noted in another comment, this also pushes the issue over on to the server side, which I believe is a good thing. The server decides who you are and takes on the responsibility for showing you the appropriate content, regardless of the computer/device/reader that you’re using to access the feed.

    And as you mentioned in the original post, for better or worse cookies have a pretty bad rep at this point. From an adoption standpoint “authenticated feeds to ensure that no one else can access your content” sounds a lot more appealing to users than “cookies so that we can track everything that you do.” It may come down to splitting hairs, but I suspect that this could make a difference to a fair number of people.

    This does add another step to the process of subscribing (though there are a variety of ways that one might simplify it), but it seems like the benefits could be made to outweight the drawbacks.
  11. § Olly: Yep, probably should have read that paragraph more throughly. ;-)
  12. § Drew McLellan: W.B. – I have given some thought to authentication, and in particular to the authentication model provided natively by HTTP. Whilst I’m confident that it can be used well for restricting access to feeds I keep coming back to the same point.

    The issue discussed above simply isn’t authentication. It’s identification that doesn’t necessarily need to be authenticated. Access is not being restricted as such, it’s simply a case of knowing who’s on the line. It’s the username without the password.

    Whilst authentication can be used as a catchall to cover identification issues, I worry that it over complicates the vast majority of identification cases. It’d be like every website that keeps track of a session (via cookies) requiring an account and authentication. It’s a sledgehammer to crack a nut.

    Authentication is needed too, but at the moment it’s identification that I’m interested in.
  13. § Greg Smith: Drew, I agree with your latest comments as well as your sense that some commenters are not on the same track. To me, my solution still stands because it offers the user/users a big capability that you required (and I agree is a good goal): everyone uses the same URL. And no other information within the URL is necessary to identify an individual. Cookies will take time to implement in all clients, ETag is here now (for at least two clients ;-) )

    Greg Smith
    Author, FeederReader – The Pocket PC RSS, podcatcher, videocatcher
    www.FeederReader.com – Download on the Road
  14. § Drew McLellan: OK, I just did some reading around ETags – an aspect of HTTP that I was previously unfamiliar with. It sounds like this could be a workable solution. It addresses the problem I outlined and does so with minimal fuss and effort on anyone’s part, which seems pretty elegant.

    So let’s turn to the other aspect of this problem – giving the user enough control to be able to use different computers or to switch clients. I’d be happy if the ETag value was included in the standard OPML import/export routines that many podcatchers seem to be supporting. Much as I hate the misuse of OPML for all this stuff, do you guys think exporting the ETag value along with the feed address would be workable?
  15. § Greg Smith: Now you’re speaking at almost crossed purposes. You want a “clean” URL that can be distributed to anyone, then you want OPML, which is a format to exchange lists of RSS feeds, to uniquely identify a feed (by also listing an ETag).

    My gut reaction is that this is unlikely to be implemented. But I can’t think of why it wouldn’t work if you got client developers to agree to do it. It’s taken quite a while to get OPML input and output in many clients. I think it would be much harder to convince clients to add this niche feature.

    If you want non-unique feeds that keep track of where you are (within one client) and are universally distributable, put the value in the ETag. If you want unique feeds that keep track of where you are when distributed, put the value in the URL.

    Greg Smith
    Author, FeederReader – The Pocket PC RSS, podcatcher, videocatcher
    www.FeederReader.com – Download on the Road
  16. § Drew McLellan: Fair point, Greg. Exporting the ETag would reintroduce some of the initial problem.

    Do you think it’s much of an issue for people to only be able to download a sequential cast from one client on one machine?
  17. § W.B. McNamara: Hey, all—
    I may have to accept that I may be going off in a slightly different direction. :)

    A question, though, since I haven’t yet gotten to reading up on ETags: do Web based readers (Bloglines) or feed aggregation/enhancement services like FeedBurner have any effect on this (or any cookie based) approach? Basically I’m wondering about cases where the client machine isn’t hitting the original source feed/server directly, but rather a cached or alternate version.

    Thanks, – Whit
  18. § Drew McLellan: Whit – yes, the model falls down in the case of preaggregation. Despite preaggregation being The Next Big Thing, I don’t think it’s something to worry too much about.
  19. § Greg Smith: To answer the last few posts:

    I don’t think it’s much of a problem to be able to download a sequential cast to only one device. There are several basic use models of aggregators that I can think of: 1) download to desktop, then listen on desktop (i.e. desktop-based aggregator), 2) download to desktop, transfer to portable MP3 player (i.e. iPodder), and 3) download directly to mobile device (i.e. FeederReader on a Pocket PC). All of these involve one download client on one device.

    Hmmmm…you got me thinking on the preaggregation problem. I wonder if there’s a way to do it with redirects on the enclosures. I can’t think of a solution off hand. I’ll post back if I can think it through. I’m thinking of a feed with a constantly increasing counter in the enclosure filename (with no text in the feed, because the actual file returned would be different for each person). Tie this to some logic on enclosure URI redirect, but I don’t think there is a way to uniquely identify the client on the enclosure download request. Hmmmm…I think we’re back to cookies ;-) If I think of anything, I’ll let you know (but don’t get your hopes up!)

    Greg Smith
    Author, FeederReader – The Pocket PC RSS, podcatcher, videocatcher
    www.FeederReader.com – Download on the Road
  20. § Rex Riley: Great wad through the tech solutions you’ve posted, thoughtful posting them where a Guru would simmer his ego by not.

    Cookies support spam… or how else do you explain getting spam from eBay on a non-published email address? I was using the same browser to access email and eBay, the same day. eBay must be able to read cookies or some other Firefox attribute which caches email addresses.

Photographs

Work With Me

edgeofmyseat.com logo

At edgeofmyseat.com we build custom content management systems, ecommerce solutions and develop web apps.

Recent Links

Affiliation

  • Web Standards Project
  • Britpack
  • 24 ways

About Drew McLellan

Photo of Drew McLellan

Drew McLellan has been hacking on the web since around 1996 following an unfortunate incident with a margarine tub. Since then he’s spread himself between both front- and back-end development projects, and now is Director and Senior Web Developer at edgeofmyseat.com in Maidenhead, UK (GEO: 51.5217, -0.7177). Prior to this, Drew was a Web Developer for Yahoo!, and before that primarily worked as a technical lead within design and branding agencies for clients such as Nissan, Goodyear Dunlop, Siemens/Bosch, Cadburys, ICI Dulux and Virgin.net. Somewhere along the way, Drew managed to get himself embroiled with Dreamweaver and was made an early Macromedia Evangelist for that product. This lead to book deals, public appearances, fame, glory, and his eventual downfall.

Picking himself up again, Drew is now a strong advocate for best practises, and stood as Group Lead for The Web Standards Project 2006-08. He has had articles published by A List Apart, Adobe, and O’Reilly Media’s XML.com, mostly due to mistaken identity. Drew is a proponent of the lower-case semantic web, and is currently expending energies in the direction of the microformats movement, with particular interests in making parsers an off-the-shelf commodity and developing simple UI conventions. He writes here at all in the head and, with a little help from his friends, at 24 ways.