Fiction

Fiction L
A Fake Internet Presence,
since 1994

 Home
 TidBits
 BLong
   Source
     GBuffy
     Mutt
     ClearSilver
     Python
     PyApache
   PalmOS Tools  

RSS as DDoS
2004-07-22

A followup to an early article about RSS Growing pains. It makes it prety obvious that serving RSS with straight Apache for popular sites is probably a bad idea, but you won't know that until you "hit the wall" as he states.

RSS is too dumb to do anything to really stop this, but you would think that aggregators would be a bit smarter. Clients could just have a random walk setting in their fetcher, so it wouldn't fetch every hour on the hour, for instance. If the load was evenly distributed throughout the hour, you'd still have (clients * 24) extra load on your systems, but it could be as much a couple orders of magnitude less "pop".

Fixing the protocol, one could imagine server side aggregators (hmm, that is confusing terminology) which could combine multiple feeds, and then a client could request all of the feeds from the single source. This could be combined with pingers such that these "clusterers" (ugh) would get pushed updates from the people publishing the feeds. The original feeds could even contain pointers to clusterers which support their feed.

In a perfect world, those writing these clients would actually support their own clients in this fashion so their clients wouldn't wreck havoc on the world. They don't have to handle all the feeds, just the most popular ones. Ie, the client would fetch the feed from the primary source, tell the mother service all the feeds it fetchs (anonymously, of course), and then for any feed with more than say 1000 subscriptions, the mother service would tell the client to fetch the feed from them instead. Good citizen and all that. Plus, it would allow the client software to report aggregate statistics about subscribership across the rss world, much like Bloglines does now.

The next step after that would of course be some sort of P2P mechanism for distribution, hijack one of the existing protocols (BitTorrent has been mentioned in the past, but that seems too one shot to me, but I'm not an expert), though you should run this service maybe separate from the primary one (different ports or whatever, no need to clutter the service with rss feeds).

The most obvious answer from a server side is to serve your RSS feeds off of something like squid. It can handle a much larger number of simultaneous transactions due to its async nature, and the caching isn't a bad thing either. It might also help when you have a really large number of feeds. It should be interesting if the GG2 feeds become very popular, for instance. Well, interesting to me, since I'll have to fix the problem...


RSS Feed
Click for San Francisco, California Forecast

Personal
·About Brandon
·Twitter
·Instagram
·Resume
·Programming

Friends & Rants
·Clong Way From Home
·Wingedpig
·Unsolicited Dave
·Jason Lindquist
·Ben Gross
·Alan Braverman

Comics
·Sluggy Freelance
·XKCD
·Questionable Content
·Least I Could Do
·Saturday Morning Breakfast Cereal


Copyright (C) 2020 Brandon Long. All Rights Reserved.
blong@fiction.net / Terms of Service

The "I work for a big public company" disclaimer:
The views expressed on these pages are mine alone and not those of my employer.
I am not now, nor have I ever been employed to speak for anyone.
Well, except my own company, but that's gone now.