The importance of serial media vs. sampled and Google Reader

The blogging world was stunned by the recent announcement by Google that it will be shutting down Google reader later this year. Due to my consulting relationship with Google I won't comment too much on their reasoning, though I will note that I believe it's possible the majority of regular readers of this blog, and many others, come via Google reader so this shutdown has a potential large effect here. Of particular note is Google's statement that usage of Reader has been in decline, and that social media platforms have become the way to reach readers.

The effectiveness of those platforms is strong. I have certainly noticed that when I make blog posts and put up updates about them on Google Plus and Facebook, it is common that more people will comment on the social network than comment here on the blog. It's easy, and indeed more social. People tend to comment in the community in which they encounter an article, even though in theory the most visibility should be at the root article, where people go from all origins.

However, I want to talk a bit about online publishing history, including USENET and RSS, and the importance of concepts within them. In 2004 I first commented on the idea of serial vs. browsed media, and later expanded this taxonomy to include sampled media such as Twitter and social media in the mix. I now identify the following important elements of an online medium:

  • Is it browsed, serial or to be sampled?
  • Is there a core concept of new messages vs. already-read messages?
  • If serial or sampled, is it presented in chronological order or sorted by some metric of importance?
  • Is it designed to make it easy to write and post or easy to read and consume?

Online media began with E-mail and the mailing list in the 60s and 70s, with the 70s seeing the expansion to online message boards including Plato, BBSs, Compuserve and USENET. E-mail is a serial medium. In a serial medium, messages have a chronological order, and there is a concept of messages that are "read" and "unread." A good serial reader, at a minimum, has a way to present only the unread messages, typically in chronological order. You can thus process messages as they came, and when you are done with them, they move out of your view.

E-mail largely is used to read messages one-at-a-time, but the online message boards, notably USENET, advanced this with the idea of move messages from read to unread in bulk. A typical USENET reader presents the subject lines of all threads with new or unread messages. The user selects which ones to read -- almost never all of them -- and after this is done, all the messages, even those that were not actually read, are marked as read and not normally shown again. While it is generally expected that you will read all the messages in your personal inbox one by one, with message streams it is expected you will only read those of particular interest, though this depends on the volume.

Echos of this can be found in older media. With the newspaper, almost nobody would read every story, though you would skim all the headlines. Once done, the newspaper was discarded, even the stories that were skipped over. Magazines were similar but being less frequent, more stories would be actually read.

USENET newsreaders were the best at handling this mode of reading. The earliest ones had keyboard interfaces that allowed touch typists to process many thousands of new items in just a few minutes, glancing over headlines, picking stories and then reading them. My favourite was TRN, based on RN by Perl creator Larry Wall and enhanced by Wayne Davison (whom I hired at ClariNet in part because of his work on that.) To my great surprise, even as the USENET readers faded, no new tool emerged capable of handling a large volume of messages as quickly.

In fact, the 1990s saw a switch for most to browsed media. Most web message boards were quite poor and slow to use, many did not even do the most fundamental thing of remembering what you had read and offering a "what's new for me?" view. In reaction to the rise of browsed media, people wishing to publish serially developed RSS. RSS was a bit of a kludge, in that your reader had to regularly poll every site to see if something was new, but outside of mailing lists, it became the most usable way to track serial feeds. In time, people also learned to like doing this online, using tools like Bloglines (which became the leader and then foolishly shut down for a few months) and Google Reader (which also became the leader and now is shutting down.) Online feed readers allow you to roam from device to device and read your feeds, and people like that.

This was followed by the discovery of feeds by the social networks. Sites like Facebook began as something browsed, but made a big change when they discovered the value of feeds and serial reading. This time it was different because most people linked to too many friends, and the volume of a typical social network feed is overwhelming and too full of material that is uninteresting to you, even if it's from people you know. Online discussion groups consisted of material that was usually about a topic of interest to you, but not necessarily from people you knew. The online groups however often became communities, and people met new friends there, so nothing is entirely pure.

Nonetheless, the firehose of social network feeds gave us a new online media class: sampled media. You can't read the whole feed so you only see what's going on when you happen to read. The feed is so fast that most social networks never even implemented a concept of "she's seen this, don't show it to her again." They rely on you to stop reading when you start seeing stuff you saw before.

Another orthogonal axis in this spectrum of online media is how the material is sorted. The classic method was chronological, but as feeds grew, systems have tried to sort in some sort of priority or importance order. This is even becoming true in mailboxes. Newspapers were always this way -- the most important stories were on the front page or front pages of sections. You could read until you got bored and know you had surely hit the most important stuff. Likewise with some social media, they now promote stories that others are reading or liking or commenting on, and you can read until you get bored. This classification is usually algorithmic, though based on human activity, while a newspaper's sorting is done by a human editor.

Still missing from all of this are the "slow feeds," including a blog like this where there may only be a few posts a week. With these feeds you typically want to consider every item. You don't want to miss an item just because you didn't happen to do a reading session shortly after it was posted. These fare terribly in places like Twitter, and even on Facebook unless they get a lot of attention so the priority system keeps them around. As far as I've seen, the priority systems don't pay attention to the fact that there is often a difference in the thoughtfulness between those who post once an hour and those who post once a week.

RSS and mailing lists fill this role today, and it's a hugely important one. If RSS is wounded by recent events, there is a challenge for social media to fulfill the need.

  • We need a way for authors to prioritize what they put out, to make the difference between ephemeral "updates on my day" and "topical notes" and "timeless essays" be clear. Authors could put "expirations" on posts (with a default of "a few days") but it's debatable if they would use such a UI.
  • Since UI gets in the way, it would be good if systems automatically saw the difference between frequent posters and rare ones, and gave more weight per post to the rare posts. In some ways authors could implicitly have so many points per day, which they can "spend" all at once or spread out over many posts.
  • We need a way to follow the non-ephemeral posts of writers we like in a way that always displays the important ones, even on busy days, and even weeks later.
  • It must be platform agnostic (as RSS was.) I want to see all my followed writers in one place, whether they post on Facebook, G+, Twitter or their own site.
  • In general, those who wish to post on their own site should not be impeded. While the fancy social networks have a zillion coders working on them and do fancy things, there is nothing like the true potential for innovation that comes from having a million different sites from a million developers.
  • The "What's new for me?" view is crucial and should be well developed. For extra credit, the USENET style where you checked off the interesting ones and then saw them with minimal UI (scrolling or hitting spacebar/pagedown) has a lot of value.

Some of these the social feed sites could do easily. Others, like being cross platform, they will resist. Social sites have tended to hate aggregators that serve actual user wishes and put everything together in one place. In particular, it's harder to figure out how to integrate monetization into this approach, be it by advertising or other means. This is also the thing that's hurt the web-based RSS readers, since they have costs to operate and few sources of income.

Can we see a return to the serial? Is this just a yearning for the glory days by an old hand, something that the new generation sees no value in? I don't think so. Everybody sees the way we are drowning in information, and everybody hopes for something to deliver the magic -- give me what I want as quickly as possible, without the FOMA (fear of missing out) drives people to be such rapacious consumers of online media to the detriment of the things they need to get done. In particular, we're really short of solutions to what to do when we return from a long trip, or sometimes even a weekend, offline and the messages have just piled up even more. A move away from RSS, or at least the style of reading it represents, is probably not the answer.

Comments

Thank you for clarifying something which I had not realized about myself. There are parts of the day when I prefer serial consumption; during those times I read RSS feeds or the newspaper (on-line). There are other parts of the day when I just want a minute or two of distraction so I bounce through Google+. If either went away, I would feel a hole in my life.

Not coincidentally, I use Google Reader like I used to use my old usenet reader: with the keyboard.

You missed, in your capsule history of net.media, the briefly lived predecessor of RSS - the thing that some of us hoped that RSS would be, but never became - the thing that was infinitely more useful than RSS.

I don't know of a proper term for it, so I'll make up my own. Plural. "Spider diffing". "Logical combining". "Reading machines".

Briefly, back in the beginning of the web, many of us would run indexing spiders. They would keep track of websites you were interested in, and/or pages you were interested in - typically that you had told them about, although sometimes they would spider links to different pages and/or different sites. And they would give you a list that might look like:

* Sites that you are interested in that have new content

* New pages, on sites that you are interested in.

* Pages that you are interested in, that have changes that you might find interesting.

Doesn't this sound like RSS?

Key difference: you were in control of the filtering. The summarization. Not the website provider.

You ran the spider, the indexer. You kept track of what you were interested in, not the website manager.

Instead of receiving multiple RSS updates every time something changed (which you may or may not be interested in - which theRSS provider thinks you might be interested in, modulo your RSS filters), you would NOT receive a summary, but would PERCEIVE a summary, when you got around to it, of the changes.

Instead of
delta1
delta2
delta3
You would receive
delta123
Or, rather:
delta1 would be available for a while. Then, if you had not read it, it might be replaced by delta12. Then, if you had not read that, by delta123.

Many of us hoped that RSS would evolve in this direction. Perhaps it has - in theory, you can accomplish almost anything you want by writing your own filters on top of an RSS stream. Filters that accomplish the above "change combining". In practice... well (1) How many people use RSS tools that sophisticated? and (2) many websites, small personal websites, don't provide proper RSS feeds. Big commercial websites have "good" RSS feeds (for a definition of good that is somewhere between you and the website/RSS manager). Personal websites on some platforms have RSS feeds. But many do not. I blush to admit that my websites do not provide what I would call proper summarizations and notifications.

I think that RSS was part of the corporatization of the web. It's good enough for companies that are willing to manage RSS. Not so good for some of us on the fringe.

Now, the overhead of the spiders and indexers was high. Robots dragged sites down. Syndication lessened robot overhead - and like I said, in theory proper stateful RSS reader filters could accomplish what the old technology did.

But, take this to the extreme: in theory, you don't need indexers or web crawlers a la Google. In theory, indexers could just monitor the RSS stream, if a usable RSS stream were being produced by every website in the world - or at least those you were interested in. In practice: web crawling indexers are needed.

Look at all the thrashes wikis, etc., have wrt notification tools.
http://twiki.org/cgi-bin/view/Plugins/NotificationPlugin
http://twiki.org/cgi-bin/view/TWiki/WebNotify

http://www.mediawiki.org/wiki/Manual:Configuration_settings#Email_notification_.28Enotif.29_settings

In an ideal world, notification tools, both active (send me an email when this page changes) and passive (the next time I am reading my background interests, tell me to look here) would not need to be part of a website, blog, or wiki. In an ideal world they would be pervasive utilities.

Wouldn't it be nice if the sort of summarization and notification that I mention above could be accomplished by piggybacking on top of Google's indexing webcrawlers? Or Bing's? Or ... Reduce the robot load. Surely you could sell more ads if you knew what people were looking for. ... Maybe that's not such a good idea.

--

There's every chance that I have just embarrassed myself - that somebody has solved this problem already in a well known way. E.g. by creating a decent stateful RSS summarizer - although that's only a partial solution. E.g. some site that makes web crawling available, so that you can do the old style summarization in a web/socially acceptable way. Ideally by leveraging both RSS and website diffing, possibly in away so that others can share. Of course, one can always fall back to the old robot-ful ways of doing it yourself (complicated somewhat by having to avoid anti-robot measures, and by having o deal with dynamic content, inserted ads, and so on). Well, if so, then at least I'll learn something I can use. If not... well, please!

--

Like I said, I don't know what the proper name for this is.

I call it a "reading machine", since that is what it is supposed to do: help you read. Help you track what you read. Etc. Like the coloring of links that you have read in your browser. But it should be so much more: tracking what you want to read, but have not read yet; what you should be reading.

I don't bemoan the closing of Google Reader, since it fell so far short of what it could and should have been.

On to Memex!!!!

I found that approach to be reader unfriendly. It's not at all practical to read "recent changes." The web is (and definitely was at the time) a browsed medium. Attempts to do serial media on it have always seemed bolted on the side, never quite right. Oh, there have been many efforts.

What I describe, what I used to have, is NOT serial.

RSS is largely serial: a serial stream of updates.

What I want is a browseable interface, but where the first level interface is a summary of how much of interest to me there is at a particular site, or newsgroup, or ...

Many sites provide this - but for those sites by themselves. I haven't found many that span multiple sites [*]. Therefore, I conclude that your objection that this has been tried and failed is incorrect. What I am wishing for is a utility, easily applicable to many sites that do not support RSS.

I feel obliged to add Note*, since I should admit that Google Alerts where something like a cross site service for this. http://www.google.com/alerts . But Google alerts are email based - and that's exactly the wrong interface for so much of this. They also require me to write the queries - which is a good start, but so primitive. Proximity? Bayesian? And there is little summarization.

What I wish for is something like

"You are interested in news about ARM/Intel/AMD CPUs/GPUs?" There are 65 ARM Cortex A15 articles, 1 Intel Haswell, 9878 Intel bullshit marketing articls I can dream, eh?) ..."

First of all: As someone who used to read ClariNet at a terminal in the USC library for hours at a time, and for the past several years has handled the guilt instilled by seeing "753 unread items" in Google Reader by going through every item in order (and afterwards feeling like I actually knew what the hell was going on in the world), I know what I want to know about and I want to be sure I know about it. I don't want to have potentially valuable-to-me stuff slip through the cracks. So I am sorry to see Reader go, but I have signed up with a couple of alternative RSS readers (NewsBlur and SwarmIQ), and maybe that will re-solve the problem.

Second, your point about over-posting is right on. No matter how interesting a piece of content a Twitter user puts up, if I go to their page and see lots of Foursquare check-ins, Pope jokes, complaints about lost luggage or barely readable hashtag bait, I can't follow them. The signal-to-noise has to be kept high simply because there is so much signal out there.

Third, and off-topic, I saw your presentation at the 2009 Singularity Summit and was amazed -- I went around telling people about self-driving cars and directing them to your blog. Keep up the good work!

I too got to your blogs through Google Reader. It was a great to put all my RSS feeds that I cared about in *one* place.

But no more...sigh.

Any good web based replacements?

Randy

http://www.feedly.com/ has engineered a very quick drop in replacement for Google Reader. It simply log's into your existing Google Reader feed and goes from there. I presume they will at some point move that to their own site.

Its a bit more "modern" than the Google Reader UI. Which some people may like and some people may not like. I've only used it for a couple of days... So I'm not quite as used to it yet as Google Reader.

I did find this Blog entry through it!

For better or worse I can live with it. Not as big a move as switching from Usenet with trn was :-)

BTW Google officially announced the death of igoogle last summer as Nov 17(?) 2012... About the only thing that changed on that date was the message that igoogle was terminated as of that date disappeared. Still using it for a couple of things. So it may be that while officially Google reader is terminated this summer, it may actually remain available for some time after that.

I suspect that a LOT more people use Google Reader than iGoogle.

Yes, there are other readers, and most Google Reader users will switch. However, the effect is huge. The majority of readers of this RSS feed are via Google Reader, and the other readers in 2nd and 3rd place are very, very distant 2nds. It is inevitable in such a switch that some readers will be lost, perhaps many of them. People who are on the edge -- and there are always many of those -- don't like the inconvenience of moving. You may say they were not very serious readers, but it is important to serve both the dedicated and casual reader.

Bloglines had been the leader, and when it shut down, there was a serious drop in readership that took at least a year to recover.

I predict the end of Google reader will cause a significant blow to the RSS world.

For articles popular on news.ycombinator.com, I find that the comments on news.ycombinator.com are often of higher quality than comments on the original post. In part, that's probably because news.ycombinator.com sorts the comments with the most upvotes to the top, whereas the original site puts them in chronological order by posting time, so on news.ycombinator.com it's typically possible to skim or read the comments until you reach the ones of mediocre quality and then stop. Also, news.ycombinator.com lets you read all the recent comments by a particular person, and I don't know of a good search tool to do that for comments on the original sites.

However, for a really thought provoking article, news.ycombinator.com doesn't provide any reasonable mechanism to have a discussion about things one realizes about an article a day after reading it, and I think there's an unsolved problem there.

I find that, when not using any sort of rss reader, ignoring blogs with occasional updates when I'm busy, and then going to those blogs when I'm bored, seems to be more effective at balancing how much time I spend reading with how much time I have than trying to aggregate everything all in one place.

I'd love for there to be a good mechanism by which I could ignore news.google.com for a week, and then see a list of everything that's been an editor's pick in the last week. Likewise, I don't think news.ycombinator.com/best necessarily shows me everything that I'd get by looking at news.ycombinator.com once a day.

But some of these sites that are popular are paid for by their advertisers, and not the people looking at them, and so they are optimized for page views rather than efficient content distribution.

user warning: Unknown column 'style' in 'field list' query: SELECT scid, filter, style, effect, action FROM spam_custom WHERE effect != 4 in /home/brad/www/drupal/includes/database.mysql.inc on line 172.

Well then, we should create a grass roots effort to convince Google to keep their Reader!!

(How do you do that?)

Add new comment