Media

Twitter clients, only shorten URLs as much as you truly need to and make them readable

I think URL shorteners are are a curse, but thanks to Twitter they are growing vastly in use. If you don’t know, URL shorteners are sites that will generate a compact encoded URL for you to turn a very long link into a short one that’s easier to cut and paste, and in particular these days, one that fits in the 140 character constraint on Twitter.

I understand the attraction, and not just on twitter. Some sites generate hugely long URLs which fold over many lines if put in text files or entered for display in comments and other locations. The result, though, is that you can no longer determine where the link will take you from the URL. This hurts the UI of the web, and makes it possible to fool people into going to attack sites or Rick Astley videos. Because of this, some better twitter clients re-expand the shortened URLs when displaying on a larger screen.

Anyway, here’s an idea for the Twitter clients and URL shorteners, if they must be used. In a tweet, figure out how much room there is to put the compacted URL, and work with a shortener that will let you generate a URL of exactly that length. And if that length has some room, try to put in some elements from the original URL so I can see them. For example, you can probably fit the domain name, especially if you strip off the “www.” from it (in the visible part, not in the real URL.) Try to leave as many things that look like real words, and strip things that look like character encoded binary codes and numbers. Of course, in the end you’ll need something to make the short URL unique, but not that much. Of course, if there already is a URL created for the target, re-use that.

Google just did its own URL shortener. I’m not quite sure what the motives of URL shortener sites are. While sometimes I see redirects that pause at the intermediate site, nobody wants that and so few ever use such sites. The search engines must have started ignoring URL redirect sites when it comes to pagerank long ago. They take donations and run ads on the pages where people create the tiny URLs, but when it comes to ones used on Twitter, these are almost all automatically generated, so the user never sees the site.

Wanted: An IRC Bot to gateway to a twitter backchannel

It’s now becoming common to kludge a conference “backchannel” onto Twitter. I am quite ambivalent about this. I don’t think Twitter works nearly as well as an internal backchannel, even though there are some very nice and fancy twitter clients to help make this look nicer.

But the real problem comes from the public/private confusion. Tweets are (generally) public, and even if tagged by a hashtag to be seen by those tracking an event, they are also seen by your regular followers. This has the following consequences, good and bad.

  • Some people tweet a lot while in a conference. They use it as a backchannel. That’s overwhelming to their followers who are not at the conference, and it fills up the feed.
  • When multiple people do it, it’s almost like a spam. I believe that conferences like using Twitter as backchannel because it causes constant mentions of their conference to be broadcast out into the world.
  • While you can filter out a hashtag in many twitter clients, it’s work to do so, and the general flooding of the feed is annoying to many.
  • People tweeting at a conference are never sure about who they are talking to. Some tweets will clearly be aimed at fellow conference attendees. But many are just repeats of salient lines said on stage, aimed only at the outsiders.
  • While you can use multiple tags and filters to divide up different concurrent sessions of a conference, this doesn’t work well.
  • The interface on Twitter is kludged on, and poor.
  • Twitter’s 140 character limit is a burden on backchannel. Backchannel comments are inherently short, and no fixed limit is needed on them. Sure, sometimes you go longer but never much longer.
  • The Twitter limit forces URLs to be put into URL shorteners, which obscure where they go and are generally a bane of the world.

Dedicated backchannels are better, I think. They don’t reach the outside world unless the outsiders decide to subscribe to them, but I think that’s a plus. I think the right answer is a dedicated, internal-only backchannel, combined with a minimal amount of tweeting to the public (not the meeting audience) for those who want to give their followers some snippets of the conferences their friends are going to. The public tweets may not use a hashtag at all, or a different one from the “official” backchannel as they are not meant for people at the conference.

The most common dedicated backchannel tool is IRC. While IRC has its flaws, it is much better at many things than any of the web applications I have seen for backchannel. It’s faster and has a wide variety of clients available to use with it. While this is rarely done, it is also possible for conferences to put an IRC server on their own LAN so the backchannel is entirely local, and even keeps working when the connection to the outside world gets congested, as is common on conference LANs. I’m not saying IRC is ideal, but until something better comes along, it works. Due to the speed, IRC backchannels tend to be much more rapid fire, with dialog, jokes, questions and answers. Some might view this as a bug, and there are arguments that slowing things down is good, but Twitter is not the way to attain that.

However, we won’t stop those who like to do it via Twitter. As noted, conferences like it because it spams the tweetsphere with mentions of their event.

I would love to see an IRC Bot designed to gateway with the Twitter world. Here are some of the features it might have.  read more »

Make e-Ink tablets an add-on for our phone/PDAs, not stand-alone

It’s over 17 years since I first too a stab at e-Books, and while I was far too early, I must admit I had not predicted I would be that early. The market is now seeing a range of e-Ink based electronic book readers, such as the kindle, and some reasonable adoption. But I don’t have one yet. But I do read e-books on my tiny phone screen. Why?

The phone has the huge advantage that it is always with me. It gives me a book any time I am caught waiting. On a train, in a doctor’s office, there is always a way to catch up on reading. It’s not ideal, and I don’t use it to read at home in bed, but it’s there. The tablets are all large, and for a good reading experience, people like them even larger. This means they are only there when you make deliberate plans to read, and pack them in your bag.

I’m not that thrilled with e-Ink yet, both for its low contrast and the annoying way it has to flash black in order to reset, causing a distracting delay when turning the page. There are ways to help that, but as yet it suffers. e-Ink also can’t readily be used for annotation or interactive operation, so many devices will keep a strip of LCD for things like selecting from menus and the like. Many of the devices also waste a lot of space with a keyboard, and the Kindle includes a cellular radio in order to download books. e-Ink does have a huge advantage in battery life.

What makes sense to me instead would be a sheet (or two sheets, folded) of e-Ink with very little in the way of smarts inside the device. Instead, it would be designed so that a variety of cell phones could dock to the e-Ink sheet and provide the brains. Phones have different form factors, of course, and different connectors though almost all can do USB. (Though annoyingly only as a slave, but this can be kludged around.) It would be necessary to make small plastic holders for the different phone models which can mate to a mount on the book display, ideally connecting the data port at the same time. The tablet of course should be able to connect to a laptop via USB (this time as a slave) but do the same reading actions. The docking can also be, I am reminded by the commenters, done by bluetooth, with interesting consequences.

This has many large advantages:

  • Done right, this tablet is a fair bit cheaper. It has minimal brains inside, and no cell phone. In fact, for most people, it also does not include the cost of a cell phone data service. (I presume with the Kindle the cost of that is split between the unit and the book sales, but either way, you pay for it.)
  • The cell phone provides an interactive LCD screen to use with all the reader’s interactive functions — book buying, annotating etc.
  • The cell phone provides a data connection for downloading books, newspapers and web pages.
  • The cell phone provides a keyboard for the few times you use a keyboard on an e-Book reader
  • When you don’t have your e-Ink tablet, you still have all your books, and can still order books.

The main thing the cell phone doesn’t have is huge battery life. The truth is, however, that cell phones have excellent battery life if they are not turning on their screen or doing complex network apps. We do such activities of course, and they drain our batteries, but we expect that and thus charge regularly and carry more. I’m not too scared at the idea of not being able to read my books with the phone dead.

The tablet could also be used with a laptop, especially a netbook. Laptops can actually run for a very long time if you put them in a power conserving mode, turning off the screen and disks, possibly even suspending the CPU between complex operations.

However, there is no need to run it at all. While I described the tablet as being dumb, it takes very little smarts for it to let you page through a pre-rendered book that was fed to it by the phone or laptop. That can be done with a low power microcontroller. It just would not do any fancy interactive operations without turning on the phone or laptop. And indeed, for the plain reading of a single book, akin to what you can do with the paper version, it would be able to operate on its own.

Of course, the vendors would not want to support every phone. But they could cut a deal to let people use old supported phones (which are in plentiful supply as people recycle phones constantly) with a minimal books-only data plan similar to the plans they have cut for the dedicated devices. In the GSM world, they could offer a special SIM good only for book operations for use in an older phone of the class they do support. And they could also build a custom module that slots perfectly into the tablet with the cell modem, small LCD screen and keyboard for those who still want a stand-alone device.

This approach also allows you to upgrade your tablet and your phone independently.

As noted, I think a folding tablet makes a lot of sense. This is true for two reasons. First, you get more screen real estate in half the width of tablet. Secondly, with two e-Ink panels, you can play some tricks so that you flash-refresh the panel you aren’t reading rather than the one you are finishing. While slightly distracting (depending how it’s done) it means that when you want to switch to the next page, you do it with your eyes, with no delay. You have to push a button when you switch (even going from left to right though it’s not apparently needed) so that the page you have fully finished refreshes while you are reading the next one. This could also be done with timings. Or even with a small camera watching your eyes, though I was trying to make the tablet dumber and this takes CPU and power right now. I can imagine other tricks that would work, such as how you hold the tablet (capacitive detection of your grip, or accelerometer detection of the angle.)

The tablet could also be built so the two pages of e-Ink are on the front and back. In this case it would not fold, though a slipcover would be a good idea. A “flip tablet” would display page 1 to you with page 2 on the back. To read page 2 you would physically flip it over. It would detect that of course, and change page 1 to page 3 when it was on the other side. This would mean the distraction of the flash-refresh would not be visible to you, which is a nice plus.

Cutely, the flip tablet could detect which direction you flip it. So if you flip it counter clockwise, you get the next page. If you flip it clockwise you get the previous page. Changing direction means you might briefly see the flash while you are flipping the unit but the UI seems pretty good to me. For those who don’t like this interface, the unit could still hinge out in the middle to show both pages at once.

Bluetooth connection

Using bluetooth for the connection has a number of interesting consequences. It does use power, and does not allow exchange of power between the tablet and device, but it means you don’t have to physically put the phone on the tablet at all. This may be a pain in some circumstances (needing two hands to do interactive things) but in other circumstances having a remote control to use to flip pages can be a real win.

I have found a very nice way to do e-reading is to have the pages displayed in front of you, at eye height, rather than down low in your hands. In particular, if you can mount your tablet on the top back of an airplane seat, it is much more comfortable than holding a book or tablet in your hands. The main downside is that the overhead light does not shine on the page there, so you need a backlight or LED book light. The ability to do remote control from your phone, in your rested hand would be great. Unfortunately they have the strange idea that they want to ban bluetooth on planes, though it poses no risk. They don’t even like wires.

In the 90s I built a device for reading books on planes where I got a book holder (they do make those) and I rigged it to attach with velcro and hang from the back of the seat in front. In those days it was quite common to have velcro on the top of the seat. Combined with a book light, I found this to be way more comfortable than holding a book in my hands, and I read much more pleasantly. Today you might have to build it so that a plate wedges between the raised table and seatback and a rod sticks out to hold the tablet.

Or on some planes they could support e-books on the screen in the seatback, with the remote control that is in your armrest. Alas, they would indeed need to use bluetooth so your PDA could display the book on that screen. (In general, letting your PDA use the screen in front of you would be very nice. It’s too sucky a resolution for laptops, since it must have been designed years ago in an era of sucky resolution. Today 1650 x 950 displays cost $100.

How to do a distributed Twitter (MSM)

Dave Winer recently made a call for an open source twitter shell, which he suggests be perhaps done with a javascript framework to let any site act like twitter.com. Many people are interested in this sort of suggestion, because while the folks at twitter.com are generally well loved and felt to be good actors, many people fear that no publishing system that becomes important should be controlled by just one company.

For success, such a system would need to be as easy to use and set up as twitter for users, and pretty easy to set up for server operators. One thing it can’t do so easily, alas, is use a simple single namespace the way twitter does. A distributed system probably has to make names be domains, like E-mail addresses. That almost surely means something longer than twitter names and no use of the @name syntax popular in Twitter to refer to users. On the other hand almost everybody already has a domain based ID, ie. their E-mail address. On the other hand most people are afraid to use this ID in public where it might get spam. It’s a shame, but many might well prefer to get a different ID from their E-mail, or of course to use one at twitter, which would now look like user@twitter.com to the outside world instead of @user within twitter.

Naming problems aside, the denizens of the internet are certainly up to building a publish/subscribe based short message multicasting service, which is what twitter is using terms much older than the company. I might propose the name MSM for the techology (Multicast Short Message)  read more »

What's this odd twitter spam about?

Some recent searches have revealed unusual activity on twitter, and I wonder where it’s going. Narcissus searches on twitter reveal a variety of accounts tweeting links into my blog and sites, for reasons not clearly apparent.

For example, a week ago, a half dozen identical twitter accounts all tweeted my post about electric cars playing music. All the accounts had pictures of models as their icon, and the exact same set of twitter posts, which seem to be a random collection of blog and news URLs with a bit.ly pointer to the item, all posted via twitterfeed. These accounts seem to follow and be followed by about 500, presumably the same list.

Here are some of the accounts:

Then more recently I see another set of accounts which all follow about 20 people but are followed by about 200 to 500. They are all posting “from API” and again are just posting links, this time with tinyurl.com. The account names are odd, too.

  • sheen0uz
  • http://twitter.com/moshelir3u
  • http://twitter.com/felecin9v

These also seem to to have cute girls as icons. However, strangely, the many followers appear to be real, or at least some of them appear to be. Why are people following a spam robot? Are the followers people who were paid to do it, or are in some twitter-optimization scheme?

What I am curious about is the motive. Are they linking to real sites in the hope of gaining some sort of legitimacy in twitter indexing engines, so that later they can start linking to people who pay for it? (Twitter SEO?) Are they trying to form twitter equivalents of link farms? Are they just hoping that site authors will see the backlinks and look at them for some later purpose? (You would be amazed how many hits on a web server are there just to put a spammer in the “Referer” field, either to get you to look, or to show up in referer logs that some sites post to the web.)

Thoughts on what’s up?

Do you get Twitter? Is a "sampled" medium good or bad?

I just returned from Jeff Pulver’s “140 Characters” conference in L.A. which was about Twitter. I asked many people if they get Twitter — not if they understand how it’s useful, but why it is such a hot item, and whether it deserves to be, with billion dollar valuations and many talking about it as the most important platform.

Some suggested Twitter is not as big as it appears, with a larger churn than expected and some plateau appearing in new users. Others think it is still shooting for the moon.

The first value in twitter I found was as a broadcast SMS. While I would not text all my friends when I go to a restaurant or a club, having a way so that they will easily know that (and might join me) is valuable. Other services have tried to do things like this but Twitter is the one that succeeded in spite of not being aimed at any specific application like this.

This explains the secret of Twitter. By being simple (and forcing brevity) it was able to be universal. By being more universal it could more easily attain critical mass within groups of friends. While an app dedicated to some social or location based application might do it better, it needs to get a critical mass of friends using it to work. Once Twitter got that mass, it had a leg up at being that platform.

At first, people wondered if Twitter’s simplicity (and requirement for brevity) was a bug or a feature. It definitely seems to have worked as a feature. By keeping things short, Twitter makes is less scary to follow people. It’s hard for me to get new subscribers to this blog, because subscribing to the blog means you will see my moderately long posts every day or two, and that’s an investment in reading. To subscribe to somebody’s Twitter feed is no big commitment. Thus people can get a million followers there, when no blog has that. In addition, the brevity makes it a good match for the mobile phone, which is the primary way people use Twitter. (Though usually the smart phone, not the old SMS way.)

And yet it is hard not to be frustrated at Twitter for being so simple. There are so many things people do with Twitter that could be done better by some more specialized or complex tool. Yet it does not happen.

Twitter has made me revise slightly my two axes of social media — serial vs. browsed and reader-friendly vs. writer friendly. Twitter is generally serial, and I would say it is writer-friendly (it is easy to tweet) but not so reader friendly (the volume gets too high.)

However, Twitter, in its latest mode, is something different. It is “sampled.” In normal serial media, you usually consume all of it. You come in to read and the tool shows you all the new items in the stream. Your goal is to read them all, and the publishers tend to expect it. Most Twitter users now follow far too many people to read it all, so the best they can do is sample — they come it at various times of day and find out what their stalkees are up to right then. Of course, other media have also been sampled, including newspapers and message boards, just because people don’t have time, or because they go away for too long to catch up. On Twitter, however, going away for even a couple of hours will give you too many tweets to catch up on.

This makes Twitter an odd choice as a publishing tool. If I publish on this blog, I expect most of my RSS subscribers will see it, even if they check a week later. If I tweet something, only a small fraction of the followers will see it — only if they happen to read shortly after I write it, and sometimes not even then. Perhaps some who follow only a few will see it later, or those who specifically check on my postings. (You can’t. Mine are protected, which turns out to be a mistake on Twitter but there are nasty privacy results from not being protected.)

TV has an unusual history in this regard. In the early days, there were so few stations that many people watched, at one time or another, all the major shows. As TV grew to many channels, it became a sampled medium. You would channel surf, and stop at things that were interesting, and know that most of the stream was going by. When the Tivo arose, TV became a subscription medium, where you identify the programs you like, and you see only those, with perhaps some suggestions thrown in to sample from.

Online media, however, and social media in particular were not intended to be sampled. Sure, everybody would just skip over the high volume of their mailing lists and news feeds when coming back from a vacation, but this was the exception and not the rule.

The question is, will Twitter’s nature as a sampled medium be a bug or a feature? It seems like a bug but so did the simplicity. It makes it easy to get followers, which the narcissists and the PR flacks love, but many of the tweets get missed (unless they get picked up as a meme and re-tweeted) and nobody loves that.

On Protection: It is typical to tweet not just blog-like items but the personal story of your day. Where you went and when. This is fine as a thing to tell friends in the moment, but with a public twitter feed, it’s being recorded forever by many different players. The ephemeral aspects of your life become permanent. But if you do protect your feed, you can’t do a lot of things on twitter. What you write won’t be seen by others who search for hashtags. You can’t reply to people who don’t follow you. You’re an outsider. The only way to solve this would be to make Twitter really proprietary, blocking all the services that are republishing it, analysing it and indexing it. In this case, dedicated applications make more sense. For example, while location based apps need my location, they don’t need to record it for more than a short period. They can safely erase it, and still provide me a good app. They can only do this if they are proprietary, because if they give my location to other tools it is hard to stop them from recording it, and making it all public. There’s no good answer here.

ClariNet history and the 20th anniversary of the dot-com

Twenty years ago (Monday) on June 8th, 1989, I did the public launch of ClariNet.com, my electronic newspaper business, which would be delivered using USENET protocols (there was no HTTP yet) over the internet.

ClariNet was the first company created to use the internet as its platform for business, and as such this event has a claim at being the birth of the “dot-com” concept which so affected the world in the two intervening decades. There are other definitions and other contenders which I discuss in the article below.

In those days, the internet consisted of regional networks, who were mostly non-profit cooperatives, and the government funded “NSFNet” backbone which linked them up. That backbone had a no-commercial-use policy, but I found a way around it. In addition, a nascent commercial internet was arising with companies like UUNet and PSINet, and the seeds of internet-based business were growing. There was no web, of course. The internet’s community lived in e-Mail and USENET. Those, and FTP file transfer were the means of publishing. When Tim Berners-Lee would coin the term “the web” a few years later, he would call all these the web, and HTML/HTTP a new addition and glue connecting them.

I decided I should write a history of those early days, where the seeds of the company came from and what it was like before most of the world had even heard of the internet. It is a story of the origins and early perils and successes, and not so much of the boom times that came in the mid-90s. It also contains a few standalone anecdotes, such as the story of how I accidentally implemented a system so reliable, even those authorized to do so failed to shut it down (which I call “M5 reliability” after the Star Trek computer), stories of too-early eBook publishing and more.

There’s also a little bit about some of the other early internet and e-publishing businesses such as BBN, UUNet, Stargate, public access unix, Netcom, Comtex and the first Internet World trade show.

Extra, extra, read all about it: The history of ClariNet.com and the dawn of the dot-coms.

Towards better pseudonym posting on message boards - casual commenting.

As you may know, I allow anonymous comments on this blog. Generally, when a blog is small, you don’t want to do too much to discourage participation. Making people sign up for an account (particularly with email verification) is too much of a barrier when your comment volume is small. You can’t allow raw posting these days because of spammers — you need some sort of captcha or other proof-of-humanity — but in most cases moderate readership sites can allow fairly easy participation.

Once a site gets very popular, it probably wants to move to authenticated user posting only. In this case, once the comment forums are getting noisy, you want to raise the bar and discourage participation by people who are not serious. My sub blog on Battlestar Galactica has gotten quite popular of late, and is attracting 100 or more comments per post, even though it has only 1/10th the subscribers of the main blog. Almost all post using the anonymous mechanism which lets them fill in a name, but does nothing to verify it. Many still post under the default name of “Anonymous.”

Some sites let you login using external IDs, such as OpenID, or accounts at Google or Yahoo. On this site, you can log in using any ID from the drupal network, in theory.

However, drupal (which is the software running this site) and most other comment/board systems are not very good at providing an intermediate state, which I will call “casual comments.” Here’s what I would like to see:

  • Unauthenticated posters may fill in parameters as they can now (like name, email, URL) and check a box to be remembered. They would get a long-term cookie set. The first post would indicate the user was new.
  • Any future posts from that browser would use that remembered ID. In fact, they would need to delete the cookie or ask the site to do so in order to change the parameters.
  • If they use the cookie, they could do things like edit their postings and several of the things that registered users can do.
  • If they don’t pick a name, a random pseudonym would be assigned. The pseudonym would never be re-used.
  • Even people who don’t ask to be remembered would get a random pseudonym. Again, such pseudonyms would not be re-used by other posters or registered users. They might get a new one every time they post. Possibly it could be tied to their IP, though not necessarily traceable back to it, but of course IPs change at many ISPs.
  • If they lose the cookie (or move to another computer) they can’t post under that name, and must create a new one. If they want to post under the same name from many machines, create an account.
  • The casual commenters don’t need to do more special things like create new threads, and can be quite limited in other ways.

In essence, a mini-account with no authorization or verification. These pseudonyms would be marked as unverified in postings. A posting count might be displayed. A mechanism should also exist to convert the pseudonym to a real account you can login from. Indeed, for many sites the day will come when they want to turn off casual commenting if it is getting abused, and thus many casual commenters will want to convert their cookies into accounts.

The main goal would be to remove confusion over who is posting in anonymous postings, and to stop impersonation, or accusations of impersonation, among casual posters.

I don’t think it should be too hard to make a module for drupal to modify the comment system like this if I knew drupal better.

Going paperless by making manuals easier to find

As I move to get more paper out of my life, one thing I’m throwing away with more confidence is manuals. It’s pretty frequent that I can do a search for product model numbers or other things on a manual, and find a place to download the PDF. Then I can toss the manual. I need to download the PDF, because the company might die and their web site might go away.

I would like to make this even easier. For starters, it would be nice if the UPC database (UPC are the bar codes found on all retail products) would also offer a link to getting all manuals and paper that come with a product. I would then be able to just photograph the bar codes of all my products with my phone or camera, and cause automatic download or escrow of all manuals. Perhaps a symbol next to the UPC could tell me this is guaranteed to work.

It would be even better if companies escrowed the manuals, which is to say paid a one-time fee to a trustable company which would promise to keep the documents online forever. This company must be backed by a very solid company itself, perhaps a consortium of all the major vendors with a pact that if any of them go other, the rest take up the slack of maintaining the site.

In fact, all free, public documents should have a code on them that can be turned into a URL where I can fetch the document, as PDF, HTML or even MSWord. Any attempt to scan such a document would pick up this code and know it doesn’t have to scan the rest unless it is marked up. For books, we sould key off the ISBN as well as the UPC. Eventually one of the newer, compact 2-D “barcodes” could be used to code a number to find the docs.

Of course, many products are now coming without manuals at all, and that’s largely fine with me.

Battlestar Galactica sub-blog returns to activity

Some of you may know that I started a sub-blog for my thoughts on my favourite SF TV show, Battlestar Galactica. This sub-blog was dormant while the show was off the air, but it’s started up again with new analysis as the first new episode of the final 10 (or 12) episodes airs tonight. (I will be missing watching it near-live as I will be giving a talk tonight on Robocars at the Future Salon in Palo Alto.) Reports are that one big mystery — the last Cylon — is revealed tonight.

So if you watch Battlestar Galactica, you may want to subscribe to the feed for the Battlestar Galactica Analysys Bog right here on this site. And I’ll go out on a limb and promote my two top candidates for the mystery Cylon.

Some recent posts of note:

Being the greatest athlete ever

NBC has had just a touch of coverage of Michael Phelps and his 8 gold medals, which in breaking Mark Spitz’s 7 from 1972 has him declared the greatest Olympic athlete, or even athlete of all time. And there’s no doubt he’s one of the greatest swimmers of all time and this is an incredible accomplishment. Couch potato that I am, I can hardly criticise him.

(We are of course watching the Olympics in HDTV using MythTV, but fast-forwarding over the major bulk of it. Endless beach volleyball, commercials and boring events whiz by. I can’t imagine watching without such a box. I would probably spend more time, which they would like, but be less satisfied and see fewer of the events I wish to.)

Phelps got 8 Gold but 3 of them were relays. He certainly contributed to those relays, may well have made the difference for the U.S. team and allowed it to win a gold it would not have won without him. So it seems fair to add them, no?

No. The problem is you can’t win relay gold unless you are lucky enough to be a citizen of one of a few powerhouse swimming nations, in particular the USA and Australia, along with a few others. Almost no matter how brilliant you are, if you don’t compete for one of these countries, you have no chance at those medals. So only a subset of the world’s population even gets to compete for the chance to win 7 or 8 medals at the games. This applies to almost all team medals, be they relay or otherwise. Perhaps the truly determined can emigrate to a contending country. A pretty tall order.

Phelps one 5 individual golds, and that is also the record, though it is shared by 3 others. He has more golds than anybody, though other athletes have more total medals.

Of course, swimming is one of the special sports in which there are enough similar events that it is possible to attain a total like this. There are many sports that don’t even have 7 events a single person could compete in. (They may have more events but they will be divided by sex, or weight class.)

Shooting has potential for a star. It used to even be mixed (men and women) until they split it. It has 9 male events, and one could in theory be master of them all.

Track and Field has 47 events split over men and women. However, it is so specialized in how muscles are trained that nobody expects sprinters to compete in long events or vice versa. Often the best sprinter does well in Long Jump or Triple Jump, allowing the potential of a giant medal run for somebody able to go from 100m to 400m in range. In theory there are 8 individual events 400m or shorter.

And there are a few other places. But the point is that to do what Phelps (or Spitz) did, you have to be in a small subset of sports, and be from a small set of countries. There have been truly “cross sport” athletes at the Olympics but in today’s world of specialized training, it’s rare. If anybody managed to win multiple golds over different sports and beat this record, then the title of greatest Olympian would be very deserving. One place I could see some crossover is between high-diving and Trampoline. While a new event, Trampoline seems to be like doing 20 vaults or high dives in a row. And not that it wasn’t exciting to watch him race.

More Burning Man packing…

Guarantee CPM if you want me to join your ad network

If you run a web site of reasonable popularity, you probably get invitations to sign up for ad networks from time to time. They want you to try them out, and will sometimes talk a great talk about how well they will do.

I always tell them “put your money where your mouth is — guarantee at least some basic minimum during the trial.”

Most of them shut up when I ask for that, indicating they don’t really believe their own message. I get enough that I wrote a page outlining what I want, and why I want it — and why everybody should want it.

If you have a web site with ads, and definitely if you have an ad network, consider reading what I want before I’ll try your ad network.

Just when you thought it was safe to buy a blu-ray player

The last week saw some serious signs that Blu-Ray could win the high-def DVD war over HD-DVD. Many people have been waiting for somebody to win the war so that they don’t end up buying a player and a video collection in the format that loses. (Strangely, the few players that supported both formats tended to cost much more than two individual players.)

Now there’s a report that the new profile for Blu-ray will obsolete many old players. So even those who made the right bet and didn’t get a PS3 may be just as screwed.

Something amazes me that has amazed me since the days of the first Audio CD players in the 80s. The Audio CD redbook format was defined early, and it was a lot of work to get reasonable combined audio + data disks because of it. And long after burnable CDs became popular (and into DVDs) it’s been the case that many home players can’t read the disks at all until they are “finalized” and unable to take more data. There were many other problems. And that’s not itself the problem, as there will always be demands you don’t anticipate.

But it’s not as though these devices don’t have a readily available means by which to be given new programming. They have a drive in them, and it would have been easy to issue CDs or DVDs with signed new firmwares on them. Indeed, since the disks have always been vastly huge compared to the firmwares of the devices they played on, it’s usually been the case that a disk wishing to use a new format could probably include new firmware for every known player in a small part of the disk. That’s certainly true for blu-ray.

Now of course if a player doesn’t have enough memory or CPU or graphics power, you can’t update it to do things it simply isn’t capable of doing. But you should be able to always update it to understand at least the structure of new formats, and know what they can use and what they can’t. Of course, all updates must be signed by a highly protected manufacturers key, so that attackers can’t hack your firmware, and the user should have to confirm on their remote that they want to accept the update. And yes, if that key is compromised and people don’t insert a disk with a revocation command on it quickly enough, there can be trouble. But it’s better than having players that slow down progress in the business.

(And yes, I realize that many early CD players did not have rewritable firmware, since ROM was cheaper than EEPROM and flash didn’t come along for a while. But it would have been worth it, and there’s no excuse for not having safely flashable firmwware today in just about anything.)

And on another rant, I’ve always been amazed at the devices that do allow firmware flashing but don’t have a safety mechanism. There are many devices, some still made today, that can be turned into “bricks” if you flash buggy firmware, in that you can no longer flash new firmware. Every device should have, in unwritable storage, the most basic and well tested firmware reloader that can be invoked if the recently installed firmware has failed. Some devices have this but it’s taken a long time.

While I don’t really seek a game machine because it would suck up too much time, it may be time for a PS3 as a Blu-Ray player. They are not much more expensive than the standalone players, and of course do much more. If I wanted a game machine it would probably be a Wii. We found one this year for a gift for the nephews, but they got another one so I ended up selling it for a $100 profit on Craigslist, the prevailing market being what it was. Made an Egyptian boy very happy, as they are very hard to get over there.

Old think on data storage for movies

A story from the New York Times suggests it costs over $12,000/year to store a movie in digital form.

This number is entirely bogus, and based on old thinking, namely the assumptions of offline storage on DVDs and tapes. Offline media do degrade, and you must copy them before they have a chance to degrade, which takes people, though frankly it’s still should not be as expensive as this. To do my calculations, I am going to assume a movie needs 100gb of storage with low-loss lossy compression. You can scale the numbers up if you like if you want to assume more, even at 1 TB it doesn’t change that much.

A film occupying 100gb of storage can go on about 20 dvds (or 11 dual layer,) costing about $8. It can go on 4 independent sets of 20 DVDs for $32 in media. Ideally you could rack these in a DVD jukebox, but if they are just sleeved, then once a year a person could pull out the DVDs, put them in a reader which would test them. Any that tested fine would be re-sleeved, those that did not would flag for the others to be pulled, and then copied to new media. (Probably better media, like blu-ray.) There are algorithms to distribute the data so that a large number of the disks must fail in that year to actually lose something. Of course, you use different vaults around the world. When approaching the point where failure rates go up for the media, you re-burn new copies even if the old ones still test fine.

This takes human time, though not all that much. Perhaps half an hour of actual human time swapping disks though much more real time to burn them, but you don’t do just one at a time.

However, even better is the new style of archival — online storage. Hard disks are 20 cents/gigabyte and continuing to fall. NAS boxes are more expensive now but there is no reason they won’t drop to very reasonable prices, so that a NAS case adds perhaps 5 cents/gigabyte (ie. $100 for a 4x500gb drive box which lasts for 10-15 years.) (NAS boxes are small boxes that hold a collection of drives and allow access to them over ethernet. No computer is needed.) They also cost about 2 cents/gb/year for power if on all the time, and some small amount for space, though they would tend to sit in computer centers that already exist.

Those are today’s prices, which will just get cheaper, except for the power. Much cheaper. If a drive lasts an average of 4 years before failing and a NAS lasts 10 years, this works out to 7.5 cents/gigabyte/year. Of course you will store your files redundantly, in 4 different places (which is actually overkill) and so it’s 30 cents/gigabyte/year.

Which is still just $30 for a 100gb file, or $300 for a TB.

Online storage is live. You can regularly check the integrity, all the time. You can either leave it off and spin it up every few days (to not use power) or just leave it on all the time. If one, two or three of the 4 disks fails, computers can copy the data to fresh disks in the network, and you are alive. Your disks should last 3 to 4 years but many will last much longer. You need a computer system to control all this, but you only need one for the entire cloud of NAS boxes, or at most a few. Its cost is low.

The real cost is people. But companies like Google have solved the problem of running large server farms. They tolerate single drive failures. The computers copy the data to new drives right away, and technicans go by every few days to pull old ones and slot in fresh ones for the next need — not for the same file. This takes just a few minutes of the tech’s time. And there is no rush to their work. Fore each 100gb file, you should expect to have a replacement about once every 4 years (ie. the lifetime of an average drive.)

Now all this is at today’s price of $100 for a 500gb drive. But that’s dropping fast, faster than Moore’s law. The replacements will be 1TB and 2TB drives before long, and the cost will continue to fall. And this is with 4 copies of every file. You can actually get by with less using modern data distribution algorithms which can scatter a file of 100gb into 200 1gb pieces, for which almost half must be lost before the whole file is lost. Several data centers could burn down without losing any files if things are done right. I have not accounted for bandwidth here for replacements, which usually would be done in the same data center except in unusual circumstances.

The biggest cost is the people to set all this up. However, presuming big demand, the cost per gigabyte for those people should become modest.

Writers' Strike threatening Porn Industry

The strike by screenwriters in the Porn Writers Guild of America is wreaking a less public havoc on the pornography industry. Porn writers, concerned about declining revenue from broadcast TV, also seek a greater share of revenue from the future growth areas of DVD and online sales.

“Online sales and DVD may one day be the prime sources of revenue in our industry,” stated union spokesman Seymour Beaver. We want to be sure we get our fair share of that for providing the writing that makes this industry tick.

“It’s getting terrible,” reported one porn consumer who refused to give his name. “I just saw Horny Nurses 14 and I have to tell you it was just a reshash of the plots from Horny Nurses 9 and 11. It’s like they didn’t even have a writer.”

“Fans are not going to put up with movies lacking in plot, character and dialogue, and that’s what they’ll get if they don’t meet our terms,” said Beaver. Beaver, who claims to have a copyright on the line, “Oh yes, baby, do it just like that, oh yeah” says he will not allow use of his lines without proper payment of residuals.

Some writers also fear that the move to online will result in customers simply downloading individual scenes rather than seeking movies with a cohesive story thread that makes you care about the characters. “I saw one movie with 5 scenes, and no character was in 2 of them,” complained one writer.

“What do people want? Movies where the actors just walk into a room, strip and just go at it? Where they always start with oral sex, then doggy, and then a money shot? Fans will walk if that’s all they get,” according to PWGA member Dick Member. “And don’t think about doing the lonely housewife and the pool-boy again. I own that.”

An industry spokesman said they had not yet seen any decline in revenues due to the strike, as they have about 2 million already-written scripts on the shelves. In addition, Hot Online Corporation spokesman Ivana Doit claimed their company is experimenting with a computer program that creates scripts through a secret algorithm. Scripts penned by the computer have already brought in a million in sales, claims Doit, but she would not indicate which films this applied to.

Converting vinyl to digital, watch the tone arm

After going through the VHS to digital process, which I lamented earlier I started wondering about the state of digitizing old vinyl albums and tapes is.

There are a few turntable/cd-writer combinations out there, but like most people today, I’m interested in the convenience of compressed digital audio which means I don’t want to burn to CDs at all, and nor would I want to burn to 70 minute CDs I have to change all the time just so I can compress later. But all this means I am probably not looking for audiophile quality, or I wouldn’t be making MP3s at all. (I might be making FLACs or sampling at a high rate, I suppose.)

What I would want is convenience and low price. Because if I have to spend $500 I probably would be better off buying my favourite 500 tracks at online music stores, which is much more convenient. (And of course, there is the argument over whether I should have to re-buy music I already own, but that’s another story. Some in the RIAA don’t even think I should be able to digitize my vinyl.)

For around $100 you can also get a “USB turntable.” I don’t have one yet, but the low end ones are very simple — a basic turntable with a USB sound chip in it. They just have you record into Audacity. Nothing very fancy. But I feel this is missing something.

Just as the VHS/DVD combo is able to make use of information like knowing the tape speed and length, detecting index marks and blank tape, so should our album recorder. It should have a simple sensor on the tone arm to see as it moves over the album (for example a disk on the axis of the arm with rings of very fine lines and an optical sensor.) It should be able to tell us when the album starts, when it ends, and also detect those 2-second long periods between tracks when the tone arm is suddenly moving inward much faster than it normally is. Because that’s a far better way to break the album into tracks than silence detection. (Of course, you can also use CDDB/Freedb to get track lengths, but they are never perfect so the use of this, net data and silence detection should get you perfect track splits.) It would also detect skips and repeats this way.  read more »

The scarcity of Talent

At Supernova 2007, several of us engaged Andrew Keen over his controversial book "The Cult of the Amateur." I will admit to not yet having read the book. Reviews in the blogosphere are scathing, but of course the book is entirely critical of the blogosphere so that's not too unexpected.

However, one of the things Keen said he worries about is what he calls the "scarcity of talent." He believes the existing "professional" media system did a good enough job at encouraging, discovering and promoting the talent that's out there, and so the world doesn't get more than slush with all the new online media. The amount of talent he felt, was very roughly constant.

I presented one interesting counter to this concept. I am from Canada. As you probably know, we excel at Hockey. Per capita certainly, and often on an absolute scale, Canada will beat any other nation in Hockey. This is only in part because of the professional leagues. We all play hockey when we are young, and this has no formal organization. The result is more talented players arise. The same is true for the USA in Baseball but not in Soccer, and so on.

This suggest that however much one might view YouTube as a vaster wasteland of terrible video, the existence of things like YouTube will eventually generate more and better videographers, and the world will be richer for it, at least if the world wants videographers. One could argue this just takes them away from something else but I doubt that accounts for all of it.

The Efficiency of Attention in Advertising

I’ve written before about the problems with TV advertising. Recently I’ve been thinking more about the efficiency of various methods of advertising — to the target, not to the advertiser. Almost all studies of advertising concern how effectively advertising turns into leads or sales, but rarely are the interests of the target of the ad considered directly.

I think that has to change, because we’re getting more tools to avoid advertising and getting more resistant. I refuse to watch TV with ads, because at $1.20 per hour of advertising watched, it’s a horrible bargain. I would rather pay if I could, and do indeed buy the DVDs in many cases, but mostly my MythTV skips the ads for me. The more able I am to do this, the more my desires as a target must be addressed.

Advertising isn’t totally valueless to the target. In fact, Google feels one big reason for their success is that they deliver ads you might actually care to look at. There are other forms of advertising with the same mantra out there, and they tend to do well, such as movie trailers and Superbowl ads.

Consider a video ad lasting 30 seconds, with a $10 CPM. That means the advertiser pays one cent per viewer of the ad. The viewer spends 30 seconds. On the other hand, a box with 3 or 4 Google ads, as you might see on this page, is typically scanned in well under a second. These ads also earn (as a group) about a $10 CPM though they are paid per click. Google doesn’t publish numbers, but let’s assume a $10 CPM and a 1% click-through on the box. It’s actually higher than this.

In the 30 seconds a TV ad takes, I can peruse perhaps 50 boxes, bars or banners of web ads. That will expose me to over 100 product offers that in theory match my interests, compared to 1 for the video ad. The video ad will of course be far more convincing as it is getting so much attention, but in terms of worthwhile products offered to me per second, it’s terrible.

It isn’t quite this simple though, since I will click on one ad every every minute spent looking at ads (not every minute on the web) and perhaps spend another minute looking in detail at what the ad had to offer. That particular, very well targeted site, gains the wealth of attention the video ad demands, but far more efficiently.

I think this area is worth of more study in the industry, and I think it’s a less understood reason why Google is getting rich, and old media are running scared. In the future, people will tolerate advertising less and less unless it is clearer to them what value they are getting for it. Simply being able to get free programming is not the value we’re looking for, or if it is, we want a better deal — more programming in exchange for our valuable attention. But we want more than that better deal. We want to be advertised to efficiently, in a way that considers our needs and value. The companies that get that will win, the dinosaurs will find themselves in the movie “The Sixth Sense” — dead people, who don’t know they’re dead.

Making instruments with the human voice

The human voice is a pretty versatile instrument, and many skilled vocalists have been able to do convincing imitations of other sounds, and we’ve all heard “human beat box” artists work with a microphone to do great sounds.

That got me thinking, could we train a choir to work together to sound like anything, starting with violins, and perhaps even a piano or more?

The idea would be to get some vocalists to make lots of sounds, both pure tones and more complex ones, and break them apart with spectrum analysis.   Do the same for the target sound — try to break it up into components that might be made by human vocal cords with appropriate spectrum analysis.

Then find a way to easily add the human sounds together to sound like the instrument.  Each singer might focus on one of the harmonics or other tonal qualities of the instrument.  Do it first in the computer, and then see if the people can do it together, without being distracted.  Then work on doing the attack and decay and other artifacts of the start and end of notes.

If it all worked, it would be a fun gag for a choir to suddenly sound like a piano or violin playing a popular piece.   Purer tones like a flute might be harder than complex tones.  Percussion is obviously possible though it might need some amplification.  Indeed, amplification to adjust the levels properly might help a lot but would be slightly more artificial than hearing this without any electronics.   Who knows, perhaps a choir could even sound like an orchestra playing the opening to Beethoven’s 5th, something everybody knows well.

Please release HD movies on regular DVDs

If you’ve looked around, you probably noticed a high-def DVD player, be it HD-DVD or Blu-Ray, is expensive. Expect to pay $500 or so unless you get one bundled with a game console where they are subsidized.

Now they won’t follow this suggestion, but the reality is they didn’t need to make the move to these new DVD formats. Regular old DVD can actually handle pretty decent HDTV movies. Not as good as the new formats, but a lot better than plain DVD. I’ve seen videos with the latest codecs that pack a quite nice HD picture into 2.5 to 3 gigabytes for an hour. I’ve even seen it in less, down to 1.5 gigabytes (actually less that SD DVDs) at 720p 24 fps, though you do notice some problems. But it’s still way better than a standard DVD. Even so, a dual layer DVD can bring about 9 gb, and a double sided dual layer DVD gives you 18gb if you are willing to flip the disk over to get at special features or the 2nd half of a very long movie. Or of course just do 2-disk sets.

Now you might feel that the DVD industry would not want to make a new slew of regular DVD players with the fancier chips in them able to do these mp4 codecs when something clearly better is around the corner. And if they did do this, it would delay adoption of whatever high def DVD format they are backing in the format wars. But in fact, these disks could have been readily playable already, with no change, for the millions who watch DVDs on laptops and media center PCs. More than will have HD DVD or Blu-Ray for some time to come, even with the boost the Playstation 3 gives to Blu-Ray.  read more »

Syndicate content