Almost everybody has a 1080p HD camera with them — almost all phones and pocket cameras do this. HD looks great but the future’s video displays will do 4K, 8K and full eye-resolution VR, and so our video today will look blurry the way old NTSC video looks blurry to us. In a bizarre twist, in the middle of the 20th century, everything was shot on film at a resolution comparable to HD. But from the 70s to 90s our TV shows were shot on NTSC tape, and thus dropped in resolution. That’s why you can watch Star Trek in high-def but not “The Wire.”
I predict that complex software in the future will be able to do a very good job of increasing the resolution of video. One way it will do this is through making full 3-D models of things in the scene using data from the video and elsewhere, and re-rendering at higher resolution. Another way it will do this is to take advantage of the “sub-pixel” resolution techniques you can do with video. One video frame only has the pixels it has, but as the camera moves or things move in a shot, we get multiple frames that tell us more information. If the camera moves half a pixel, you suddenly have a lot more detail. Over lots of frames you can gather even more.
This will already happen with today’s videos, but what if we help them out? For example, if you have still photographs of the things in the video, this will allow clever software to fill in more detail. At first, it will look strange, but eventually the uncanny valley will be crossed and it will just look sharp. Today I suspect most people shooting video on still cameras also shoot some stills, so this will help, but there’s not quite enough information if things are moving quickly, or new sides of objects are exposed. A still of your friend can help render them in high-res in a video, but not if they turn around. For that the software just has to guess.
We might improve this process by designing video systems that capture high-res still frames as often as they can and embed them to the video. Storage is cheap, so why not?
I typical digital video/still camera has 16 to 20 million pixels today. When it shoots 1080p HD video, it combines those pixels together, so that there are 6 to 10 still pixels going into every video pixel. Ideally this is done by hardware right in the imaging chip, but it can also be done to a lesser extent in software. A few cameras already shoot 4K, and this will become common in the next couple of years. In this case, they may just use the pixels one for one, since it’s not so easy to map a 16 megapixel 3:2 still array into a 16:9 8 megapixel 4K image. You can’t just combine 2 pixels per pixel.
Most still cameras won’t shoot a full-resolution video (ie. a 6K or 8K video) for several reasons:
As designed, you simply can’t pull that much data off the chip per unit time. It’s a huge amount of data. Even with today’s cheap storage, it’s also a lot to store.
Still camera systems tend to compress jpegs, but you want a video compression algorithm to record a video even if you can afford the storage for that.
Nobody has displays to display 6K or 8K video, and only a few people have 4K displays — though this will change — so demand is not high enough to justify these costs
When you combine pixels, you get less noise and can shoot in lower light. That’s why your camera can make a decent night-time video without blurring, but it can’t shoot a decent still in that lighting.
What is possible is a sensor which is able to record video (at the desired 30fps or 60fps rate) and also pull off full-resolution stills at some lower frame rate, as long as the scene is bright enough. That frame rate might be something like 5 or even 10 fps as cameras get better. In addition, hardware compression would combine the stills and the video frames to eliminate the great redundancy, though only to a limited extent because our purpose is to save information for the future.
Thus, if we hand the software of the future an HD video along with 3 to 5 frames/second of 16megapixel stills, I am comfortable it will be able to make a very decent 4K video from it most of the time, and often a decent 6K or 8K video. As noted, a lot of that can happen even without the stills, but they will just improve the situation. Those situations where it can’t — fast changing objects — are also situations where video gets blurred and we are tolerant of lower resolution.
It’s a bit harder if you are already shooting 4K. To do this well, we might like a 38 megapixel still sensor, with 4 pixels for every pixel in the video. That’s the cutting edge in high-end consumer gear today, and will get easier to buy, but we now run into the limitations of our lenses. Most lenses can’t deliver 38 million pixels — not even many of the high-end professional photographer lenses can do that. So it might not deliver that complete 8K experience, but it will get a lot closer than you can from an “ordinary” 4K video.
If you haven’t seen 8K video, it’s amazing. Sharp has been showing their one-of-a-kind 8K video display at CES for a few years. It looks much more realistic than 3D videos of lower resolution. 8K video can subtend over 100 degrees of viewing angle at one pixel per minute of arc, which is about the resolution of the sensors in your eye. (Not quite, as your eye also does sub-pixel tricks!) At 60 degrees — which is more than any TV is set up to subtend — it’s the full resolution of your eyes, and provides an actual limit on what we’re likely to want in a display.
And we could be shooting video for that future display today, before the technology to shoot that video natively exists.
Recently I tried Facebook/Oculus Rift Crescent Bay prototype. It has more resolution (I will guess 1280 x 1600 per eye or similar) and runs at 90 frames/second. It also has better head tracking, so you can walk around a small space with some realism — but only a very small space. Still, it was much more impressive than the DK2 and a sign of where things are going. I could still see a faint screen door, they were annoyed that I could see it.
We still have a lot of resolution gain left to go. The human eye sees about a minute of arc, which means about 5,000 pixels for a 90 degree field of view. Since we have some ability for sub-pixel resolution, it might be suggested that 10,000 pixels of width is needed to reproduce the world. But that’s not that many Moore’s law generations from where we are today. The graphics rendering problem is harder, though with high frame rates, if you can track the eyes, you need only render full resolution where the fovea of the eye is. This actually gives a boost to onto-the-eye systems like a contact lens projector or the rumoured Magic Leap technology which may project with lasers onto the retina, as they need actually render far fewer pixels. (Get really clever, and realize the optic nerve only has about 600,000 neurons, and in theory you can get full real-world resolution with half a megapixel if you do it right.)
Walking around Rome, I realized something else — we are now digitizing our world, at least the popular outdoor spaces, at a very high resolution. That’s because millions of tourists are taking billions of pictures every day of everything from every angle, in every lighting. Software of the future will be able to produce very accurate 3D representations of all these spaces, both with real data and reasonably interpolated data. They will use our photographs today and the better photographs tomorrow to produce a highly accurate version of our world today.
This means that anybody in the future will be able to take a highly realistic walk around the early 21st century version of almost everything. Even many interiors will be captured in smaller numbers of photos. Only things that are normally covered or hidden will not be recorded, but in most cases it should be possible to figure out what was there. This will be trivial for fairly permanent things, like the ruins in Rome, but even possible for things that changed from day to day in our highly photographed world. A bit of AI will be able to turn the people in photos into 3-D animated models that can move within these VRs.
It will also be possible to extend this VR back into the past. The 20th century, before the advent of the digital camera, was not nearly so photographed, but it was still photographed quite a lot. For persistent things, the combination of modern (and future) recordings with older, less frequent and lower resolution recordings should still allow the creation of a fairly accurate model. The further back in time we go, the more interpolation and eventually artistic interpretation you will need, but very realistic seeming experiences will be possible. Even some of the 19th century should be doable, at least in some areas.
This is a good thing, because as I have written, the world’s tourist destinations are unable to bear the brunt of the rising middle class. As the Chinese, Indians and other nations get richer and begin to tour the world, their greater numbers will overcrowd those destinations even more than the waves of Americans, Germans and Japanese that already mobbed them in the 20th century. Indeed, with walking chairs (successors of the BigDog Robot) every spot will be accessible to everybody of any level of physical ability.
VR offers one answer to this. In VR, people will visit such places and get the views and the sounds — and perhaps even the smells. They will get a view captured at the perfect time in the perfect light, perhaps while the location is closed for digitization and thus empty of crowds. It might be, in many ways, a superior experience. That experience might satisfy people, though some might find themselves more driven to visit the real thing.
In the future, everybody will have had a chance to visit all the world’s great sites in VR while they are young. In fact, doing so might take no more than a few weekends, changing the nature of tourism greatly. This doesn’t alter the demand for the other half of tourism — true experience of the culture, eating the food, interacting with the locals and making friends. But so much commercial tourism — people being herded in tour groups to major sites and museums, then eating at tour-group restaurants — can be replaced.
I expect VR to reproduce the sights and sounds and a few other things. Special rooms could also reproduce winds and even some movement (for example, the feeling of being on a ship.) Right now, walking is harder to reproduce. With the OR Crescent Bay you could only walk 2-3 feet, but one could imagine warehouse size spaces or even outdoor stadia where large amounts of real walking might be possible if the simulated surface is also flat. Simulating walking over rough surfaces and stairs offers real challenges. I have tried systems where you walk inside a sphere but they don’t yet quite do it for me. I’ve also seen a system where you are held in place and move your feet in slippery socks on a smooth surface. Fun, but not quite there. Your body knows when it is staying in one place, at least for now. Touching other things in a realistic way would require a very involved robotic system — not impossible, but quite difficult.
Also interesting will be immersive augmented reality. There are a few ways I know of that people are developing
With a VR headset, bring in the real world with cameras, modify it and present that view to the screens, so they are seeing the world through the headset. This provides a complete image, but the real world is reduced significantly in quality, at least for now, and latency must be extremely low.
With a semi-transparent screen, show the augmentation with the real world behind it. This is very difficult outdoors, and you can’t really stop bright items from the background mixing with your augmentation. Focus depth is an issue here (and is with most other systems.) In some plans, the screens have LCDs that can go opaque to block the background where an augmentation is being placed.
CastAR has you place retroreflective cloth in your environment, and it can present objects on that cloth. They do not blend with the existing reality, but replace it where the cloth is.
Projecting into the eye with lasers from glasses, or on a contact lens can be brighter than the outside world, but again you can’t really paint over the bright objects in your environment.
Getting back to Rome, my goal would be to create an augmented reality that let you walk around ancient Rome, seeing the buildings as they were. The people around you would be converted to Romans, and the modern roads and buildings would be turned into areas you can’t enter (since we don’t want to see the cars, and turning them into fast chariots would look silly.) There have been attempts to create a virtual walk through ancient Rome, but being able to do it in the real location would be very cool.
The Olympics are coming up, and I have a request for you, NBC Sports. It’s the 21st century, and media technologies have changed a lot. It’s not just the old TV of the 1900s.
Every year, you broadcast the opening ceremony, which is always huge, expensive and spectacular. But your judgment is that we need running commentary, even when music is playing or especially poignant moments are playing out. OK, I get that, perhaps a majority of the audience wants and needs that commentary. Another part of the audience would rather see the ceremony as is, with minimal commentary.
This being the 21st century, you don’t have to choose only one. Almost every TV out there now supports both multiple audio channels — either via the SAP channel (where it still exists) or more likely through the multiple audio channels of digital TV. In addition, they all support multiple channels of captions, too.
So please give us the audio without your announcers on one of the alternate audio channels. Give us their commentary on a caption channel, so if we want to read it without interfering with the music, we can read it.
If you like, do a channel where the commentary is only on the left channel. Clever viewers can then mix the commentary at whatever volume they like using the balance control. Sure, you lose stereo, but this is much more valuable.
I know you might take this as an insult. You work hard on your coverage and hire good people to do it. And so do it — but give your viewers the choice when the live audio track is an important part of the event, as it is for the opening and closing ceremonies, medal ceremonies and a few other events.
The blogging world was stunned by the recent announcement by Google that it will be shutting down Google reader later this year. Due to my consulting relationship with Google I won’t comment too much on their reasoning, though I will note that I believe it’s possible the majority of regular readers of this blog, and many others, come via Google reader so this shutdown has a potential large effect here. Of particular note is Google’s statement that usage of Reader has been in decline, and that social media platforms have become the way to reach readers.
The effectiveness of those platforms is strong. I have certainly noticed that when I make blog posts and put up updates about them on Google Plus and Facebook, it is common that more people will comment on the social network than comment here on the blog. It’s easy, and indeed more social. People tend to comment in the community in which they encounter an article, even though in theory the most visibility should be at the root article, where people go from all origins.
However, I want to talk a bit about online publishing history, including USENET and RSS, and the importance of concepts within them. In 2004 I first commented on the idea of serial vs. browsed media, and later expanded this taxonomy to include sampled media such as Twitter and social media in the mix. I now identify the following important elements of an online medium:
Is it browsed, serial or to be sampled?
Is there a core concept of new messages vs. already-read messages?
If serial or sampled, is it presented in chronological order or sorted by some metric of importance?
Is it designed to make it easy to write and post or easy to read and consume?
Online media began with E-mail and the mailing list in the 60s and 70s, with the 70s seeing the expansion to online message boards including Plato, BBSs, Compuserve and USENET. E-mail is a serial medium. In a serial medium, messages have a chronological order, and there is a concept of messages that are “read” and “unread.” A good serial reader, at a minimum, has a way to present only the unread messages, typically in chronological order. You can thus process messages as they came, and when you are done with them, they move out of your view.
E-mail largely is used to read messages one-at-a-time, but the online message boards, notably USENET, advanced this with the idea of move messages from read to unread in bulk. A typical USENET reader presents the subject lines of all threads with new or unread messages. The user selects which ones to read — almost never all of them — and after this is done, all the messages, even those that were not actually read, are marked as read and not normally shown again. While it is generally expected that you will read all the messages in your personal inbox one by one, with message streams it is expected you will only read those of particular interest, though this depends on the volume.
Echos of this can be found in older media. With the newspaper, almost nobody would read every story, though you would skim all the headlines. Once done, the newspaper was discarded, even the stories that were skipped over. Magazines were similar but being less frequent, more stories would be actually read.
USENET newsreaders were the best at handling this mode of reading. The earliest ones had keyboard interfaces that allowed touch typists to process many thousands of new items in just a few minutes, glancing over headlines, picking stories and then reading them. My favourite was TRN, based on RN by Perl creator Larry Wall and enhanced by Wayne Davison (whom I hired at ClariNet in part because of his work on that.) To my great surprise, even as the USENET readers faded, no new tool emerged capable of handling a large volume of messages as quickly.
In fact, the 1990s saw a switch for most to browsed media. Most web message boards were quite poor and slow to use, many did not even do the most fundamental thing of remembering what you had read and offering a “what’s new for me?” view. In reaction to the rise of browsed media, people wishing to publish serially developed RSS. RSS was a bit of a kludge, in that your reader had to regularly poll every site to see if something was new, but outside of mailing lists, it became the most usable way to track serial feeds. In time, people also learned to like doing this online, using tools like Bloglines (which became the leader and then foolishly shut down for a few months) and Google Reader (which also became the leader and now is shutting down.) Online feed readers allow you to roam from device to device and read your feeds, and people like that. read more »
Last month, I invited Gregory Benford and Larry Niven, two of the most respected writers of hard SF, to come and give a talk at Google about their new book “Bowl of Heaven.” Here’s a Youtube video of my session. They did a review of the history of SF about “big dumb objects” — stories like Niven’s Ringworld, where a huge construct is a central part of the story.
Tonight I will be on a panel at the Palo Alto International Film Festival at 5pm. Not on robocars, but on the role of science fiction in movies in changing the world. (In a past life, I published science fiction and am on this panel by virtue of my faculty position at Singularity University.)
I’m watching the Olympics, and my primary tool as always is MythTV. Once you do this, it seems hard to imagine watching them almost any other way. Certainly not real time with the commercials, and not even with other DVR systems. MythTV offers a really wide variety of fast forward speeds and programmable seeks. This includes the ability to watch at up to 2x speed with the audio still present (pitch adjusted to be natural) and a smooth 3x speed which is actually pretty good for watching a lot of sports. In addition you can quickly access 5x, 10x, 30x, 60x, 120x and 180x for moving along, as well as jumps back and forth by some fixed amount you set (like 2 minutes or 10 minutes) and random access to any minute. Finally it offers a forward skip (which I set to 20 seconds) and a backwards skip (I set it to 8 seconds.)
MythTV even lets you customize these numbers so you use different nubmers for the Olympics compared to other recordings. For example the jumps are normally +/- 10 minutes and plus 30 seconds for commercial skip, but Myth has automatic commercial skip.
A nice mode allows you to go to smooth 3x speed with closed captions, though it does not feature the very nice ability I’ve seen elsewhere of turning on CC when the sound is off (by mute or FF) and turning it off when sound returns. I would like a single button to put me into 3xFF + CC and take me out of it.
Anyway, this is all very complex but well worth learning because once you learn it you can consume your sports much, much faster than in other ways, and that means you can see more of the sports that interest you, and less of the sports, commercials and heart-warming stories of triumph over adversity that you don’t. With more than 24 hours a day of coverage it is essential you have tools to help you do this.
I have a number of improvements I would like to see in MythTV like a smooth 5x or 10x FF (pre-computed in advance) and the above macro for CC/FF swap. In addition, since the captions tend to lag by 2-3 seconds it would be cool to have a time-sync for the CC. Of course the network, doing such a long tape delay, should do that for you, putting the CC into the text accurately and at the moment the words are said. You could write software to do that even with human typed captions, since the speech-recognition software can easily figure out what words match once it has both the audio and the words. Nice product idea for somebody.
Watching on the web
This time, various networks have put up extensive web offerings, and indeed on NBC this is the only way to watch many events live, or at all. Web offerings are good, though not quite at the quality of over-the-air HDTV, and quality matters here. But the web offerings have some failings read more »
I found this recent article from the editor of the MIT Tech review on why apps for publishers are a bad idea touched on a number of key issues I have been observing since I first got into internet publishing in the 80s. I recommend the article, but if you insist, the short summary is that publishers of newspapers and magazines flocked to the idea of doing iPad apps because they could finally make something they that they sort of recognized as similar to a traditional publication; something they controlled and laid out, that was a combined unit. So they spent lots of money and ran into nightmares (having to design for both landscape and portrait on the tablet, as well as possibly on the phones or even Android.) and didn’t end up selling many subscriptions.
Since the dawn of publishing there has been a battle between design and content. This is not a battle that has or should have a single winner. Design is important to enjoyment of content, and products with better design are more loved by consumers and represent some of the biggest success stories. Creators of the content — the text in this case — point out that it is the text where you find the true value, the thing people are actually coming for. And on the technology side, the value of having a wide variety of platforms for content — from 30” desktop displays to laptops to tablets to phones, from colour video displays to static e-ink — is essential to a thriving marketplace and to innovation. Yet design remains so important that people will favour the iPhone just because they are all the same size, and most Android apps still can’t be used on Google TV.
This is also the war between things like PDF, which attempts to bring all the elements of paper-based design onto the computer, and the purest form of SGMLs, including both original and modern HTML. Between WYSIWYG and formatting languages, between semantic markup and design markup. This battle is quite old, and still going on. In the case of many designers, that is all they do, and the idea that a program should lay out text and other elements to fit a wide variety of display sizes and properties is anathema. To technologists, that layout should be fixed is almost as anathema.
Also included in this battle are the forces of centralization (everything on the web or in the cloud) and the distributed world (custom code on your personal device) and their cousins online and offline reading. A full treatise on all elements of this battle would take a book for it is far from simple.
I sit mostly with the technologists, eager to divide design from content. I still write all my documents in text formatting languages with visible markup and use WYSIWYG text editors only rarely. An ideal system that does both is still hard to find. Yet I can’t deny the value and success of good design and believe the best path is to compromises in this battle. We need compromises in design and layout, we need compromises between the cloud and the dedicated application. End-user control leads to some amount of chaos. It’s chaos that is feared by designers and publishers and software creators, but it is also the chaos that gives us most of our good innovations, which come from the edge.
Let’s consider all the battles I perceive for the soul of how computing, networks and media work:
The design vs. semantics battle (outlined above)
The cloud vs. personal device
Mobile, small and limited in input vs. tethered, large screen and rich in input
Central control vs. the distributed bazaar (with so many aspects, such as)
The destination (facebook) vs. the portal (search engine)
The designed, uniform, curated experience (Apple) vs. the semi-curated (Android) vs. the entirely open (free software)
The social vs. the individual (and social comment threads vs. private blogs and sites)
The serial (email/blogs/RSS/USENET) vs. the browsed (web/wikis) vs. the sampled (facebook/twitter)
The reader-friendly (fancy sites, well filtered feeds) vs. writer friendly (social/wiki)
In most of these battles both sides have virtues, and I don’t know what the outcomes will be, but the original MITTR article contained some lessons for understanding them.
It’s been interesting to see how TV shows from the 60s and 70s are being made available in HDTV formats. I’ve watched a few of Classic Star Trek, where they not only rescanned the old film at better resolution, but also created new computer graphics to replace the old 60s-era opticals. (Oddly, because the relative budget for these graphics is small, some of the graphics look a bit cheesy in a different way, even though much higher in technical quality.)
The earliest TV was shot live. My mother was a TV star in the 50s and 60s, but this was before videotape was cheap. Her shows all were done live, and the only recording was a Kinescope — a film shot off the TV monitor. These kinneys are low quality and often blown out. The higher budget shows were all shot and edited on film, and can all be turned into HD. Then broadcast quality videotape got cheap enough that cheaper shows, and then even expensive shows began being shot on it. This period will be known in the future as a strange resolution “dark ages” when the quality of the recordings dropped. No doubt they will find today’s HD recordings low-res as well, and many productions are now being shot on “4K” cameras which have about 8 megapixels.
But I predict the future holds a surprise for us. We can’t do it yet, but I imagine software will arise that will be able to take old, low quality videos and turn them into some thing better. They will do this by actually modeling the scenes that were shot to create higher-resolution images and models of all the things which appear in the scene. In order to do this, it will be necessary that everything move. Either it has to move (as people do) or the camera must pan over it. In some cases having multiple camera views may help.
When an object moves against a video camera, it is possible to capture a static image of it in sub-pixel resolution. That’s because the multiple frames can be combined to generate more information than is visible in any one frame. A video taken with a low-res camera that slowly pans over an object (in both dimensions) can produce a hi-res still. In addition, for most TV shows, a variety of production stills are also taken at high resolution, and from a variety of angles. They are taken for publicity, and also for continuity. If these exist, it makes the situation even easier. read more »
In media today, it’s common to talk about three screens: Desktop, mobile and TV. Many people watch TV on the first two now, and tools like Google TV and the old WebTV try to bring interactive, internet style content to the TV. People like to call the desktop the “lean forward” screen where you use a keyboard and have lots of interactivity, while the TV is the “lean back” couch-potato screen. The tablet is also distinguishing itself a bit from the small screen normally found in mobile.
More and more people also find great value in having an always-on screen where they can go to quickly ask questions or do tasks like E-mail.
I forecast we will soon see the development of a “fourth screen” which is a mostly-always-on wall panel meant to be used with almost no interaction at all. It’s not a thing to stare at like the TV (though it could turn into one) nor a thing to do interactive web sessions on. The goal is to have minimal UI and be a little bit psychic about what to show.
One could start by showing stuff that’s always of use. The current weather forecast, for example, and selected unusual headlines. Whether each member of the household has new mail, and if it makes sense from a privacy standpoint, possibly summaries of that mail. Likewise the most recent status from feeds on twitter or Facebook or other streams. One could easily fill a screen with these things so you need a particularly good filter to find what’s relevant. Upcoming calendar events (with warnings) also make sense.
Some things would show only when important. For example, when getting ready to go out, I almost always want to see the traffic map. Or rather, I want to see it if it has traffic jams on it, no need to show it when it’s green — if it’s not showing I know all is good. I may not need to see the weather if it’s forecast sunny either. Or if it’s raining right now. But if it’s clear now and going to rain later I want to see that. Many city transit systems have a site that tracks when the next bus or train will come to my stop — I want to see that, and perhaps at morning commute time even get an audio alert if something unusual is up or if I need to leave right now to catch the street car. A view from the security camera at the door should only show if somebody is at the door.
There are so many things I want to see that we will need some UI for the less popular ones. But it should be a simple UI, with no need to find a remote (though if I have a remote — any remote — it should be able to use it.) Speech commands would be good to temporarily see other screens and modes. A webcam (and eventually Kinect style sensor) for gestural UI would be nice, letting me swipe or wave to get other screens. read more »
Like me, you probably have a dozen “universal” remote controls gathered over the years. With each new device and remote you go through a process to try to figure out special codes to enter into the remote to train it to operate your other devices. And it’s never very good, except perhaps in the expensive remotes with screens and macros.
The first universal remotes had to do this because they were made after the TVs and other devices, and had to control old ones. But the idea’s been around for decades, and I think we have it backwards. It’s not the remote that should work with any TV, it’s the TV that should work with any remote. I’m not even sure in most cases we need to have the remote come with the TV, though I know they like designing special magic buttons and layouts for each new remote.
It would be trivial for any TV or other device that displays video to figure out exactly what sort of remote you are pointing at it, and then figure out what to do with all its buttons. Since these devices now all have USB plugs and internet connections, they can even get their data updated. With the TV in a remote setting mode (which you must of course reach by the few keys on the TV) a few buttons from any remote should let the TV figure out what it’s seeing. If it can’t figure out the difference it can ask on the screen to push specific buttons until you you see a picture of your remote on the screen and confirm.
If it can’t figure it out, it can still program the codes from any device by remembering. This would let it prompt you “push the button you want to change the channel” and you would push it and it would know. You could also tweak any remotes. But most people would see the very simple interface of “press these keys and we’ll figure out which you have.” Also makes it easy to have more than one device of the same type. But in particular makes it easy to not have so many “modes” where you have to tell the remote you want to control the TV now, then the satellite box, then the stereo, then the dvd player. Instead just tell the TV “ignore the buttons I am about to press” (for example the volume buttons) and tell the stereo to obey them. Or program a button to do different things on different devices — not a macro where a smart remote sends all the codes needed to tell the TV and stereo to switch inputs while turning on the DVD player, but just each box responding in its own way.
For outlying cases, you could tell the user to program their universal remote for some well established old devices. Every universal remote there is can control a Sony TV for example. That makes it sure the TV will know a set of codes.
The TVs and other devices might as well recognize all the infrared keyboards out there while they are at it.
Of course, as TVs figure out how to do this, the remotes can change. They can become a bit more standardized, and instead of trying to figure everything out, they can be the dumb device and the AV equipment can be the smart device. It’s the AV equipment that has storage, a screen, audio and so much more.
You can also train devices to understand there are multiple remotes that belong to some people. For example, the adult remote can be different from the child’s remote, and only the adult remote can see the Playboy channel, and is kept private. The child’s remote can also be limited to a number of hours of TV as I first suggested six years ago at the birth of this blog.
You can even fix the annoying problem of most remote protocols — “on” and “off” are the same button. This makes it very hard to do things like macro control because you can’t be sure what that code can do. You can have a “turn everything off” button that really works (I presume some of the ones out there use hidden non-toggle codes when they can) or codes to do things like switch on the DVD if it’s not already on, switch video and audio inputs to it, and start playing — something many systems have tried to do but rarely done well.
There are a few things to tweak to make sure “IR blasters” work properly. (These are IR outputs found on DVRs which send commands to cable and satellite boxes to change their channel etc. They are a horrible kludge and the best way rid of them are the new protocols that connect the devices up to IP or the new IP over HDMI 1.4, or failing that the badly-done anynet.)
But the key point here is this: Remotes put the smarts in the wrong place.
I’m starting to say that Curling might be the best Olympic sport. Why?
It’s the most dominated by strategy. It also requires precision and grace, but above all the other Olympic sports, long pauses to think about the game are part of the game. If you haven’t guessed, I like strategy.
Yes, other sports have in-game strategy, of course, particularly the team sports. And since the gold medalist from 25 years ago in almost every sport would barely qualify, you can make a case that all the sports are mostly mental in their way. But with curling, it’s right there, and I think it edges out the others in how important it is.
While it requires precision and athletic skill, it does not require strength and endurance to the human limits. As such, skilled players of all ages can compete. (Indeed, the fact that out-of-shape curlers can compete has caused some criticism.) A few other sports, like sharpshooting and equestrian events, also demand skill over youth. All the other sports give a strong advantage to those at the prime age.
Mixed curling is possible, and there are even tournaments. There’s debate on whether completely free mixing would work, but I think there should be more mixed sports, and more encouragement of it. (Many of the team sports could be made mixed, of course mixed tennis used to be in the Olympics and is returning.)
The games are tense and exciting, and you don’t need a clock, judge or computer to tell you who is winning.
On the downside, not everybody is familiar with the game, the games can take quite a long time and the tournament even longer for just one medal, and compared to a multi-person race it’s a slow game. It’s not slow compared to an even that is many hours of time trials, though those events have brief bursts of high-speed excitement mixed in with waiting. And yes, I’m watching Canada-v-USA hockey now too. read more »
These days it is getting very common to make videos of presentations, and even to do live streams of them. And most of these presentations have slides in Powerpoint or Keynote or whatever. But this always sucks, because the camera operator — if there is one — never moves between the speaker and the slide the way I want. You can’t please everybody of course.
In the proprietary “web meeting” space there are several tools that will let people do a video presentation and sync it with slides, ideally by pre-transmitting the slide deck so it is rendered in full resolution at the other end, along with the video. In this industry there are also some video players where you can seek along in the video and it automatically seeks within the slides. This can be a bit complex if the slides try to do funny animations but it can be done.
Obviously it would be nice to see a flash player that understands it is playing a video and also playing slides (even video of slides, though it would be better to do it in higher quality since it isn’t usually full motion video.) Sites like youtube could support it. However, getting the synchronization requires that you have a program on the presenting computer, which you may not readily control.
One simple idea would be a button the camera operator could push to say “Copy this frame to the slide window.” Then the camera would, when there is a new slide, move or switch over there, and the button would be pushed, and the camera could go immediately back to the speaker. Usually though the camera crew has access to the projector feed and would not need to actually point a camera, in fact some systems “switch” to the slides by just changing the video feed. A program which sends the projector feed with huge compresion (in other words, an I frame for any slide change and nothing after) would also work well. No need to send all the fancy transitions.
But it would be good to send the slides not as mpeg, but as PNG style, to be sharper if you can’t get access to the slides themselves. I want a free tool, so I can’t ask for the world, yet, but even something as basic as this would make my watching of remote presentations and talks much better. And it would make people watching my talks have a better time too — a dozen or so of them are out on the web.
I’m in O’Hare waiting to fly to Munich for DLD. More details to come.
I think URL shorteners are are a curse, but thanks to Twitter they are growing vastly in use. If you don’t know, URL shorteners are sites that will generate a compact encoded URL for you to turn a very long link into a short one that’s easier to cut and paste, and in particular these days, one that fits in the 140 character constraint on Twitter.
I understand the attraction, and not just on twitter. Some sites generate hugely long URLs which fold over many lines if put in text files or entered for display in comments and other locations. The result, though, is that you can no longer determine where the link will take you from the URL. This hurts the UI of the web, and makes it possible to fool people into going to attack sites or Rick Astley videos. Because of this, some better twitter clients re-expand the shortened URLs when displaying on a larger screen.
Anyway, here’s an idea for the Twitter clients and URL shorteners, if they must be used. In a tweet, figure out how much room there is to put the compacted URL, and work with a shortener that will let you generate a URL of exactly that length. And if that length has some room, try to put in some elements from the original URL so I can see them. For example, you can probably fit the domain name, especially if you strip off the “www.” from it (in the visible part, not in the real URL.) Try to leave as many things that look like real words, and strip things that look like character encoded binary codes and numbers. Of course, in the end you’ll need something to make the short URL unique, but not that much. Of course, if there already is a URL created for the target, re-use that.
Google just did its own URL shortener. I’m not quite sure what the motives of URL shortener sites are. While sometimes I see redirects that pause at the intermediate site, nobody wants that and so few ever use such sites. The search engines must have started ignoring URL redirect sites when it comes to pagerank long ago. They take donations and run ads on the pages where people create the tiny URLs, but when it comes to ones used on Twitter, these are almost all automatically generated, so the user never sees the site.
It’s now becoming common to kludge a conference “backchannel” onto Twitter. I am quite ambivalent about this. I don’t think Twitter works nearly as well as an internal backchannel, even though there are some very nice and fancy twitter clients to help make this look nicer.
But the real problem comes from the public/private confusion. Tweets are (generally) public, and even if tagged by a hashtag to be seen by those tracking an event, they are also seen by your regular followers. This has the following consequences, good and bad.
Some people tweet a lot while in a conference. They use it as a backchannel. That’s overwhelming to their followers who are not at the conference, and it fills up the feed.
When multiple people do it, it’s almost like a spam. I believe that conferences like using Twitter as backchannel because it causes constant mentions of their conference to be broadcast out into the world.
While you can filter out a hashtag in many twitter clients, it’s work to do so, and the general flooding of the feed is annoying to many.
People tweeting at a conference are never sure about who they are talking to. Some tweets will clearly be aimed at fellow conference attendees. But many are just repeats of salient lines said on stage, aimed only at the outsiders.
While you can use multiple tags and filters to divide up different concurrent sessions of a conference, this doesn’t work well.
The interface on Twitter is kludged on, and poor.
Twitter’s 140 character limit is a burden on backchannel. Backchannel comments are inherently short, and no fixed limit is needed on them. Sure, sometimes you go longer but never much longer.
The Twitter limit forces URLs to be put into URL shorteners, which obscure where they go and are generally a bane of the world.
Dedicated backchannels are better, I think. They don’t reach the outside world unless the outsiders decide to subscribe to them, but I think that’s a plus. I think the right answer is a dedicated, internal-only backchannel, combined with a minimal amount of tweeting to the public (not the meeting audience) for those who want to give their followers some snippets of the conferences their friends are going to. The public tweets may not use a hashtag at all, or a different one from the “official” backchannel as they are not meant for people at the conference.
The most common dedicated backchannel tool is IRC. While IRC has its flaws, it is much better at many things than any of the web applications I have seen for backchannel. It’s faster and has a wide variety of clients available to use with it. While this is rarely done, it is also possible for conferences to put an IRC server on their own LAN so the backchannel is entirely local, and even keeps working when the connection to the outside world gets congested, as is common on conference LANs. I’m not saying IRC is ideal, but until something better comes along, it works. Due to the speed, IRC backchannels tend to be much more rapid fire, with dialog, jokes, questions and answers. Some might view this as a bug, and there are arguments that slowing things down is good, but Twitter is not the way to attain that.
However, we won’t stop those who like to do it via Twitter. As noted, conferences like it because it spams the tweetsphere with mentions of their event.
I would love to see an IRC Bot designed to gateway with the Twitter world. Here are some of the features it might have. read more »
It’s over 17 years since I first too a stab at e-Books, and while I was far too early, I must admit I had not predicted I would be that early. The market is now seeing a range of e-Ink based electronic book readers, such as the kindle, and some reasonable adoption. But I don’t have one yet. But I do read e-books on my tiny phone screen. Why?
The phone has the huge advantage that it is always with me. It gives me a book any time I am caught waiting. On a train, in a doctor’s office, there is always a way to catch up on reading. It’s not ideal, and I don’t use it to read at home in bed, but it’s there. The tablets are all large, and for a good reading experience, people like them even larger. This means they are only there when you make deliberate plans to read, and pack them in your bag.
I’m not that thrilled with e-Ink yet, both for its low contrast and the annoying way it has to flash black in order to reset, causing a distracting delay when turning the page. There are ways to help that, but as yet it suffers. e-Ink also can’t readily be used for annotation or interactive operation, so many devices will keep a strip of LCD for things like selecting from menus and the like. Many of the devices also waste a lot of space with a keyboard, and the Kindle includes a cellular radio in order to download books. e-Ink does have a huge advantage in battery life.
What makes sense to me instead would be a sheet (or two sheets, folded) of e-Ink with very little in the way of smarts inside the device. Instead, it would be designed so that a variety of cell phones could dock to the e-Ink sheet and provide the brains. Phones have different form factors, of course, and different connectors though almost all can do USB. (Though annoyingly only as a slave, but this can be kludged around.) It would be necessary to make small plastic holders for the different phone models which can mate to a mount on the book display, ideally connecting the data port at the same time. The tablet of course should be able to connect to a laptop via USB (this time as a slave) but do the same reading actions. The docking can also be, I am reminded by the commenters, done by bluetooth, with interesting consequences.
This has many large advantages:
Done right, this tablet is a fair bit cheaper. It has minimal brains inside, and no cell phone. In fact, for most people, it also does not include the cost of a cell phone data service. (I presume with the Kindle the cost of that is split between the unit and the book sales, but either way, you pay for it.)
The cell phone provides an interactive LCD screen to use with all the reader’s interactive functions — book buying, annotating etc.
The cell phone provides a data connection for downloading books, newspapers and web pages.
The cell phone provides a keyboard for the few times you use a keyboard on an e-Book reader
When you don’t have your e-Ink tablet, you still have all your books, and can still order books.
The main thing the cell phone doesn’t have is huge battery life. The truth is, however, that cell phones have excellent battery life if they are not turning on their screen or doing complex network apps. We do such activities of course, and they drain our batteries, but we expect that and thus charge regularly and carry more. I’m not too scared at the idea of not being able to read my books with the phone dead.
The tablet could also be used with a laptop, especially a netbook. Laptops can actually run for a very long time if you put them in a power conserving mode, turning off the screen and disks, possibly even suspending the CPU between complex operations.
However, there is no need to run it at all. While I described the tablet as being dumb, it takes very little smarts for it to let you page through a pre-rendered book that was fed to it by the phone or laptop. That can be done with a low power microcontroller. It just would not do any fancy interactive operations without turning on the phone or laptop. And indeed, for the plain reading of a single book, akin to what you can do with the paper version, it would be able to operate on its own.
Of course, the vendors would not want to support every phone. But they could cut a deal to let people use old supported phones (which are in plentiful supply as people recycle phones constantly) with a minimal books-only data plan similar to the plans they have cut for the dedicated devices. In the GSM world, they could offer a special SIM good only for book operations for use in an older phone of the class they do support. And they could also build a custom module that slots perfectly into the tablet with the cell modem, small LCD screen and keyboard for those who still want a stand-alone device.
This approach also allows you to upgrade your tablet and your phone independently.
As noted, I think a folding tablet makes a lot of sense. This is true for two reasons. First, you get more screen real estate in half the width of tablet. Secondly, with two e-Ink panels, you can play some tricks so that you flash-refresh the panel you aren’t reading rather than the one you are finishing. While slightly distracting (depending how it’s done) it means that when you want to switch to the next page, you do it with your eyes, with no delay. You have to push a button when you switch (even going from left to right though it’s not apparently needed) so that the page you have fully finished refreshes while you are reading the next one. This could also be done with timings. Or even with a small camera watching your eyes, though I was trying to make the tablet dumber and this takes CPU and power right now. I can imagine other tricks that would work, such as how you hold the tablet (capacitive detection of your grip, or accelerometer detection of the angle.)
The tablet could also be built so the two pages of e-Ink are on the front and back. In this case it would not fold, though a slipcover would be a good idea. A “flip tablet” would display page 1 to you with page 2 on the back. To read page 2 you would physically flip it over. It would detect that of course, and change page 1 to page 3 when it was on the other side. This would mean the distraction of the flash-refresh would not be visible to you, which is a nice plus.
Cutely, the flip tablet could detect which direction you flip it. So if you flip it counter clockwise, you get the next page. If you flip it clockwise you get the previous page. Changing direction means you might briefly see the flash while you are flipping the unit but the UI seems pretty good to me. For those who don’t like this interface, the unit could still hinge out in the middle to show both pages at once.
Using bluetooth for the connection has a number of interesting consequences. It does use power, and does not allow exchange of power between the tablet and device, but it means you don’t have to physically put the phone on the tablet at all. This may be a pain in some circumstances (needing two hands to do interactive things) but in other circumstances having a remote control to use to flip pages can be a real win.
I have found a very nice way to do e-reading is to have the pages displayed in front of you, at eye height, rather than down low in your hands. In particular, if you can mount your tablet on the top back of an airplane seat, it is much more comfortable than holding a book or tablet in your hands. The main downside is that the overhead light does not shine on the page there, so you need a backlight or LED book light. The ability to do remote control from your phone, in your rested hand would be great. Unfortunately they have the strange idea that they want to ban bluetooth on planes, though it poses no risk. They don’t even like wires.
In the 90s I built a device for reading books on planes where I got a book holder (they do make those) and I rigged it to attach with velcro and hang from the back of the seat in front. In those days it was quite common to have velcro on the top of the seat. Combined with a book light, I found this to be way more comfortable than holding a book in my hands, and I read much more pleasantly. Today you might have to build it so that a plate wedges between the raised table and seatback and a rod sticks out to hold the tablet.
Or on some planes they could support e-books on the screen in the seatback, with the remote control that is in your armrest. Alas, they would indeed need to use bluetooth so your PDA could display the book on that screen. (In general, letting your PDA use the screen in front of you would be very nice. It’s too sucky a resolution for laptops, since it must have been designed years ago in an era of sucky resolution. Today 1650 x 950 displays cost $100.
For success, such a system would need to be as easy to use and set up as twitter for users, and pretty easy to set up for server operators. One thing it can’t do so easily, alas, is use a simple single namespace the way twitter does. A distributed system probably has to make names be domains, like E-mail addresses. That almost surely means something longer than twitter names and no use of the @name syntax popular in Twitter to refer to users. On the other hand almost everybody already has a domain based ID, ie. their E-mail address. On the other hand most people are afraid to use this ID in public where it might get spam. It’s a shame, but many might well prefer to get a different ID from their E-mail, or of course to use one at twitter, which would now look like email@example.com to the outside world instead of @user within twitter.
Naming problems aside, the denizens of the internet are certainly up to building a publish/subscribe based short message multicasting service, which is what twitter is using terms much older than the company. I might propose the name MSM for the techology (Multicast Short Message) read more »
Some recent searches have revealed unusual activity on twitter, and I wonder where it’s going. Narcissus searches on twitter reveal a variety of accounts tweeting links into my blog and sites, for reasons not clearly apparent.
For example, a week ago, a half dozen identical twitter accounts all tweeted my post about electric cars playing music. All the accounts had pictures of models as their icon, and the exact same set of twitter posts, which seem to be a random collection of blog and news URLs with a bit.ly pointer to the item, all posted via twitterfeed. These accounts seem to follow and be followed by about 500, presumably the same list.
Then more recently I see another set of accounts which all follow about 20 people but are followed by about 200 to 500. They are all posting “from API” and again are just posting links, this time with tinyurl.com. The account names are odd, too.
These also seem to to have cute girls as icons. However, strangely, the many followers appear to be real, or at least some of them appear to be. Why are people following a spam robot? Are the followers people who were paid to do it, or are in some twitter-optimization scheme?
What I am curious about is the motive. Are they linking to real sites in the hope of gaining some sort of legitimacy in twitter indexing engines, so that later they can start linking to people who pay for it? (Twitter SEO?) Are they trying to form twitter equivalents of link farms? Are they just hoping that site authors will see the backlinks and look at them for some later purpose? (You would be amazed how many hits on a web server are there just to put a spammer in the “Referer” field, either to get you to look, or to show up in referer logs that some sites post to the web.)
I just returned from Jeff Pulver’s “140 Characters” conference in L.A. which was about Twitter. I asked many people if they get Twitter — not if they understand how it’s useful, but why it is such a hot item, and whether it deserves to be, with billion dollar valuations and many talking about it as the most important platform.
Some suggested Twitter is not as big as it appears, with a larger churn than expected and some plateau appearing in new users. Others think it is still shooting for the moon.
The first value in twitter I found was as a broadcast SMS. While I would not text all my friends when I go to a restaurant or a club, having a way so that they will easily know that (and might join me) is valuable. Other services have tried to do things like this but Twitter is the one that succeeded in spite of not being aimed at any specific application like this.
This explains the secret of Twitter. By being simple (and forcing brevity) it was able to be universal. By being more universal it could more easily attain critical mass within groups of friends. While an app dedicated to some social or location based application might do it better, it needs to get a critical mass of friends using it to work. Once Twitter got that mass, it had a leg up at being that platform.
At first, people wondered if Twitter’s simplicity (and requirement for brevity) was a bug or a feature. It definitely seems to have worked as a feature. By keeping things short, Twitter makes is less scary to follow people. It’s hard for me to get new subscribers to this blog, because subscribing to the blog means you will see my moderately long posts every day or two, and that’s an investment in reading. To subscribe to somebody’s Twitter feed is no big commitment. Thus people can get a million followers there, when no blog has that. In addition, the brevity makes it a good match for the mobile phone, which is the primary way people use Twitter. (Though usually the smart phone, not the old SMS way.)
And yet it is hard not to be frustrated at Twitter for being so simple. There are so many things people do with Twitter that could be done better by some more specialized or complex tool. Yet it does not happen.
However, Twitter, in its latest mode, is something different. It is “sampled.” In normal serial media, you usually consume all of it. You come in to read and the tool shows you all the new items in the stream. Your goal is to read them all, and the publishers tend to expect it. Most Twitter users now follow far too many people to read it all, so the best they can do is sample — they come it at various times of day and find out what their stalkees are up to right then. Of course, other media have also been sampled, including newspapers and message boards, just because people don’t have time, or because they go away for too long to catch up. On Twitter, however, going away for even a couple of hours will give you too many tweets to catch up on.
This makes Twitter an odd choice as a publishing tool. If I publish on this blog, I expect most of my RSS subscribers will see it, even if they check a week later. If I tweet something, only a small fraction of the followers will see it — only if they happen to read shortly after I write it, and sometimes not even then. Perhaps some who follow only a few will see it later, or those who specifically check on my postings. (You can’t. Mine are protected, which turns out to be a mistake on Twitter but there are nasty privacy results from not being protected.)
TV has an unusual history in this regard. In the early days, there were so few stations that many people watched, at one time or another, all the major shows. As TV grew to many channels, it became a sampled medium. You would channel surf, and stop at things that were interesting, and know that most of the stream was going by. When the Tivo arose, TV became a subscription medium, where you identify the programs you like, and you see only those, with perhaps some suggestions thrown in to sample from.
Online media, however, and social media in particular were not intended to be sampled. Sure, everybody would just skip over the high volume of their mailing lists and news feeds when coming back from a vacation, but this was the exception and not the rule.
The question is, will Twitter’s nature as a sampled medium be a bug or a feature? It seems like a bug but so did the simplicity. It makes it easy to get followers, which the narcissists and the PR flacks love, but many of the tweets get missed (unless they get picked up as a meme and re-tweeted) and nobody loves that.
On Protection: It is typical to tweet not just blog-like items but the personal story of your day. Where you went and when. This is fine as a thing to tell friends in the moment, but with a public twitter feed, it’s being recorded forever by many different players. The ephemeral aspects of your life become permanent. But if you do protect your feed, you can’t do a lot of things on twitter. What you write won’t be seen by others who search for hashtags. You can’t reply to people who don’t follow you. You’re an outsider. The only way to solve this would be to make Twitter really proprietary, blocking all the services that are republishing it, analysing it and indexing it. In this case, dedicated applications make more sense. For example, while location based apps need my location, they don’t need to record it for more than a short period. They can safely erase it, and still provide me a good app. They can only do this if they are proprietary, because if they give my location to other tools it is hard to stop them from recording it, and making it all public. There’s no good answer here.
Twenty years ago (Monday) on June 8th, 1989, I did the public launch of ClariNet.com, my electronic newspaper business, which would
be delivered using USENET protocols (there was no HTTP yet) over the internet.
ClariNet was the first company created to use the internet as its platform for business, and as such this event has a claim at being the birth of the “dot-com” concept which so affected the world in the two intervening decades. There are other definitions and other contenders which I discuss in the article below.
In those days, the internet consisted of regional networks, who were mostly non-profit cooperatives, and the government funded “NSFNet” backbone which linked them up. That backbone had a no-commercial-use policy, but I found a way around it. In addition, a nascent commercial internet was arising with companies like UUNet and PSINet, and the seeds of internet-based business were growing. There was no web, of course. The internet’s community lived in e-Mail and USENET. Those, and FTP file transfer were the means of publishing. When Tim Berners-Lee would coin the term “the web” a few years later, he would call all these the web, and HTML/HTTP a new addition and glue connecting them.
I decided I should write a history of those early days, where the seeds of the company came from and what it was like before most of the world had even heard of the internet. It is a story of the origins and early perils and successes, and not so much of the boom times that came in the mid-90s. It also contains a few standalone anecdotes, such as the story of how I accidentally implemented a system so reliable, even those authorized to do so failed to shut it down (which I call “M5 reliability” after the Star Trek computer), stories of too-early eBook publishing and more.
There’s also a little bit about some of the other early internet and e-publishing businesses such as BBN, UUNet, Stargate, public access unix, Netcom, Comtex and the first Internet World trade show.