Not everybody loves video calls, but there are times when they are great. I like them with family, and I try to insist on them when negotiating, because body language is important. So I’ve watched as we’ve increased the quality and ease of use.
The ultimate goals would be “retinal” resolution — where the resolution surpasses your eye — along with high dynamic range, stereo, light field, telepresence mobility and VR/AR with headset image removal. Eventually we’ll be able to make a video call or telepresence experience so good it’s a little hard to tell from actually being there. This will affect how much we fly for business meetings, travel inside towns, life for bedridden and low mobility people and more.
Here’s a proposal for how to provide that very high or retinal resolution without needing hundreds of megabits of high quality bandwidth.
Many people have observed that the human eye is high resolution on in the center of attention, known as the fovea centralis. If you make a display that’s sharp where a person is looking, and blurry out at the edges, the eye won’t notice — until of course it quickly moves to another section of the image and the brain will show you the tunnel vision.
Decades ago, people designing flight simulators combined “gaze tracking,” where you spot in real time where a person is looking with the foveal concept so that the simulator only rendered the scene in high resolution where the pilot’s eyes were. In those days in particular, rendering a whole immersive scene at high resolution wasn’t possible. Even today it’s a bit expensive. The trick is you have to be fast — when the eye darts to a new location, you have to render it at high-res within milliseconds, or we notice. Of course, to an outside viewer, such a system looks crazy, and with today’s technology, it’s still challenging to make it work.
With a video call, it’s even more challenging. If a person moves their eyes (or in AR/VR their head) and you need to get a high resolution stream of the new point of attention, it can take a long time — perhaps hundreds of milliseconds — to send that signal to the remote camera, have it adjust the feed, and then get that new feed back to you. There is no way the user will not see their new target as blurry for way too long. While it would still be workable, it will not be comfortable or seem real. For VR video conferencing it’s even an issue for people turning their head. For now, to get a high resolution remote VR experience would require sending probably a half-sphere of full resolution video. The delay is probably tolerable if the person wants to turn their head enough to look behind them.
One opposite approach being taken for low bandwidth video is the use of “avatars” — animated cartoons of the other speaker which are driven by motion capture on the other end. You’ve seen characters in movies like Sméagol, the blue Na’vi of the movie Avatar and perhaps the young Jeff Bridges (acted by old Jeff Bridges) in Tron: Legacy. Cartoon avatars are preferred because of what we call the Uncanny Valley — people notice flaws in attempts at total realism and just ignore them in cartoonish renderings. But we are now able to do moderately decent realistic renderings, and this is slowly improving.
My thought is to combine foveal video with animated avatars for brief moments after saccades and then gently blend them towards the true image when it arrives. Here’s how.
The remote camera will send video with increasing resolution towards the foveal attention point. It will also be scanning the entire scene and making a capture of all motion of the face and body, probably with the use of 3D scanning techniques like time-of-flight or structured light. It will also be, in background bandwidth, updating the static model of the people in the scene and the room.
Upon a saccade, the viewer’s display will immediately (within milliseconds) combine the blurry image of the new target with the motion capture data, along with the face model data received, and render a generated view of the new target. It will transmit the new target to the remote.
The remote, when receiving the new target, will now switch the primary video stream to a foveal density video of it.
When the new video stream starts arriving, the viewer’s display will attempt to blend them, creating a plausible transition between the rendered scene and the real scene, gradually correcting any differences between them until the video is 100% real
In addition, both systems will be making predictions about what the likely target of next attention is. We tend to focus our eyes on certain places, notably the mouth and eyes, so there are some places that are more likely to be looked at next. Some portion of the spare bandwidth would be allocated to also sending those at higher resolution — either full resolution if possible, or with better resolution to improve the quality of the animated rendering.
The animated rendering will, today, both be slightly wrong, and also suffer from the uncanny valley problem. My hope is that if this is short lived enough, it will be less noticeable, or not be that bothersome. It will be possible to trade off how long it takes to blend the generated video over to the real video. The longer you take, the less jarring any error correction will be, but the longer the image is “uncanny.”
While there are 100 million photoreceptors in the whole eye, but only about a million nerve fibers going out. It would still be expensive to deliver this full resolution in the attention spot and most likely next spots, but it’s much less bandwidth than sending the whole scene. Even if full resolution is not delivered, much better resolution can be offered.
Stereo and simulated 3D
You can also do this in stereo to provide 3D. Another interesting approach was done at CMU called pseudo 3D. I recommend you check out the video. This system captures the background and moves the flat head against it as the viewer moves their head. The result looks surprisingly good. read more »
I have so much paper that I’ve been on a slow quest to scan things. So I have high speed scanners and other tools, but it remains a great deal of work to get it done, especially reliably enough that you would throw away the scanned papers. I have done around 10 posts on digitizing and gathered them under that tag.
Recently, I was asked by a friend who could not figure out what to do with the papers of a deceased parent. Scanning them on your own or in scanning shops is time consuming and expensive, so a new thought came to me.
Set up a scanning table by mounting a camera that shoots 4K video looking down on the table. I have tripods that have an arm that extends out but there are many ways to mount it. Light the table brightly, and bring your papers. Then start the 4K video and start slapping the pages down (or pulling them off) as fast as you can.
There is no software today that can turn that video into a well scanned document. But there will be. Truth is, we could write it today, but nobody has. If you scan this way, you’re making the bet that somebody will. Even if nobody does, you can still go into the video and find any page and pull it out by hand, it will just be a lot of work, and you would only do this for single pages, not for whole documents. You are literally saving the document “for the future” because you are depending on future technology to easily extract it. read more »
HBO released a new version of “Westworld” based on the old movie about a robot-based western theme park. The show hasn’t excited me yet — it repeats many of the old tropes on robots/AI becoming aware — but I’m interested in the same thing the original talked about — simulated experiences for entertainment.
The new show misses what’s changed since the original. I think it’s more likely they will build a world like this with a combination of VR, AI and specialty remotely controlled actuators rather than with independent self-contained robots.
One can understand the appeal of presenting the simulation in a mostly real environment. But the advantages of the VR experience are many. In particular, with the top-quality, retinal resolution light-field VR we hope to see in the future, the big advantage is you don’t need to make the physical things look real. You will have synthetic bodies, but they only have to feel right, and only just where you touch them. They don’t have to look right. In particular, they can have cables coming out of them connecting them to external computing and power. You don’t see the cables, nor the other manipulators that are keeping the cables out of your way (even briefly unplugging them) as you and they move.
This is important to get data to the devices — they are not robots as their control logic is elsewhere, though we will call them robots — but even more important for power. Perhaps the most science fictional thing about most TV robots is that they can run for days on internal power. That’s actually very hard.
The VR has to be much better than we have today, but it’s not as much of a leap as the robots in the show. It needs to be at full retinal resolution (though only in the spot your eyes are looking) and it needs to be able to simulate the “light field” which means making the light from different distances converge correctly so you focus your eyes at those distances. It has to be lightweight enough that you forget you have it on. It has to have an amazing frame-rate and accuracy, and we are years from that. It would be nice if it were also untethered, but the option is also open for a tether which is suspended from the ceiling and constantly moved by manipulators so you never feel its weight or encounter it with your arms. (That might include short disconnections.) However, a tracking laser combined with wireless power could also do the trick to give us full bandwidth and full power without weight.
It’s probably not possible to let you touch the area around your eyes and not feel a headset, but add a little SF magic and it might be reduced to feeling like a pair of glasses.
The advantages of this are huge:
You don’t have to make anything look realistic, you just need to be able to render that in VR.
You don’t even have to build things that nobody will touch, or go to, including most backgrounds and scenery.
You don’t even need to keep rooms around, if you can quickly have machines put in the props when needed before a player enters the room.
In many cases, instead of some physical objects, a very fast manipulator might be able to quickly place in your way textures and surfaces you are about to touch. For example, imagine if, instead of a wall, a machine with a few squares of wall surface quickly holds one out anywhere you’re about to touch. Instead of a door there is just a robot arm holding a handle that moves as you push and turn it.
Proven tricks in VR can get people to turn around without realizing it, letting you create vast virtual spaces in small physical ones. The spaces will be designed to match what the technology can do, of course.
You will also control the audio and cancel sounds, so your behind-the-scenes manipulations don’t need to be fully silent.
You do it all with central computers, you don’t try to fit it all inside a robot.
You can change it all up any time.
In some cases, you need the player to “play along” and remember not to do things that would break the illusion. Don’t try to run into that wall or swing from that light fixture. Most people would play along.
For a lot more money, you might some day be able to do something more like Westworld. That has its advantages too:
Of course, the player is not wearing any gear, which will improve the reality of the experience. They can touch their faces and ears.
Superb rendering and matching are not needed, nor the light field or anything else. You just need your robots to get past the uncanny valley
You can use real settings (like a remote landscape for a western) though you may have a few anachronisms. (Planes flying overhead, houses in the distance.)
The same transmitted power and laser tricks could work for the robots, but transmitting enough power to power a horse is a great deal more than enough to power a headset. All this must be kept fully hidden.
The latter experience will be made too, but it will be more static and cost a lot more money.
Yes, there will be sex
Warning: We’re going to get a bit squicky here for some folks.
Westworld is on HBO, so of course there is sex, though mostly just a more advanced vision of the classic sex robot idea. I think that VR will change sex much sooner. In fact, there is already a small VR porn industry, and even some primitive haptic devices which tie into what’s going on in the porn. I have not tried them but do not imagine them to be very sophisticated as yet, but that will change. Indeed, it will change to the point where porn of this sort becomes a substitute for prostitution, with some strong advantages over the real thing (including, of course, the questions of legality and exploitation of humans.) read more »
We’re on the cusp of a new wave of virtual reality and augmented reality technology. The most exciting is probably the Magic Leap. I have yet to look through it, but friends who have describe it as hard to tell from actual physical objects in your environment. The Hololens (which I have looked through) is not that good, and has a very limited field of view, but it already shows good potential.
It’s becoming easier and easier to create VR versions of both fictional and real environments. Every historical documentary show seems to include a nice model reconstructing what something used to look like, and this is going to get better and better with time.
This will be an interesting solution for many of the world’s museums and historical sites. A few years from now, every visit to a ruin or historical building won’t just include a boring and slow audioguide, but some AR glasses to allow you to see a model of what the building was really like in its glory. Not just a building — it should be possible to walk around ancient Rome or other towns and do this as well.
Now with VR you’ll be able to do that in your own home if you like, but you won’t be able to walk very far in that space. (There are tricks that let you fool people into thinking they walked further but they are just not the same as walking in the real space with the real geometry.) They will also be able to populate the space with recordings or animations of people in period costumes doing period things.
This is good news for historical museums. Many of them have very few actual interesting artifacts to see, so they end up just being placards and photos and videos and other multimedia presentations. Things I could easily see on the museum web site; their only virtue is that I am reading the text and looking at the picture in the greatly changed remains of where it happened. These days, I tend to skip museums that have become little more than multimedia. But going to see the virtual recreation will be a different story, I predict.
Soon will be the time for museum and tourist organizations to start considering what spaces will be good for this. You don’t need to restore or rebuild that old castle, as long as it’s safe to walk around. You just need to instrument it with tracking sensors for the AR gear and build and refine those models. Over time, the resolution of the AR glasses will approach that of the eyes, and the reality of the models will improve too. In time, many will feel like they got an experience very close to going back and time and seeing it as it was.
Well, not quite as it was. It will be full of tourists from the future, including yourself. AR keeps them present, which is good because you don’t want to bump into them. A more advanced system will cover the tourists in period clothing, or even replace their faces. You would probably light the space somewhat dimly to assure the AR can cover up what it needs to cover up, while still keeping enough good vision of the floor so you don’t trip.
Of course, if you cover everything up with the AR, you could just do this in a warehouse, and that will happen too. You would need to reproduce the staircases of the recreated building but could possibly get away with producing very little else. As long as the other visitors don’t walk through walls the walls don’t have to be there. This might be popular (since it needs no travel) but many of us still do have an attraction to the idea that we’re standing in the actual old place, not in our hometown. And the museums would also have rooms with real world artifacts to examine, if they have them.
Last year, I wrote a few posts on the attack on Science Fiction’s Hugo awards, concluding in the end that only human defence can counter human attack. A large fraction of the SF community felt that one could design an algorithm to reduce the effect of collusion, which in 2015 dominated the nomination system. (It probably will dominate it again in 2016.) The system proposed, known as “e Pluribus Hugo” attempted to defeat collusion (or “slates”) by giving each nomination entry less weight when a nomination ballot was doing very well and getting several of its choices onto the final ballot. More details can be found on the blog where the proposal was worked out.
The process passed the first round of approval, but does not come into effect unless it is ratified at the 2016 meeting and then it applies to the 2017 nominations. As such, the 2016 awards will be as vulnerable to the slates as before, however, there are vastly more slate nominators this year — presuming all those who joined in last year to support the slates continue to do so.
Recently, my colleague Bruce Schneier was given the opportunity to run the new system on the nomination data from 2015. The final results of that test are not yet published, but a summary was reported today in File 770 and the results are very poor. This is, sadly, what I predicted when I did my own modelling. In my models, I considered some simple strategies a clever slate might apply, but it turns out that these strategies may have been naturally present in the 2015 nominations, and as predicted, the “EPH” system only marginally improved the results. The slates still massively dominated the final ballots, though they no longer swept all 5 slots. I consider the slates taking 3 or 4 slots, with only 1 or 2 non-slate nominees making the cut to be a failure almost as bad as the sweeps that did happen. In fact, I consider even nomination through collusion to be a failure, though there are obviously degrees of failure. As I predicted, a slate of the size seen in the final Hugo results of 2015 should be able to obtain between 3 and 4 of the 5 slots in most cases. The new test suggests they could do this even with a much smaller slate group as they had in the 2015 nominations.
Another proposal — that there be only 4 nominations on each nominating ballot but 6 nominees on the final ballot — improves this. If the slates can take only 3, then this means 3 non-slate nominees probably make the ballot.
An alternative - Make Room, Make Room!
First, let me say I am not a fan of algorithmic fixes to this problem. Changing the rules — which takes 2 years — can only “fight the last war.” You can create a defence against slates, but it may not work against modifications of the slate approach, or other attacks not yet invented.
Nonetheless, it is possible to improve the algorithmic approach to attain the real goal, which is to restore the award as closely as possible to what it was when people nominated independently. To allow the voters to see the top 5 “natural” nominees, and award the best one the Hugo award, if it is worth.
The approach is as follows: When slate voting is present, automatically increase the number of nominees so that 5 non-slate candidates are also on the ballot along with the slates.
To do this, you need a formula which estimates if a winning candidate is probably present due to slate voting. The formula does not have to be simple, and it is OK if it occasionally identifies a non-slate candidate as being from a slate.
Calculate the top 5 nominees by the traditional “approval” style ballot.
If 2 or more pass the “slate test” which tries to measure if they appear disproportionately together on too many ballots, then increase the number of nominees until 5 entries do not meet the slate condition.
As a result, if there is a slate of 5, you may see the total pool of nominees increased to 10. If there are no slates, there would be only 5 nominees. (Ties for last place, as always, could increase the number slightly.)
Let’s consider the advantages of this approach:
While ideally it’s simple, the slate test formula does not need to be understood by the typical voter or nominator. All they need to know is that the nominees listed are the top nominees.
Likewise, there is no strategy in nominating. Your ballot is not reduced in strength if it has multiple winners. It’s pure approval.
If a candidate is falsely identified as passing the slate test — for example a lot of Doctor Who fans all nominate the same episodes — the worst thing that happens is we get a few extra nominees we should not have gotten. Not ideal, but pretty tame as a failure mode.
Likewise, for those promoting slates, they can’t claim their nominations are denied to them by a cabal or conspiracy.
All the nominees who would have been nominated in the absence of slate efforts get nominated; nobody’s work is displaced.
Fans can decide for themselves how they want to consider the larger pool of nominees. Based on 2015’s final results (with many “No Awards”) it appears fans wish to judge some works as there unfairly and discount them. Fans who wish it would have the option of deciding for themselves which nominees are important, and acting as though those are all that was on the ballot.
If it is effective, it gives the slates so little that many of them are likely to just give up. It will be much harder to convince large numbers of supporters to spend money to become members of conventions just so a few writers can get ignored Hugo nominations with asterisks beside them.
It has a few downsides, and a vulnerability.
The increase in the number of nominees (only while under slate attack) will frustrate some, particularly those who feel a duty to read all works before voting.
All the slate candidates get on the ballot, along with all the natural ones. The first is annoying, but it’s hardly a downside compared to having some of the natural ones not make it. A variant could block any work that fits the slate test but scored below 5th, but that introduces a slight (and probably un-needed) bit of bias.
You need a bigger area for nominees at the ceremony, and a bigger party, if they want to show up and be sneered at. The meaning of “Hugo Nominee” is diminished (but not as much as it’s been diminished by recent events.)
As an algorithmic approach it is still vulnerable to some attacks (one detailed below) as well as new attacks not yet thought of.
In particular, if slates are fully coordinated and can distribute their strength, it is necessary to combine this with an EPH style algorithm or they can put 10 or more slate candidates on the ballot.
All algorithmic approaches are vulnerable to a difficult but possible attack by slates. If the slate knows its strength and knows the likely range of the top “natural” nominees, it can in theory choose a number of slots it can safety win, and name only that many choices, and divide them up among supporters. Instead of having 240 people cast ballots with the 3 choices, they can have 3 groups of 80 cast ballots for one choice only. No simple algorithm can detect that or respond to it, including this one. This is a more difficult attack than the current slates can carry off, as they are not that unified. However, if you raise the bar, they may rise to it as well.
All algorithmic approaches are also vulnerable to a less ambitious colluding group, that simply wants to get one work on the ballot by acting together. That can be done with a small group, and no algorithm can stop it. This displaces a natural candidate and wins a nomination, but probably not the award. Scientologists were accused of doing this for L. Ron Hubbard’s work in the past.
The best way to work out the formula would be through study of real data with and without slates. One candidate would be to take all nominees present on more than 5% of ballots, and pairwise compare them to find out what fraction of the time the pair are found together on ballots. Then detect pairs which are together a great deal more than that. How much more would be learned from analysis of real data. Of course, the slates will know the formula, so it must be difficult to defeat it even knowing it. As noted, false positives are not a serious problem if they are uncommon. False negatives are worse, but still better than alternatives.
So what else?
At the core is the idea of providing voters with information on who the natural nominees would have been, and allowing them to use the STV voting system of the final ballot to enact their will. This was done in 2015, but simply to give No Award in many of the categories — it was necessary to destroy the award in order to save it.
As such, I believe there is a reason why every other system (including the WSFS site selection) uses a democratic process, such as write-in, to deal with problems in nominations. Democratic approaches use human judgment, and as such they are not a response to slates, but to any attack.
As such, I believe a better system is to publish a longer list of nominees — 10 or more — but to publish them sorted according to how many nominations they got. This allows voters to decide what they think the “real top 5” was and to vote on that if they desire. Because a slate can’t act in secret, this is robust against slates and even against the “slate of one” described above. Revealing the sort order is a slight compromise, but a far lesser one than accepting that most natural nominees are pushed off the ballot.
The advantages of this approach:
It is not simply a defence against slates, it is a defence against any effort to corrupt the nominations, as long as it is detected and fans believe it.
It requires no algorithms or judgment by officials. It is entirely democratic.
It is completely fair to all comers, even the slate members.
The downsides are:
As above, there are a lot more nominees, so the meaning of being a nominee changes
Some fans will feel bound to read/examine more than 5 nominees, which produces extra work on their part
The extra information (sorting order) was never revealed before, and may have subtle effects on voting strategy. So far, this appears to be pretty minor, but it’s untested. With STV voting, there is about as little strategy as can be. Some voters might be very slightly more likely to rank a work that sorted low in first place, to bump its chances, but really, they should not do that unless they truly want it to win — in which case it is always right to rank it first.
It may need to add EPH style counting if slates get a high level of coordination.
Another surprisingly strong approach would be simply to add a rule saying, “The Hugo Administrators should increase the number of nominees in any category if their considered analysis leaves them convinced that some nominees made the final ballot through means other than the nominations of fans acting independently, adding one slot for each work judged to fail that test, but adding no more than 6 slots.” This has tended to be less popular, in spite of its simplicity and flexibility - it even deals with single-candidate campaigns — because some fans have an intense aversion to any use of human judgment by the Hugo administrators.
Very simple (for voters at least)
Very robust against any attempt to corrupt the nominations that the admins can detect. So robust that it makes it not worth trying to corrupt the nominations, since that often costs money.
Does not require constant changes to the WSFS constitution to adapt to new strategies, nor give new strategies a 2 year “free shot” before the rules change.
If administrators act incorrectly, the worst they do is just briefly increase the number of nominees in some categories.
If there are no people trying to corrupt the system in a way admins can see, we get the original system we had before, in all its glory and flaws.
The admins get access to data which can’t be released to the public to make their evaluations, so they can be smarter about it.
Clearly a burden for the administrators to do a good job and act fairly
People will criticise and second guess. It may be a good idea to have a post-event release of any methodology so people learn what to do and not do.
There is the risk of admins acting improperly. This is already present of course, but traditionally they have wanted to exercise very little judgment.
I wrote earlier on the drama that ensued when a group of SF writers led a campaign to warp the nomination process by getting a small but sufficiently large group of supporters to collude on nominating a slate of candidates. The way the process works, with the nomination being a sampling process where a thousand nominators choose from thousands of works, it takes only a 100-200 people working together to completely take over the process, and in some cases, they did — to much uproar.
In the aftermath, there was much debate about what to do about it. Changes to the rules are in the works, but due to a deliberate ratification process, they mostly can’t take effect until the 2017 award.
One popular proposal, called E Pluribus Hugo appeals, at least initially, to the nerdy mathematician in many of us. Game theory tries to design voting systems that resist attack. This is such a proposal, which works to diminish the effect that slate collusion can have, so that a slate of 5 might get fewer than 5 (perhaps just 1 or 2) onto the ballot. It is complex but aimed to make it possible for people to largely nominate the same way as before. My fear is that it modestly increases the reward for “strategic” voting. With strategic voting, you are not colluding, but you deliberately leave choices you like off your ballot to improve the chances of other choices you like more. read more »
Facebook’s ARPU (average revenue per user, annualized) in the last quarter was just under $10, declining slightly in the USA and Canada, and a much lower 80 cents in the rest of the world. This is quite a bit less than Google’s which hovers well over $40.
That number has been mostly growing (it shrank last quarter for the first time) but it’s fairly low. I can solidly say I would happily pay $10 a year — even $50 a year — for a Facebook which was not simply advertising-free, but more importantly motivated only to please its customers and not advertisers. Why can’t I get that?
One reason is that it’s not that simple. If Facebook had to actually charge, it would not get nearly as many users as it does being free and ad-supported. It is frictionless to join and participate in FB, and that’s important with the natural monopolies that apply to social media. You dare not do anything that would scare away users.
Valley of Distraction
Being advertising supported bends how Facebook operates, as it will any company. The most obvious thing is the annoying ads. Particularly annoying are the ads which show up in my feed, often marked with “Friend X liked this company.” I am starting to warn my friends to please not like the pages of anybody who buys ads on FB, because these ads are even more distracting than regular ads. Also extra distracting are ads which are “just off the bulls-eye,” which is to say they are directed at me (based on what FB knows about me) and thus likely to distract me, but which turn out to be completely useless. That’s worse than an ad which was not well aimed and so doesn’t distract me at all with its uselessness. There is a “valley of distraction” when it comes to targeting ads:
Ads about things I am researching or may want to buy can be actually valuable to me, and also rewarding to the advertiser.
Ads about things I am interested in, but have already bought or would not buy via an ad are highly distracting but provide no value to the advertiser and negative value to me.
Ads about things I have no interest in tend to be only mildly distracting if they are off to the side and not blinky/flashy/pop-up style.
As sites get better at ad targeting, they generate more of the middle type.
Facebook’s need to monetize with advertising gives them strong incentives to be less protective of privacy. All social networks have an anti-privacy incentive, because the more they can get you to share with more people, the more they can make things happen on their site, and the more they can attract in other users. But advertising ads to this. Without ads, FB would focus only on attracting and retaining customers by serving them, which would be good for users.
As the old saying goes, “If you’re not paying, you’re not the customer, you’re the product.” To give credit to many web companies, in spite of the reality of this, they actually work hard to reduce the truth of this statement, but they can never do it entirely.
How we monetize the web
When I created the first internet based publication in 1989, I did it by selling subscriptions. There really wasn’t a way to do it with advertising at that time, but I lamented the eventual switch that later came which has made advertising the overwhelmingly dominant means of monetizing the web. There are a few for-pay sites but they are very few and specialized. I lament that forces pushed the web that way, and have always wished for a mechanism to make it easier, if not as easy, to monetize a web site with payment from customers. That’s why I promoted ideas like microrefunds as well as selling books in flat-rate pools like my Library of Tomorrow back in 1992. (Fortunately this concept is now starting to get some traction in some areas, like Amazon’s Kindle Unlimited.)
I’m also very interested in the way that low-friction digital currencies like Bitcoin and in particular Dogecoin have made it work workable to give donations and tips. Dogecoin started as a joke, but because people viewed it as a joke, they were willing to build easy and low security means of tipping people. The lack of value attached to Dogecoin meant people were more willing to play around with such approaches. Perhaps Bitcoin’s greatest flaw is that because its transactions are irrevocable, you must make the engine that spends them secure, and in turn, that demands it is harder to use. Easy to spend means easy to lose, or easy to steal and that’s a rule that’s hard to break. The credit card system, in order to be easy to spend, solves the problem of being easy to steal by allowing chargebacks or other human fixes when problems occur. While we can do better at making digital money easy to spend and not quite so easy to steal, it’s hard to figure out how to be perfect at that without something akin to chargebacks.
To monetize the web without advertising, we need a truly frictionless money. Advertising provides a money whose only friction is the annoyance of the advertising. To consume an ad-supported product you need do nothing but waste a little time. It’s a fairly passive thing. To consume a consumer-paid product, you must pay, and that creates three frictions:
The spending itself — though if it’s low that should be tolerable
The mental cost of thinking about the spending — which often exceeds the monetary cost on tiny transactions
The user interface cost of your means of payment.
You can’t eliminate #1 of course, but you can realize that the monetary cost is less than the negatives introduced by advertising. Eliminating #2 and #3 in a secure way is the challenge, and indeed it is the challenge which I devised the microrefund concept to address.
Will we pay the cost?
I think lots of people would pay $10/year for Facebook, particularly if alternatives also charged money. It’s a bargain at that price. But would people pay the $50 that Google makes from them? Again, I think Google is a bargain at that price, but for a lot of the world, that could be a lot of money, and that’s Google’s average revenue, not its revenue for me. (I click on ads so rarely that I think their revenue from me is actually a lot lower.)
I already bought my ticket on Iberia!
At the same time, Google’s ads are among the least painful. The ads on search are marked and isolated, and largely text based. The only really bad ads Google is doing are the ones in the valley of distraction in Adsense. As I wrote earlier, we are all constantly seeing ads for things we already bought.
And so, even though a Google search might only cost you a couple of pennies, I doubt we could move Google to payment supported even if we could remove all the friction from it.
This is not true for many other sites, though. Video sites would be a great target for frictionless payment, since showing a 30 second video ad to watch a 2 minute video is a terrible bargain, yet we see it happen frequently. There are many sites who do much worse than Google at monetizing themselves through advertising, and who would welcome a way to get more decent revenues via payment — though of course they can’t get greedy or they friction of the payment itself will reduce their business.
In addition, there are zillions of small sites and sites about topics of no commercial value who can’t make much money from advertising at all. Some of these sites probably don’t even exist because they can’t become going concerns in the current regime of monetizing the web — what fraction of the web are we missing because we have only one practical way to monetize it?
Last week’s Hugo Awards point of crisis caused a firestorm even outside the SF community. I felt it time to record some additional thoughts above the summary of many proposals I did.
It’s not about the politics
I think all sides have made an error by bringing the politics and personal faults of either side into the mix. Making it about the politics legitimises the underlying actions for some. As such, I want to remove that from the discussion as much as possible. That’s why in the prior post I proposed an alternate history.
What are the goals of the award?
Awards are funny beasts. They are almost all given out by societies. The Motion Picture Academy does the Oscars, and the Worldcons do the Hugos. The Hugos, though, are overtly a “fan” award (unlike the Nebulas which are a writer’s award, and the Oscars which are a Hollywood pro’s award.) They represent the view of fans who go to the Worldcons, but they have always been eager for more fans to join that community. But the award does not belong to the public, it belongs to that community.
While the award is done with voting and ballots, I believe it is really a measurement, which is to say, a survey. We want to measure the aggregate opinion of the community on what the best of the year was. The opinions are, of course, subjective, but the aggregate opinion is an objective fact, if we could learn it.
In particular, I would venture we wish to know which works would get the most support among fans, if the fans had the time to fairly judge all serious contenders. Of course, not everybody reads everything, and not everybody votes, so we can’t ever know that precisely, but if we did know it, it’s what we would want to give the award to.
To get closer to that, we use a 2 step process, beginning with a nomination ballot. Survey the community, and try to come up with a good estimate of the best contenders based on fan opinion. This both honours the nominees but more importantly it now gives the members the chance to more fully evaluate them and make a fair comparison. To help, in a process I began 22 years ago, the members get access to electronic versions of almost all the nominees, and a few months in which to evaluate them.
Then the final ballot is run, and if things have gone well, we’ve identified what truly is the best loved work of the informed and well-read fans. Understand again, the choices of the fans are opinions, but the result of the process is our best estimate of a fact — a fact about the opinions.
The process is designed to help obtain that winner, and there are several sub-goals
The process should, of course, get as close to the truth as it can. In the end, the most people should feel it was the best choice.
The process should be fair, and appear to be fair
The process should be easy to participate in, administer and to understand
The process should not encourage any member to not express their true opinion on their ballot. If they lie on their ballot, how can we know the true best aggregate of their opinions.
As such, ballots should be generated independently, and there should be very little “strategy” to the system which encourages members to falsely represent their views to help one candidate over another.
It should encourage participation, and the number of nominees has to be small enough that it’s reasonable for people to fairly evaluate them all
A tall order, when we add a new element — people willing to abuse the rules to alter the results away from the true opinion of the fans. In this case, we had this through collusion. Two related parties published “slates” — the analog of political parties — and their followers carried them out, voting for most or all of the slate instead of voting their own independent and true opinion.
This corrupts the system greatly because when everybody else nominates independently, their nominations are broadly distributed among a large number of potential candidates. A group that colludes and concentrates their choices will easily dominate, even if it’s a small minority of the community. A survey of opinion becomes completely invalid if the respondents collude or don’t express their true views. Done in this way, I would go so far as to describe it as cheating, even though it is done within the context of the rules.
Proposals that are robust against collusion
Collusion is actually fairly obvious if the group is of decent size. Their efforts stick out clearly in a sea of broadly distributed independent nominations. There are algorithms which make it less powerful. There are other algorithms that effectively promote ballot concentration even among independent nominators so that the collusion is less useful.
A wide variety have been discussed. Their broad approaches include:
Systems that diminish the power of a nominating ballot as more of its choices are declared winners. Effectively, the more you get of what you asked for, the less likely you will get more of it. This mostly prevents a sweep of all nominations, and also increases diversity in the final result, even the true diversity of the independent nominators.
Systems which attempt to “maximize happiness,” which is to say try to make the most people pleased with the ballot by adding up for each person the fraction of their choices that won and maximizing that. This requires that nominators not all nominate 5 items, and makes a ballot with just one nomination quite strong. Similar systems allow putting weight on nominations to make some stronger than others.
Public voting, where people can see running tallies, and respond to collusion with their own counter-nominations.
Reduction of the number of nominations for each member, to stop sweeps.
The proposals work to varying degrees, but they all significantly increase the “strategy” component for an individual voter. It becomes the norm that if you have just a little information about what the most common popular choices will be, that your wisest course to get the ballot you want will be to deliberately remove certain works from your ballot.
Some members would ignore this and nominate honestly. Many, however, would read articles about strategy, and either practice it or wonder if they were doing the right thing. In addition to debates about collusion, there would be debates on how strategy affected the ballot.
Certain variants of multi-candidate STV help against collusion and have less strategy, but most of the methods proposed have a lot.
In addition, all the systems permit at least one, and as many as 2 or 3 slate-choice nominees onto the final ballot. While members will probably know which ones those are, this is still not desired. First of all, these placements displace other works which would otherwise have made the ballot. You could increase the size of the final ballot, you need to know how many slate choices will be on it.
It should be clear, when others do not collude, slate collusion is very powerful. In many political systems, it is actually considered a great result if a party with 20% of the voters gains 20% of the “victories.” Here, we have a situation with 2,000 nominators, and where just 100 colluding members can saturate some categories and get several entries into all of them, and with 10% (the likely amount in 2015) they can get a large fraction of them. As such it is not proportional representation at all.
Fighting human attackers with human defence
Consideration of the risks of confusion and strategy with all these systems, I have been led to the conclusion that the only solid response to organized attackers on the nomination system is a system of human judgement. Instead of hard and fast voting rules, the time has come, regrettably, to have people judge if the system is under attack, and give them the power to fix it.
This is hardly anything new, it’s how almost all systems of governance work. It may be a hubris to suggest the award can get by without it. Like the good systems of governance this must be done with impartiality, transparency and accountability, but it must be done.
I see a few variants which could be used. Enforcement would most probably be done by the Hugo Committee, which is normally a special subcommittee of the group running the Worldcon. However, it need not be them, and could be a different subcommittee, or an elected body.
While some of the variants I describe below add complexity, it is not necessary to do them. One important thing about the the rule of justice is that you don’t have to get it exactly precise. You get it in broad strokes and you trust people. Sometimes it fails. Mostly it works, unless you bring in the wrong incentives.
As such, some of these proposals work by not changing almost anything about the “user experience” of the system. You can do this with people nominating and voting as they always did, and relying on human vigilance to deflect attacks. You can also use the humans for more than that.
A broad rule against collusion and other clear ethical violations
The rule could be as broad as to prohibit “any actions which clearly compromise the honesty and independence of ballots.” There would be some clarifications, to indicate this does not forbid ordinary lobbying and promotion, but does prohibit collusion, vote buying, paying for memberships which vote as you instruct and similar actions. The examples would not draw hard lines, but give guidance.
Explicit rules about specific acts
The rule could be much more explicit, with less discretion, with specific unethical acts. It turns out that collusion can be detected by the appearance of patterns in the ballots which are extremely unlikely to occur in a proper independent sample. You don’t even need to know who was involved or prove that anybody agreed to any particular conspiracy.
The big challenge with explicit rules (which take 2 years to change) is that clever human attackers can find holes, and exploit them, and you can’t fix it then, or in the next year.
Delegation of nominating power or judicial power to a sub group elected by the members
Judicial power to fix problems with a ballot could fall to a committee chosen by members. This group would be chosen by a well established voting system, similar to those discussed for the nomination. Here, proportional representation makes sense, so if a group is 10% of the members it should have 10% of this committee. It won’t do it much good, though, if the others all oppose them. Unlike books, the delegates would be human beings, able to learn and reason. With 2,000 members, and 50 members per delegate, there would be 40 on the judicial committee, and it could probably be trusted to act fairly with that many people. In addition, action could require some sort of supermajority. If a 2/3 supermajority were needed, attackers would need to be 1/3 of all members.
This council could perhaps be given only the power to add nominations — beyond the normal fixed count — and not to remove them. Thus if there are inappropriate nominations, they could only express their opinion on that, and leave it to the voters what to do with those candidates, including not reading them and not ranking them.
Instead of judicial power, it might be simpler to appoint pure nominating power to delegates. Collusion is useless here because in effect all members are now colluding about their different interests, but in an honest way. Unlike pure direct democracy, the delegates, not unlike an award jury, would be expected to listen to members (and even look at nominating ballots done by them) but charged with coming up with the best consensus on the goal stated above. Such jurors would not simply vote their preferences. They would swear to attempt to examine as many works as possible in their efforts. They would suggest works to others and expect them to be likely to look at them. They would expect to be heavily lobbied and promoted to, but as long as its pure speech (no bribes other than free books and perhaps some nice parties) they would be expected to not be fooled so easily by such efforts.
As above, a nominating body might also only start with a member nominating system and add candidates to it and express rulings about why. In many awards, the primary function of the award jury is not to bypass the membership ballot, but to add one or two works that were obscure and the members may have missed. This is not a bad function, so long as the “real ballot” (the one you feel a duty to evaluate) is not too large.
Transparency and accountability
There is one barrier to transparency, in that releasing preliminary results biases the electorate in the final ballot, which would remain a direct survey of members with no intermediaries — though still the potential to look for attacks and corruption. There could also be auditors, who are barred from voting in the awards and are allowed to see all that goes on. Auditors might be people from the prior worldcon or some other different source, or fans chosen at random.
Finally, decisions could be appealed to the business meeting. This requires a business meeting after the Hugos. Attackers would probably always appeal any ruling against them. Appeals can’t alter nominations, obviously, or restore candidates who were eliminated.
All the above requires the two year ratification process and could not come into effect (mostly) until 2017. To deal with the current cheating and the promised cheating in 2016, the following are recommended.
Downplay the 2015 Hugo Award, perhaps with sufficient fans supporting this that all categories (including untainted ones) have no award given.
Conduct a parallel award under a new system, and fête it like the Hugos, though they would not use that name.
Pass new proposed rules including a special rule for 2016
If 2016’s award is also compromised, do the same. However, at the 2016 business meeting, ratify a short-term amendment proposed in 2015 declaring the alternate awards to be the Hugo awards if run under the new rules, and discarding the uncounted results of the 2016 Hugos conducted under the old system. Another amendment would permit winners of the 2015 alternate award to say they are Hugo winners.
If the attackers gave up, and 2016’s awards run normally, do not ratify the emergency plan, and instead ratify the new system that is robust against attack for use in 2017.
Since 1992 I have had a long association with the Hugo Awards for SF & Fantasy given by the World Science Fiction Society/Convention. In 1993 I published the Hugo and Nebula Anthology which was for some time the largest anthology of current fiction every published, and one of the earliest major e-book projects. While I did it as a commercial venture, in the years to come it became the norm for the award organizers to publish an electronic anthology of willing nominees for free to the voters.
This year, things are highly controversial, because a group of fans/editors/writers calling themselves the “Sad Puppies,” had great success with a campaign to dominate the nominations for the awards. They published a slate of recommended nominations and a sufficient number of people sent in nominating ballots with that slate so that it dominated most of the award categories. Some categories are entirely the slate, only one was not affected.
It’s important to understand the nominating and voting on the Hugos is done by members of the World SF Society, which is to say people who attend the World SF Convention (Worldcon) or who purchase special “supporting” memberships which don’t let you go but give you voting rights. This is a self-selected group, but in spite of that, it has mostly manged to run a reasonably independent vote to select the greatest works of the year. The group is not large, and in many categories, it can take only a score or two of nominations to make the ballot, and victory margins are often small. As such, it’s always been possible, and not even particularly hard, to subvert the process with any concerted effort. It’s even possible to do it with money, because you can just buy memberships which can nominate or vote, so long as a real unique person is behind each ballot.
The nominating group is self-selected, but it’s mostly a group that joins because they care about SF and its fandom, and as such, this keeps the award voting more independent than you would expect for a self-selected group. But this has changed.
The reasoning behind the Sad Puppy effort is complex and there is much contentious debate you can find on the web, and I’m about to get into some inside baseball, so if you don’t care about the Hugos, or the social dynamics of awards and conventions, you may want to skip this post. read more »
I’m sure you’ve seen it. Shop for something and pretty quickly, half the ads you see on the web relate to that thing. And you keep seeing those ads, even after you have made your purchase, sometimes for weeks on end.
At first blush, it makes sense, and is the whole reason the ad companies (like Google and the rest) want to track more about us is to deliver ads that target our interests. The obvious is value in terms of making advertising effective for advertisers, but it’s also argued that web surfers derive more value from ads that might interest them than we do from generic ads with little relevance to our lives. It’s one of the reasons that text ads on search have been such a success.
Anything in the ad industry worth doing seems to them to be worth overdoing, I fear, and I think this is backfiring. That’s because the ads that pop up for products I have already bought are both completely useless and much more annoying than generic ads. They are annoying because they distract my attention too well — I have been thinking about those products, I may be holding them in my hands, so of course my eyes are drawn to photos of things like what I just bought.
I already bought my ticket on Iberia!
This extends beyond the web. Woe to me for searching for hotel rooms and flights these days. I am bombarded after this with not just ads but emails wanting to make sure I had gotten a room or other travel service. They accept that if I book a flight, I don’t need another flight but surely need a room, but of course quite often I don’t need a room and may not even be shopping for one. It’s way worse than the typical spam. I’ve seen ads for travel services a month after I took the trip.
Yes, that Iberia ad I screen captured on the right is the ad showing to me on my own blog — 5 days after I booked a trip to Spain on USAir that uses Iberia as a codeshare. (Come see me at the Singularity Summit in Sevilla on March 12-14!)
I am not sure how to solve this. I am not really interested in telling the ad engines what I have done to make them go away. That’s more annoyance, and gives them even more information just to be rid of another annoyance.
It does make us wonder — what is advertising like if it gets really, really good? I mean good beyond the ads John Anderton sees in Minority report as he walks past the billboards. What if every ad is actually about something you want to buy? It will be much more effective for advertisers of course, but will that cause them to cut back on the ads to reduce the brain bandwidth it takes from us? Would companies like Google say, “Hey, we are making a $200 CPM here, so let’s only run ads 1/10th of the time that we did when we made a $20 CPM?” Somehow I doubt it.
Over the past 14 years, there has been only one constant in my TV viewing, and that’s The Daily Show. I first loved it with Craig Kilborn, and even more under Jon Stewart. I’ve seen almost all of them, even after going away for a few weeks, because when you drop the interview and commercials, it’s a pretty quick play. Jon Stewart’s decision to leave got a much stronger reaction from me than any other TV show news, though I think the show will survive.
I don’t know how many viewers are like me, but I think that TDS is one of the most commercially valuable programs on TV. It is the primary reason I have not “cut the cord” (or rather turned off the satellite.) I want to get it in HD, with the ability to skip commercials, at 8pm on the night that it was made. No other show I watch regularly meets this test. I turned off my last network show last year — I had been continuing to watch just the “Weekend Update” part of SNL along with 1 or 2 sketches. It always surprised me that the Daily Show team could produce a better satirical newscast than the SNL writers, even though SNL’s team had more money and a whole week to produce much less material.
The reason I call it that valuable is that by and larger, I am paying $45/month for satellite primarily to get that show. Sure, I watch other shows, but in a pinch, I would be willing to watch these other shows much later through other channels, like Netflix, DVD or online video stores at reasonable prices. I want the Daily Show as soon as I can, which is 8pm on the west coast. On the east coast, the 11pm arrival is a bit late.
I could watch it on their web site, but that’s the next day, and with forced watching of commercials. My time is too valuable to me to watch commercials — I would much rather pay to see it without them. (As I have pointed out there, you receive around $1-$2 in value for every hour of commercials you watch on regular TV, though the online edition only plays 4 ads instead of the more typical 12-15 of broadcast that I never see.)
In the early days at BitTorrent when we were trying to run a video store, I really wanted us to do a deal with Viacom/Comedy Central/TDS. In my plan, they would release the show to us (in HD before the cable systems moved to HD) as soon as possible (ie. before 11pm Eastern) and with unbleeped audio and no commercials. In other words, a superior product. I felt we could offer them more revenue per pay subscriber than they were getting from advertising. That’s because the typical half-hour show only brings in around 15 cents per broadcast viewer, presuming a $10 CPM. They were not interested, in part because some people didn’t want to go online, or had a bad view of BitTorrent (though the company that makes the software is not involved in any copyright infringement done with the tools.)
It may also have been they knew some of that true value. Viacom requires cable and satellite companies to buy a bundle of channels from them, and even though the channels show ads. Evidence suggests that the bundle of Viacom channels (Including Comedy Central, MTV and Nickelodeon) costs around $2.80 per household per month. While there are people like me who watch only Comedy Central from the Viacom bundle, most people probably watch 2 or more of them. They should be happy to get $5/month from a single household for a single show, but they are very committed to the bundling, and the cable companies, who don’t like the bundles, would get upset if Viacom sold individual shows like this and cable subscribers cut the cord.
In spite of this, I think the cord cutting and unbundling are inevitable. The forces are too strong. Dish Network’s supposedly bold venture with Sling, which provides 20 channels of medium popularity for $20/month over the internet only offers live streaming — no time-shifting, no fast forwarding — so it’s a completely uninteresting product to me.
As much as I love Jon Stewart, I think The Daily Show will survive his transition just fine. That’s because it was actually pretty funny with Craig Kilborn. Stewart improved it, but he is just one part of a group of writers, producers and other on-air talent, including those who came from a revolving door with The Onion. There are other folks who can pull it off.
TDS is available a day late on Amazon Instant Video and next day on Google Play — for $3/episode, or almost $50/month. You can get cable for a lot less than that. It’s on iTunes for $2/episode or $10/month, the latter price being reasonable, does anybody know when it gets released? The price difference is rather large.
When Southwest started using tablets for in-flight entertainment, I lauded it. Everybody has been baffled by just how incredibly poor most in-flight video systems are. They tend to be very slow, with poor interfaces and low resolution screens. Even today it’s common to face a small widescreen that takes a widescreen film, letterboxes it and then pillarboxes it, with only an option to stretch it and make it look wrong. All this driven by a very large box in somebody’s footwell.
I found out one reason why these systems are so outdated. Apparently, all seatback screens have to be safety tested, to make sure that if you are launched forward and hit your head on the screen, it is not more dangerous than it needs to be. Such testing takes time and money, so these systems are only updated every 10 years. The process of redesigning, testing and installing takes long enough that it’s pretty sure the IFE system will seem like a dinosaur compared to your phone or tablet.
One airline is planning to just safety test a plastic case for the seatback into which they can insert different panels as they develop. Other airlines are moving to tablets, or providing you movies on your own tablet, though primarily they have fallen into the Apple walled garden and are doing it only for the iPad.
The natural desire is just to forget the airline system and bring your own choice of entertainment on your own tablet. This is magnified by the hugely annoying system which freezes the IFE system on every announcement. Not just the safety announcements. Not just the announcements in your language, but also the announcement that duty free shopping has begun in English, French and Chinese. While a few airlines let you start your movie right after boarding, you don’t want to do it, as you will get so many interruptions until the flight levels off that it will drive you crazy. The airline provided tablet services also do this interruption, so your own tablet is better.
In the further interests of safety, new rules insist you can only use the airline’s earbud headphones during takeoff and landing, not your nice noise cancellation phones. But you didn’t pick up earbuds since you have the nicer ones. The theory is, your nice headphones might make you miss a safety announcement when landing, even though they tend to block background noise and actually make speech clearer.
One of the better IFE systems is the one on Emirates. This one, I am told, knows who you are, and if you pause a show on one flight, it picks up there on your next flight. (Compare that to so many systems that often forget where you were in the film on the same flight, and also don’t warn you if you won’t be able to finish the movie before the system is turned off.)
Using your own tablet
It turns out to be no picnic using your own tablet.
You have to remember to pre-load the video, of course
You have to pay for it, which is annoying if:
The airline is already paying for it and providing it free in the IFE
You have it on netflix/etc. and could watch it at home at no cost
You wish to start a movie one day and finish it on another flight, but don’t want to pay to “own” the movie. (Because of this I mostly watch TV shows, which only have a $3 “own” price and no rental price.)
How to fix this:
IFE systems should know who I am, know my language, know if I have already seen the safety briefing, and not interrupt me for anything but new or plane-specific safety announcements in my chosen language.
Like the Emirates systems, they should know where I am in each movie, as well as my tastes.
How to know the language of the announcement? Well, you could have a button for the FA to push, but today software is able to figure out the language pretty reliably, so an automated system could learn the languages and the order in which they are done on that flight. Software could also spot phrases like “Safety announcement” at the start of a public address, or there could be a button.
Netflix should, like many other services, allow you to cache material for offline viewing. The material can have an expiration date, and the software can check when it’s online to update those dates, if you are really paranoid about people using the cache as a way to watch stuff after it leaves Netflix. Reportedly Amazon does this on the Kindle Fire.
Online video stores (iTunes, Google Play, etc.) should offer a “plane rental” which allows you to finish a movie after the day you start it. In fact, why not have that ability for a week or two on all rentals? It would not let you restart, only let you watch material you have not yet viewed, plus perhaps a minute ahead of that.
Perhaps I am greedy, but it would be nice if you could do a rental that lets 2 or more people in a household watch independently, so I watch it on my flight and she watches it on hers.
If necessary, noise-cancelling headphones should have a “landing mode” that mixes in more outside sound, and a little airplane icon on them, so that we can keep them on during takeoff and landing. Or get rid of this pretty silly rule.
Choosing your film
There’s a lot of variance in the quality of in-flight films. Air Canada seems particularly good at choosing turkeys. Before they close the doors, I look up movies — if I can get the IFE system to work with all the announcements — in review sites to figure out what to watch. In November, at Dublin Web Summit, I met the developers of a travel app called Quicket, which specialized in having its resources offline. I suggested they include ratings for the movies on each flight — the airlines publish their catalog in advance — in the offline data, and in December they had implemented it. Great job, Quicket.
Almost everybody has a 1080p HD camera with them — almost all phones and pocket cameras do this. HD looks great but the future’s video displays will do 4K, 8K and full eye-resolution VR, and so our video today will look blurry the way old NTSC video looks blurry to us. In a bizarre twist, in the middle of the 20th century, everything was shot on film at a resolution comparable to HD. But from the 70s to 90s our TV shows were shot on NTSC tape, and thus dropped in resolution. That’s why you can watch Star Trek in high-def but not “The Wire.”
I predict that complex software in the future will be able to do a very good job of increasing the resolution of video. One way it will do this is through making full 3-D models of things in the scene using data from the video and elsewhere, and re-rendering at higher resolution. Another way it will do this is to take advantage of the “sub-pixel” resolution techniques you can do with video. One video frame only has the pixels it has, but as the camera moves or things move in a shot, we get multiple frames that tell us more information. If the camera moves half a pixel, you suddenly have a lot more detail. Over lots of frames you can gather even more.
This will already happen with today’s videos, but what if we help them out? For example, if you have still photographs of the things in the video, this will allow clever software to fill in more detail. At first, it will look strange, but eventually the uncanny valley will be crossed and it will just look sharp. Today I suspect most people shooting video on still cameras also shoot some stills, so this will help, but there’s not quite enough information if things are moving quickly, or new sides of objects are exposed. A still of your friend can help render them in high-res in a video, but not if they turn around. For that the software just has to guess.
We might improve this process by designing video systems that capture high-res still frames as often as they can and embed them to the video. Storage is cheap, so why not?
I typical digital video/still camera has 16 to 20 million pixels today. When it shoots 1080p HD video, it combines those pixels together, so that there are 6 to 10 still pixels going into every video pixel. Ideally this is done by hardware right in the imaging chip, but it can also be done to a lesser extent in software. A few cameras already shoot 4K, and this will become common in the next couple of years. In this case, they may just use the pixels one for one, since it’s not so easy to map a 16 megapixel 3:2 still array into a 16:9 8 megapixel 4K image. You can’t just combine 2 pixels per pixel.
Most still cameras won’t shoot a full-resolution video (ie. a 6K or 8K video) for several reasons:
As designed, you simply can’t pull that much data off the chip per unit time. It’s a huge amount of data. Even with today’s cheap storage, it’s also a lot to store.
Still camera systems tend to compress jpegs, but you want a video compression algorithm to record a video even if you can afford the storage for that.
Nobody has displays to display 6K or 8K video, and only a few people have 4K displays — though this will change — so demand is not high enough to justify these costs
When you combine pixels, you get less noise and can shoot in lower light. That’s why your camera can make a decent night-time video without blurring, but it can’t shoot a decent still in that lighting.
What is possible is a sensor which is able to record video (at the desired 30fps or 60fps rate) and also pull off full-resolution stills at some lower frame rate, as long as the scene is bright enough. That frame rate might be something like 5 or even 10 fps as cameras get better. In addition, hardware compression would combine the stills and the video frames to eliminate the great redundancy, though only to a limited extent because our purpose is to save information for the future.
Thus, if we hand the software of the future an HD video along with 3 to 5 frames/second of 16megapixel stills, I am comfortable it will be able to make a very decent 4K video from it most of the time, and often a decent 6K or 8K video. As noted, a lot of that can happen even without the stills, but they will just improve the situation. Those situations where it can’t — fast changing objects — are also situations where video gets blurred and we are tolerant of lower resolution.
It’s a bit harder if you are already shooting 4K. To do this well, we might like a 38 megapixel still sensor, with 4 pixels for every pixel in the video. That’s the cutting edge in high-end consumer gear today, and will get easier to buy, but we now run into the limitations of our lenses. Most lenses can’t deliver 38 million pixels — not even many of the high-end professional photographer lenses can do that. So it might not deliver that complete 8K experience, but it will get a lot closer than you can from an “ordinary” 4K video.
If you haven’t seen 8K video, it’s amazing. Sharp has been showing their one-of-a-kind 8K video display at CES for a few years. It looks much more realistic than 3D videos of lower resolution. 8K video can subtend over 100 degrees of viewing angle at one pixel per minute of arc, which is about the resolution of the sensors in your eye. (Not quite, as your eye also does sub-pixel tricks!) At 60 degrees — which is more than any TV is set up to subtend — it’s the full resolution of your eyes, and provides an actual limit on what we’re likely to want in a display.
And we could be shooting video for that future display today, before the technology to shoot that video natively exists.
Recently I tried Facebook/Oculus Rift Crescent Bay prototype. It has more resolution (I will guess 1280 x 1600 per eye or similar) and runs at 90 frames/second. It also has better head tracking, so you can walk around a small space with some realism — but only a very small space. Still, it was much more impressive than the DK2 and a sign of where things are going. I could still see a faint screen door, they were annoyed that I could see it.
We still have a lot of resolution gain left to go. The human eye sees about a minute of arc, which means about 5,000 pixels for a 90 degree field of view. Since we have some ability for sub-pixel resolution, it might be suggested that 10,000 pixels of width is needed to reproduce the world. But that’s not that many Moore’s law generations from where we are today. The graphics rendering problem is harder, though with high frame rates, if you can track the eyes, you need only render full resolution where the fovea of the eye is. This actually gives a boost to onto-the-eye systems like a contact lens projector or the rumoured Magic Leap technology which may project with lasers onto the retina, as they need actually render far fewer pixels. (Get really clever, and realize the optic nerve only has about 600,000 neurons, and in theory you can get full real-world resolution with half a megapixel if you do it right.)
Walking around Rome, I realized something else — we are now digitizing our world, at least the popular outdoor spaces, at a very high resolution. That’s because millions of tourists are taking billions of pictures every day of everything from every angle, in every lighting. Software of the future will be able to produce very accurate 3D representations of all these spaces, both with real data and reasonably interpolated data. They will use our photographs today and the better photographs tomorrow to produce a highly accurate version of our world today.
This means that anybody in the future will be able to take a highly realistic walk around the early 21st century version of almost everything. Even many interiors will be captured in smaller numbers of photos. Only things that are normally covered or hidden will not be recorded, but in most cases it should be possible to figure out what was there. This will be trivial for fairly permanent things, like the ruins in Rome, but even possible for things that changed from day to day in our highly photographed world. A bit of AI will be able to turn the people in photos into 3-D animated models that can move within these VRs.
It will also be possible to extend this VR back into the past. The 20th century, before the advent of the digital camera, was not nearly so photographed, but it was still photographed quite a lot. For persistent things, the combination of modern (and future) recordings with older, less frequent and lower resolution recordings should still allow the creation of a fairly accurate model. The further back in time we go, the more interpolation and eventually artistic interpretation you will need, but very realistic seeming experiences will be possible. Even some of the 19th century should be doable, at least in some areas.
This is a good thing, because as I have written, the world’s tourist destinations are unable to bear the brunt of the rising middle class. As the Chinese, Indians and other nations get richer and begin to tour the world, their greater numbers will overcrowd those destinations even more than the waves of Americans, Germans and Japanese that already mobbed them in the 20th century. Indeed, with walking chairs (successors of the BigDog Robot) every spot will be accessible to everybody of any level of physical ability.
VR offers one answer to this. In VR, people will visit such places and get the views and the sounds — and perhaps even the smells. They will get a view captured at the perfect time in the perfect light, perhaps while the location is closed for digitization and thus empty of crowds. It might be, in many ways, a superior experience. That experience might satisfy people, though some might find themselves more driven to visit the real thing.
In the future, everybody will have had a chance to visit all the world’s great sites in VR while they are young. In fact, doing so might take no more than a few weekends, changing the nature of tourism greatly. This doesn’t alter the demand for the other half of tourism — true experience of the culture, eating the food, interacting with the locals and making friends. But so much commercial tourism — people being herded in tour groups to major sites and museums, then eating at tour-group restaurants — can be replaced.
I expect VR to reproduce the sights and sounds and a few other things. Special rooms could also reproduce winds and even some movement (for example, the feeling of being on a ship.) Right now, walking is harder to reproduce. With the OR Crescent Bay you could only walk 2-3 feet, but one could imagine warehouse size spaces or even outdoor stadia where large amounts of real walking might be possible if the simulated surface is also flat. Simulating walking over rough surfaces and stairs offers real challenges. I have tried systems where you walk inside a sphere but they don’t yet quite do it for me. I’ve also seen a system where you are held in place and move your feet in slippery socks on a smooth surface. Fun, but not quite there. Your body knows when it is staying in one place, at least for now. Touching other things in a realistic way would require a very involved robotic system — not impossible, but quite difficult.
Also interesting will be immersive augmented reality. There are a few ways I know of that people are developing
With a VR headset, bring in the real world with cameras, modify it and present that view to the screens, so they are seeing the world through the headset. This provides a complete image, but the real world is reduced significantly in quality, at least for now, and latency must be extremely low.
With a semi-transparent screen, show the augmentation with the real world behind it. This is very difficult outdoors, and you can’t really stop bright items from the background mixing with your augmentation. Focus depth is an issue here (and is with most other systems.) In some plans, the screens have LCDs that can go opaque to block the background where an augmentation is being placed.
CastAR has you place retroreflective cloth in your environment, and it can present objects on that cloth. They do not blend with the existing reality, but replace it where the cloth is.
Projecting into the eye with lasers from glasses, or on a contact lens can be brighter than the outside world, but again you can’t really paint over the bright objects in your environment.
Getting back to Rome, my goal would be to create an augmented reality that let you walk around ancient Rome, seeing the buildings as they were. The people around you would be converted to Romans, and the modern roads and buildings would be turned into areas you can’t enter (since we don’t want to see the cars, and turning them into fast chariots would look silly.) There have been attempts to create a virtual walk through ancient Rome, but being able to do it in the real location would be very cool.
The Olympics are coming up, and I have a request for you, NBC Sports. It’s the 21st century, and media technologies have changed a lot. It’s not just the old TV of the 1900s.
Every year, you broadcast the opening ceremony, which is always huge, expensive and spectacular. But your judgment is that we need running commentary, even when music is playing or especially poignant moments are playing out. OK, I get that, perhaps a majority of the audience wants and needs that commentary. Another part of the audience would rather see the ceremony as is, with minimal commentary.
This being the 21st century, you don’t have to choose only one. Almost every TV out there now supports both multiple audio channels — either via the SAP channel (where it still exists) or more likely through the multiple audio channels of digital TV. In addition, they all support multiple channels of captions, too.
So please give us the audio without your announcers on one of the alternate audio channels. Give us their commentary on a caption channel, so if we want to read it without interfering with the music, we can read it.
If you like, do a channel where the commentary is only on the left channel. Clever viewers can then mix the commentary at whatever volume they like using the balance control. Sure, you lose stereo, but this is much more valuable.
I know you might take this as an insult. You work hard on your coverage and hire good people to do it. And so do it — but give your viewers the choice when the live audio track is an important part of the event, as it is for the opening and closing ceremonies, medal ceremonies and a few other events.
The blogging world was stunned by the recent announcement by Google that it will be shutting down Google reader later this year. Due to my consulting relationship with Google I won’t comment too much on their reasoning, though I will note that I believe it’s possible the majority of regular readers of this blog, and many others, come via Google reader so this shutdown has a potential large effect here. Of particular note is Google’s statement that usage of Reader has been in decline, and that social media platforms have become the way to reach readers.
The effectiveness of those platforms is strong. I have certainly noticed that when I make blog posts and put up updates about them on Google Plus and Facebook, it is common that more people will comment on the social network than comment here on the blog. It’s easy, and indeed more social. People tend to comment in the community in which they encounter an article, even though in theory the most visibility should be at the root article, where people go from all origins.
However, I want to talk a bit about online publishing history, including USENET and RSS, and the importance of concepts within them. In 2004 I first commented on the idea of serial vs. browsed media, and later expanded this taxonomy to include sampled media such as Twitter and social media in the mix. I now identify the following important elements of an online medium:
Is it browsed, serial or to be sampled?
Is there a core concept of new messages vs. already-read messages?
If serial or sampled, is it presented in chronological order or sorted by some metric of importance?
Is it designed to make it easy to write and post or easy to read and consume?
Online media began with E-mail and the mailing list in the 60s and 70s, with the 70s seeing the expansion to online message boards including Plato, BBSs, Compuserve and USENET. E-mail is a serial medium. In a serial medium, messages have a chronological order, and there is a concept of messages that are “read” and “unread.” A good serial reader, at a minimum, has a way to present only the unread messages, typically in chronological order. You can thus process messages as they came, and when you are done with them, they move out of your view.
E-mail largely is used to read messages one-at-a-time, but the online message boards, notably USENET, advanced this with the idea of move messages from read to unread in bulk. A typical USENET reader presents the subject lines of all threads with new or unread messages. The user selects which ones to read — almost never all of them — and after this is done, all the messages, even those that were not actually read, are marked as read and not normally shown again. While it is generally expected that you will read all the messages in your personal inbox one by one, with message streams it is expected you will only read those of particular interest, though this depends on the volume.
Echos of this can be found in older media. With the newspaper, almost nobody would read every story, though you would skim all the headlines. Once done, the newspaper was discarded, even the stories that were skipped over. Magazines were similar but being less frequent, more stories would be actually read.
USENET newsreaders were the best at handling this mode of reading. The earliest ones had keyboard interfaces that allowed touch typists to process many thousands of new items in just a few minutes, glancing over headlines, picking stories and then reading them. My favourite was TRN, based on RN by Perl creator Larry Wall and enhanced by Wayne Davison (whom I hired at ClariNet in part because of his work on that.) To my great surprise, even as the USENET readers faded, no new tool emerged capable of handling a large volume of messages as quickly.
In fact, the 1990s saw a switch for most to browsed media. Most web message boards were quite poor and slow to use, many did not even do the most fundamental thing of remembering what you had read and offering a “what’s new for me?” view. In reaction to the rise of browsed media, people wishing to publish serially developed RSS. RSS was a bit of a kludge, in that your reader had to regularly poll every site to see if something was new, but outside of mailing lists, it became the most usable way to track serial feeds. In time, people also learned to like doing this online, using tools like Bloglines (which became the leader and then foolishly shut down for a few months) and Google Reader (which also became the leader and now is shutting down.) Online feed readers allow you to roam from device to device and read your feeds, and people like that. read more »
Last month, I invited Gregory Benford and Larry Niven, two of the most respected writers of hard SF, to come and give a talk at Google about their new book “Bowl of Heaven.” Here’s a Youtube video of my session. They did a review of the history of SF about “big dumb objects” — stories like Niven’s Ringworld, where a huge construct is a central part of the story.
Tonight I will be on a panel at the Palo Alto International Film Festival at 5pm. Not on robocars, but on the role of science fiction in movies in changing the world. (In a past life, I published science fiction and am on this panel by virtue of my faculty position at Singularity University.)
I’m watching the Olympics, and my primary tool as always is MythTV. Once you do this, it seems hard to imagine watching them almost any other way. Certainly not real time with the commercials, and not even with other DVR systems. MythTV offers a really wide variety of fast forward speeds and programmable seeks. This includes the ability to watch at up to 2x speed with the audio still present (pitch adjusted to be natural) and a smooth 3x speed which is actually pretty good for watching a lot of sports. In addition you can quickly access 5x, 10x, 30x, 60x, 120x and 180x for moving along, as well as jumps back and forth by some fixed amount you set (like 2 minutes or 10 minutes) and random access to any minute. Finally it offers a forward skip (which I set to 20 seconds) and a backwards skip (I set it to 8 seconds.)
MythTV even lets you customize these numbers so you use different nubmers for the Olympics compared to other recordings. For example the jumps are normally +/- 10 minutes and plus 30 seconds for commercial skip, but Myth has automatic commercial skip.
A nice mode allows you to go to smooth 3x speed with closed captions, though it does not feature the very nice ability I’ve seen elsewhere of turning on CC when the sound is off (by mute or FF) and turning it off when sound returns. I would like a single button to put me into 3xFF + CC and take me out of it.
Anyway, this is all very complex but well worth learning because once you learn it you can consume your sports much, much faster than in other ways, and that means you can see more of the sports that interest you, and less of the sports, commercials and heart-warming stories of triumph over adversity that you don’t. With more than 24 hours a day of coverage it is essential you have tools to help you do this.
I have a number of improvements I would like to see in MythTV like a smooth 5x or 10x FF (pre-computed in advance) and the above macro for CC/FF swap. In addition, since the captions tend to lag by 2-3 seconds it would be cool to have a time-sync for the CC. Of course the network, doing such a long tape delay, should do that for you, putting the CC into the text accurately and at the moment the words are said. You could write software to do that even with human typed captions, since the speech-recognition software can easily figure out what words match once it has both the audio and the words. Nice product idea for somebody.
Watching on the web
This time, various networks have put up extensive web offerings, and indeed on NBC this is the only way to watch many events live, or at all. Web offerings are good, though not quite at the quality of over-the-air HDTV, and quality matters here. But the web offerings have some failings read more »
I found this recent article from the editor of the MIT Tech review on why apps for publishers are a bad idea touched on a number of key issues I have been observing since I first got into internet publishing in the 80s. I recommend the article, but if you insist, the short summary is that publishers of newspapers and magazines flocked to the idea of doing iPad apps because they could finally make something they that they sort of recognized as similar to a traditional publication; something they controlled and laid out, that was a combined unit. So they spent lots of money and ran into nightmares (having to design for both landscape and portrait on the tablet, as well as possibly on the phones or even Android.) and didn’t end up selling many subscriptions.
Since the dawn of publishing there has been a battle between design and content. This is not a battle that has or should have a single winner. Design is important to enjoyment of content, and products with better design are more loved by consumers and represent some of the biggest success stories. Creators of the content — the text in this case — point out that it is the text where you find the true value, the thing people are actually coming for. And on the technology side, the value of having a wide variety of platforms for content — from 30” desktop displays to laptops to tablets to phones, from colour video displays to static e-ink — is essential to a thriving marketplace and to innovation. Yet design remains so important that people will favour the iPhone just because they are all the same size, and most Android apps still can’t be used on Google TV.
This is also the war between things like PDF, which attempts to bring all the elements of paper-based design onto the computer, and the purest form of SGMLs, including both original and modern HTML. Between WYSIWYG and formatting languages, between semantic markup and design markup. This battle is quite old, and still going on. In the case of many designers, that is all they do, and the idea that a program should lay out text and other elements to fit a wide variety of display sizes and properties is anathema. To technologists, that layout should be fixed is almost as anathema.
Also included in this battle are the forces of centralization (everything on the web or in the cloud) and the distributed world (custom code on your personal device) and their cousins online and offline reading. A full treatise on all elements of this battle would take a book for it is far from simple.
I sit mostly with the technologists, eager to divide design from content. I still write all my documents in text formatting languages with visible markup and use WYSIWYG text editors only rarely. An ideal system that does both is still hard to find. Yet I can’t deny the value and success of good design and believe the best path is to compromises in this battle. We need compromises in design and layout, we need compromises between the cloud and the dedicated application. End-user control leads to some amount of chaos. It’s chaos that is feared by designers and publishers and software creators, but it is also the chaos that gives us most of our good innovations, which come from the edge.
Let’s consider all the battles I perceive for the soul of how computing, networks and media work:
The design vs. semantics battle (outlined above)
The cloud vs. personal device
Mobile, small and limited in input vs. tethered, large screen and rich in input
Central control vs. the distributed bazaar (with so many aspects, such as)
The destination (facebook) vs. the portal (search engine)
The designed, uniform, curated experience (Apple) vs. the semi-curated (Android) vs. the entirely open (free software)
The social vs. the individual (and social comment threads vs. private blogs and sites)
The serial (email/blogs/RSS/USENET) vs. the browsed (web/wikis) vs. the sampled (facebook/twitter)
The reader-friendly (fancy sites, well filtered feeds) vs. writer friendly (social/wiki)
In most of these battles both sides have virtues, and I don’t know what the outcomes will be, but the original MITTR article contained some lessons for understanding them.