Archives

Date
  • 01
  • 02
  • 03
  • 04
  • 05
  • 06
  • 07
  • 08
  • 09
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31

Where will 3-D cameras like Kinect lead?

This year, I bought Microsoft Kinect cameras for the nephews and niece. At first they will mostly play energetic X-box games with them but my hope is they will start to play with the things coming from the Kinect hacking community — the videos of the top hacks are quite interesting. At first, MS wanted to lock down the Kinect and threaten the open source developers who reverse engineered the protocol and released drivers. Now Microsoft has official open drivers.

This camera produced a VGA colour video image combined with a Z (depth) value for each pixel. This makes it trivial to isolate objects in the view (like people and their hands and faces) and splitting foreground from background is easy. The camera is $150 today (when even a simple one line LIDAR cost a fortune not long ago) and no doubt cameras like it will be cheap $30 consumer items in a few years time. As I understand it, the Kinect works using a mixture of triangulation — the sensor being in a different place from the emitter — combined with structured light (sending out arrays of dots and seeing how they are bent by the objects they hit.) An earlier report that it used time-of-flight is disputed, and implies it will get cheaper fast. Right now it doesn’t do close up or very distant, however. While projection takes power, meaning it won’t be available full time in mobile devices, it could still show up eventually in phones for short duration 3-D measurement.

I agree with those that think that something big is coming from this. Obviously in games, but also perhaps in these other areas.

Gestural interfaces and the car

While people have already made “Minority Report” interfaces with the Kinect, studies show these are not very good for desktop computer use — your arms get tired and are not super accurate. They are good for places where your interaction with the computer will be short, or where using a keyboard is not practical.

One place that might make sense is in the car, at least before the robocar. Fiddling with the secondary controls in a car (such as the radio, phone, climate system or navigation) is always a pain and you’re really not supposed to look at your hands as you hunt for the buttons. But taking one hand off the wheel is OK. This can work as long as you don’t have to look at a screen for visual feedback, which is often the case with navigation systems. Feedback could come by audio or a heads up display. Speech is also popular here but it could be combined with gestures.

A Gestural interface for the TV could also be nice — a remote control you can’t ever misplace. It would be easy to remember gestures for basic functions like volume and channel change and arrow keys (or mouse) in menus. More complex functions (like naming shows etc.) are best left to speech. Again speech and gestures should be combined in many cases, particularly when you have a risk that an accidental gesture or sound could issue a command you don’t like.

I also expect gestures to possibly control what I am calling the “4th screen” — namely an always-on wall display computer. (The first 3 screens are Computer, TV and mobile.) I expect most homes to eventually have a display that constantly shows useful information (as well as digital photos and TV) and you need a quick and unambiguous way to control it. Swiping is easy with gesture control so being able to just swipe between various screens (Time/weather, transit arrivals, traffic, pending emails, headlines) might be nice. Again in all cases the trick is not being fooled by accidental gestures while still making the gestures simple and easy.

In other areas of the car, things like assisted or automated parking, though not that hard to do today, become easier and cheaper.

Small scale robotics

I expect an explosion in hobby and home robotics based on these cameras. Forget about Roombas that bump into walls, finally cheap robots will be able to see. They may not identify what they see precisely, though the 3D will help, but they won’t miss objects and will have a much easier time doing things like picking them up or avoiding them. LIDARs have been common in expensive robots for some time, but having it cheap will generate new consumer applications.

Mobile

There will be some gestural controls for phones, particularly when they are used in cars. I expect things to be more limited here, with big apps to come in games. However, history shows that most of the new sensors added to mobile devices cause an explosion of innovation so there will be plenty not yet thought of. 3-D maps of areas (particularly when range is longer which requires power) can also be used as a means of very accurate position detection. The static objects of a space are often unique and let you figure out where you are to high precision — this is how the Google robocars drive.

Security & facial recognition

3-D will probably become the norm in the security camera business. It also helps with facial recognition in many ways (both by isolating the face and allowing its shape to play a role) and recognition of other things like gait, body shape and animals. Face recognition might become common at ATMs or security doors, and be used when logging onto a computer. It also makes “presence” detection reliable, allowing computers to see how and where people are in a room and even a bit of what they are doing, without having to object recognition. (Though as the kinect hacks demonstrate, they help object recognition as well.)

Face recognition is still error-prone of course so its security uses will be initially limited, but it will get better at telling among people.

Virtual worlds & video calls

While some might view this as gaming, we should also see these cameras heavily used in augmented reality and virtual world applications. It makes it easy to insert virtual objects into a view of the physical world and have a good sense of what’s in front and what’s behind. In video calling, the ability to tell the person from the background allows better compression, as well as blanking of the background for privacy. Effectively you get a “green screen” without the need for a green screen.

You can also do cool 3-D effects by getting an easy and cheap measurement of where the viewer’s head is. Moving a 3-D viewpoint in a generated or semi-generated world as the viewer moves her head creates a fun 3-D effect without glasses and now it will be cheap. (It only works for one viewer, though.) Likewise in video calls you can drop the other party into a different background and have them move within it in 3-D.

With multiple cameras it is also possible to build a more complete 3-D model of an entire scene, with textures to paint on it. Any natural scene can suddenly become something you can fly around.

Amateur video production

Some of the above effects are already showing up on YouTube. Soon everybody will be able to do it. The Kinect’s firmware already does “skeleton” detection, to map out the position of the limbs of a person in the view of the camera. That’s good for games but also allows motion capture for animation on the cheap. It also allows interesting live effects distorting the body or making light sabres glow. Expect people in their own homes to be making their own Avatar like movies, at least on a smaller scale.

These cameras will become so popular we may need to start worrying about interference by their structured light. These are apps I thought of in just a few minutes. I am sure there will be tons more. If you have something cool to imagine, put it in the comments.

Happy Seasons to all! and a Merry New Year.

Drivers cost 1.7 million person-years every year in the USA, 3rd of all major causes

I’ve written frequently about how driving fatalities are the leading cause of death for people from age 5 to 45, and one of the leading overall causes of death. I write this because we hope that safe robocars, with a much lower accident rate, can eliminate much of this death.

Today I sought to calculate the toll in terms not of lives, but in years of life lost. Car accidents kill people young, while the biggest killers like heart disease/stroke, cancer and respiratory disease kill people when they are older. The CDC’s injury prevention dept. publishes a table of “Years of Potential Life Lost” which I have had it calculate for a lifespan of 80 years. (People who die after 80 are not counted as having lost years of life, though a more accurate accounting might involve judging the average expected further lifespan for each age cohort and counting that as the YPLL.)

The core result of the table though is quite striking. Auto accidents jump to #3 on the list from #7, and the ratios become much smaller. While each year almost a million die from cardiovascular causes and 40,000 from cars, the ratio of total years lost is closer to 4 to 1 for both cardiovascular disease and cancer, and the other leading causes are left far behind. (The only ones to compete with the cars are suicides and accidental poisoning which is much worse than I expected.)

The lesson: Work on safe robocars is even more vital than we might have thought, if you use this metric. It also seems that those interested in saving years of life may want to address the problem of accidental poisoning. Perhaps smart packaging or cheap poison detection could have a very big effect. (Update: This number includes non-intentional drug overdoses and deaths due to side effects of prescription drugs.) For suicide, this may suggest that our current approaches to treating depression need serious work. (For example, there are drugs that have surprising effectiveness on depression such as ketamine which are largely unused because they have recreational uses at higher doses and are thus highly controlled.) And if you can cure cancer, you would be doing everybody a solid.

Note: Stillbirths are not counted here. I would have expected the Perinatal causes to rank higher due to the large number of years erased. If you only do it to 65, thus counting what might get called “productive years” the motor vehicle deaths take on a larger fraction of the pie. Productivity lost to long term disability is not counted here, though it is very common in non-fatal motor vehicle accidents. Traffic deaths are dropping though so the 2009 figures will be lower.

Banks: Give me two passwords

Passwords are in the news thanks to Gawker media, who had their database of userids, emails and passwords hacked and published on the web. A big part of the fault is Gawker’s, who was saving user passwords (so it could email them) and thus was vulnerable. As I have written before, you should be very critical of any site that is able to email you your password if you forget it.

Some of the advice in the wake of this to users has been to not use the same password on multiple sites, and that’s not at all practical in today’s world. I have passwords for many hundreds of sites. Most of them are like gawker — accounts I was forced to create just to leave a comment on a message board. I use the same password for these “junk accounts.” It’s just not a big issue if somebody is able to leave a comment on a blog with my name, since my name was never verified in the first place. A different password for each site just isn’t something people can manage. There are password managers that try to solve this, creating different passwords for each site and remembering them, but these systems often have problems when roaming from computer to computer, or trying out new web browsers, or when sites change their login pages.

The long term solution is not passwords at all, it’s digital signature (though that has all the problems listed above) and it’s not to even have logins at all, but instead use authenticated actions so we are neither creating accounts to do simple actions nor using a federated identity monopoly (like Facebook Connect). This is better than OpenID too.  read more »

How Robocars affect the City, plus Masdar & City of Apple

I decided to gather together all my thoughts on how robocars will affect urban design. There are many things that might happen, though nobody knows enough urban planning to figure out just what will happen. However, I felt it worthwhile to outline the forces that might be at work so that urban geographers can speculate on what they will mean. It is hard to make firm predictions. For example, does the ability for a short pleasant trip make people want a Manhattan where everybody can get anywhere in 10 minutes, or does the ability to work or relax during trips make people not care about the duration and lead to more Sprawl? It can go either way, or both.

Read Robocar influence on the future of cities.

Masdar Video

In other notes, now that Masdar’s PRT is in limited operation, there are more videos of it. Here is a CNN Report with good shots of the cars moving around. As noted before, the system is massively scaled back, and runs at ground level, underneath elevated pedestrian streets. The cars are guided by magnets but there is LIDAR to look for pedestrians and obstacles.

City of Apple

The designer of Masdar, Foster + Partners, has been retained to design the new “City of Apple” which is going to spring up literally a 5 minute walk from my house. Apple has purchased the large Cupertino tract that was a major HP facility (and which also held Tandem, which HP eventually bought) and a few other companies. This is about a mile from Apple’s main HQ in Cupertino. Speculation about the plan includes a transportation system of some kind, possibly a PRT like in Masdar. However, strangely, there are talks of an underground tunnel between the buildings which makes almost no sense in this area, particularly since I can’t imagine it would be too hard to run elevated guideway along the side of interstate 280 or even on the very wide Stevens Creek Boulevard.

Sadly, aside from Apple, there’s not a lot for the system to visit if it’s to be more than intra-company transport. The Valco mall and the Cupertino Village are popular but Cupertino doesn’t really have a walkable downtown to speak of.

Of course if Apple wants to tear down all the HP buildings and put up a new massive complex, it will be hard to call that a green move. The energy and greenhouse gases involved in replacing buildings are huge. For transportation, robocars could just make use of the existing highway between the two campuses. It’s not even impossible to imagine Apple building its own exits and bridges on the interstate — much cheaper than an underground tunnel.

Building a house organizing robot with image search

There are many fields that people expect robotics to change in the consumer space. I write regularly about transportation, and many feel that robots to assist the elderly will be the other big field. The first successful consumer robot (outside of entertainment) was the Roomba, a house cleaning robot. So I’ve often wondered about how far we are from a robot that can tidy up the house. People got excited with a PR2 robot was programmed to fold towels.

This is a hard problem because it seems such a robot needs to do general object recognition and manipulation, something we’re pretty far from doing. Special purpose household chore robots, like the Roomba, might appear first. (A gutter cleaner is already on the market.)

Recently I was pondering what we might do with a robot that is able to pick up objects gently, but isn’t that good at recognizing them. Such a robot might not identify the objects, but it could photograph them, and put them in bins. The members of the household could then go to their computers and see a visual catalog of all the things that have been put away, and an indicator of where it was put. This would make it easy to find objects.

The catalog could trivially be sorted by when the items were put away, which might well make it easy to browse for something put away recently. But the fact that we can’t do general object recognition does not mean we can’t do a lot of useful things with photographs and sensor readings (including precise weight and other factors) beyond that. One could certainly search by colour, by general size and shape, and by weight and other characteristics like rigidity. The item could be photographed in a 360 view by being spun on a table or in the grasping arm, or which a rotating camera. It could also be laser-scanned or 3D photographed with new cheap 3D camera techniques.

When looking for a specific object, one could find it by drawing a sketch of the object — software is already able to find photos that are similar to a sketch. But more is possible. Typing in the name of what you’re looking for could bring up the results of a web image search on that string, and you could find a photo of a similar object, and then ask the object search engine to find photos of objects that are similar. While ideally the object was photographed from all angles, there are already many comparison algorithms that survive scaling and rotation to match up objects.

The result would be a fairly workable search engine for the objects of your life that were picked up by the robot. I suspect that you could quickly find your item and learn just exactly where it was.

Certain types of objects could be recognized by the robot, such as books, papers and magazines. For those, bar-codes could be read, or printing could be scanned with OCR. Books might be shelved at random in the library but be easily found. Papers might be hard to manipulate but could at least be stacked, possibly with small divider sheets inserted between them with numbers on them, so that you could look for the top page of any collected group of papers and be told, “it’s under divider 20 in the stack of papers.”  read more »

SARTRE "road train" update

The folks at the SARTRE road train project have issued an update one year into their 3 year project. This is an EU-initiated project to build convoy technology, where a professional lead driver in a truck or bus is followed by a convoy of closely packed cars which automatically follow based on radio communications (and other signals) with the lead. They have released a new video on their progress from Volvo.

I have written before about the issues involved in this project and many of them remain. It’s the easiest way to get a robocar on the highway, but comes with a particularly high risk if it fails — and failure in the earliest stages of robocar projects is very likely.

In the video, some interesting elements include:

  • The building of a simulator to test driver attitudes and reactions. Generally quite positive, in that people are happy to trust the driving to the system and the lead driver. This will change a bit in a real car, since a simulator can only do so much.
  • The imagine people eating, drinking, listening to music and reading while in the convoys, but they don’t talk about the elephant in the car: sleeping. People doing anything else can quickly take the controls in a problem, but sleepers may not. And there’s also that act that we metaphorically call “sleeping together.”
  • Their simulations depict cars leaving the convoy from the middle. However, in this situation it seems you can’t give them too much brake-accelerator control for the difficult task of changing lanes when you are just a few feet from the cars in front and back of you. You must maintain the speed of the train until you have fully left its lane, but that means you can’t do the usual task of changing speed as you enter your new lane. Exit from the trains will need some work. (There are suggestions in the comments that make sense.)
  • They expect to have to make legal changes to allow this. However, since it’s an EU initiated project, they have a leg-up on that. This might pave the way for more robocar-friendly laws in Europe.
  • While they plan to do a live test by 2012, they are much more cautious on predicting when the trains might be common on the roads.
  • They do speculate if a simple robocar function for “stop and go” traffic, which is able to follow the car in front of you at lower speeds, might come first. Indeed, this is pretty easy, and not much more than a smarter version of existing auto-follow cruise control with steering and lane-following added.
  • Their main pitch is environmental, as drafting should save decent fuel. However, I think most people will be interested in the time saving, and I’ll be interested in how the public accepts it.