Not much new to report after the second game of the Watson Jeopardy Challenge. I've added a few updates to yesterday's post on Watson and the result was as expected, though Watson struggled a lot more in this game than in the prior round, deciding not to answer many questions due to low confidence and making a few mistakes. In a few cases it was saved by not buzzing fast enough even though it had over 50% confidence, as it would have answered slightly wrong.
The computer scientist world is abuzz with the game show world over the showdown between IBM's "Watson" question-answering system and the best human players to play the game Jeopardy. The first game has been shown, with a crushing victory by Watson (in spite of a tie after the first half of the game.)
Tomorrow's outcome is not in doubt. IBM would not have declared itself ready for the contest without being confident it would win, and they wouldn't be putting all the advertising out about the contest if they had lost. What's interesting is how they did it and what else they will be able to do with it.
Dealing with a general question has long been one of the hard problems in AI research. Watson isn't quite there yet but it's managed a great deal with a combination of algorithmic parsing and understanding combined with machine learning based on prior Jeopardy games. That's a must because Jeopardy "answers" (clues) are often written in obfuscated styles, with puns and many idioms, exactly the sorts of things most natural language systems have had a very hard time with.
Watson's problem is almost all understanding the question. Looking up obscure facts is not nearly so hard if you have a copy of Wikipedia and other databases on hand, particularly one parsed with other state-of-the-art natural language systems, which is what I presume they have. In fact, one would predict that Watson would do the best on the hardest $2,000 questions because these are usually hard because they refer to obscure knowledge, not because it is harder to understand the question. I expect that an evaluation of its results may show that its performance on hard questions is not much worse than on easy ones. (The main thing that would make easy questions easier would be the large number of articles in its database confirming the answer, and presumably boosting its confidence in its answer.) However, my intuition may be wrong here, in that most of Watson's problems came on the high-value questions.
It's confidence is important. If it does not feel confident it doesn't buzz in. And it has a serious advantage at buzzing in, since you can't buzz in right away on this game, and if you're an encyclopedia like the two human champions and Watson, buzzing in is a large part of the game. In fact, a fairer game, which Watson might not do as well at, would involve randomly choosing which of the players who buzz in in the first few tenths of a second gets to answer the question, eliminating any reaction time advantage. Watson gets the questions as text, which is also a bit unfair, unless it is given them one word a time at human reading speed. It could do OCR on the screen but chances are it would read faster than the humans. It's confidence numbers and results are extremely impressive. One reason it doesn't buzz in is that even with 3,000 cores it takes 2-6 seconds to answer a question.
Indeed a totally fair contest would not have buzzing in time competition at all, and just allow all players who buzz in to answer an get or lose points based on their answer. (Answers would need to be in parallel.)
Watson's coders know by now that they probably should have coded it to receive wrong answers from other contestants. In one instance it repeated a wrong answer, and in another case it said "What is Leg?" after Jennings had incorrectly answered "What is missing an arm?" in a question about an Olympic athlete. The host declared that right, but the judges reversed that saying that it would be right if a human who was following up the wrong answer said it, but was a wrong answer without that context. This was edited out. Also edited out were 4 crashes by Watson that made the game take 4 hours instead of 30 minutes.
It did not happen in what aired so far, but in the trials, another error I saw Watson make was declining to answer a request to be more specific on an answer. Watson was programmed to give minimalist answers, which often the host will accept as correct, so why take a risk. If the host doesn't think you said enough he asks for a more specific answer. Watson sometimes said "I can be no more specific." From a pure gameplay standpoint, that's like saying, "I admit I am wrong." For points, one should say the best longer phrase containing the one-word answer, because it just might be right. Though it has a larger chance of looking really stupid -- see below for thoughts on that.
The shows also contain total love-fest pieces about IBM which make me amazed that IBM is not listed as a sponsor for the shows, other than perhaps in the name "The IBM Challenge." I am sure Jeopardy is getting great ratings (just having their two champs back would do that on its own but this will be even more) but I have to wonder if any other money is flowing.
Being an idiot savant
Watson doesn't really understand the Jeopardy clues, at least not as a human does. Like so many AI breakthroughs, this result comes from figuring out another way to attack the problem different from the method humans use. As a result, Watson sometimes puts out answers that are nonsense "idiot" answers from a human perspective. They cut back a lot on this by only having it answer when it has 50% confidence or higher, and in fact for most of its answers it has very impressive confidence numbers. But sometimes it gives such an answer. To the consternation of the Watson team, it did this on the Final Jeopardy clue, where it answered "Toronto" in the category "U.S. Cities."
There are many fields that people expect robotics to change in the consumer space. I write regularly about transportation, and many feel that robots to assist the elderly will be the other big field. The first successful consumer robot (outside of entertainment) was the Roomba, a house cleaning robot. So I've often wondered about how far we are from a robot that can tidy up the house. People got excited with a PR2 robot was programmed to fold towels.
This is a hard problem because it seems such a robot needs to do general object recognition and manipulation, something we're pretty far from doing. Special purpose household chore robots, like the Roomba, might appear first. (A gutter cleaner is already on the market.)
Recently I was pondering what we might do with a robot that is able to pick up objects gently, but isn't that good at recognizing them. Such a robot might not identify the objects, but it could photograph them, and put them in bins. The members of the household could then go to their computers and see a visual catalog of all the things that have been put away, and an indicator of where it was put. This would make it easy to find objects.
The catalog could trivially be sorted by when the items were put away, which might well make it easy to browse for something put away recently. But the fact that we can't do general object recognition does not mean we can't do a lot of useful things with photographs and sensor readings (including precise weight and other factors) beyond that. One could certainly search by colour, by general size and shape, and by weight and other characteristics like rigidity. The item could be photographed in a 360 view by being spun on a table or in the grasping arm, or which a rotating camera. It could also be laser-scanned or 3D photographed with new cheap 3D camera techniques.
When looking for a specific object, one could find it by drawing a sketch of the object -- software is already able to find photos that are similar to a sketch. But more is possible. Typing in the name of what you're looking for could bring up the results of a web image search on that string, and you could find a photo of a similar object, and then ask the object search engine to find photos of objects that are similar. While ideally the object was photographed from all angles, there are already many comparison algorithms that survive scaling and rotation to match up objects.
The result would be a fairly workable search engine for the objects of your life that were picked up by the robot. I suspect that you could quickly find your item and learn just exactly where it was.
Certain types of objects could be recognized by the robot, such as books, papers and magazines. For those, bar-codes could be read, or printing could be scanned with OCR. Books might be shelved at random in the library but be easily found. Papers might be hard to manipulate but could at least be stacked, possibly with small divider sheets inserted between them with numbers on them, so that you could look for the top page of any collected group of papers and be told, "it's under divider 20 in the stack of papers."
A number of people have been hiring "virtual" assistants in lower-wage countries to do all the tasks in their life that don't require a personal presence. Such assistants are found starting at a few bucks an hour. I have not done it myself, since for some reason most of the things I feel I could pass on to such an assistant are things that involve some personal presence. (Though I suppose I could just ship off all the papers I need scanned and filed every few weeks to get that out of my life, but I want to have a scanner here too.)
Anyway, last weekend I was talking to an acquaintance about his use of such services. He has his assistant seducing women for him. His assistant, who is female and lives in India, logs onto his account on a popular dating site, browses profiles and (pretending to be him) makes connections with women on the site. She has e-mail conversations and arranges first dates. Then her employer reads the e-mail conversation and goes to the date. (Perhaps he also does a quick vet before arranging a date to be sure the assistant has chosen well, but I did not confirm that.)
I don't often write about robots that don't go on roads, but last night I stopped by Willow Garage, the robot startup created by my old friend Scott Hassan. Scott is investing in building open robotics platforms, and giving much of it out free to the world, because he thinks progress in robotics has been far too slow.
It is no coincidence that two friends of mine have both founded companies recently to build telepresence robots. These are easy to drive remote control robots which have a camera and screen at head height. You can inhabit the robot, and drive it around a flat area and talk to people by videoconferencing. You can join meetings, go visit people or inspect a factory. Companies building these robots, initially at high prices, intend to sell them both to executives who want to remotely tour remote offices and to companies who want to give cheaper remote employees a more physical presence back at HQ.
There are also a few super-cheap telepresence robots, such as the Spykee, which runs Skype video conferencing and can be had for as low as $150. It's not very good, and the camera is very low down, and there's no screen, but it shows just how cheap such a product can get.
|"Anybots" QA telepresence robot|
When they get down to a price like that, it seems inevitable to me that we will see an emergency services robot on every block, primarily for use by the police. When there is a police, fire or ambulance call to an address, an officer could immediately connect to the robot on that block and drive it to the scene, to be telepresent. The robot would live in a small, powered protective closet either paid for by the city, but more likely just donated by some neighbour on the block who wants the fastest possible emergency response. Called into action, the robot's garage door would open and the robot would drive out, and probably be at the location of the emergency within 60 to 120 seconds, depending on how densely they are placed. In the meantime actual first responders might also be on the way.
What could such a robot do?
Watching and managing children is one of the major occupations of the human race. A true robot babysitter is still some time in the future, and getting robocars to the level that we will trust them as safe to carry children is also somewhat in the future, but it will still happen much sooner.
Today I want to explore the implications of a robocar that is ready to safely carry children of certain age ranges. This may be far away because people are of course highly protective of their children. They might trust a friend to drive a child, even though human driving records are poor, because the driver is putting her life on the line just as much as the child's, while the robot is just programmed to be safe, with no specific self-interest.
A child's robocar can be designed to higher safety standards than an adult's, with airbags in all directions, crumple zones designed for a single occupant in the center and the child in a 5-point seatbelt. As you know, with today's modern safety systems, racecar drivers routinely walk away from crashes at 150mph. Making a car that won't hurt the child in a 40mph crash is certainly doable, though not without expense. A robocar's ability to anticipate an accident might even allow it to swivel the seat around so that the child's back is to the accident, something even better than an airbag.
The big issue is supervision of smaller children. It's hard to say what age ranges of children people might want to send via robocar. In some ways infants are easiest, as you just strap them in and they don't do much. All small children today are strapped in solidly, and younger ones are in a rear facing seat where they don't even see the parent. (This is now recommended as safest up to age 4 but few parents do that.) Children need some supervision, though real problems for a strapped in child are rare. Of course, beyond a certain age, the children will be fully capable of riding with minimal supervision, and by 10-12, no direct supervision (but ability to call upon an adult at any time.)
There's a phenomenon we're seeing more and more often. A company screws over a customer, but this customer now has a means to reach a large audience through the internet, and as a result it becomes a PR disaster for the company. The most famous case recently was United Breaks Guitars where Nova Scotia musician David Carroll had his luggage mistreated and didn't get good service, so he wrote a funny song and music video about it.
PEW Research has released their recent study on the future of the internet and technology where they interviewed a wide range of technologists and futurists, including yours truly. It's fairly long, and the diverse opinions are perhaps too wide to be synthesized, but there is definitely some interesting stuff in there.
I've just returned from Denver and the World Science Fiction Convention (worldcon) where I spoke on issues such as privacy, DRM and creating new intelligent beings. However, I also attended a session on "hard" science fiction, and have some thoughts to relate from it.
Defining the sub-genres of SF, or any form of literature, is a constant topic for debate. No matter where you draw the lines, authors will work to bend them as well. Many people just give up and say "Science Fiction is what I point at when I say Science Fiction."
My most important essay to date
Today let me introduce a major new series of essays I have produced on "Robocars" -- computer-driven automobiles that can drive people, cargo, and themselves, without aid (or central control) on today's roads.
It began with the DARPA Grand Challenges convincing us that, if we truly want it, we can have robocars soon. And then they'll change the world. I've been blogging on this topic for some time, and as a result have built up what I hope is a worthwhile work of futurism laying out the consequences of, and path to, a robocar world.
I was intrigued by this report of a russian chatbot fooling men into thinking it was a woman who was hot for them. The chatbot seduces men, and gets them to give personal information that can be used in identity theft. The story is scant on details, but I was wondering why this was taking place in Russia and not in richer places. As reported, this was considered a partial passing of the Turing Test.
Many in my futurist circles worry a lot about the future of AI that eventually becomes smarter than humans. There are those who don't think that's possible, but for a large crowd it's mostly a question of when, not if. How do you design something that becomes smarter than you, and doesn't come back to bite you?
I have written a few times before about versed, the memory drug and the ethical and metaphysical questions that surround it. I was pointed today to a story from Time about propofol, which like the Men in Black neuralizer pen, can erase the last few minutes of your memory from before you are injected with it. This is different from Versed, which stops you from recording memories after you take it.
Here are three events coming up that I will be involved with.
Burning Man of course starts next weekend and consumes much of my time. While I'm not doing any bold new art project this year, maintaining my 3 main ones is plenty of work, as is the foolishly taken on job of village organizer and power grid coordinator. I must admit I often look back fondly on my first Burning Man, where we just arrived and were effectively spectators. But you only get to do that once.
If you go to the cities of Asia, one thing I find striking is how much more three-dimensional their urban streets are. By this I mean that you will regularly find busy retail shops and services on the higher floors of ordinary buildings, and even in the basement. Even in our business areas, above the ground floor is usually offices at most, rarely depending on walk-by traffic. There it's commonplace. I remember being in Hong Kong and asking natives to pick a restaurant for lunch.