Alan Turing proposed a simple test for machine intelligence. Based on a parlour game where players try to tell if a hidden person is a man or a woman just by passing notes, he suggested we define a computer as intelligent if people can’t tell it from a human being through conversations with both over a teletype.
While this seemed like a great test (for those who accept that external equivalence is sufficient) in fact to the surprise of many people, computers passed this test long ago with ordinary, untrained examiners. Today there has been an implicit extension of the test, that the computer must be able to fool a trained examiner, typically an AI researcher or expert in brain sciences or both.
I am going to propose updating it further, in two steps. Turing proposed his test perhaps because at the time, computer speech synthesis did not exist, and video was in the distant future. He probably didn’t imagine that we would solve the problems of speech well before we got handles on actual thought. Today a computer can, with a bit of care in programming inflections and such into the speech, very much like a human, and we’re much closer to making that perfect than we are to getting a Turing-level intelligence. Speech recognition is a bit behind, but also getting closer.
So my first updated proposal is to cast aside the teletype, and make it be a phone conversation. It must be impossible to tell the computer from another human over the phone or an even higher fidelity audio channel.
The second update is to add video. We’re not as far along here, but again we see more progress, both in the generation of digital images of people, and in video processing for object recognition, face-reading and the like. The next stage requires the computer to be impossible to tell from a human in a high-fidelity video call. Perhaps with 3-D goggles it might even be a 3-D virtual reality experience.
A third potential update is further away, requiring a fully realistic android body. In this case, however, we don’t wish to constrain the designers too much, so the tester would probably not get to touch the body, or weigh it, or test if it can eat, or stay away from a charging station for days etc. What we’re testing here is the being’s “presence” — fluidity of motion, body language and so on. I’m not sure we need this test as we can do these things in the high fidelity video call too.
Why these updates, which may appear to divert from the “purity” of the text conversation? For one, things like body language, nuance of voice and facial patterns are a large part of human communication and intelligence, so to truly accept that we have a being of human level intelligence we would want to include them.
Secondly, however, passing this test is far more convincing to the general public. While the public is not very sophisticated and thus can even be fooled by an instant messaging chatbot, the feeling of equivalence will be much stronger when more senses are involved. I believe, for example, that it takes a much more sophisticated AI to trick even an unskilled human if presented through video, and not simply because of the problems of rendering realistic video. It’s because these communications channels are important, and in some cases felt more than they are examined. The public will understand this form of turing test better, and more will accept the consequences of declaring a being as having passed it — which might include giving it rights, for example.
Though yes, the final test should still require a skilled tester.
