The name “artificial intelligence” covers a lot of disparate problem areas, united mainly by the fact that they involve complex inputs and outputs that are difficult to compute (or even check for correctness when supplied). One of the most interesting such areas is sensor-controlled behavior, in which a machine acts in the real world using information gathered from sensors such as sonars and cameras. This is a major focus of A.I. research at Yale.
The difference between sensor-controlled behavior and what computers usually do is that the input from a sensor is ambiguous. When a computer reads a record from a database, it can be certain what the record says. There may be philosophical doubt about whether an employee’s social-security number really succeeds in referring to a flesh-and-blood employee but such doubts don’t affect how programs are written. As far as the computer system is concerned, the identifying number is the employee, and it will happily, and successfully, use it to access all relevant data as long as no internal inconsistency develops.
Contrast that with a computer controlling a soccer-playing robot, whose only sensor is a camera mounted above the field. The camera tells the computer, several times per second, the pattern of illumination it is receiving encoded as an array of numbers. (Actually, it’s three arrays, one for red, one for green, and one for blue.) The vision system must extract from this large set of numbers the locations of all the robots (on its team and the opponent’s) plus the ball. What it extracts is not an exact description, but always noisy, and occasionally grossly wrong. In addition, by the time the description is available it is always slightly out of date. The computer must decide quickly how to alter the behavior of the robots, send them messages to accomplish that, and then process the next image.
One might wonder why we choose to work in such a perversely difficult area. There are two obvious reasons: First, one ultimate goal of A.I. research is to understand how people are possible–i.e., how it is that an intelligent system can thrive in the real world. Our vision and other senses are so good that we can sometimes overlook the noise and errors they are prone to, when in fact we are faced with problems that are similar to the robot-soccer player, but much worse. We will never understand human intelligence until we understand how the human brain extracts information from its environment, and uses it to guide behavior.
Second, vision and robotics have many practical applications. Space exploration is more cost-effective when robots are the vanguard, as demonstrated dramatically by the Mars Rover mission of 1997. Closer to home, we are already seeing commercially viable applications of the technology. For instance, TV networks can now produce three-dimensional views of an athletic event, by combining several two-dimensional views, in essentially the same way animals manage stereo vision. There is now a burgeoning robotic-toy industry, and we can expect robots to appear in more complex roles in our lives. So far, the behaviors these robots can exhibit are quite primitive. Kids are satisfied with a robot that can utter a few phrases or wag its tail when hugged. But it quickly becomes clear even to a child that today’s toys are not really aware of what is going on around them. The main problem in making them aware is to provide them with better sensors, which means better algorithms for processing the outputs from the sensors.
Research in this area at Yale is carried out by the Center for Computational Vision and Control, a joint effort of the Departments of Computer Science, Electrical Engineering, and Radiology. We will describe three of the ongoing projects in this area.