Machine Learning

Recent improvements in computers and communication technology have made staggering amounts of information available to us. Fortunately there has been a concomitant increase in processing power. One of the major challenges for computing lies in finding ways to organize the available processing power in order to extract maximum benefit from the available information. Different aspects of that problem are addressed by machine learning, knowledge discovery and data mining.

Every field of human endeavor is affected by these issues; therefore a few examples will suffice. In bioinformatics, there are opportunities to use information about the human genome to design more effective medications, information from medical records to track drug side-effects or emerging diseases, and information from medical imaging to understand the structure and function of the brain and other organs. In robotics, major challenges lie in making use of visual and other sensed information to allow a robot to adapt flexibly to its environment. In the domain of e-commerce, all but the simplest artificial agents designed to gather information and act on behalf of human users will have to be equipped with the ability to adapt their behavior successfully to unexpected conditions. The very large number of documents and databases available on the web has fostered the development of tools for abstracting, extracting, and combining sources in a variety of ways to make the information more usable to humans.

Machine learning is driven by the goal of making programs or agents that exhibit useful learning behavior, autonomously or in cooperation with teams of other agents, either human or artificial. One goal is software that is easier to use, e.g., a word-processing program that can guess from an example or two what text transformation a user wishes to make. Machine learning can be used in the design of a team of soccer-playing robots to allow the team to adapt its tactics to the observed behavior of an opposing team. At the level of hardware, sophisticated processor designs can adapt to characteristics of the program currently being run to achieve increases in speed.

As learning components are increasingly included in software and embedded systems, it is important to have a clear understanding of their theoretical and practical strengths and limitations. Computational learning theory–a theoretical branch of machine learning–develops and studies algorithmic models of learning, using tools from analysis of algorithms, theory of computation, probability and statistics, game theory, and cryptography. Fundamental models and results in computational learning theory have established the learnability or non-learnability of a number of classes of concepts from various kinds of information. Many exciting questions and areas remain open, including learnability of probabilistic models, learnability in games and other cooperative and competitive situations, and aspects of learnability in natural language. As one example, in the context of e-commerce it is important to study how a population of agents can learn and use information about the trustworthiness of the other agents in the population.

Knowledge discovery and data mining address the question of extracting usable information from very large and heterogeneous sources of information. In this context, even a linear-time algorithm can be too slow. One promising approach to a collection of text documents is to treat it as a matrix of the occurrences of distinct words in the documents, and to approximate the information in the matrix using a low-rank singular value decomposition. This has motivated the search for fast provable approximations of important matrix-vector computations based on weighted sampling.

Faculty members with interests in this area include Dana Angluin, Drew McDermott, Sahand Negahban, Martin Schultz, and Daniel Spielman.