Machine Learning

Machine Learning at Yale - External Web Site

Recent improvements in computers and communication technology have made staggering amounts of information available to us. Fortunately there has been a concomitant increase in processing power. One of the major challenges for computing lies in finding ways to organize the available processing power in order to extract maximum benefit from the available information. Different aspects of that problem are addressed by machine learning, knowledge discovery and data mining.

Every field of human endeavor is affected by these issues; therefore a few examples will suffice. In bioinformatics, there are opportunities to use information about the human genome to design more effective medications, information from medical records to track drug side-effects or emerging diseases, and information from medical imaging to understand the structure and function of the brain and other organs. In robotics, major challenges lie in making use of visual and other sensed information to allow a robot to adapt flexibly to its environment. In the domain of e-commerce, all but the simplest artificial agents designed to gather information and act on behalf of human users will have to be equipped with the ability to adapt their behavior successfully to unexpected conditions. The very large number of documents and databases available on the web has fostered the development of tools for abstracting, extracting, and combining sources in a variety of ways to make the information more usable to humans.

Machine learning is driven by the goal of making programs or agents that exhibit useful learning behavior, autonomously or in cooperation with teams of other agents, either human or artificial. One goal is software that is easier to use, e.g., a word-processing program that can guess from an example or two what text transformation a user wishes to make. Machine learning can be used in the design of a team of soccer-playing robots to allow the team to adapt its tactics to the observed behavior of an opposing team. At the level of hardware, sophisticated processor designs can adapt to characteristics of the program currently being run to achieve increases in speed.

As learning components are increasingly included in software and embedded systems, it is important to have a clear understanding of their theoretical and practical strengths and limitations. Computational learning theory – a theoretical branch of machine learning–develops and studies algorithmic models of learning, using tools from analysis of algorithms, theory of computation, probability and statistics, game theory, and cryptography. Fundamental models and results in computational learning theory have established the learnability or non-learnability of a number of classes of concepts from various kinds of information. Many exciting questions and areas remain open, including learnability of probabilistic models, learnability in games and other cooperative and competitive situations, and aspects of learnability in natural language. As one example, in the context of e-commerce it is important to study how a population of agents can learn and use information about the trustworthiness of the other agents in the population.

Knowledge discovery and data mining address the question of extracting usable information from very large and heterogeneous sources of information. In this context, even a linear-time algorithm can be too slow. One promising approach to a collection of text documents is to treat it as a matrix of the occurrences of distinct words in the documents, and to approximate the information in the matrix using a low-rank singular value decomposition. This has motivated the search for fast provable approximations of important matrix-vector computations based on weighted sampling.

Machine learning tools are increasingly controlling various aspects of modern society: from social interactions (e.g., Facebook, Twitter, Google, YouTube), economics (e.g., Uber, Airbnb, Banking), learning (e.g., Wikipedia, MOOCs), to governance (Judgements, Policing, Voting). These systems have a tremendous potential to change our lives for the better, but, via the ability to mimic and nudge human behavior, they also have the potential to be discriminatory, reinforce societal prejudices, and polarize opinions. Moreover, recent studies have demonstrated that these systems can be quite brittle and generally lack the required robustness to be deployed in various civil/military situations. Our goal is to develop a principled and thoughtful approach to the redesign of machine learning tools for society.

Many procedures in machine learning and nature at large—Bayesian inference, deep learning, protein folding—successfully solve non-convex optimization and sampling problems that are NP-hard, i.e., intractable on worst-case instances. Moreover, often nature or humans choose methods that are inefficient in the worst case to solve problems in P. Our goal is to develop a rigorous theory to resolve this mismatch between reality and the predictions of worst-case analysis.  Such a theory could identify structure in natural inputs that helps sidestep worst-case complexity, and guide the choice of ML algorithms and their parameters.

Faculty members with interests in this area include Dana Angluin, Amin Karbasi, Smita Krishnaswamy, John Lafferty, Sahand Negahban, Dragomir Radev, Brian Scassellati, Daniel Spielman, Marynel Vázquez, and Nisheeth Vishnoi.