Artificial Intelligence and Machine Learning

Artificial Intelligence is the study and practice of building systems that can solve complex tasks in ways that would traditionally need human intelligence.

Yale has a number of faculty working in Artificial Intelligence. Our main areas of research are Robotics, Natural Language Processing, Machine Learning, Computer Vision, and the emerging area of AI, Ethics and Society.

Faculty working in this area:

Faculty Email website
Dana Angluin  
Yang Cai Cai Group
Arman Cohan  
Ronald Coifman  
Ozan Erat  
Tesca Fitzgerald Fitzgerald Group
Robert Frank Frank Group
Mark Gerstein Gerstein Group
Amin Karbasi  
Smita Krishnaswamy Krishnaswamy Group
John Lafferty StatML Group
Sahand Negahban  
Danny Rakita Rakita Group
Dragomir Radev NLP Lab (LILY)
Brian Scassellati Social Robotics Lab
Stephen Slade  
Daniel Spielman Spielman Group
Marynel Vázquez Vázquez Group
Nisheeth Vishnoi  
Andre Wibisono Wibisono Group
Alex Wong Wong Group
Rex Ying Ying Group
Steven Zucker Zucker Group


Highlights in this area:

Alex Wong – My work lies at the intersection of computer vision, machine learning and robotics with a focus on depth perception to enable spatial tasks like autonomous navigation and manipulation. Specifically, I develop perception systems that are fast, robust, and accurate with the ability to learn causally online through interactions with the surrounding space and adapt its inference to new environments without any human intervention. To this end, I am interested in: (i) Designing approaches for learning from multiple modalities of data i.e. images, and sparse range and inertial data – without the need for ground truth annotations. (ii) Exploiting regularities from our visual world, whether from our physical environment (i.e. the preponderance of flat surfaces in human-made scenes), the virtually infinite amount of synthetic data, or the abundance of pretrained machine learning models, to improve model accuracy while reducing complexity and run-time. (iii) Developing methods that are robust to adverse (i.e. adversarially optimized, or natural phenomenons like rain or snow) perturbations of the input to support agents operating in novel environments. 

Yang Cai is broadly interested in the theory of computation and its interface with economics, game theory, and machine learning. His current focus includes equilibrium computation in multi-agent machine learning, wherein agents learn, choose actions and receive rewards in a shared environment. Examples of application include generative adversarial networks, adversarial examples, and multi-robot interactions. His group is exploring what equilibrium concepts are meaningful in these environments and when they can be realistically computed. Another focus of his group is on the design of information and incentive structure in markets, with an emphasis on the computational complexity, structural simplicity, and robustness of the design. He has received the Sloan Research Fellowship, the NSF CAREER Award, and the William Dawson Scholarship.

Smita Krishnaswamy – The primary focus of my research is on Machine Learning for extracting patterns and insights from scientific data in order to drive biomedical discovery. While much of AI has focused on matching known patterns for classification, there is a great need for using AI to find unknown patterns and to generate plausible scientific hypotheses. My work is at the intersection of several fields including applied math, deep learning, data geometry, topology, manifold learning, and graph signal processing, all serving to tackle key challenges in data science. The problems I address are motivated by the ubiquity of high-throughput, high-dimensional data in the biomedical sciences – a result of breakthroughs in measurement technologies like single cell sequencing, proteomics, fMRI and vast improvements in health record data collection and storage. While these large datasets, containing millions of cellular or patient observations hold great potential for understanding the generative mechanisms, the state space of the data, as well as causal interactions driving development, disease and progression, they also pose new challenges in terms of noise, missing data, measurement artifacts, and the so-called “curse of dimensionality.” My research has been addressing these issues, by developing denoised data representations that are designed for data exploration, mechanistic understanding, and hypothesis generation. My lab is at the forefront of unsupervised learning where we have uncovered a unifying principle in scientific data, particularly from biomedical systems: the manifold principle[Moon 2018], which has been previously used in mathematical constructs. Biological entities have a myriad of parts (genes, epigenetics, proteins, signals) that can be measured to result in a high ambient space, but by their nature, they intrinsically lie in low dimensional, smoothly varying spaces. For instance, single-cell RNA-sequencing (scRNA-seq) measurements often have 1000s of gene dimensions. These genes cannot be acting individually, uncoordinated with other genes, and must be informationally redundant, thus lowering intrinsic dimension to the 20-30 dimensions we see in practical datasets [van Dijk Cell 2018, Moon Nature Biotechnology 2019]. We have followed the far-reaching effects of the manifold assumption to devise both graph spectral and deep learning methods such as: MAGIC [van Dijk et al. Cell 2018] to denoise data by restoration to low frequency spectral dimensions; PHATE [Moon Nature Biotech 2019] and multiscale PHATE [Kuchroo Nature Biotechnology 2022] to create manifold, affinity-preserving low dimensional embeddings and visualizations [Moon Nature Biotechnology 2019], MELD to understand effects of experimental perturbation [Burkhardt et al. Nature Biotechnology 2021]; TrajectoryNet to learn high dimensional trajectories from static snapshot data [Tong ICML 2020], and many more. The productivity of my lab and its impact on many fields can be seen in our broad publication profile which includes: machine learning venues like ICML, NeurIPS, IDA (Intelligent Data Analysis), IJCNN, and CVPR; signal processing venues such as IEEE MLSP (Machine Learning for signal Processing), IEEE Big Data, Journal of Signal Processing; applied math venues such as SAMPTA (Sampling Theory and Applications), SIAM Data Mining, as well as biomedical journals such as Science, Cell, Nature, Nature Biotechnology, and Nature Methods. I have been recognized for these contributions with the NSF CAREER grant, Sloan Faculty Fellowship, FASEB Excellence in Science Award, two NIH (NIGMS) R01 grants, a joint NSF/NIH grant as well as grants from private foundations such as CZI, Novo Nordisk and Simons Foundation—all as PI or co-PI. In addition, I have been designated as the Dean’s Faculty Fellow of the Yale School of Medicine.

Rex Ying – My research spans 3 broad areas: deep learning for graphs, geometric representation learning, and real-world applications with relational reasoning and modeling. In the past, I created many widely used GNN algorithms such as GraphSAGE, PinSAGE and GNNExplainer. In addition, I have worked on a variety of applications of graph learning in physical simulations, social networks, knowledge graphs and biology. I developed the first billion-scale graph embedding services at Pinterest, and the graph-based anomaly detection algorithm at Amazon.

Marynel Vázquez – Our work has shown that multi-party interactions offer novel opportunities for robots to shape and facilitate human-robot and human-human interactions. For example, we have shown that robot group influence can motivate human prosocial behavior towards an abused robot [HRI’20], proposed learning multi-modal robot gaze policies that shape human-human conversational dynamics [HRI’22] and contributed to creating a robot teleoperation system — called VectorConnect — to combat social isolation in children due to COVID-19 [HRI’21, Best Paper Nominee]. Logs and user feedback for VectorConnect suggested that telepresence robots have a role in providing a fun and safe mechanism for individuals to interact socially during infectious disease outbreaks. Our group has also proposed a data-driven perspective to unify siloed modeling efforts in multi-party Human-Robot Interaction (HRI). Multi-party HRI requires computational models that support a variable number of interactants and can reason about individual, pairwise interaction and group factors that drive social encounters. How can we satisfy all these requirements? Our idea was to represent human-robot interactions with graph abstractions because graphs can encode individual, pairwise interaction and group factors relevant to HRI in a well organized manner. Then, we proposed to reason about the graphs with data-driven models that leverage the structure of the data, such as models that use Deep Set architectures to summarize information about multiple interactants [CSCW’20] or message-passing Graph Neural Networks to reason simultaneously about individual behavior and whole group interactions [Frontiers in Robots & AI’22]. Worth noting, the idea of modeling human-robot interactions with graphs builds on a long history of work in social network analysis, but requires addressing novel challenges due to robots’ embodied nature, e.g., having to reason jointly about social interactions and spatial environmental constraints that influence social behavior [Frontiers in Robotics & AI’22]. Lastly, our work has led to novel systems that facilitate research and data collection for data-driven modeling in HRI. For instance, in social robot navigation, we proposed novel simulation environments to facilitate algorithm development and community benchmarking [HAI’20, Best Poster Nominee; RA-L’22]. Also, we proposed systems to embed interactive simulations in online surveys such that we can scale HRI data collection with crowdsourcing tools in a more immersive manner than using video surveys [IROS’21]. 

Stephen Slade – My interests include goal based decision making, explanation, philosophy of mind, interpersonal relationships, fintech, politics, and ethics. For more information, see my course on Automated Decision Systems.

Mark Gerstein – Current research foci in the lab include disease genomics (particularly neurogenomics and cancer genomics), human genome annotation, genomic privacy, network science, wearable and molecular image data analysis, text mining of the biological-science literature and macromolecular simulation.

Steven Zucker – The current research focus of the ZuckerLab includes (1) inferring manifolds of neurons in the mouse visual system, and associated theoretical problems in manifold learning; (2) applying topological methods to understand human shape inferences; and (3) applying computational modeling to understand the role of color in material and shape perception.