CS Talk - Omar Khattab

Event time: 
Monday, February 19, 2024 - 4:00pm
Location: 
AKW 200 See map
51 Prospect Street
New Haven, CT 06511
Event description: 

CS Talk - Omar Khattab

Host: Smita Krishnaswamy

Title: Building More Reliable and Scalable AI Systems with Language Model Programming

Abstract:

It is now easy to build impressive demos with language models (LMs) but turning these into reliable systems currently requires brittle combinations of prompting, chaining, and finetuning LMs. In this talk, I present LM programming, a systematic way to address this by defining and improving four layers of the LM stack. I start with how to adapt LMs to search for information most effectively (ColBERT, ColBERTv2, UDAPDR) and how to scale that to billions of tokens (PLAID). I then discuss the right architectures and supervision strategies (ColBERT-QA, Baleen, and Hindsight) for allowing LMs to search for and cite verifiable sources in their responses. This leads to DSPy, a programming model that forgoes ad-hoc techniques for using LMs and replaces them with composable modules and optimizers that can supervise complex LM programs. Even simple AI systems expressed in DSPy routinely outperform standard hand-crafted prompt pipelines, in some cases while using small LMs. I highlight how ColBERT and DSPy have sparked applications at dozens of leading tech companies, open-source communities, and research labs, and then conclude by discussing how DSPy enables a new degree of research modularity, one that stands to allow open research to again lead the development of AI systems.

Bio:

Omar is a fifth-year CS Ph.D. candidate at Stanford NLP and an Apple Scholar in AI/ML. He is interested in Natural Language Processing (NLP) at scale, where systems capable of retrieval and reasoning can leverage massive text corpora to craft knowledgeable responses efficiently and transparently. Omar is the author of the ColBERT retrieval model, which has helped shape the modern landscape of neural information retrieval (IR), and author of several early multi-stage retrieval-based LM systems like ColBERT-QA and Baleen. His recent work includes the DSPy programming model for building and optimizing reliable language model systems—by bringing structure, composition, and automatic optimization into the space of prompting, finetuning, and chaining language models and retrieval models. Much of Omar’s work forms the basis of influential open-source projects, and his lines of work on ColBERT and DSPy have sparked applications at dozens of academic research labs and leading tech companies, including at Google, Meta, Amazon, IBM, VMware, Baidu, Huawei, AliExpress, and many others.