CS Colloquium - Ming-Hsuan Yang, UC Merced and Google DeepMind

Event time:

Thursday, December 5, 2024 - 4:00pm

Location:

Mason Lab, 211 See map

9 Hillhouse Avenue

New Haven, CT 06511

Event description:

CS Colloquium
Ming-Hsuan Yang
UC Merced and Google DeepMind

Host: Alex Wong

Title: Video Understanding and Generation with Multimodal Foundation Models

Abstract:

Recent advancements in vision and language models have greatly enhanced various visual tasks related to understanding and generation. In this talk, I will present our latest research on effective tokenizers for transformers and discuss our efforts to adapt frozen large language models for a range of vision tasks, including visual classification, video-text retrieval, visual captioning, vision query answering, visual grounding, video generation, stylization, outpainting, and video-to-audio conversion. If time permits, I will also share some recent findings in 3D vision.

Bio:

Ming-Hsuan Yang is a Professor at UC Merced and a Research Scientist at Google DeepMind. He received the Google Faculty Award in 2009, the NSF CAREER Award in 2012, and the Nvidia Pioneer Research Award in 2017 and 2018. Yang has earned several awards, including Best Paper Honorable Mention at UIST 2017, Best Paper Honorable Mention at CVPR 2018, Best Student Paper Honorable Mention at ACCV 2018, Longuet-Higgins Prize (for test of time) at CVPR 2023, and Best Paper at ICML 2024. He serves as Associate Editor-in-Chief of PAMI and as an Associate Editor for IJCV. Previously, he was the Editor-in-Chief of CVIU and served as program co-chair for ICCV in 2019. Yang is a Fellow of both the IEEE and ACM.