CS Colloquium - Michael Ryoo
Refreshments available at 3:45
Host: Marynel Vazquez
Title: Representation Learning for Video Understanding
This talk provides an overview of our recent progress in “representation learning” for video data. More than 400 hours of video are uploaded to YouTube every minute, individuals are recording their lives using smart phones and lifeloggers, and robots with cameras are becoming increasingly available at both public and private places. The objective is to make sense of all this visual data by making machines abstract visual information in such videos, detect interesting humans/objects/events in them, and enable simple event-based summaries to be generated. We present convolutional neural network (CNN) models for 3D space-time video data and discuss how they convert high dimensional video data into more machine understandable representations. We talk about activity recognition models with the concept of sub-events and super-events, designed to capture longer-term temporal information in continuous videos. We also illustrate how such video representation learning benefits robots operating in a real-world environment while interacting with other objects/humans/robots. This not only includes providing activity-level situation awareness to robots but also building fully convolutional predictive models for robot action imitation.
Michael S. Ryoo is an assistant professor in the Department of Computer Science at Indiana University. He is currently also with Google Brain as their visiting faculty. His research interest is within the areas of Computer Vision and Robotics, with a particular emphasis on human activity recognition/learning, first-person vision, and robot learning. Before joining IU, Dr. Ryoo was a staff researcher at the Robotics Section of the NASA’s Jet Propulsion Laboratory (JPL). Dr. Ryoo received the Ph.D. degree from the University of Texas at Austin in 2008, and the B.S. degree from the Korea Advanced Institute of Science and Technology (KAIST) in 2004. His paper on robot-centric activity recognition at ICRA 2016 won the Best Paper Award in Robot Vision, and his HRI 2015 paper was one of the two nominees for its Best Enabling Technology award. Dr. Ryoo has been providing tutorials on human activity recognition at major Computer Vision conferences including CVPR 2011, 2014, and 2018, and he organized the workshop on Egocentric (First-Person) Vision at CVPR 2014 and CVPR 2016.