Mirella Lapata, School of Informatics at the University of Edinburgh
Host: Dragomir Radev
Title: Translating from Multiple Modalities to Text and Back
Recent years have witnessed the development of a wide range of computational tools that process and generate natural language text. Many of these have become familiar to mainstream computer users in the from of web search, question answering, sentiment analysis, and notably machine translation. The accessibility of the web could be further enhanced with applications that not only translate between different languages (e.g., from English to French) but also within the same language, between different modalities, or different data formats. The web is rife with non-linguistic data (e.g., video, images, source code) that cannot be indexed or searched since most retrieval tools operate over textual data.
In this talk I will argue that in order to render electronic data more accessible to individuals and computers alike, new types of translation models need to be developed. I will focus on three examples, text simplification, source code generation, and movie summarization. I will illustrate how recent advances in deep learning can be extended in order to induce general representations for different modalities and learn how to translate between these and natural language.
Mirella Lapata is a Professor at the School of Informatics at the University of Edinburgh. Her recent research interests are in natural language processing. She serves as an associate editor of the Journal of Artificial Intelligence Research (JAIR). She is the first recipient (2009) of the British Computer Society and Information Retrieval Specialist Group (BCS/IRSG) Karen Sparck Jones award. She has also received best paper awards in leading NLP conferences and financial support from the EPSRC (the UK Engineering and Physical Sciences Research Council) and ERC (the European Research Council).