A very successful research year for the Natural Language Processing group at Yale

June 7, 2018

The LILY (Language, Information, and Learning at Yale) Lab so far this year has published eight papers on Natural Language Processing (NLP) at three top-tier conferences: AAAI (The Association for the Advancement of Artificial Intelligence), NAACL (North American Chapter of the Association for Computational Linguistics) and ACL (Association for Computational Linguistics). These publications are the result of collaboration among the PhD students in the LILY lab as well as over a dozen Yale undergraduates and master’s students.

One of the papers, “TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation” by Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield and Dragomir R. Radev will appear at ACL, the top venue in NLP, in July in Melbourne, Australia. This paper aims to facilitate NLP research and education through a dataset of over 7,500 hand-curated resources about NLP and related fields such as Information Retrieval, Machine Learning and Artificial Intelligence as well as an internally-maintained catalog and search engine named AAN.

AAN (short for “All About NLP”) allows users to browse resources according to a taxonomy of over 300 topics which was developed by LILY. Users can also find tutorials, surveys and other educational materials relevant to a given project given a query consisting of a title and an abstract. More information about the search engine and other features of AAN can be found on this blog post.

The other published papers focus on natural language dialogue, translating sentences to database queries as well as part-of-speech tagging, text summarization and Tree Adjoining Grammar parsing. Here is the full list of papers.

TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation by Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield and Dragomir R. Radev
Improving Text-to-SQL Evaluation Methodology by Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang and Dragomir Radev
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering by Rui Zhang, Cicero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang and Dragomir Radev
TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation by Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang and Dragomir Radev
Robust Multilingual Part-of-Speech Tagging via Adversarial Training by Michihiro Yasunaga, Jungo Kasai and Dragomir Radev
End-to-end Graph-based TAG Parsing with Neural Networks by Jungo Kasai, Robert Frank, Pauli Xu, William Merrill and Owen Rambow
Sentence Ordering using Recurrent Neural Networks by Lajanugen Logeswaran, Honglak Lee and Dragomir Radev
Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs by Rui Zhang, Honglak Lee, Lazaros Polymenakos and Dragomir Radev