Dragomir Radev, Ph.D. Computer Science; Columbia University

Dragomir Radev's picture
A. Bartlett Giamatti Professor of Computer Science
Address: 
Room 319, 17 Hillhouse Avenue, New Haven, CT 06511
203.436.4759

Dragomir Radev’s interests include Natural Language Processing (NLP), Artificial Intelligence, Computational Linguistics, Machine Learning, Information Retrieval, Text Summarization, Network Analysis, Text Mining Applications of NLP to Bioinformatics, Social Network Analysis, Political Science, and the Humanities

Awards Dragomir Radev has received

  • Fellow of the ACM (Association for Computing Machinery) (2015)
  • University of Michigan Faculty Recognition Award (2013)
  • Linguistics Society of America: Linguistics, Language and the Public Award (2011) (as co-founder and program chair of NACLO)
  • Secretary of ACL (Association for Computational Linguistics) (2006-2015)
  • The Gosnell Prize for Excellence in Political Methodology (shared) (2006)
  • University of Michigan UROP Faculty Award for Outstanding Research Mentorship (2004)
     

Selected Publications
(Full list of publications available here)

  • Rahul Jha, Amjad Abu-Jbara, Vahed Qazvinian, and Dragomir R. Radev. Nlp driven citation analysis for scientometrics. Journal of Natural Language Engineering, 2016.
  • Kathleen McKeown, Hal Daume III, Snigdha Chaturvedi, John Paparrizos, Kapil Thadani, Pablo Barrio, Or Biran, Suvarna Bothe, Michael Collins, Kenneth Fleischmann, Luis Gravano, Rahul Jha, Ben King, Kevin McInerney, Taesun Moon, Diarmuid O’Seaghdha, Dragomir Radev, Clay Templeton, and Simone Teufel. Predicting impact of scientific concepts using full text features. Journal of the American Society for Information Science and Technology, 2016.
  • Nikita Bhutani, H. V. Jagadish, and Dragomir Radev. Nested propositions in open information extraction. In EMNLP, 2016.
  • Catherine Finegan-Dollak, Reed Coke, Rui Zhang, Xiangyi Ye, and Dragomir Radev. Effects of creativity and cluster tightness on short text clustering performance. In ACL, August 2016.
  • Dragomir Radev, Amanda Stent, Joel Tetreault, Aasish Pappu, Aikaterini Iliakopoulou, Agustin Chanfreau, Paloma de Juan, Jordi Vallmitjana, Alejandro Jaimes, Rahul Jha, and Robert Mankoff. Humor in collective discourse: Unsupervised funniness detection in the new yorker cartoon caption contest. In LREC, May 2016.
  • William Wang, Yashar Mehdad, Dragomir Radev, and Amanda Stent. A low-rank approximation approach to learning joint embeddings of news stories and images for timeline summarization. In HLT NAACL, June 2016.
  • Rui Zhang, Honglak Lee, and Dragomir Radev. Dependency sensitive convolutional neural networks for modeling sentences and documents. In HLT NAACL, June 2016.
  • Arzucan Ozgur, Junghuk Hur, Zuoshuang Xiang, Edison Ong, , Dragomir R. Radev, and Yongqun He. Ignet: A centrality-and ino-based web system for analyzing and visualizing literature-mined networks. 2016.
  • Catherine Finegan-Dollak and Dragomir R. Radev. Sentence Simplification, Compression, and Disaggregation for Summarization of Sophisticated Documents. Journal of the American Society for Information Science and Technology, 2015.
  • Vivi Nastase, Rada Mihalcea, and Dragomir R. Radev. A survey of graphs in natural language processing. Journal of Natural Language Engineering, 2015.
  • Dragomir R. Radev, Mark Joseph, Bryan Gibson, and Pradeep Muthukrishnan. A Bibliometric and Network Analysis of the Field of Computational Linguistics. Journal of the American Society for Information Science and Technology, 2015.
  • Rahul Jha, Reed Coke, and Dragomir R. Radev. Surveyor: A system for generating coherent survey articles for scientific topics. In Proceedings of the Twenty-Ninth AAAI Conference, 2015.
  • Rahul Jha, Catherine Finegan-Dollak, Ben King, Reed Coke, and Dragomir R. Radev. Content models for survey generation: A factoid-based evaluation. In Proceedings of ACL 2015, Beijing, China, 2015.
  • Ahmed Hassan, Amjad Abu-Jbara, Wanchen Lu, and Dragomir R. Radev. A random walk based model for identifying semantic orientation. Computational Linguistics, 40, 2014.
  • Ben King, Rahul Jha, and Dragomir R. Radev. Heterogeneous networks and their applications: Scientometrics, name disambiguation, and topic modeling. Transactions of the Association for Computational Linguistics, 2:1-14, 2014.
  • Amjad Abu-Jbara, Jefferson Ezra, and Dragomir R. Radev. Purpose and polarity of citation: Towards nlp-based bibliometrics. In Proceedings of the North American Association for Computational Linguistics, 2013.
  • Amjad Abu-Jbara, Benjamin King, Mona Diab, and Dragomir R. Radev. Identifying opinion subgroups in arabic online discussions. In Proceedings of The Association for Computational Linguistics (short paper), 2013.
  • Vahed Qazvinian, Dragomir R. Radev, Saif Mohammad, Bonnie Dorr, David Zajic, Michael Whidby, and Taesun Moon. Generating extractive summaries of scientific paradigms. Journal of Artificial Intelligence Research (JAIR), 46:165-201, 2013.
  • Rahul Jha, Amjad Abu-Jbara, and Dragomir R. Radev. A system for summarizing scientific topics starting from keywords. In Proceedings of The Association for Computational Linguistics (short paper), 2013.
  • Benjamin King, Rahul Jha, Dragomir R. Radev, and Robert Mankoff. Random walk factoid annotation for collective discourse. In Proceedings of The Association for Computational Linguistics (short paper), 2013.
  • Patrick Littell, Lori Levin, Jason Eisner, and Dragomir Radev. Introducing computational concepts in a linguistics olympiad. In Proceedings of the Fourth Workshop on Teaching NLP and CL, Association for Computational Linguistics, Sofia, Bulgaria, 2013.
  • Amjad abu Jbara and Dragomir R. Radev. Reference scope identification in citation sentences. In Proceedings of NAACL 2012, Montreal, QC, 2012.
  • Amjad abu Jbara, Pradeep Dasigi, Mona Diab, and Dragomir R. Radev. Subgroup detection in ideological discussions. In Proceedings of ACL 2012, Jeju Island, Korea, 2012.
  • Ahmed Hassan, Amjad Abu-Jbara, and Dragomir Radev. Detecting subgroups in online discussions by modeling positive and negative relations among participants. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 59-70, Jeju Island, Korea, July 2012. Association for Computational Linguistics.
  • Yoshinobu Kano, Jari Bjorne, Filip Ginter, Tapio Salakoski, Ekaterina Buyko, Udo Hahn, K. Bretonnel Cohen, Karin Verspoor, Christophe Roeder, Lawrence E. Hunter, Halil Kilicoglu, Sabine Bergler, Sofie Van Landeghem, Thomas Van Parys, Yves van de Peer, Makoto Miwa, Sophia Ananiadou, Mariana Neves, Alberto Pascual-Montano, Arzucan Ozgur, Dragomir R. Radev, Sebastian Riedel, Rune Saetre, Hong-Woo Chun, Jin-Dong Kim, Sampo Pyysalo, Tomoko Ohta, and Jun’ichi Tsujii. U-compare bio-event meta-service: Compatible bionlp event extraction services. BMC Bioinformatics, 2011.
  • Arzucan Ozgur, Zhuohuang Xiang, Dragomir R. Radev, and Yongqun He. Mining of vaccine-associated ifn-gamma gene interaction networks using the vaccine ontology. Journal of Biomedical Semantics, 2011.
  • Amjad abu Jbara and Dragomir R. Radev. Coherent citation-based summarization of scientific papers. In Proceedings of ACL 2011, Portland, Oregon, 2011.
  • Vahed Qazvinian and Dragomir Radev. Exploiting phase transition in similarity networks for clustering. In Proceedings of the AAAI 2011 Conference, San Francisco, CA, August 2011.
  • Vahed Qazvinian and Dragomir R. Radev. Learning from collective human behavior to introduce diversity in lexical choice. In Proceedings of ACL 2011, Portland, Oregon, 2011.
  • Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. Rumor has it: Identifying misinformation in microblogs. In Proceedings of EMNLP 2011, Edinburgh, UK, 2011.
  • Arzucan Ozgur, Zhuohuang Xiang, Dragomir R. Radev, and Yongqun He. Literature-based discovery of IFN-gamma and vaccine-mediated gene interaction networks. Journal of Biomedicine and Biotechnology, 2010.
  • Kevin Quinn, Burt Monroe, Michael Colaresi, Michael Crespin, and Dragomir R. Radev. How to analyze political attention with minimal assumptions and costs. American Journal of Political Science, 2010.
  • Ahmed Hassan and Dragomir R. Radev. Identifying text polarity using random walks. In Proceedings of ACL 2010, Uppsala, Sweden, 2010.
  • Ahmed Hassan, Vahed Qazvinian, and Dragomir R. Radev. What’s with the attitude? a study of participant attitude in multi-party online discussions. In Proceedings of EMNLP 2010, Cambridge, Massachusetts, 2010.
  • Qiaozhu Mei, Jian Guo, and Dragomir R. Radev. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of SIGKDD 2010, Washington, DC, 2010.
  • Pradeep Muthukrishnan, Dragomir R. Radev, and Qiaozhu Mei. Edge weight regularization over multiple graphs for similarity learning. In IEEE ICDM, Sydney, Australia, 2010.
  • Vahed Qazvinian and Dragomir R. Radev. Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of ACL 2010, Uppsala, Sweden, 2010.
  • Vahed Qazvinian, Dragomir R. Radev, and Arzucan Ozgur. Citation summarization through keyphrase extraction. In COLING 2010, Beijing, China, 2010.
  • Aleks Aris, Ben Shneiderman, Vahed Qazvinian, and Dragomir R. Radev. Visual overviews for discovering key papers and influences across research fronts. Journal of the American Society for Information Science and Technology, 2009.
  • Jahna Otterbacher, Gunes Erkan, and Dragomir Radev. Biased LexRank: Passage Retrieval using Random Walks with Question-Based Priors. Information Processing and Management, 45(1):42-54, 2009.
  • V. G. Tarcea, T. Weymouth, A. Ade, A. Bookvich, J. Gao, V. Mahavisno, Z. Wright, A. Chapman, M. Jayapandian, A. Ozgur, Y. Tian, J. Cavalcoli, B. Mirel, J. Patel, D. Radev, B. Athey, D. States, and H. V. Jagadish. Michigan molecular interactions (mimi) r2: From interacting proteins to pathways. Nucleic Acids Research, 37:D642-D646, January 2009.
  • Ahmed Hassan, Dragomir R. Radev, Junghoo Cho, and Amruta Joshi. Content based recommendation and summarization in the blogosphere. In Proceedings of ICWSM 2009, San Jose, CA, 2009.
  • Saif Mohammad, Bonnie Dorr, Melissa Egan, Ahmed Hassan, Pradeep Muthukrishan, Vahed Qazvinian, Dragomir R. Radev, and David Zajic. Generating surveys of scientific paradigms. In Proceedings of HLT-NAACL 2009, Boulder, CO, June 2009.
  • Arzucan Ozgur and Dragomir R. Radev. Detecting speculations and their scopes in scientific text. In EMNLP, Singapore, 2009.
  • Arzucan Ozgur and Dragomir R. Radev. Supervised classification for extracting biomedical events. In Proceedings of the BioNLP’09 Workshop Shared Task on Event Extraction at NAACL-HLT, Boulder, Colorado, June 2009.
  • Aaron Elkiss, Siwei Shen, Anthony Fader, Gunes Erkan, David States, and Dragomir R. Radev. Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1):51-62, 2008.
  • Florian Leitner, Martin Krallinger, Carlos Rodriguez-Penagos, Joerg Hakenberg, Conrad Plake, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tasi, Hsi-Chuan Hung, William W. Lau, Calvin A. Johnson, Rune Saetre, Kazuhiro Yoshida, Yan Hua Chen, Sun Kim, Soo-Yong Shin, Byoung-Tak Zhang, William A. Baumgartner, Jr. , Lawrence Hunter, Barry Haddow, Michael Matthew, Xinglong Wang, Patrick Ruch, Frederic Ehrler, Arzucan Ozgur, Gunes Erkan, Dragomir R. Radev, Michael Krauthammer, ThaiBinh Luong, Robert Hoffmann, Chris Sander, and Alfonso Valencia. Introducing meta-services for biomedical information extraction. Genome Biology, 9, September 2008.
  • Jahna Otterbacher and Dragomir Radev. Exploring Fact-focused Relevance and Novelty Detection. Journal of Documentation, 64(4), 2008.
  • Jahna Otterbacher, Dragomir Radev, and Omer Kareem. Hierarchical Summarization for Delivering Information to Mobile Devices. Information Processing and Management, 44(2):931-947, 2008.
  • Arzucan Ozgur, Thuy Vu, Gunes Erkan, and Dragomir R. Radev. Identifying gene-disease associations based on centrality on a literature mined gene interaction network. Bioinformatics, 24:i277-i285, 2008.
  • Ahmed Hassan, Anthony Fader, Michael Crespin, Kevin Quinn, Burt Monroe, Michael Colaresi, and Dragomir R. Radev. Tracking the dynamic evolution of participant salience in a discussion. In COLING 2008, Manchester, UK, 2008.
  • Pradeep Muthukrishnan, Joshua Gerrish, and Dragomir R. Radev. Detecting multiple facets of an event using graph-based unsupervised methods. In COLING 2008, Manchester, UK, 2008.
  • Vahed Qazvinian and Dragomir R. Radev. Scientific paper summarization using citation summary networks. In COLING 2008, Manchester, UK, 2008.
  • Dragomir R. Radev, Lori S. Levin, and Thomas E. Payne. The North American Computational Linguistics Olympiad (NACLO). In Proceedings, The Third Workshop on Issues in Teaching Computational Linguistics, Columbus, OH, 2008.
  • Gunes Erkan, Arzucan Ozgur, and Dragomir R. Radev. Extracting interacting protein pairs and evidence sentences by using dependency parsing and machine learning techniques. In Proceedings of the Second BioCreAtIvE Challenge Workshop - Critical Assessment of Information Extraction in Molecular Biology, April 23-25 2007.
  • Gunes Erkan, Arzucan Ozgur, and Dragomir R. Radev. Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP ‘07), Prague, Czech Republic, June 28-30 2007.
  • Anthony Fader, Dragomir R. Radev, Michael H. Crespin, Burt L. Monroe, Kevin M. Quinn, and Michael Colaresi. MavenRank: Identifying influential members of the us senate using lexical centrality. In Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP ‘07), Prague, Czech Republic, June 28-30 2007.
  • Jahna Otterbacher, Dragomir Radev, and Omer Kareem. News to Go: Hierarchical Text Summarization for Mobile Devices. In 29th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, August 2006.
  • Kevin M. Quinn, Burt L. Monroe, Michael Colaresi, Michael H. Crespin, and Dragomir R. Radev. An automated method of topic-coding legislative speech over time with application to the 105th-108th U.S. Senate. In Midwest Political Science Association Meeting, 2006.
  • Wai Lam, Ki Chan, Dragomir Radev, Horacio Saggion, and Simone Teufel. Context-based generic cross-lingual retrieval of documents and automated summaries. Journal of the American Society for Information Science and Technology, 56(2), February 2005.
  • Dragomir R. Radev, Weiguo Fan, Hong Qi, Harris Wu, and Amardeep Grewal. Probabilistic question answering on the web. Journal of the American Society for Information Science and Technology 56(3), March 2005.
  • Dragomir R. Radev, Jahna Otterbacher, Adam Winkel, and Sasha Blair-Goldensohn. Newsinessence: Summarizing online news topics. Communications of the ACM, 10 2005.
  • Jahna Otterbacher, Gunes Erkan, and Dragomir R. Radev. Using random walks for question-focused sentence retrieval. In Proceedings of HLT-EMNLP, Vancouver, BC, 2005.
  • Gunes Erkan and Dragomir R. Radev. Lexrank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research (JAIR), 2004.
  • Dragomir R. Radev, Hongyan Jing, Malgorzata Stys, and Daniel Tam. Centroid-based summarization of multiple documents. Information Processing and Management, 40:919-938, December 2004.
  • Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Anoop Sarkar, Kenji Yamada, Alex Fraser, Shankar Kumar, Libin Shen, David Smith, Katherine Eng, Viren Jain, Zhen Jin, and Dragomir Radev. A smorgasbord of features for statistical machine translation. In Proceedings of HLT-NAACL 2004, Boston, MA, May 2004.
  • Jahna C. Otterbacher and Dragomir Radev. Comparing semantically related sentences: The case of paraphrase versus subsumption. COLING 2004, August 23rd-27th 2004.
  • Dragomir R. Radev, Hong Qi, Daniel Tam, and Adam Winkel. Computational linkuistics: word triggers across hyperlinks. In Proceedings of HLT-NAACL 2004 (short paper), 2004.
  • James Pustejovsky, Patrick Hanks, Roser Sauri, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, and Marcia Lazo. The TIMEBANK Corpus. In Proceedings of Corpus Linguistics 2003, pages 647-656, Lancaster, UK, March 2003.
  • James Pustejovsky, Jose Casta no, Robert Ingria, Roser Sauri, Robert Gaizauskas, Andrea Setzer, Graham Katz, and Dragomir R. Radev. TimeML: Robust specification of event and temporal expressions in text. In Proceedings, AAAI Spring Symposium on New Directions in Question Answering, Stanford, CA, March 2003.
  • Dragomir R. Radev, Simone Teufel, Horacio Saggion, Wai Lam, John Blitzer, Hong Qi, Arda Celebi, Danyu Liu, and Elliott Drabek. Evaluation challenges in large-scale multi-document summarization: the mead project. In Proceedings of ACL 2003, Sapporo, Japan, 2003.
  • Zhu Zhang, Jahna Otterbacher, and Dragomir R. Radev. Combining labeled and unlabeled data for learning cross-document structural relationships. In Proceedings of ACM CIKM 2003, New Orleans, LA, November 2003.
  • Dragomir R. Radev, Kelsey Libner, and Weiguo Fan. Getting Answers to Natural Language Queries on the Web. Journal of the American Society for Information Science and Technology, 53(5):359-364, 2002.
  • Dragomir R. Radev, Weiguo Fan, Hong Qi, Harris Wu, and Amardeep Grewal. Probabilistic Question Answering from the Web. In The 11th International World Wide Web Conference, Honolulu, Hawaii, May 2002.
  • Horacio Saggion, Dragomir Radev, Simone Teufel, and Wai Lam. Meta-evaluation of summaries in a cross-lingual environment using content-based metrics. In Proceedings of COLING’2002, Taipei, Taiwan, August 2002.
  • Zhu Zhang, Sasha Blair-Goldensohn, and Dragomir Radev. Towards CST-enhanced summarization. In Proceedings of the AAAI 2002 Conference, Edmonton, Alberta, July - August 2002.
  • John Prager, Dragomir R. Radev, and Krzysztof Czuba. Answering what-is questions by virtual annotation. In Proceedings, HLT-2001, San Diego, CA, March 2001.
  • Dragomir R. Radev, Sasha Blair-Goldensohn, Zhu Zhang, and Revathi Sundara Raghavan. Interactive, domain-independent identification and summarization of topically related news articles. In Proceedings, 5th European Conference on Research and Advanced Technology for Digital Libraries, Darmstadt, Germany, September 2001.
  • Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, and John Prager. Mining the web for answers to natural language questions. In ACM CIKM 2001: Tenth International Conference on Information and Knowledge Management, Atlanta, GA, 2001.
  • John Prager, Eric Brown, Anni Coden, and Dragomir Radev. Question-answering by predictive annotation. In Proceedings, 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, July 2000.
  • Dragomir R. Radev, John Prager, and Valerie Samn. Ranking potential answers to natural language questions. In Proceedings of the 6th Conference on Applied Natural Language Processing, Seattle, WA, May 2000.
  • Alfred Aho, Shih-Fu Chang, Kathleen McKeown, Dragomir Radev, John Smith, and Kazi Zaman. Columbia Digital News Project. International Journal of Digital Libraries, 1(4):377-385, 1998.
  • Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-500, September 1998.
  • Dragomir R. Radev. Learning correlations between linguistic indicators and semantic constraints: Reuse of context-dependent descriptions of entities. In Proceedings, 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics COLING-ACL’98, Montreal, Canada, August 1998.
  • Alfred Aho, Shih-Fu Chang, Kathleen McKeown, Dragomir Radev, John Smith, and Kazi Zaman. Columbia Digital News System : An environment for briefing and search over multimedia information. In Proceedings, IEEE International Conference on the Advances of Digital Libraries ADL’97, Washington, DC, May 1997.
  • Dragomir R. Radev and Kathleen R. McKeown. Building a generation knowledge source using internet-accessible newswire. In Proceedings, Fifth ACL Conference on Applied Natural Language Processing ANLP’97, pages 221-228, Washington, DC, April 1997.
  • Evelyne Tzoukermann and Dragomir R. Radev. Using word class for part-of-speech disambiguation. In Proceedings, Fourth Workshop on Very Large Corpora WVLC’96, pages 1-13, Copenhagen, Denmark, August 1996. COLING.
  • Kathleen R. McKeown and Dragomir R. Radev. Generating summaries of multiple news articles. In Proceedings, ACM Conference on Research and Development in Information Retrieval SIGIR’95, pages 74-82, Seattle, Washington, July 1995.
  • Evelyne Tzoukermann, Dragomir R. Radev, and William A. Gale. Combining linguistic knowledge and statistical learning in French part-of-speech tagging. In Proceedings, EACL Workshop on Very Large Corpora WVLC’95, pages 51-57, Dublin, Ireland, February 1995. EACL.