CS Talk - Nick Duffield, Texas A&M University

Event time: 
Tuesday, November 29, 2016 - 4:00pm
AKW 200 See map
51 Prospect Street
New Haven, CT 06511
Event description: 

Speaker: Nick Duffield, Texas A&M University

Title: The cost and benefit of reducing Big Data size and complexity

Host: Joan Feigenbaum


Sampling is a powerful approach to reduce Big Data to Small Data, relieving storage and enabling faster query response when an approximate answer suffices. The first part of this talk describes a cost-based formulation for optimal data reduction that is used by a major ISP, and some new applications to subgraph counting in graph streaming. The second part of this talk focuses on the use of machine learning methods to model the complex dependence between internet user experience and the systems that provide services, and how this knowledge can be used to improve those services. The talk also touches on the dependence between the foundations and applications of data science, and the costs and benefits of interdisciplinary data science research.