CS Talk - Nick Duffield, Texas A&M University

Event time:

Tuesday, November 29, 2016 - 4:00pm

Location:

AKW 200 See map

51 Prospect Street

New Haven, CT 06511

Event description:

CS Talk

Speaker: Nick Duffield, Texas A&M University

Title: The cost and benefit of reducing Big Data size and complexity

Host: Joan Feigenbaum

Abstract:

Sampling is a powerful approach to reduce Big Data to Small Data, relieving storage and enabling faster query response when an approximate answer suffices. The first part of this talk describes a cost-based formulation for optimal data reduction that is used by a major ISP, and some new applications to subgraph counting in graph streaming. The second part of this talk focuses on the use of machine learning methods to model the complex dependence between internet user experience and the systems that provide services, and how this knowledge can be used to improve those services. The talk also touches on the dependence between the foundations and applications of data science, and the costs and benefits of interdisciplinary data science research.