Dissertation Defense/Minh-Tam Le
Title: Feature Selection for Diffusion Methods Within a Supervised Context.
Committee members:
Steven Zucker (Advisor)
Ronald Coifman
Vladimir Rokhlin
Robert Zucker (Michigan University)
Abstract: We apply diffusion geometry to sociopolitical and public health datasets. Our specific goal is to reveal hidden trends and narratives behind UN voting records and alcohol-use questionnaires. Importantly, seeking those hidden variables in a supervised context, e.g. alcohol-abuse, can be problematic for diffusion geometry. We suggest two approaches to deal with these shortcomings. First, we develop a correlation-based hierarchical clustering algorithm that exposes sub-patterns in the feature (response) space; this works in the UN voting context. Second, we introduce a feature selection algorithm based on a second-order correlation measure to guide diffusion embeddings; this significantly improves the performance of diffusion methods in the alcohol context. Together they suggest how to structure embeddings when there exist strong correlations among features irrelevant to a given labeling function.