Hierarchical Clustering as a Tool for Unsupervised Learning: An Optimization Viewpoint / On the Computational Power of Online Gradient Descent

Institute

Corelab, ECE NTUA

Speaker

Βαγγέλης Χατζηαφράτης

Date

18-03-2019, 17:00

Place

1.1.31, Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών, ΕΜΠ (παλιό κτίριο)

Abstract

Hierarchical Clustering as a Tool for Unsupervised Learning: An Optimization Viewpoint
Hierarchical Clustering (HC) is a widely studied problem in unsupervised learning and exploratory data analysis, usually tackled by simple agglomerative procedures like average-linkage, single-linkage or complete-linkage. Applications of HC include reasoning about text documents, understanding the Evolution of species and the Tree of life, decomposing social networks like Facebook, or even organizing large data centers efficiently. Surprisingly, despite the plethora of heuristics for tackling the problem, until recently there was no optimization objective associated with it; this is in stark contrast with flat clustering objectives like k-means, k-median and k-center. In this talk, we will give an overview of the optimization objectives for Hierarchical Clustering, we will analyze the popular Average-Linkage procedure and mention some of its drawbacks. Then, we will propose better algorithms with provably better approximation guarantees. (Full paper: https://arxiv.org/pdf/1808.02227.pdf). Most of the talk is based on joint works with Moses Charikar and Rad Niazadeh.

On the Computational Power of Online Gradient Descent
How efficiently can we compute the weight vector of online gradient descent after T steps? We prove that the evolution of weight vectors in online gradient descent can encode arbitrary polynomial-space computations, even in very simple learning settings. Our results imply that, under weak complexity-theoretic assumptions, it is impossible to reason efficiently about the fine-grained behavior of online gradient descent. Specifically, during the talk, we will see how even in the extremely simple learning setting of soft-margin SVMs (support vector machines), the weight updates can encode an instance of the PSPACE-complete C-Path problem. Our reduction and our results also extend to simple ReLU neural networks.
(Full paper: https://arxiv.org/pdf/1807.01280.pdf)
Based on joint work with Tim Roughgarden (Columbia) and Josh Wang (Google).