Professor Kim Chuan Toh (National University of Singapore, Singapore)
An Efficient Semismooth Newton Based Algorithm for Convex Clustering
Clustering is a fundamental problem in unsupervised learning. Popular methods like K-means, may suffer from instability as they are prone to get stuck at a local minimum. Recently, the sum-of-norms (SON) model (also known as clustering path), which is a convex relaxation of the hierarchical clustering model, has been proposed by Hocking et al and Lindsten et al. Although numerical
algorithms like ADMM and AMA (alternating minimization algorithm) have been proposed to solve the convex clustering model, it is still very challenging to solve large-scale problems. In this talk, we present a semismooth Newton based augmented Lagrangian algorithm for large-scale convex clustering problems wherein the second-order sparsity property of the generalized Hessians associated with the underlying subproblem in each iteration are carefully exploited to achieve high computational efficiency. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm compared to existing first-order methods.
[Based on joint work with Defeng Sun and Yancheng Yuan]