TB-02: Nonlinear Composite and Constrained Optimization – Part II
Stream: Advances in mathematical optimization for machine learning
Chair(s): Cong Bang Vu, Dimitri Papadimitriou
A primal dual method for optimization with nonlinear constraints
Mehmet Fatih Sahin
We propose a practical inexact augmented Lagrangian method (iALM) for nonconvex problems with nonlinear constraints. We characterize the total computational complexity of our method subject to a verifiable geometric condition, which is closely related to the Polyak-Lojasiewicz and Mangasarian-Fromowitz conditions. We also provide numerical evidence on machine learning problems, including the Burer-Monteiro factorization of semidefinite programs.
Safe Screening for the Generalized Conditional Gradient Method
We explore the sparsity acquiring properties of a generalized conditional gradient method, where the constraint is replaced by a gauge penalty function. Without assuming bounded iterates, we show O(1/t) convergence of the function values and duality gap. We couple this with a safe screening rule, and show that at a rate O(1/(td^2)), the screened support matches the support at the solution, where d > 0 measures problem degeneracy. Numerical experiments support our theoretical findings.
An Optimal Algorithm for Strongly Convex Minimization under Affine Constraints
Adil Salim, Laurent Condat, Dmitry Kovalev, Peter Richtarik
Optimization problems under affine constraints appear in various areas of machine learning. We consider the task of minimizing a smooth strongly convex function F under the affine constraint L x = b, with an oracle providing evaluations of the gradient of F and multiplications by L and its transpose. We provide lower bounds on the number of gradient computations and matrix multiplications to achieve a given accuracy. Then we propose an accelerated primal-dual algorithm achieving these lower bounds. Our algorithm is the first optimal algorithm for this class of problems.
Larger stepsizes for some primal-dual algorithms
Ming Yan, Zhi Li
Many primal-dual algorithms are developed to solve the optimization problem with a linear composition, and there was an common upper bound for the product of the primal and dual stepsizes. In this talk, I will present a large upper bound of the product for some primal-dual algorithms. In addition, we provide examples to show that the upper bound is tight. Then we apply this upper bound to show the convergence of several decentralized algorithms under weaker conditions.