Nonconvex optimization arises in many areas of computational science and engineering.
However, most nonconvex optimization algorithms are only known to have local
convergence or subsequence convergence properties. In this paper, we propose an algorithm
for nonconvex optimization and establish its global convergence (of the whole sequence) to
a critical point. In addition, we give its asymptotic convergence rate and numerically demonstrate
its efficiency. In our algorithm, the variables of the underlying problem are either treated
as one block or multiple disjoint blocks. It is assumed that each non-differentiable component
of the objective function, or each constraint, applies only to one block of variables. The
differentiable components of the objective function, however, can involve multiple blocks of
variables together. Our algorithm updates one block of variables at a time by minimizing a
certain prox-linear surrogate, along with an extrapolation to accelerate its convergence. The
order of update can be either deterministically cyclic or randomly shuffled for each cycle. In
fact, our convergence analysis only needs that each block be updated at least once in every
fixed number of iterations. We show its global convergence (of the whole sequence) to a
critical point under fairly loose conditions including, in particular, the Kurdyka–Łojasiewicz
condition, which is satisfied by a broad class of nonconvex/nonsmooth applications. These
results, of course, remain valid when the underlying problem is convex.We apply our convergence
results to the coordinate descent iteration for non-convex regularized linear regression,
as well as amodified rank-one residue iteration for nonnegative matrix factorization. We show that both applications have global convergence. Numerically,we tested our algorithm on nonnegative matrix and tensor factorization problems, where random shuffling clearly improves
the chance to avoid low-quality local solutions.