In this work, we give a geometric interpretation to the Generative Adversarial Networks (GANs). The geometric view is based on the intrinsic relation between Optimal Mass Transportation (OMT) theory and convex geometry, and leads to a variational approach to solve the Alexandrov problem: constructing a convex polytope with prescribed face normals and volumes.
By using the optimal transportation view of GAN model, we show that the discriminator computes the Wasserstein distance via the Kantorovich potential, the generator calculates the transportation map. For a large class of transportation costs, the Kantorovich potential can give the optimal transportation map by a close-form formula. Therefore, it is sufficient to solely optimize the discriminator. This shows the adversarial competition can be avoided, and the computational architecture can be simplified.
Preliminary experimental results show the geometric method outperforms the traditional Wasserstein GAN for approximating probability measures with multiple clusters in low dimensional space.
We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model. Assume the probability that a pixel or a data point belongs to each class is known a priori. We show that the corresponding indicator function obeys the Bernoulli distribution and the new region force function can be computed as the negative log-likelihood function under the Bernoulli distribution. We solve the Potts model by the primal-dual hybrid gradient method and the augmented Lagrangian method, which are based on two different dual problems of the same primal problem. Empirical evaluations of the Potts model with the new region force function on benchmark problems show that it is competitive with existing variational methods in both image segmentation and semi- supervised data clustering.
The stochastic gradient (SG) method can quickly solve a problem with a large number
of components in the objective, or a stochastic optimization problem, to a moderate accuracy. The
block coordinate descent/update (BCD) method, on the other hand, can quickly solve problems with
multiple (blocks of) variables. This paper introduces a method that combines the great features of
SG and BCD for problems with many components in the objective and with multiple (blocks of)
variables. This paper proposes a block SG (BSG) method for both convex and nonconvex programs.
BSG generalizes SG by updating all the blocks of variables in the Gauss–Seidel type (updating the
current block depends on the previously updated block), in either a fixed or randomly shuffled order.
Although BSG has slightly more work at each iteration, it typically outperforms SG because of
BSG’s Gauss–Seidel updates and larger step sizes, the latter of which are determined by the smaller
per-block Lipschitz constants. The convergence of BSG is established for both convex and nonconvex
cases. In the convex case, BSG has the same order of convergence rate as SG. In the nonconvex
case, its convergence is established in terms of the expected violation of a first-order optimality
condition. In both cases our analysis is nontrivial since the typical unbiasedness assumption no
longer holds. BSG is numerically evaluated on the following problems: stochastic least squares and
logistic regression, which are convex, and low-rank tensor recovery and bilinear logistic regression,
which are nonconvex. On the convex problems, BSG performed significantly better than SG. On the
nonconvex problems, BSG significantly outperformed the deterministic BCD method because the
latter tends to stagnate early near local minimizers. Overall, BSG inherits the benefits of both SG
approximation and block coordinate updates and is especially useful for solving large-scale nonconvex