In this work, we give a geometric interpretation to the Generative Adversarial Networks (GANs). The geometric view is based on the intrinsic relation between Optimal Mass Transportation (OMT) theory and convex geometry, and leads to a variational approach to solve the Alexandrov problem: constructing a convex polytope with prescribed face normals and volumes.
By using the optimal transportation view of GAN model, we show that the discriminator computes the Wasserstein distance via the Kantorovich potential, the generator calculates the transportation map. For a large class of transportation costs, the Kantorovich potential can give the optimal transportation map by a close-form formula. Therefore, it is sufficient to solely optimize the discriminator. This shows the adversarial competition can be avoided, and the computational architecture can be simplified.
Preliminary experimental results show the geometric method outperforms the traditional Wasserstein GAN for approximating probability measures with multiple clusters in low dimensional space.
We propose a deep learning-based method for object detection in UAV-borne thermal images that have the capability of observing scenes in both day and night. Compared with visible images, thermal images have lower requirements for illumination conditions, but they typically have blurred edges and low contrast. Using a boundary-aware salient object detection network, we extract the saliency maps of the thermal images to improve the distinguishability. Thermal images are augmented with the corresponding saliency maps through channel replacement and pixel-level weighted fusion methods. Considering the limited computing power of UAV platforms, a lightweight combinational neural network ComNet is used as the core object detection method. The YOLOv3 model trained on the original images is used as a benchmark and compared with the proposed method. In the experiments, we analyze the detection performances of the ComNet models with different image fusion schemes. The experimental results show that the average precisions (APs) for pedestrian and vehicle detection have been improved by 2%~5% compared with the benchmark without saliency map fusion and MobileNetv2. The detection speed is increased by over 50%, while the model size is reduced by 58%. The results demonstrate that the proposed method provides a compromise model, which has application potential in UAV-borne detection tasks.
We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model. Assume the probability that a pixel or a data point belongs to each class is known a priori. We show that the corresponding indicator function obeys the Bernoulli distribution and the new region force function can be computed as the negative log-likelihood function under the Bernoulli distribution. We solve the Potts model by the primal-dual hybrid gradient method and the augmented Lagrangian method, which are based on two different dual problems of the same primal problem. Empirical evaluations of the Potts model with the new region force function on benchmark problems show that it is competitive with existing variational methods in both image segmentation and semi- supervised data clustering.
Finding a fixed point to a nonexpansive operator, i.e., x = Tx, abstracts many
problems in numerical linear algebra, optimization, and other areas of data sciences. To solve xed-
point problems, we propose ARock, an algorithmic framework in which multiple agents (machines,
processors, or cores) update x in an asynchronous parallel fashion. Asynchrony is crucial to parallel
computing since it reduces synchronization wait, relaxes communication bottleneck, and thus speeds
up computing significantly. At each step of ARock, an agent updates a randomly selected coordinate
xi based on possibly out-of-date information on x. The agents share x through either global memory
or communication. If writing xi is atomic, the agents can read and write x without memory locks.
We prove that if the nonexpansive operator T has a fixed point, then with probability one, ARock
generates a sequence that converges to a fixed point of T. Our conditions on T and step sizes are
weaker than comparable work. Linear convergence is obtained under suitable assumptions.
We propose special cases of ARock for linear systems, convex optimization, machine learning, as
well as distributed and decentralized consensus problems. Numerical experiments of solving sparse
logistic regression problems are presented.
This paper focuses on coordinate update methods, which are useful for solving problems involving large or high-dimensional datasets. They decompose a problem into simple subproblems, where each updates one, or a small block of, variables while fixing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize.
The great performance of coordinate update methods depends on solving simple sub-problems. To derive simple subproblems for several new classes of applications, this paper systematically studies coordinate-friendly operators that perform low-cost coordinate updates.
Based on the discovered coordinate friendly operators, as well as operator splitting techniques, we obtain new coordinate update algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history. The obtained algorithms are scalable to large instances through parallel and even asynchronous computing. We present numerical examples to illustrate how effective these algorithms are.
The modern financial industry has been required to deal with large and diverse portfolios in a variety of asset classes often with limited market data available. Financial Signal Processing and Machine Learning unifies a number of recent advances made in signal processing and machine learning for the design and management of investment portfolios and financial engineering. This book bridges the gap between these disciplines, offering the latest information on key topics including characterizing statistical dependence and correlation in high dimensions, constructing effective and robust risk measures, and their use in portfolio optimization and rebalancing. The book focuses on signal processing approaches to model return, momentum, and mean reversion, addressing theoretical and implementation aspects. It highlights the connections between portfolio theory, sparse learning and compressed sensing, sparse eigen-portfolios, robust optimization, non-Gaussian data-driven risk measures, graphical models, causal analysis through temporal-causal modeling, and large-scale copula-based approaches. Key features: Highlights signal processing and machine learning as key approaches to quantitative finance. Offers advanced mathematical tools for high-dimensional portfolio construction, monitoring, and post-trade analysis problems. Presents portfolio theory, sparse learning and compressed sensing, sparsity methods for investment portfolios. including eigen-portfolios, model return, momentum, mean reversion and non-Gaussian data-driven risk measures with real-world applications of these techniques. Includes contributions from leading
The stochastic gradient (SG) method can quickly solve a problem with a large number
of components in the objective, or a stochastic optimization problem, to a moderate accuracy. The
block coordinate descent/update (BCD) method, on the other hand, can quickly solve problems with
multiple (blocks of) variables. This paper introduces a method that combines the great features of
SG and BCD for problems with many components in the objective and with multiple (blocks of)
variables. This paper proposes a block SG (BSG) method for both convex and nonconvex programs.
BSG generalizes SG by updating all the blocks of variables in the Gauss–Seidel type (updating the
current block depends on the previously updated block), in either a fixed or randomly shuffled order.
Although BSG has slightly more work at each iteration, it typically outperforms SG because of
BSG’s Gauss–Seidel updates and larger step sizes, the latter of which are determined by the smaller
per-block Lipschitz constants. The convergence of BSG is established for both convex and nonconvex
cases. In the convex case, BSG has the same order of convergence rate as SG. In the nonconvex
case, its convergence is established in terms of the expected violation of a first-order optimality
condition. In both cases our analysis is nontrivial since the typical unbiasedness assumption no
longer holds. BSG is numerically evaluated on the following problems: stochastic least squares and
logistic regression, which are convex, and low-rank tensor recovery and bilinear logistic regression,
which are nonconvex. On the convex problems, BSG performed significantly better than SG. On the
nonconvex problems, BSG significantly outperformed the deterministic BCD method because the
latter tends to stagnate early near local minimizers. Overall, BSG inherits the benefits of both SG
approximation and block coordinate updates and is especially useful for solving large-scale nonconvex
Nonconvex reformulations via low-rank factorization for stochastic convex semidefinite optimization problem have attracted arising attention due to their empirical efficiency and scalability. Compared with the original convex formulations, the nonconvex ones typically involve much fewer variables, allowing them to scale to scenarios with millions of variables. However, it opens a new challenge that under what conditions the nonconvex stochastic algorithms may find the population minimizer within the optimal statistical precision despite their empirical success in applications. In this paper, we provide an answer that the stochastic gradient descent (SGD) method can be adapted to solve the nonconvex reformulation of the original convex problem, with a global linear convergence when using a fixed step size, i.e., converging exponentially fast to the population minimizer within an optimal statistical precision in the restricted strongly convex case. If a diminishing step size is adopted, the bad effect caused by the variance of gradients on the optimization error can be eliminated but the rate is dropped to be sublinear. The core of our treatment relies on a novel second-order descent lemma, which is more general than the existing best result in the literature and improves the analysis on both online and batch algorithms. The developed theoretical results and effectiveness of the suggested SGD are also verified by a series of experiments.