Personal videos often contain visual distractors, which are objects that are accidentally captured that can distract viewers from focusing on the main subjects. We propose a method to automatically detect and localize these distractors through learning from a manually labeled dataset. To achieve spatially and temporally coherent detection, we propose extracting features at the Temporal-Superpixel (TSP) level using a traditional SVM-based learning framework. We also experiment with end-to-end learning using Convolutional Neural Networks (CNNs), which achieves slightly higher performance than other methods. The classification result is further refined in a post-processing step based on graph-cut optimization. Experimental results show that our method achieves an accuracy of 81% and a recall of 86%. We demonstrate several ways of removing the detected distractors to improve the video quality, including video hole filling; video frame replacement; and camera path re-planning. The user study results show that our method can significantly improve the aesthetic quality of videos.
We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model. Assume the probability that a pixel or a data point belongs to each class is known a priori. We show that the corresponding indicator function obeys the Bernoulli distribution and the new region force function can be computed as the negative log-likelihood function under the Bernoulli distribution. We solve the Potts model by the primal-dual hybrid gradient method and the augmented Lagrangian method, which are based on two different dual problems of the same primal problem. Empirical evaluations of the Potts model with the new region force function on benchmark problems show that it is competitive with existing variational methods in both image segmentation and semi- supervised data clustering.
This paper focuses on coordinate update methods, which are useful for solving problems involving large or high-dimensional datasets. They decompose a problem into simple subproblems, where each updates one, or a small block of, variables while fixing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize.
The great performance of coordinate update methods depends on solving simple sub-problems. To derive simple subproblems for several new classes of applications, this paper systematically studies coordinate-friendly operators that perform low-cost coordinate updates.
Based on the discovered coordinate friendly operators, as well as operator splitting techniques, we obtain new coordinate update algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history. The obtained algorithms are scalable to large instances through parallel and even asynchronous computing. We present numerical examples to illustrate how effective these algorithms are.
Finding a fixed point to a nonexpansive operator, i.e., x = Tx, abstracts many
problems in numerical linear algebra, optimization, and other areas of data sciences. To solve xed-
point problems, we propose ARock, an algorithmic framework in which multiple agents (machines,
processors, or cores) update x in an asynchronous parallel fashion. Asynchrony is crucial to parallel
computing since it reduces synchronization wait, relaxes communication bottleneck, and thus speeds
up computing significantly. At each step of ARock, an agent updates a randomly selected coordinate
xi based on possibly out-of-date information on x. The agents share x through either global memory
or communication. If writing xi is atomic, the agents can read and write x without memory locks.
We prove that if the nonexpansive operator T has a fixed point, then with probability one, ARock
generates a sequence that converges to a fixed point of T. Our conditions on T and step sizes are
weaker than comparable work. Linear convergence is obtained under suitable assumptions.
We propose special cases of ARock for linear systems, convex optimization, machine learning, as
well as distributed and decentralized consensus problems. Numerical experiments of solving sparse
logistic regression problems are presented.