Image retargeting techniques adjust images into different sizes and have attracted much attention recently. Objective quality assessment (OQA) of image retargeting results is often desired to automatically select the best results. Existing OQA methods train a model using some benchmarks (e.g., RetargetMe), in which subjective scores evaluated by users are provided. Observing that it is challenging even for human subjects to give consistent scores for retargeting results of different source images (diff-source-results), in this paper we propose a learning-based OQA method that trains a General Regression Neural Network (GRNN) model based on relative scores --- which preserve the ranking --- of retargeting results of the same source image (same-source-results). In particular, we develop a novel training scheme with provable convergence that learns a common base scalar for same-source-results. With this source specific offset, our computed scores not only preserve the ranking of subjective scores for same-source-results, but also provide a reference to compare the diff-source-results. We train and evaluate our GRNN model using human preference data collected in RetargetMe. We further introduce a subjective benchmark to evaluate the generalizability of different OQA methods. Experimental results demonstrate that our method outperforms ten representative OQA methods in ranking prediction and has better generalizability to different datasets.
In this work, we give a geometric interpretation to the Generative Adversarial Networks (GANs). The geometric view is based on the intrinsic relation between Optimal Mass Transportation (OMT) theory and convex geometry, and leads to a variational approach to solve the Alexandrov problem: constructing a convex polytope with prescribed face normals and volumes.
By using the optimal transportation view of GAN model, we show that the discriminator computes the Wasserstein distance via the Kantorovich potential, the generator calculates the transportation map. For a large class of transportation costs, the Kantorovich potential can give the optimal transportation map by a close-form formula. Therefore, it is sufficient to solely optimize the discriminator. This shows the adversarial competition can be avoided, and the computational architecture can be simplified.
Preliminary experimental results show the geometric method outperforms the traditional Wasserstein GAN for approximating probability measures with multiple clusters in low dimensional space.
In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an efficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some efficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(10^14) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training efficiency.
In this paper, we aim at developing scalable neural network-type learning systems. Motivated by the idea of constructive neural networks in approximation theory, we focus on constructing rather than training feed-forward neural networks (FNNs) for learning, and propose a novel FNNs learning system called the constructive FNN (CFN). Theoretically, we prove that the proposed method not only overcomes the classical saturation problem for constructive FNN approximation, but also reaches the optimal learning rate when the regression function is smooth, while the state-of-the-art learning rates established for traditional FNNs are only near optimal (up to a logarithmic factor). A series of numerical simulations are provided to show the efficiency and feasibility of CFN.
Personal videos often contain visual distractors, which are objects that are accidentally captured that can distract viewers from focusing on the main subjects. We propose a method to automatically detect and localize these distractors through learning from a manually labeled dataset. To achieve spatially and temporally coherent detection, we propose extracting features at the Temporal-Superpixel (TSP) level using a traditional SVM-based learning framework. We also experiment with end-to-end learning using Convolutional Neural Networks (CNNs), which achieves slightly higher performance than other methods. The classification result is further refined in a post-processing step based on graph-cut optimization. Experimental results show that our method achieves an accuracy of 81% and a recall of 86%. We demonstrate several ways of removing the detected distractors to improve the video quality, including video hole filling; video frame replacement; and camera path re-planning. The user study results show that our method can significantly improve the aesthetic quality of videos.