Do you want to **read the rest** of this article?

# An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors

**Article**

*in*Frontiers of Mathematics in China 7(2) · March 2011

*with*144 Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

DOI: 10.1007/s11464-012-0194-5 · Source: arXiv

Cite this publicationAbstract

This paper introduces an algorithm for the nonnegative matrix
factorization-and-completion problem, which aims to find nonnegative low-rank
matrices X and Y so that the product XY approximates a nonnegative data matrix
M whose elements are partially known (to a certain accuracy). This problem
aggregates two existing problems: (i) nonnegative matrix factorization where
all entries of M are given, and (ii) low-rank matrix completion where
nonnegativity is not required. By taking the advantages of both nonnegativity
and low-rankness, one can generally obtain superior results than those of just
using one of the two properties. We propose to solve the non-convex constrained
least-squares problem using an algorithm based on the classic alternating
direction augmented Lagrangian method. Preliminary convergence properties of
the algorithm and numerical simulation results are presented. Compared to a
recent algorithm for nonnegative matrix factorization, the proposed algorithm
produces factorizations of similar quality using only about half of the matrix
entries. On tasks of recovering incomplete grayscale and hyperspectral images,
the proposed algorithm yields overall better qualities than those produced by
two recent matrix-completion algorithms that do not exploit nonnegativity.

- ... We then obtain a (raw) matrix that is transformed in the un-complete H exp , that is, in a spa- tial coordinates (x, y) vs. spectral coordinates (ν) (see the missing boxes in Fig. 1.C). By using this random hyperspectrum fast sampling methodology, H exp can be readily processed using off-the-shelf algorithms [21,23]. The algorithm then outputs a singular value decomposi- tion (SVD) of H, which we use for post-processing in a conventional manner used in Raman imaging. ...... The algorithm then outputs a singular value decomposi- tion (SVD) of H, which we use for post-processing in a conventional manner used in Raman imaging. Alterna- tively, we also use non-negative matrix completion algo- rithms [23] based on the factorization H = XY, where X and Y are non-negative matrices, motivated by the fact that Raman spectra are necessarily non-negative. Hence, this constraint may help to get a better reconstruction. ...... Different algorithms were used for the analysis. They are based on either soft- threshold SVD [27] or non-negative matrix factoriza- tion [23]. In general, we chose reconstructions with ranks between 2-5, as previous results have suggested [3]. ...PreprintFull-text available
- Nov 2018

Raman microscopy is a powerful method combining non-invasiveness with no special sample preparation. Because of this remarkable simplicity, it has been widely exploited in many fields, ranging from life and materials sciences, to engineering. Notoriously, due to the required imaging speeds for bio-imaging, it has remained a challenge how to use this technique for dynamic and large-scale imaging. Recently, compressive Raman has been put forward, allowing for fast imaging, therefore solving the issue of speed. Yet, due to the need of strong a priori information of the species forming the hyperspectrum, it has remained elusive how to apply this technique for microspectroscopy of (dynamic) biological tissues. Combining an original spectral under-sampling measurement technique with matrix completion framework for reconstruction, we demonstrate fast and inexpensive label-free molecular imaging of biological specimens (brain tissues and single cells). Therefore, our results open interesting perspectives for clinical and cell biology applications using the much faster compressive Raman framework. - ... NMF finds applications in many domains such as blind source separation [39], low-rank matrix completion [30], [40], recommendation system/collaborative filtering [41], classifi- cation/clustering [42], or dictionary learning [43]. It is also increasingly used in environmental monitoring [44]. ...... using an Alternating Direction Method (ADM) [40], multi- plicative updates [41], or a Nesterov gradient technique [30], respectively 3 . The product of the estimated matrices˜Amatrices˜ matrices˜A · ˜ S then provides an estimation X comp of the completed version of X which satisfies both the known rank and the nonnegative decomposition of X. ...... In the remainder of this paper, we investigate the perfor- mance of the low-rank-based approach using NMF/C [40], and of the IN-cal, ACIN-cal, and SpIN-cal methods with respect to different parameter values. ...ArticleFull-text available
- Mar 2018

In this paper, we consider the problem of blindly calibrating a mobile sensor network i.e., determining the gain and the offset of each sensor from heterogeneous observations on a defined spatial area over time. For that purpose, we propose to revisit blind sensor calibration as an informed Nonnegative Matrix Factorization (NMF) problem with missing entries. In the considered framework, one matrix factor contains the calibration structure of the sensors and especially the values of the sensed phenomenon while the other one contains the calibration parameters of the whole sensor network. The available information is taken into account by using a specific parameterization of the NMF problem. Moreover, we also consider additional NMF constraints which can be independently taken into account, i.e., an approximate constraint over the mean calibration parameters and a sparse approximation of the sensed phenomenon over a known dictionary. The enhancement of our proposed approaches is investigated through more than 5000 simulations and is shown to be accurate for the considered application and to outperform a multi-hop micro-calibration technique as well as a method based on low-rank matrix completion and nonnegative least squares. - ... La matrice X peut donc s'écrire comme le produit de deux matrices non-négatives (4.3). Il est alors possible de chercher à résoudre le problème de complétion (4.11) par une approche de complétion de matrice par factorisation en matrices non-négatives (ou NMF/C pour Nonnegative Matrix Factorization/Completion) [234,230,52], qui consiste à résoudre 1 ...... en utilisant une méthode de direction alternée (ADM pour Alternating Direction Method) [230], des mises à jour multiplicatives [234] ou une descente de gradient de Nesterov [52]. Les deux matrices A et S obtenues permettent par produit de reconstruire une matrice X comp complète vérifiant la propriété de faible rang de X et sa décomposition en produit de matrices non-négatives. ...... Construire la matrice X comp par une approche de complétion de matrice faible rang, par exemple à base de SVD tronquée [7] ou de NMF/C [230]. ...ThesisFull-text available
- Dec 2017

Le mobile crowdsensing consiste à acquérir des données géolocalisées et datées d'une foule de capteurs mobiles (issus de ou connectés à des smartphones). Dans cette thèse, nous nous intéressons au traitement de données issues du mobile crowdsensing environnemental. En particulier, nous proposons de revisiter le problème d'étalonnage aveugle de capteurs comme un problème informé de factorisation matricielle à données manquantes, où les facteurs contiennent respectivement le modèle d'étalonnage fonction du phénomène physique observé (nous proposons des approches pour des modèles affines et non-linéaires) et les paramètres d'étalonnage de chaque capteur. Par ailleurs, dans l'application de surveillance de la qualité de l'air que nous considérons, nous supposons avoir à notre disposition des mesures très précises mais distribuées de manière très parcimonieuse dans le temps et l'espace, que nous couplons aux multiples mesures issues de capteurs mobiles. Nos approches sont dites informées car (i) les facteurs matriciels sont structurés par la nature du problème, (ii) le phénomène observé peut être décomposé sous forme parcimonieuse dans un dictionnaire connu ou approché par un modèle physique/géostatistique, et (iii) nous connaissons la fonction d'étalonnage moyenne des capteurs à étalonner. Les approches proposées sont plus performantes que des méthodes basées sur la complétion de la matrice de données observées ou les techniques multi-sauts de la littérature, basées sur des régressions robustes. Enfin, le formalisme informé de factorisation matricielle nous permet aussi de reconstruire une carte fine du phénomène physique observé. - ... To avoid poor MC recovery due to choosing the "wrong" rank, we present a strategy that selects a range of ranks in an automated fashion and aggregates the recovered entries over ranks in a robust manner. Further, standard MC algorithms do not impose a non-negative constraint on matrix entries [11], but fiber counts are non-negative. We thus devise a method that exploits neighbourhood information to interpolate the negative entries. ...... For AC estimation, negative recovered entries are not biologically interpretable, since fiber counts cannot be negative. Explicitly imposing a non-negative constraint onto MC is possible [11], but we observe empirically that such constraint tends to decrease the recovery accuracy (Fig. 3). Instead, we devise a simple method to exploit those few entries which are estimated as negative based on neighborhood information. ...... For comparison, we examined interpolating AC entries with zero values by median filtering (MedFilter), with neighborhood information as described in Sect. 2.3 (NeighFill), and three other widely-used rank-based MC algorithms: OptSpace [10], GROUSE [16], and MCNF [11], which imposes a non-negative constraint. The vanilla LMaFit was also tested. ...
- ... However, as directly optimizing problem (1) is NP-hard, many methods instead minimize the nuclear norm or low-rank matrix factorization to efficiently enhance the low-rankness of the underlying results. In addition, once the nuclear norm or the low-rank matrix factorization is selected, this minimization problem can be efficiently solved by fixed point continuation with approximate singular value decomposition (FPCA) [29] , accelerated proximal gradient algorithm for nuclear norm reg- ularized linear least squares problems (APGL) [38] , low-rank matrix fitting algorithm (LMaFit) [41] , and alternating direction multipliers method (ADMM) [36,46] . ...... can be easily solved by alternating direction method (ADM) [6,9,32,36,46,49] . Following, we solve X 3 -sub-problem. ...
- ... Therefore, we solve the problem similar to an alternating minimization approach where the cost function is minimized with respect to either Θ or B, one at a time, while fixing the other and then enforce the relevant con- straints after each minimization step. This method is similar to the alternating least squares (ALS) algorithm utilized in [170] that has been widely used for non- negative matrix factorization or similar literatures [171]. We divide the problem into two minimization steps as below: ...... Similar to the methods used in [159,168], we use an alternating minimization approach where the cost function is minimized with re- spect to either µ k , or H SF k , one at a time, while fixing the other. This method is similar to the alternating least squares (ALS) algorithm utilized in [170] that has been widely used for non-negative matrix factorization or similar literatures [171]. ...ThesisFull-text available
- Mar 2017

Smart grid cyber-security has come to the forefront of national security priorities due to emergence of new cyber threats such as the False Data Injection (FDI) attack. Using FDI, an attacker can intelligently modify smart grid measurement data to produce wrong system states which can directly affect the safe operation of the physical grid. The goal of this thesis is to investigate key research problems leading to the discovery of significant vulnerabilities and their impact on smart grid operation. The first problem investigates how a stealthy FDI attack can be constructed without the knowledge of system parameters, e.g., line reactance, bus and line connectivity. We show how an attacker can successfully carry out an FDI attack by analysing subspace information of the measurement data without requiring the system topological knowledge. In addition, we make a critical observation that existing subspace based attacks would fail in the presence of gross errors and missing values in the observed data. Next, we show how an attacker can circumvent this problem by using a sparse matrix separation technique. Extensive evaluation on several benchmark systems demonstrates the effectiveness of this approach. The second problem addresses the scenario when an attacker may eavesdrop but only has access to a limited number of measurement devices to inject false vii data. We show how an attack can be constructed by first estimating the hidden system topology from measurement data only and then use it to identify a set of critical sensors for data injection. Extensive experiments using graph-theoretic and eigenvalue analyses demonstrate that the estimated power grid structure is very close to the original grid topology, and a stealthy FDI attack can be carried out using only a small fraction of all available sensors. The third problem investigates a new type of stealthy Load Redistribution (LR) attack using FDI which can deliberately cause changes in the Locational Marginal Price (LMP) of smart grid nodes. To construct the LR-FDI attack, the Shift factor is estimated from measurement and LMP data. Finally, the impact of the attacks on the state estimation and the nodal energy prices is thoroughly investigated. - ... Matrix factorization (MF) [7] is a most victorious modelbased method that gains considerable performance while appling on real-world domains. The basic idea of MF is to map items and users into an identical latent space factors. ...... However, these methods are inappropriate for CF-based recommenders because of incompleteness of rating matrices [14]. Thus, the authors of [7] proposed a matrix completion model (called NMC) to predict unknown ratings by using matrix completion algorithms. Furthermore, in industrial applications, we mostly faced with a scattered and a large matrix. ...Conference PaperFull-text available
- May 2018

- ... By taking advantage of both low-rankness and nonnegative factors, Xu et al. [17] had shown that superior results can be generally obtained, compared with just using one of the two aspects for nonnegative matrix completion problems. Following by the spirit of [17], we, in this paper, consider the nonnegative tensor completion problem for nonnegative data, and we solve it by performing nonnegative Tucker decomposition, which decomposes the high-order tensor into a low-rank core tensor in every mode multiplied by a matrix, where the core tensor and the matrices are all nonnegative. ...... By taking advantage of both low-rankness and nonnegative factors, Xu et al. [17] had shown that superior results can be generally obtained, compared with just using one of the two aspects for nonnegative matrix completion problems. Following by the spirit of [17], we, in this paper, consider the nonnegative tensor completion problem for nonnegative data, and we solve it by performing nonnegative Tucker decomposition, which decomposes the high-order tensor into a low-rank core tensor in every mode multiplied by a matrix, where the core tensor and the matrices are all nonnegative. A Tucker decomposition with a rank of (r 1 , r 2 , . . . ...Article
- Jul 2019

We consider the problem of low-rank tensor decomposition of incomplete tensors that has applications in many data analysis problems such as recommender systems, signal processing, machine learning and image inpainting. In this paper, we focus on nonnegative tensor completion via low-rank Tucker decomposition for dealing with it. The speciality of our model is that the ranks of nonnegative Tucker decomposition are no longer constants, while they all become a part of the decisions to be optimized. Our solving approach is based on penalty method and block coordinate descent method with prox-linear updates for regularized multiconvex optimization. We demonstrate the convergence of our algorithm. Numerical results on the three image datasets show that the proposed algorithm offers competitive performance compared with other existing algorithms even though the data is highly sparse. - ... A commonly used assumption in showing the convergence of BCD methods in the literature is that the gradient of f is globally Lipschitz-continuous. However, this could be a restrictive assumption violated in diverse applications in practice, such as matrix factorization [12], [13], tensor decomposition [14], matrix/tensor completion [15], Poisson likelihood models [16], etc. Although this assumption may be relaxed by adopting conventional line search methods, the efficiency and computational complexity of the BCD methods are unavoidably distorted, especially when the size of the problem is large. ...Preprint
- Dec 2019

In the applications of signal processing and data analytics, there is a wide class of non-convex problems whose objective function is freed from the common global Lipschitz continuous gradient assumption (e.g., the nonnegative matrix factorization (NMF) problem). Recently, this type of problem with some certain special structures has been solved by Bregman proximal gradient (BPG). This inspires us to propose a new Block-wise two-references Bregman proximal gradient (B2B) method, which adopts two reference functions so that a closed-form solution in the Bregman projection is obtained. Based on the relative smoothness, we prove the global convergence of the proposed algorithms for various block selection rules. In particular, we establish the global convergence rate of $O(\frac{\sqrt{s}}{\sqrt{k}})$ for the greedy and randomized block updating rule for B2B, which is $O(\sqrt{s})$ times faster than the cyclic variant, i.e., $O(\frac{s}{\sqrt{k}} )$, where $s$ is the number of blocks, and $k$ is the number of iterations. Multiple numerical results are provided to illustrate the superiority of the proposed B2B compared to the state-of-the-art works in solving NMF problems. - ... is a bounded box with i < u i for all i, and the matrix A has dimension R m×n . Problems of the form (1.1) arise in many applications involving big data, including nonnegative matrix factorization [9,20,23], phase retrieval [22], distributed matrix factorization [16], polynomial optimization [14], asset allocation [21], zero variance discriminant analysis [1]. A popular approach to solve problem (1.1) is to dualize the linear equality constraint and apply a primal-dual type algorithm to the resulting augmented Lagrangian function. ...Preprint
- Dec 2018

Consider the minimization of a nonconvex differentiable function over a polyhedron. A popular primal-dual first-order method for this problem is to perform a gradient projection iteration for the augmented Lagrangian function and then update the dual multiplier vector using the constraint residual. However, numerical examples show that this approach can exhibit "oscillation" and may not converge. In this paper, we propose a proximal alternating direction method of multipliers for the multi-block version of this problem. A distinctive feature of this method is the introduction of a "smoothed" (i.e., exponentially weighted) sequence of primal iterates, and the inclusion, at each iteration, to the augmented Lagrangian function a quadratic proximal term centered at the current smoothed primal iterate. The resulting proximal augmented Lagrangian function is inexactly minimized (via a gradient projection step) at each iteration while the dual multiplier vector is updated using the residual of the linear constraints. When the primal and dual stepsizes are chosen sufficiently small, we show that suitable "smoothing" can stabilize the "oscillation", and the iterates of the new proximal ADMM algorithm converge to a stationary point under some mild regularity conditions. Furthermore, when the objective function is quadratic, we establish the linear convergence of the algorithm. Our proof is based on a new potential function and a novel use of error bounds. - ... The alternating scheme used in Algorithm 1 and Algorithm 2 is a typical multi-block ADMM optimization problem, whose convergence is theoretically supported in [38] [39]. Moreover, similar work for solving this sort of multi-block ADMM-based optimization problem has been successfully applied in [40] [41] [42]. We experimentally visualize the convergence results for ALMM-based SU and ALMM-based SVDL on the three datasets, where the objective function value is recorded in each iteration (see Fig. 5). ...PreprintFull-text available
- Oct 2018

Hyperspectral imagery collected from airborne or satellite sources inevitably suffers from spectral variability, making it difficult for spectral unmixing to accurately estimate abundance maps. The classical unmixing model, the linear mixing model (LMM), generally fails to handle this sticky issue effectively. To this end, we propose a novel spectral mixture model, called the augmented linear mixing model (ALMM), to address spectral variability by applying a data-driven learning strategy in inverse problems of hyperspectral unmixing. The proposed approach models the main spectral variability (i.e., scaling factors) generated by variations in illumination or typography separately by means of the endmember dictionary. It then models other spectral variabilities caused by environmental conditions (e.g., local temperature and humidity, atmospheric effects) and instrumental configurations (e.g., sensor noise), as well as material nonlinear mixing effects, by introducing a spectral variability dictionary. To effectively run the data-driven learning strategy, we also propose a reasonable prior knowledge for the spectral variability dictionary, whose atoms are assumed to be low-coherent with spectral signatures of endmembers, which leads to a well-known low coherence dictionary learning problem. Thus, a dictionary learning technique is embedded in the framework of spectral unmixing so that the algorithm can learn the spectral variability dictionary and estimate the abundance maps simultaneously. Extensive experiments on synthetic and real datasets are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods. - ... In this section, we discuss the convergence of ADMM-Factorization algorithm to a KKT point as in Xu et al. (2012). The KKT conditions for (3) are as follows: ...ArticleFull-text available
- Dec 2019

In this paper, we propose an Alternating Direction Method of Multipliers (ADMM) based algorithm that is taking advantage of factorization for the fixed rank matrix completion problem. The convergence of the proposed algorithm to the KKT point is discussed. Finally, on several classes of test problems, its efficiency is compared with several efficient algorithms from the literature. - ... The ADMM algorithm is particularly suitable for handling matrix completion problems with additional constraints. Similar to the general matrix completion, ADMM have been used to address a model equivalent to the non-negative matrix completion model (Formula (23)) by introducing a separation matrix variable [54,55] : ...Article
- Dec 2018

In recent years, the recommendation systems have become increasingly popular and have been used in a broad variety of applications. Here, we investigate the matrix completion techniques for the recommendation systems that are based on collaborative filtering. The collaborative filtering problem can be viewed as predicting the favorability of a user with respect to new items of commodities. When a rating matrix is constructed with users as rows, items as columns, and entries as ratings, the collaborative filtering problem can then be modeled as a matrix completion problem by filling out the unknown elements in the rating matrix. This article presents a comprehensive survey of the matrix completion methods used in recommendation systems. We focus on the mathematical models for matrix completion and the corresponding computational algorithms as well as their characteristics and potential issues. Several applications other than the traditional user-item association prediction are also discussed. - ... For the ADMM-style optimization approach, it is difficult to prove its strong convergence property with more than two unknown variables. Fortunately, we can prove a weak convergence property of the proposed method based on the following theorem [42,49]. Theorem 1. ...ArticleFull-text available
- Dec 2019

In this paper, we propose a general framework for incomplete multiview clustering. The proposed method is the first work that exploits the graph learning and spectral clustering techniques to learn the common representation for incomplete multiview clustering. First, owing to the good performance of low-rank representation in discovering the intrinsic subspace structure of data, we adopt it to adaptively construct the graph of each view. Second, a spectral constraint is used to achieve the low-dimensional representation of each view based on the spectral clustering. Third, we further introduce a co-regularization term to learn the common representation of samples for all views, and then use the k-means to partition the data into their respective groups. An efficient iterative algorithm is provided to optimize the model. Experimental results conducted on seven incomplete multiview datasets show that the proposed method achieves the best performance in comparison with some state-of-the-art methods, which proves the effectiveness of the proposed method in incomplete multiview clustering. - ... Instead of the multiplicative updates, Alternating Direction Method of Multipli- ers (ADMM) [15] was used, which considerably improves the convergence prop- erties. ADMM, known also as the Douglas-Rachford splitting, become recently very popular in machine learning and image processing [16,17], including NMF problems [14,[18][19][20], matrix completion [21,22], and low-rank matrix decompo- sition [23]. ...ChapterFull-text available
- Jan 2018

- ... Xu et al. in [19] recommended nonnegative matrix completion (NMC) in order to solve the sparsity problem, where the aim is to introduce an intermediate, complete matrix to estimate a target rating matrix, and use an NNMF process on this matrix instead of the incomplete rating matrix to avoid addressing its missing entries. Though these models are good to deal with the incomplete one in a CF problem, they have the drawback of high computational and storage costs, which are linear with the size of the target matrix. ...Article
- Apr 2018

Recent advancements in business strategies marked the significance of e-commerce in marketing any service of the organization. Moreover, users are quiet dependent on the average ratings of the products showcased in the marketing interface in turn these average ratings made remarkable impact on sales phenomena of the product. The average rating of a product is the aggregation of individual users ratings biased with the tendency of the user towards publishing the opinion. The optimistic user tends to give a slight high rating than a neutral judgement and vice versa with a pessimistic user. However, these biased ratings produce an aggregate value that is degraded with its trustworthiness. This paper proposed a novel approach named DBT (De-biased Tendency) Recommender to analyze the bias in product rating which recalculates the average ratings of the products by making user tendencies as part of the process. The solution implemented on a big data environment on demand of high computation complexity involved in the process. Experimental results had shown a significant improvement in the trustworthiness of the product ratings with the proposed approach. - ... Here, convergence of Algorithm 1 is examined when all the measurements are available {G A n = G n } n , and X A = X, the extension for the case with misses is straightforward [17]. ...PreprintFull-text available
- Sep 2018

Joint analysis of data from multiple information repositories facilitates uncovering the underlying structure in heterogeneous datasets. Single and coupled matrix-tensor factorization (CMTF) has been widely used in this context for imputation-based recommendation from ratings, social network, and other user-item data. When this side information is in the form of item-item correlation matrices or graphs, existing CMTF algorithms may fall short. Alleviating current limitations, we introduce a novel model coined coupled graph-tensor factorization (CGTF) that judiciously accounts for graph-related side information. The CGTF model has the potential to overcome practical challenges, such as missing slabs from the tensor and/or missing rows/columns from the correlation matrices. A novel alternating direction method of multipliers (ADMM) is also developed that recovers the nonnegative factors of CGTF. Our algorithm enjoys closed-form updates that result in reduced computational complexity and allow for convergence claims. A novel direction is further explored by employing the interpretable factors to detect graph communities having the tensor as side information. The resulting community detection approach is successful even when some links in the graphs are missing. Results with real data sets corroborate the merits of the proposed methods relative to state-of-the-art competing factorization techniques in providing recommendations and detecting communities. - ... Thus, in practice we need to deal with missing values that are included in spatio-temporal data. Non-negative tensor completion (NTC) [14], which was proposed for missing value completion, is an extension of non-negative tensor factorization (NTF) that simulta- neously extracts latent factors from observed values and infers the missing values. Tensor is a generalization of a matrix that can represent a high-dimensional data array without any loss of data attributes. ...ArticleFull-text available
- Mar 2018

Machine learning is a promising technology for analyzing diverse types of big data. The Internet of Things era will feature the collection of real-world information linked to time and space (location) from all sorts of sensors. In this paper, we discuss spatio-temporal multidimensional collective data analysis to create innovative services from such spatio-temporal data and describe the core technologies for the analysis. We describe core technologies about smart data collection and spatio-temporal data analysis and prediction as well as a novel approach for real-time, proactive navigation in crowded environments such as event spaces and urban areas. Our challenge is to develop a real-time navigation system that enables movements of entire groups to be efficiently guided without causing congestion by making near-future predictions of people flow. We show the effectiveness of our navigation approach by computer simulation using artificial people-flow data. - ... In this section, we propose two block coordinate descent methods for problems (P m ) and (P c ), respec- tively. Block coordinate descent methods, also known as alternative direction methods, have been success- fully applied to convex programming and some non- convex optimization problems arising from image pro- cessing and matrix optimization (see, e.g., Goldstein 2009, Yin et al. 2008, He et al. 2012, Xu et al. 2012, Shen et al. 2014). The idea of the block coordinate descent method is to alternatively fix some variables in the aug- mented Lagrangian formulation of (P m ) or (P c ) and solve the resulting more tractable subproblems at each iteration of the algorithm. ...Article
- Aug 2018
- INFORMS J COMPUT

In this paper, we investigate a portfolio optimization methodology using nonparametric value at risk (VaR). In particular, we adopt kernel VaR and quadratic VaR as risk measures. As the resulting models are nonconvex and nonsmooth optimization problems, albeit with some special structures, we propose some specially devised block coordinate descent (BCD) methods for finding approximate or local optimal solutions. Computational results show that the BCD methods are efficient for finding local solutions with good quality and they compare favorably with the branch-and-bound method-based global optimal solution procedures. From the simulation test and empirical analysis that we carry out, we are able to conclude that the mean-VaR models using kernel VaR and quadratic VaR are more robust compared to those using historical VaR or parametric VaR under the normal distribution assumption, especially when the information of the return distribution is limited. - ... Many methods were proposed for NMF, to name a few, multiplicative updates [9], alternative non-negative least squares (ANLS) [7], accelerated al- ternating proximal gradient (AAPG) [23], projected-gradient [13], projected quasi-Newton [6], block principal pivoting method [8] and alternative direction multipliers method (ADMM) [25]. Among these, multiplicative updates is the first well-known algorithm for NMF, which was often compared against the new algorithms. ...Preprint
- Feb 2019

In this paper, we propose a right kernel NMF (RKNMF) method for matrix factorization. Different from the kernel NMF (KNMF) which implements the kernel method at the left hand of the factorization, RKNMF applies the kernel method at the right hand of the factorization. KNMF lifts the original features to a higher dimension space by a nonlinear map to excavate the structures hidden in the data, while RKNMF hopes to excavate the nonlinear structures between the bases and the reduced dimensional representation. Moreover, we present a double kernel NMF (DKNMF) method which is a combination of KNMF and RKNMF. In addition, practical algorithms are presented for RKNMF and DKNMF. The numerical experiments show that RKNMF outperforms NMF on the task of image reconstruction, and RKNMF and DKNMF outperform NMF and KNMF on the task of extracting features. - ... Apart from multiplicative updates, NMF based on alternating direction method of multipliers (ADMM) were recently proposed [43] for their ability to perform distributed computations for large scale data and in particular, Sun and Févotte introduced an approach based on the β-divergence [22] while Zhu and Honeine [24] proposed a correntropy-based approach for large deviations. Such fast approaches are not required for the considered chemical application where the global computation time is not an issue. ...In this paper, we propose informed weighted non-negative matrix factorization (NMF) methods using an αβ-divergence cost function. The available information comes from the exact knowledge/boundedness of some components of the factorization-which are used to structure the NMF parameterization-together with the row sum-to-one property of one matrix factor. In this contribution, we extend our previous work which partly involved some of these aspects to αβ-divergence cost functions. We derive new update rules which are extendthe previous ones and take into account the available information. Experiments conducted for several operating conditions on realistic simulated mixtures of particulate matter sources show the relevance of these approaches. Results from a real dataset campaign are also presented and validated with expert knowledge.
- ... . Hence, by Lemma 5.3 4 applied to f y and the sequences {Y s } and {v s y }, we find −( Remark 6.10. The assumption that the successive differences U i − U + i converge to 0 is used in analyses of nonconvex ADMM such as [11,22]. Corollary 6.9 shows that this is a very strong assumption: it alone implies that every limit point of ADMM is a constrained stationary point, even when f and C only satisfy Assumptions 3 and 4. ...Article
- Feb 2018
- OPTIM METHOD SOFTW

We propose a significant expansion of the scope of the alternating direction method of multipliers (ADMM). Specifically, we show that ADMM, when employed to solve problems with multiaffine constraints that satisfy certain easily verifiable assumptions, converges to the set of constrained stationary points if the penalty parameter in the augmented Lagrangian is sufficiently large. Our analysis applies under assumptions that we have endeavored to make as weak as possible. It applies to problems that involve nonconvex and/or nonsmooth objective terms, in addition to the multiaffine constraints that can involve multiple (three or more) blocks of variables. To illustrate the applicability of our results, we describe examples including nonnegative matrix factorization, sparse learning, risk parity portfolio selection, nonconvex formulations of convex problems, and neural network training. In each case, our ADMM approach encounters only subproblems that have closed-form solutions. - ... Low rank decompositions designed for real matrices, on the other hand, tend to be SVD-like and orthogonal, with negative entries, and it is less clear how to interpret these. Low rank matrix completion with nonnegativity constraints, for example in [27], enforces non-negativity of factors, but does not address the issue of rounding errors induced by rounding non-integer values. Matrix completion algorithms were proposed in [26,24] which obtain decompositions in which one factor is composed of rows of the matrix, and is by consequence binary, although the other factor is generally non-integer. ...PreprintFull-text available
- Jul 2019

We propose an algorithm for low rank matrix completion for matrices with binary entries which obtains explicit binary factors. Our algorithm, which we call TBMC (\emph{Tiling for Binary Matrix Completion}), gives interpretable output in the form of binary factors which represent a decomposition of the matrix into tiles. Our approach is inspired by a popular algorithm from the data mining community called PROXIMUS: it adopts the same recursive partitioning approach while extending to missing data. The algorithm relies upon rank-one approximations of incomplete binary matrices, and we propose a linear programming (LP) approach for solving this subproblem. We also prove a $2$-approximation result for the LP approach which holds for any level of subsampling and for any subsampling pattern. Our numerical experiments show that TBMC outperforms existing methods on recommender systems arising in the context of real datasets. - ... Thus, a practical algorithm usually converges to a local minimum. Many algorithms have been proposed to solve NMF such as multiplicative updates (MU) ( Lee and Seung 2001), hierarchical alternating least square (HALS) (Cichocki, Zdunek, and Amari 2007;Li and Zhang 2009), alternating direction multiplier method (ADMM) (Zhang 2010), and alternating nonnega- tive least square (ANLS) (Kim and Park 2011). Amongst those algorithms, ANLS has the largest reduction of objec- tive value per iteration since it exactly solves nonnegative least square (NNLS) subproblems using a block principal pivoting (BPP) method (Kim and Park 2011). ...ArticleFull-text available
- Feb 2018

Nonnegative matrix factorization (NMF) has attracted much attention in the last decade as a dimension reduction method in many applications. Due to the explosion in the size of data, naturally the samples are collected and stored distributively in local computational nodes. Thus, there is a growing need to develop algorithms in a distributed memory architecture. We propose a novel distributed algorithm, called \textit{distributed incremental block coordinate descent} (DID), to solve the problem. By adapting the block coordinate descent framework, closed-form update rules are obtained in DID. Moreover, DID performs updates incrementally based on the most recently updated residual matrix. As a result, only one communication step per iteration is required. The correctness, efficiency, and scalability of the proposed algorithm are verified in a series of numerical experiments. - ... We introduce an alternating optimization algorithm to solve problem (10) (Shen, Wen, and Zhang 2014;Xu et al. 2012). According to the Ky Fan's Theorem (Nie, Cai, and Li 2017;Fan 1949) and constraint U (v)T U (v) = I, problem (10) is equivalent to the following optimization problem: ...Article
- Jul 2019

Multi-view clustering aims to partition data collected from diverse sources based on the assumption that all views are complete. However, such prior assumption is hardly satisfied in many real-world applications, resulting in the incomplete multi-view learning problem. The existing attempts on this problem still have the following limitations: 1) the underlying semantic information of the missing views is commonly ignored; 2) The local structure of data is not well explored; 3) The importance of different views is not effectively evaluated. To address these issues, this paper proposes a Unified Embedding Alignment Framework (UEAF) for robust incomplete multi-view clustering. In particular, a locality-preserved reconstruction term is introduced to infer the missing views such that all views can be naturally aligned. A consensus graph is adaptively learned and embedded via the reverse graph regularization to guarantee the common local structure of multiple views and in turn can further align the incomplete views and inferred views. Moreover, an adaptive weighting strategy is designed to capture the importance of different views. Extensive experimental results show that the proposed method can significantly improve the clustering performance in comparison with some state-of-the-art methods. - ... Here, convergence of Algorithm 1 is examined when all the measurements are available {G A n = G n } n , and X A = X, the extension for the case with misses is straightforward [23]. ...Joint analysis of data from multiple information repositories facilitates uncovering the underlying structure in heterogeneous datasets. Single and coupled matrix-tensor factorization (CMTF) has been widely used in this context for imputation-based recommendation from ratings, social network, and other user-item data. When this side information is in the form of item-item correlation matrices or graphs, existing CMTF algorithms may fall short. Alleviating current limitations, we introduce a novel model coined coupled graph-tensor factorization (CGTF) that judiciously accounts for graph-related side information. The CGTF model has the potential to overcome practical challenges, such as missing slabs from the tensor and/or missing rows/columns from the correlation matrices. A novel alternating direction method of multipliers (ADMM) is also developed that recovers the nonnegative factors of CGTF. Our algorithm enjoys closed-form updates that result in reduced computational complexity and allow for convergence claims. A novel direction is further explored by employing the interpretable factors to detect graph communities having the tensor as side information. The resulting community detection approach is successful even when some links in the graphs are missing. Results with real data sets corroborate the merits of the proposed methods relative to state-of-the-art competing factorization techniques in providing recommendations and detecting communities.
- ... Baselines. Our model is compared with 8 state-of-the-art methods, including GROUSE [1], IALM [6], LMaFit [12], MC-NMF [13], OR1MP [11], RMAMR [14], ScGrassMC [9], and multiNMF [7]. Parameters are initialized as suggested in [10]. ...Conference Paper
- Nov 2019

Collective matrix completion refers to the problem of simultaneously predicting the missing entries in multiple matrices by leveraging the cross-matrix information. It finds abundant applications in various domains such as recommender system, dimensionality reduction, and image recovery. Most of the existing work represents the cross-matrix information in a shared latent structure constrained by the Euclidean-based pairwise similarity, which may fail to capture the nonlinear relationship of the data. To address this problem, in this paper, we propose a new collective matrix completion framework, named C4, which uses the graph spectral filters to capture the non-Euclidean cross-matrix information. To the best of our knowledge, this is the first effort to represent the cross-matrix information in the graph spectral domain. We benchmark our model against 8 recent models on 10 real-world data sets, and our model outperforms state-of-the-art methods in most tasks. - Article
- Feb 2018
- IEEE T IMAGE PROCESS

Low-light image enhancement methods based on classic Retinex model attempt to manipulate the estimated illumination and to project it back to the corresponding reflectance. However, the model does not consider the noise, which inevitably exists in images captured in low-light conditions. In this paper, we propose the robust Retinex model, which additionally considers a noise map compared with the conventional Retinex model, to improve the performance of enhancing low-light images accompanied by intensive noise. Based on the robust Retinex model, we present an optimization function that includes novel regularization terms for the illumination and reflectance. Specifically, we use ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> norm to constrain the piece-wise smoothness of the illumination, adopt a fidelity term for gradients of the reflectance to reveal the structure details in low-light images, and make the first attempt to estimate a noise map out of the robust Retinex model. To effectively solve the optimization problem, we provide an augmented Lagrange multiplier based alternating direction minimization algorithm without logarithmic transformation. Experimental results demonstrate the effectiveness of the proposed method in low-light image enhancement. In addition, the proposed method can be generalized to handle a series of similar problems, such as the image enhancement for underwater or remote sensing and in hazy or dusty conditions. - Motivated by block partitioned problems arising from group sparsity representation and generalized noncooperative potential games, this paper presents a basic decomposition method for a broad class of multiblock nonsmooth optimization problems subject to coupled linear constraints on the variables that may additionally be individually constrained. The objective of such an optimization problem is given by the sum of two nonseparable functions minus a sum of separable, pointwise maxima of finitely many convex differentiable functions. One of the former two nonseparable functions is of the class LC¹, i.e., differentiable with a Lipschitz gradient, while the other summand is multiconvex. The subtraction of the separable, pointwise maxima of convex functions induces a partial difference-of-convex (DC) structure in the overall objective; yet with all three terms together, the objective is nonsmooth and non-DC, but is blockwise directionally differentiable. By taking advantage of the (negative) pointwise maximum structure in the objective, the developed algorithm and its convergence result are aimed at the computation of a blockwise directional stationary solution, which arguably is the sharpest kind of stationary solutions for this class of nonsmooth problems. This aim is accomplished by combining the alternating direction method of multipliers (ADMM) with a semilinearized Gauss–Seidel scheme, resulting in a decomposition of the overall problem into subproblems each involving the individual blocks. To arrive at a stationary solution of the desired kind, our algorithm solves multiple convex subprograms at each iteration, one per convex function in each pointwise maximum. In order to lessen the potential computational burden in each iteration, a probabilistic version of the algorithm is presented and its almost sure convergence is established.
- Article
- Feb 2018

Preserving global and local structures during projection learning is very important for feature extraction. Although various methods have been proposed for this goal, they commonly introduce an extra graph regularization term and the corresponding regularization parameter that needs to be tuned. However, tuning the parameter manually not only is time-consuming, but also is difficult to find the optimal value to obtain a satisfactory performance. This greatly limits their applications. Besides, projections learned by many methods do not have good interpretability and their performances are commonly sensitive to the value of the selected feature dimension. To solve the above problems, a novel method named low-rank preserving projection via graph regularized reconstruction (LRPP_GRR) is proposed. In particular, LRPP_GRR imposes the graph constraint on the reconstruction error of data instead of introducing the extra regularization term to capture the local structure of data, which can greatly reduce the complexity of the model. Meanwhile, a low-rank reconstruction term is exploited to preserve the global structure of data. To improve the interpretability of the learned projection, a sparse term with l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2,1</sub> norm is imposed on the projection. Furthermore, we introduce an orthogonal reconstruction constraint to make the learned projection hold main energy of data, which enables LRPP_GRR to be more flexible in the selection of feature dimension. Extensive experimental results show the proposed method can obtain competitive performance with other state-of-the-art methods. - PreprintFull-text available
- May 2018

We investigate a class of general combinatorial graph problems, including MAX-CUT and community detection, reformulated as quadratic objectives over nonconvex constraints and solved via the alternating direction method of multipliers (ADMM). We propose two reformulations: one using vector variables and a binary constraint, and the other further reformulating the Burer-Monteiro form for simpler subproblems. Despite the nonconvex constraint, we prove the ADMM iterates converge to a stationary point in both formulations, under mild assumptions. Additionally, recent work suggests that in this latter form, when the matrix factors are wide enough, local optimum with high probability is also the global optimum. To demonstrate the scalability of our algorithm, we include results for MAX-CUT, community detection, and image segmentation benchmark and simulated examples. - Preprint
- Jul 2018

Two-trust-region subproblem (TTRS), which is the minimization of a general quadratic function over the intersection of two full-dimensional ellipsoids, has been the subject of several recent research. In this paper, to solve TTRS, a hybrid of efficient algorithms for finding global and local-nonglobal minimizers of trust-region subproblem and the alternating direction method of multipliers (ADMM) is proposed. The convergence of the ADMM steps to the first order stationary condition is proved under certain conditions. On several classes of test problems, we compare the new algorithm with the recent algorithm of Sakaue et. al's \cite{SakaueNakat:16} and Snopt software. - Article
- Mar 2018
- MULTIDIM SYST SIGN P

It is widely known that the total variation image restoration suffers from the stair casing artifacts which results in blocky restored images. In this paper, we address this problem by proposing a combined non-convex higher order total variation with overlapping group sparse regularizer. The hybrid scheme of both the overlapping group sparse and the non-convex higher order total variation for blocky artifact removal is complementary. The overlapping group sparse term tends to smoothen out blockiness in the restored image more globally, while the non-convex higher order term tends to smoothen parts that are more local to texture while preserving sharp edges. To solve the proposed image restoration model, we develop an iteratively re-weighted \(\ell _1\) based alternating direction method of multipliers algorithm to deal with the constraints and subproblems. In this study, the images are degraded with different levels of Gaussian noise. A comparative analysis of the proposed method with the overlapping group sparse total variation, the Lysaker, Lundervold and Tai model, the total generalized variation and the non-convex higher order total variation, was carried out for image denoising. The results in terms of peak signal-to-noise ratio and structure similarity index measure show that the proposed method gave better performance than the compared algorithms. - Article
- Nov 2018

Multiangle total internal reflection fluorescence microscopy (TIRFM) has become one of the most important techniques for achieving axial superresolution. The key process in this technique is solving the inverse problem. This paper applies an improved alternating direction method of multipliers algorithm to solve the inverse problem and validates the accuracy of the algorithm by reconstructing simulated microtubule structures in multiangle TIRFM images. The reconstruction times for different algorithms and the convergence speeds of the improved and original algorithms are compared. Experimental results show that the improved algorithm can achieve an axial resolution of 40 nm, reduce the influence of the penalty parameter on convergence, and improve the convergence speed of the iterative process while ensuring image reconstruction quality. Based on the algorithm, a three-dimensional image with the depth information of microtubules and mitochondria is reconstructed. - Article
- Jul 2018
- APPL MAGN RESON

The multi-contrast magnetic resonance imaging can provide rich clinical and diagnostic information, but it requires long scanning time in data acquisition. In this paper, we propose a reference guided joint reconstruction method to address this problem. The proposed method both incorporates the location and orientation priors on edge regions from a high-resolution reference image into joint sparsity constraints, enabling to effectively reconstruct high-quality multi-contrast images from the under-sampled k-space data. The alternating direction method of multipliers is used to solve the joint sparsity-promoting optimization problem. In addition, a generalized frame with multiple reference images is developed to further improve the reconstruction performance, and the proposed method in combination with parallel imaging is also demonstrated to analyze the feasibility in the practical multi-channel acquisition of multi-contrast images. The experiments have demonstrated the superiority of our proposed method compared to those existing reconstruction technologies in multi-contrast imaging. - Recently, the alternating direction method of multipliers (ADMM) and its variations have gained great popularity in large-scale optimization problems. This paper is concerned with the solution of the tensor equation \(\mathscr{A}\textbf {x}^{m-1}=\textbf {b}\) in which \(\mathscr{A}\) is an m th-order and n-dimensional real tensor and b is an n-dimensional real vector. By introducing certain auxiliary variables, we transform equivalently this tensor equation into a consensus constrained optimization problem, and then propose an ADMM type method for it. It turns out that each limit point of the sequences generated by this method satisfies the Karush-Kuhn-Tucker conditions. Moreover, from the perspective of computational complexity, the proposed method may suffer from the curse-of-dimensionality if the size of the tensor equation is large, and thus we further present a modified version (as a variant of the former) turning to the tensor-train decomposition of the tensor \(\mathscr{A}\), which is free from the curse. As applications, we establish the associated inverse iteration methods for solving tensor eigenvalue problems. The performed numerical examples illustrate that our methods are feasible and efficient.
- This paper proposes an effective method for accurately recovering vessel structures and intensity information from the X-ray coronary angiography (XCA) images of moving organs or tissues. Specifically, a global logarithm transformation of XCA images is implemented to fit the X-ray attenuation sum model of vessel/background layers into a low-rank, sparse decomposition model for vessel/background separation. The contrast-filled vessel structures are extracted by distinguishing the vessels from the low-rank backgrounds by using a robust principal component analysis and by constructing a vessel mask via Radon-like feature filtering plus spatially adaptive thresholding. Subsequently, the low-rankness and inter-frame spatio-temporal connectivity in the complex and noisy backgrounds are used to recover the vessel-masked background regions using tensor completion of all other background regions, while the twist tensor nuclear norm is minimized to complete the background layers. Finally, the method is able to accurately extract vessels’ intensities from the noisy XCA data by subtracting the completed background layers from the overall XCA images. We evaluated the vessel visibility of resulting images on real X-ray angiography data and evaluated the accuracy of vessel intensity recovery on synthetic data. Experiment results show the superiority of the proposed method over the state-of-the-art methods.
- The code is available from the website https://sites.google.com/view/danfeng-hong. ------------------------------------------------------------------------------------------------------------------------------------------- To support high-level analysis of spaceborne imaging spectroscopy (hyperspectral) imagery, spectral unmixing has been gaining significance in recent years. However, from the inevitable spectral variability, caused by illumination and topography change, atmospheric effects and so on, makes it difficult to accurately estimate abundance maps in spectral unmixing. Classical unmixing methods, e.g. linear mixing model (LMM), extended linear mixing model (ELMM), fail to robustly handle this issue, particularly facing complex spectral variability. To this end, we propose a subspace-based unmixing model using low-rank learning strategy, called subspace unmixing with low-rank attribute embedding (SULoRA), robustly against spectral variability in inverse problems of hyperspectral unmixing. Unlike those previous approaches that unmix the spectral signatures directly in original space, SULoRA is a general subspace unmixing framework that jointly estimates subspace projections and abundance maps in order to find a ‘raw’ subspace which is more suitable for carrying out the unmixing procedure. More importantly, we model such ‘raw’ subspace with low-rank attribute embedding. By projecting the original data into a low-rank subspace, SULoRA can effectively address various spectral variabilities in spectral unmixing. Furthermore, we adopt an alternating direction method of multipliers (ADMM) based to solve the resulting optimization problem. Extensive experiments on synthetic and real datasets are performed to demonstrate the superiority and effectiveness of the proposed method in comparison with previous state-of-the-art methods.
- Article
- Dec 2018
- IEEE T IMAGE PROCESS

Without any prior structure information, Nuclear Norm Minimization (NNM), a convex relaxation for Rank Minimization (RM), is a widespread tool for matrix completion and relevant low rank approximation problems. Nevertheless, the result derivated by NNM generally deviates the solution we desired, because NNM ignores the difference between different singular values. In this paper, we present a non-convex regularizer and utilize it to construct two matrix completion models. In order to solve the constructed models efficiently, we develop an efficient optimization method with convergence guarantee, which can achieve faster convergence speed compared to conventional approaches. Particularly, we show that the proposed regularizer as well as optimization method are suitable for other RM problems, such as subspace clustering based on low rank representation. Extensive experimental results on real images demonstrate that the constructed models provide significant advantages over several state-of-the-art matrix completion algorithms. In addition, we implement numerous experiments to investigate the convergence speed of developed optimization method. - Article
- Apr 2017
- IEEE T SIGNAL PROCES

The low-rank matrix completion problem is fundamental to a number of tasks in data mining, machine learning, and signal processing. This paper considers the problem of adaptive matrix completion in time-varying scenarios. Given a sequence of incomplete and noise-corrupted matrices, the goal is to recover and track the underlying low rank matrices. Motivated from the classical least-mean square (LMS) algorithms for adaptive filtering, three LMS-like algorithms are proposed for estimating and tracking low-rank matrices. Performance of the proposed algorithms is provided in form of nonasymptotic bounds on the tracking mean-square error. Tracking performance of the algorithms is also studied via detailed simulations over real-world datasets. - Article
- Feb 2019
- OPTIM METHOD SOFTW

We consider the problem of minimizing a smooth nonconvex function over a structured convex feasible set, that is, defined by two sets of constraints that are easy to treat when considered separately. In order to exploit the structure of the problem, we define an equivalent formulation by duplicating the variables and we consider the augmented Lagrangian of this latter formulation. Following the idea of the Alternating Direction Method of Multipliers (ADMM), we propose an algorithm where a two-blocks decomposition method is embedded within an augmented Lagrangian framework. The peculiarities of the proposed algorithm are the following: (1) the computation of the exact solution of a possibly nonconvex subproblem is not required; (2) the penalty parameter is iteratively updated once an approximated stationary point of the augmented Lagrangian is determined. Global convergence results are stated under mild assumptions and without requiring convexity of the objective function. Although the primary aim of the paper is theoretical, we perform numerical experiments on a nonconvex problem arising in machine learning, and the obtained results show the practical advantages of the proposed approach with respect to classical ADMM. © 2019 - PreprintFull-text available
- Feb 2019

This paper is motivated by the desire to develop distributed algorithms for nonconvex optimization problems with complicated constraints associated with a network. The network can be a physical one, such as an electric power network, where the constraints are nonlinear power flow equations, or an abstract one that represents constraint couplings between decision variables of different agents. Thus, this type of problems are ubiquitous in applications. Despite the recent development of distributed algorithms for nonconvex programs, highly complicated constraints still pose a significant challenge in theory and practice. We first identify some intrinsic difficulties with the existing algorithms based on the alternating direction method of multipliers (ADMM) for dealing with such problems. We then propose a reformulation for constrained nonconvex programs that enables us to design a two-level algorithm, which embeds a specially structured three-block ADMM at the inner level in an augmented Lagrangian method (ALM) framework. Furthermore, we prove the global convergence of this new scheme for both smooth and nonsmooth constrained nonconvex programs. The proof builds on and extends the classic and recent works on ALM and ADMM. Finally, we demonstrate with computation that the new scheme provides convergent and parallelizable algorithms for several classes of constrained nonconvex programs, for all of which existing algorithms may fail. To the best of our knowledge, this is the first distributed algorithm that extends the ADMM architecture to general nonlinear nonconvex constrained optimization. The proposed algorithmic framework also provides a new principled way for parallel computation of constrained nonconvex optimization. - Article
- Mar 2019

Raman microscopy is a powerful method combining non-invasiveness with no special sample preparation. Because of this remarkable simplicity, it has been widely exploited in many fields, ranging from life and materials sciences to engineering. Notoriously, due to the required imaging speeds for bio-imaging, it has remained a challenge how to use this technique for dynamic and large-scale imaging. Recently, a supervised compressive Raman framework has been put forward, allowing for fast imaging, therefore alleviating the issue of speed. Yet, due to the need for strong a priori information of the species forming the hyperspectrum, it has remained elusive how to apply this supervised method for microspectroscopy of (dynamic) biological tissues. Combining an original spectral under-sampling measurement technique with a matrix completion framework for reconstruction, we demonstrate fast and inexpensive label-free molecular imaging of biological specimens (brain tissues and single cells). Using the matrix completion outcome with the supervised method allows for large compressions (64 ×) and bio-imaging speeds surpassing current technology in spontaneous Raman microspectroscopy. Therefore, our results open interesting perspectives for clinical and cell biology applications using the much faster compressive Raman framework. © 2019 Optical Society of America under. - Article
- Jul 2019

In this paper, we present an augmented Lagrangian alternating direction algorithm for symmetric nonnegative matrix factorization. The convergence of the algorithm is also proved in detail and strictly. Then we present a modified overlapping community detection method which is based on the presented symmetric nonnegative matrix factorization algorithm. We apply the modified community detection method to several real world networks. The obtained results show the capability of our method in detecting overlapping communities, hubs and outliers. We find that our experimental results have better quality than several competing methods for identifying communities. - Article
- Oct 2019

We establish local convergence results for a generic algorithmic framework for solving a wide class of equality constrained optimization problems. The framework is based on applying a splitting scheme to the augmented Lagrangian function that includes as a special case the well-known alternating direction method of multipliers (ADMM). Our local convergence analysis is free of the usual restrictions on ADMM-like methods, such as convexity, block separability or linearity of constraints. It offers a much-needed theoretical justification to the widespread practice of applying ADMM-like methods to nonconvex optimization problems.

- Article
- Apr 2000
- OPER RES LETT

We give new convergence results for the block Gauss–Seidel method for problems where the feasible set is the Cartesian product of m closed convex sets, under the assumption that the sequence generated by the method has limit points. We show that the method is globally convergent for m=2 and that for m>2 convergence can be established both when the objective function f is componentwise strictly quasiconvex with respect to m−2 components and when f is pseudoconvex. Finally, we consider a proximal point modification of the method and we state convergence results without any convexity assumption on the objective function. - Matrix Rank Minimization with Applications
- M Fazel

- ArticleFull-text available
- Dec 2012

The matrix completion problem is to recover a low-rank matrix from a subset of its entries. The main solution strategy for this problem has been based on nuclear-norm minimization which requires computing singular value decompositions—a task that is increasingly costly as matrix sizes and ranks increase. To improve the capacity of solving large-scale problems, we propose a low-rank factorization model and construct a nonlinear successive over-relaxation (SOR) algorithm that only requires solving a linear least squares problem per iteration. Extensive numerical experiments show that the algorithm can reliably solve a wide range of problems at a speed at least several times faster than many nuclear-norm minimization algorithms. In addition, convergence of this nonlinear SOR algorithm to a stationary point is analyzed. - Article
- Dec 1999
- J COMPUT GRAPH STAT

A technique for fitting multilinear and quasi-multilinear mathematical expressions or models to two-, three-, and many-dimensional data arrays is described. Principal component analysis and three-way PARAFAC factor analysis are examples of bilinear and trilinear least squares fit. This work presents a technique for specifying the problem in a structured way so that one program (the Multilinear Engine) may be used for solving widely different multilinear problems. The multilinear equations to be solved are specified as a large table of integer code values. The end user creates this table by using a small preprocessing program. For each different case, an individual structure table is needed. The solution is computed by using the conjugate gradient algorithm. Non-negativity constraints are implemented by using the well-known technique of preconditioning in opposite way for slowing down changes of variables that are about to become negative. The iteration converges to a minimum that may be local or global. Local uniqueness of the solution may be determined by inspecting the singular values of the Jacobian matrix. A global solution may be searched for by starting the iteration from different pseudorandom starting points. Application examples are discussed—for example, n-way PARAFAC, PARAFAC2, Linked mode PARAFAC, blind deconvolution, and nonstandard variants of these. - We present a framework for solving large-scale l1-regularized convex minimization problem: min |x|_1 + µf(x). Our approach is based on two powerful algorithmic ideas: operator-splitting and continuation. Operator-splitting results in a fixed-point algorithm for any given scalar µ; continuation refers to approximately following the path traced by the optimal value of x as µ increases. In this paper, we study the structure of optimal solution sets; prove finite convergence for important quantities, and establish q-linear convergence rates for the fixed-point algorithm applied to problems with f(x) convex, but not necessarily strictly convex. The continuation framework, motivated by our convergence results, is demonstrated to facilitate the construction of practical algorithms.
- Article
- Jun 1994
- ENVIRONMETRICS

A new variant ‘PMF’ of factor analysis is described. It is assumed that X is a matrix of observed data and σ is the known matrix of standard deviations of elements of X. Both X and σ are of dimensions n × m. The method solves the bilinear matrix problem X = GF + E where G is the unknown left hand factor matrix (scores) of dimensions n × p, F is the unknown right hand factor matrix (loadings) of dimensions p × m, and E is the matrix of residuals. The problem is solved in the weighted least squares sense: G and F are determined so that the Frobenius norm of E divided (element-by-element) by σ is minimized. Furthermore, the solution is constrained so that all the elements of G and F are required to be non-negative. It is shown that the solutions by PMF are usually different from any solutions produced by the customary factor analysis (FA, i.e. principal component analysis (PCA) followed by rotations). Usually PMF produces a better fit to the data than FA. Also, the result of PF is guaranteed to be non-negative, while the result of FA often cannot be rotated so that all negative entries would be eliminated. Different possible application areas of the new method are briefly discussed. In environmental data, the error estimates of data can be widely varying and non-negativity is often an essential feature of the underlying models. Thus it is concluded that PMF is better suited than FA or PCA in many environmental applications. Examples of successful applications of PMF are shown in companion papers. - ArticleFull-text available
- Jan 2008

We propose simple and extremely efficient methods for solving the Basis Pursuit problem min{{u 1 : Au = f, u ∈ R n }, which is used in compressed sensing. Our methods are based on Bregman iterative regularization and they give a very accurate solution after solving only a very small number of instances of the unconstrained problem min u∈R n μu 1 + 1 2 Au − f k 2 2 , for given matrix A and vector f k . We show analytically that this iterative approach yields exact solutions in a finite number of steps, and present numerical results that demonstrate that as few as two to six iterations are sufficient in most cases. Our approach is especially useful for many compressed sensing applications where matrix-vector operations involving A and A can be computed by fast transforms. Utilizing a fast fixed-point continuation solver that is solely based on such operations for solving the above unconstrained sub-problem, we were able to solve huge instances of compressed sensing problems quickly on a standard PC. - The nuclear norm is widely used to induce low-rank solutions for many optimization problems with matrix variables. Recently, it has been shown that the augmented Lagrangian method (ALM) and the alternating direction method (ADM) are very efficient for many convex programming problems arising from various applications, provided that the resulting sub-problems are sufficiently simple to have closed-form solutions. In this paper, we are interested in the application of the ALM and the ADM for some nu-clear norm involved minimization problems. When the resulting subproblems do not have closed-form solutions, we propose to linearize these subproblems such that closed-form solutions of these linearized subproblems can be easily derived. Global convergence of these linearized ALM and ADM are established under standard assumptions. Finally, we verify the effectiveness and efficiency of these new methods by some numerical experiments.
- Article
- Mar 2007

Abstract This term paper outlines the basics of non-negative matrix factorization and is based on the work of Daniel D. Lee and H. Sebastian Seung [7], [8]. It has been composed within the scope of the seminar Machine Learning in Computer Vision hold during the winter term ’06/’07 by Prof. Dr. J. Schmidhuber and Dipl.-Inf. C. Osendorfer at the Technische Universit¨at M¨unchen. It describes fundamental concepts, provides an overview over similar methods such as principal component analysis (PCA) and gives some practical examples for possible applications in the field of computer vision. Copyright This work may not be copied or reproduced in whole or in part for any com- - Article
- Nov 2008
- FOUND COMPUT MATH

We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys m ≥ C n^(1.2)r log n for some positive numerical constant C, then with very high probability, most n×n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information. - Article
- Nov 1973
- J OPTIMIZ THEORY APP

For nonlinear programming problems with equality constraints, Hestenes and Powell have independently proposed a dual method of solution in which squares of the constraint functions are added as penalties to the Lagrangian, and a certain simple rule is used for updating the Lagrange multipliers after each cycle. Powell has essentially shown that the rate of convergence is linear if one starts with a sufficiently high penalty factor and sufficiently near to a local solution satisfying the usual second-order sufficient conditions for optimality. This paper furnishes the corresponding method for inequality-constrained problems. Global convergence to an optimal solution is established in the convex case for an arbitrary penalty factor and without the requirement that an exact minimum be calculated at each cycle. Furthermore, the Lagrange multipliers are shown to converge, even though the optimal multipliers may not be unique. - Article
- Nov 1969
- J OPTIMIZ THEORY APP

The main purpose of this paper is to suggest a method for finding the minimum of a functionf(x) subject to the constraintg(x)=0. The method consists of replacingf byF=f+g+1/2cg 2, wherec is a suitably large constant, and computing the appropriate value of the Lagrange multiplier. Only the simplest algorithm is presented. The remaining part of the paper is devoted to a survey of known methods for finding unconstrained minima, with special emphasis on the various gradient techniques that are available. This includes Newton''s method and the method of conjugate gradients. - Article
- Dec 2010

We present an alternating direction dual augmented Lagrangian method for solving semidefinite programming (SDP) problems in standard form. At each iteration, our basic algorithm minimizes the augmented Lagrangian function for the dual SDP problem sequentially, first with respect to the dual variables corresponding to the linear constraints, and then with respect to the dual slack variables, while in each minimization keeping the other variables fixed, and then finally it updates the Lagrange multipliers (i.e., primal variables). Convergence is proved by using a fixed-point argument. For SDPs with inequality constraints and positivity constraints, our algorithm is extended to separately minimize the dual augmented Lagrangian function over four sets of variables. Numerical results for frequency assignment, maximum stable set and binary integer quadratic programming problems demonstrate that our algorithms are robust and very efficient due to their ability or exploit special structures, such as sparsity and constraint orthogonality in these problems. KeywordsSemidefinite programming-Alternating direction method-Augmented Lagrangian method Mathematics Subject Classification (2000)90C06-90C22-90C30-90C35 - Article
- Jun 2010
- IEEE T INFORM THEORY

This paper is concerned with the problem of recovering an unknown matrix from a small fraction of its entries. This is known as the matrix completion problem, and comes up in a great number of applications, including the famous Netflix Prize and other similar questions in collaborative filtering. In general, accurate recovery of a matrix from a small number of entries is impossible, but the knowledge that the unknown matrix has low rank radically changes this premise, making the search for solutions meaningful. This paper presents optimality results quantifying the minimum number of entries needed to recover a matrix of rank r exactly by any method whatsoever (the information theoretic limit). More importantly, the paper shows that, under certain incoherence assumptions on the singular vectors of the matrix, recovery is possible by solving a convenient convex program as soon as the number of entries is on the order of the information theoretic limit (up to logarithmic factors). This convex program simply finds, among all matrices consistent with the observed entries, that with minimum nuclear norm. As an example, we show that on the order of nr log( n ) samples are needed to recover a random n x n matrix of rank r by any method, and to be sure, nuclear norm minimization succeeds as soon as the number of entries is of the form nr polylog( n ). - Conference PaperFull-text available
- Nov 2009

We present several first-order algorithms for solving the low-rank matrix completion problem and the tightest convex relaxation of it obtained by minimizing the nuclear norm instead of the rank of the matrix. Our first algorithm is a fixed point continuation algorithm that incorporates an approximate singular value decomposition procedure (FPCA). FPCA can solve large matrix completion problems efficiently and attains high rates of recoverability. For example, FPCA can recover 1000 by 1000 matrices of rank 50 with a relative error of 10<sup>-5</sup> in about 3 minutes by sampling only 20% of the elements. We know of no other method that achieves as good recoverability. Our second algorithm is a row by row method for solving a semidefinite programming reformulation of the nuclear norm matrix completion problem. This method produces highly accurate solutions to fairly large nuclear norm matrix completion problems efficiently. Finally, we introduce an alternating direction approach based on the augmented Lagrangian framework. - An inexact alternating direction method for trace norm regularized least squares problem. Online at optimization online
- Jan 2010

- J Yang
- X Yuan

J. Yang and X. Yuan. An inexact alternating direction method for trace norm regularized least squares problem. Online at optimization online, 2010. - Article
- Dec 1976
- COMPUT MATH APPL

For variational problems of the form , we propose a dual method which decouples the difficulties relative to the functionals f and g from the possible ill-conditioning effects of the linear operator A.The approach is based on the use of an Augmented Lagrangian functional and leads to an efficient and simply implementable algorithm. We study also the finite element approximation of such problems, compatible with the use of our algorithm. The method is finally applied to solve several problems of continuum mechanics. - Article
- Sep 2007
- COMPUT STAT DATA AN

The development and use of low-rank approximate nonnegative matrix factorization (NMF) algorithms for feature extraction and identification in the fields of text mining and spectral data analysis are presented. The evolution and convergence properties of hybrid methods based on both sparsity and smoothness constraints for the resulting nonnegative matrix factors are discussed. The interpretability of NMF outputs in specific contexts are provided along with opportunities for future work in the modification of NMF algorithms for large-scale and time-varying data sets. - Article
- May 1997
- CHEMOMETR INTELL LAB

Positive matrix factorization (PMF) is a recently published factor analytic technique where the left and right factor matrices (corresponding to scores and loadings) are constrained to non-negative values. The PMF model is a weighted least squares fit, weights based on the known standard deviations of the elements of the data matrix. The following aspects of PMF are discussed in this work: (1) Robust factorization (based on the Huber influence function) is achieved by iterative reweighting of individual data values. This appears especially useful if individual data values may be in error. (2) Desired rotations may be obtained automatically with the help of suitably chosen regularization terms. (3) The algorithms for PMF are discussed. A synthetic spectroscopic example is shown, demonstrating both the robust processing and the automatic rotations. - Conference Paper
- Jan 2009

Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted ob- servations. This paper considers the idealized "robust principal component anal- ysis" problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support pat- terns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observa- tion space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small frac- tion of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. - In this paper we derive a family of new extended SMART (Simultaneous Multiplicative Algebraic Reconstruction Technique) al- gorithms for Non-negative Matrix Factorization (NMF). The proposed algorithms are characterized by improved e-ciency and convergence rate and can be applied for various distributions of data and additive noise. Information theory and information geometry play key roles in the derivation of new algorithms. We discuss several loss functions used in information theory which allow us to obtain generalized forms of mul- tiplicative NMF learning adaptive algorithms. We also provide ∞exible and relaxed forms of the NMF algorithms to increase convergence speed and impose an additional constraint of sparsity. The scope of these re- sults is vast since discussed generalized divergence functions include a large number of useful loss functions such as the Amari fi{ divergence, Relative entropy, Bose-Einstein divergence, Jensen-Shannon divergence, J-divergence, Arithmetic-Geometric (AG) Taneja divergence, etc. We applied the developed algorithms successfully to Blind (or semi blind) Source Separation (BSS) where sources may be generally statistically dependent, however are subject to additional constraints such as non- negativity and sparsity. Moreover, we applied a novel multilayer NMF strategy which improves performance of the most proposed algorithms.
- Book
- Oct 2009

Standard ALS AlgorithmMethods for Improving Performance and Convergence Speed of ALS AlgorithmsALS Algorithm with Flexible and Generalized Regularization TermsCombined Generalized Regularized ALS AlgorithmsWang-Hancewicz Modified ALS AlgorithmImplementation of Regularized ALS Algorithms for NMFHALS Algorithm and its ExtensionsSimulation ResultsDiscussion and Conclusions Appendix 4.A: MATLAB Source Code for ALS AlgorithmAppendix 4.B: MATLAB Source Code for Regularized ALS AlgorithmsAppendix 4.C: MATLAB Source Code for Mixed ALS-HALS AlgorithmsAppendix 4.D: MATLAB Source Code for HALS CS AlgorithmAppendix 4.E: Additional MATLAB FunctionsReferences - Article
- Jan 2009
- SIAM J MATRIX ANAL A

The nuclear norm (sum of singular values) of a matrix is often used in convex heuristics for rank minimization problems in control, signal processing, and statistics. Such heuristics can be viewed as extensions of ℓ1-norm minimization techniques for cardinality minimization and sparse signal estimation. In this paper we consider the problem of minimizing the nuclear norm of an affine matrix valued function. This problem can be formulated as a semidefinite program, but the reformulation requires large auxiliary matrix variables, and is expensive to solve by general-purpose interior-point solvers. We show that problem structure in the semidefinite programming formulation can be exploited to develop more efficient implementations ofinterior-point methods. In the fast implementation, the cost per iteration is reduced to a quartic function of the problem dimensions, and is comparable to the cost of solving the approximation problem in Frobenius norm. In the second part of the paper, the nuclear norm approximation algorithm is applied to system identification. A variant of a simple subspace algorithm is presented, in which low-rank matrix approximations are computed via nuclear norm minimization instead of the singular value decomposition. This has the important advantage of preserving linear matrix structure in the low-rank approximation. The method is shown to perform well on publicly available benchmark data. - An SDP relaxation based method is developed to solve the localization problem in sensor networks using incomplete and inaccurate distance information. The problem is set up to find a set of sensor positions such that given distance constraints are satisfied. The nonconvex constraints in the formulation are then relaxed in order to yield a semidefinite program that can be solved efficiently.The basic model is extended in order to account for noisy distance information. In particular, a maximum likelihood based formulation and an interval based formulation are discussed. The SDP solution can then also be used as a starting point for steepest descent based local optimization techniques that can further refine the SDP solution.We also describe the extension of the basic method to develop an iterative distributed SDP method for solving very large scale semidefinite programs that arise out of localization problems for large dense networks and are intractable using centralized methods.The performance evaluation of the technique with regard to estimation accuracy and computation time is also presented by the means of extensive simulations.Our SDP scheme also seems to be applicable to solving other Euclidean geometry problems where points are locally connected.
- Article
- Dec 1992
- COMMUN ACM

predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording their reactions to documents they read. The reactions are called annotations; they can be accessed by other people’s filters. Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser. Tapestry’s client/server architecture, its various components, and the Tapestry query language are described. - We extend the alternating minimization algorithm recently proposed in (38, 39) to the case of recovering blurry multichannel (color) images corrupted by impulsive rather than Gaussian noise. The algorithm minimizes the sum of a multichannel extension of total variation (TV), either isotropic or anisotropic, and a data fldelity term measured in the L1-norm. We derive the algorithm by applying the well-known quadratic penalty function technique and prove attractive convergence properties including flnite convergence for some variables and global q-linear convergence. Under periodic boundary conditions, the main computational requirements of the algorithm are fast Fourier transforms and a low-complexity Gaussian elimination procedure. Numerical results on images with difierent blurs and impulsive noise are presented to demonstrate the e-ciency of the algorithm. In addition, it is numerically compared to an algorithm recently proposed in (20) that uses a linear program and an interior point method for recovering grayscale images.
- Article
- Mar 2010
- SIAM J OPTIMIZ

Abstract This paper introduces a novel algorithm to approximate the matrix with minimum,nuclear norm among all matrices obeying a set of convex constraints. This problem may be understood as the convex relaxation of a rank minimization problem, and arises in many important applications as in the task of recovering a large matrix from a small subset of its entries (the famous Netix problem). O-the-shelf,algorithms such as interior point methods are not directly amenable to large problems of this kind with over a million unknown,entries. This paper develops a simple,rst-order and easy-to-implement algorithm that is extremely ecient,at addressing problems in which the optimal solution has low rank. The algorithm is iterative and produces a sequence of matricesfX,g is empirically nondecreasing. Both these facts allow the algorithm to make use of very minimal storage space and keep the computational cost of each iteration low. On the theoretical side, we provide a convergence analysis showing that the sequence of iterates converges. On the practical side, we provide numerical examples in which 1; 000 1; 000 matrices are recovered in less than a minute on a modest desktop computer. We also demonstrate that our approach is amenable to very large scale problems by recovering matrices of rank about 10 with nearly a billion unknowns from just about 0.4% of their sampled entries. Our methods are connected with the recent literature on linearized Bregman iterations for ‘1 minimization, and we develop a framework in which one can understand these algorithms in terms of well-known Lagrange multiplier algorithms. Keywords. Nuclear norm minimization, matrix completion, singular value thresholding, La- - ArticleFull-text available
- Jan 2008

We propose, analyze and test an alternating minimization algorithm for recovering images from blurry and noisy observa- tions with total variation (TV) regularization. This algorithm arises from a new half-quadratic model applicable to not only the anisotropic but also isotropic forms of total variation discretizations. The per-iteration computational complexity of the algorithm is three Fast Fourier Transforms (FFTs). We establish strong convergence properties for the algorithm including finite convergence for some variables and relatively fast exponential (or q-linear in optimization terminology) convergence for the others. Furthermore, we propose a continuation scheme to accelerate the practical convergence of the algorithm. Extensive numerical results show that our algorithm performs favorably in comparison to several state-of-the-art algorithms. In particular, it runs orders of magnitude faster than the Lagged Diusivity algorithm for total-variation-based deblurring. Some extensions of our algorithm are also discussed. - Article
- Dec 2009
- J ACM

This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces. - The linearly constrained matrix rank minimization problem is widely applicable in many fields such as control, signal processing and system identification. The tightest convex relaxation of this problem is the linearly constrained nuclear norm minimization. Although the latter can be cast as a semidefinite programming problem, such an approach is computationally expensive to solve when the matrices are large. In this paper, we propose fixed point and Bregman iterative algorithms for solving the nuclear norm minimization problem and prove convergence of the first of these algorithms. By using a homotopy approach together with an approximate singular value decomposition procedure, we get a very fast, robust and powerful algorithm, which we call FPCA (Fixed Point Continuation with Approximate SVD), that can solve very large matrix rank minimization problems. Our numerical results on randomly generated and real matrix completion problems demonstrate that this algorithm is much faster and provides much better recoverability than semidefinite programming solvers such as SDPT3. For example, our algorithm can recover 1000 x 1000 matrices of rank 50 with a relative error of 1e-5 in about 3 minutes by sampling only 20 percent of the elements. We know of no other method that achieves as good recoverability. Numerical experiments on online recommendation, DNA microarray data set and image inpainting problems demonstrate the effectiveness of our algorithms.
- ArticleFull-text available
- Jan 1989

Incluye bibliografía e índice - Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.
- The guest editors of this special issue are extremely grateful to all the reviewers who took time to carefully read the submitted manuscripts and to provide critical comments which helped to ensure the high quality of this issue. The guest editors are also much indebted to the authors for their important contributions. All these tremendous efforts and dedication have contributed to make this issue a reality.
- Non-negative matrix factorization (NMF) has previously been shown to be a useful decomposition for multivariate data. Two different multiplicative algorithms for NMF are analyzed. They differ only slightly in the multiplicative factor used in the update rules. One algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence. The monotonic convergence of both algorithms can be proven using an auxiliary function analogous to that used for proving convergence of the ExpectationMaximization algorithm. The algorithms can also be interpreted as diagonally rescaled gradient descent, where the rescaling factor is optimally chosen to ensure convergence.
- The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and collaborative filtering. Although specific instances can often be solved with specialized algorithms, the general affine rank minimization problem is NP-hard because it contains vector cardinality minimization as a special case. In this paper, we show that if a certain restricted isometry property holds for the linear transformation defining the constraints, the minimum-rank solution can be recovered by solving a convex optimization problem, namely, the minimization of the nuclear norm over the given affine space. We present several random ensembles of equations where the restricted isometry property holds with overwhelming probability, provided the codimension of the subspace is sufficiently large. The techniques used in our analysis have strong parallels in the compressed sensing framework. We discuss how affine rank minimization generalizes this preexisting concept and outline a dictionary relating concepts from cardinality minimization to those of rank minimization. We also discuss several algorithmic approaches to minimizing the nuclear norm and illustrate our results with numerical examples.