In this paper, we give an almost complete classication of toric surface codes of dimension
less than or equal to 7, according to monomially equivalence. This is a natural extension
of our previous work [YZ], [LYZZ]. More pairs of monomially equivalent toric codes constructed
from non-equivalent lattice polytopes are discovered. A new phenomenon appears, that is, the
monomially non-equivalence of two toric codes C
can be discerned on Fq, for all
q 8, except q = 29. This sudden break seems to be strange and interesting. Moreover, the
parameters, such as the numbers of codewords with dierent weights, depends on q heavily. More
meticulous analyses have been made to have the possible distinct families of reducible polynomials.
Penghang YinUniversity of California at Los AngelesShuai ZhangUniversity of California at IrvineJiancheng LyuUniversity of California at Los AngelesStanley OsherUniversity of California at IrvineYingyong QiUniversity of California at IrvineJack XinUniversity of California at Irvine
We propose BinaryRelax, a simple two-phase algorithm, for training deep neural networks with quantized weights. The set constraint that characterizes the quantization of weights is not imposed until the late stage of training, and a sequence of pseudo quantized weights is maintained. Specifically, we relax the hard constraint into a continuous regularizer via Moreau envelope, which turns out to be the squared Euclidean distance to the set of quantized weights. The pseudo quantized weights are obtained by linearly interpolating between the float weights and their quantizations. A continuation strategy is adopted to push the weights towards the quantized state by gradually increasing the regularization parameter. In the second phase, exact quantization scheme with a small learning rate is invoked to guarantee fully quantized weights. We test BinaryRelax on the benchmark CIFAR- 10 and CIFAR-100 color image datasets to demonstrate the superiority of the relaxed quantization approach and the improved accuracy over the state-of-the-art training methods. Finally, we prove the convergence of BinaryRelax under an approximate orthogonality condition.
Bao WangDepartment of Mathematics, UCLAPenghang YinDepartment of Mathematics, UCLAAndrea L. BertozziDepartment of Mathematics, UCLAP. Jeffrey BrantinghamDepartment of Anthropology, UCLAStanley J. OsherDepartment of Mathematics, UCLAJack XinDepartment of Mathematics, UCLA&UCI
Real-time crime forecasting is important. However, accurate prediction of when and where the next crime will happen is difficult. No known physical model provides a reasonable approximation to such a complex system. Historical crime data are sparse in both space and time and the signal of interests is weak. In this work, we first present a proper representation of crime data. We then adapt the spatial temporal residual network on the well represented data to predict the distribution of crime in Los Angeles at the scale of hours in neighborhood-sized parcels. These experiments as well as comparisons with several existing approaches to prediction demonstrate the superiority of the proposed model in terms of accuracy. Finally, we present a ternarization technique to address the resource consumption issue for its deployment in real world. This work is an extension of our short conference proceeding paper [Wang et al, Arxiv 1707.03340].
We present LBW-Net, an efficient optimization based method for quantization and training of the low bit-width convolutional neural networks (CNNs). Specifically, we quantize the weights to zero or powers of two by minimizing the Euclidean distance between full-precision weights and quantized weights during backpropagation. We characterize the combinatorial nature of the low bit-width quantization problem. For 2-bit (ternary) CNNs, the quantization of N weights can be done by an exact formula in O(N log N ) complexity. When the bit-width is three and above, we further propose a semi-analytical thresholding scheme with a single free parameter for quantization that is computationally inexpensive. The free parameter is further determined by network retraining and object detection tests. LBW-Net has several desirable advantages over full-precision CNNs, including considerable memory savings, energy efficiency, and faster deployment. Our experiments on PASCAL VOC dataset show that compared with its 32-bit floating-point counterpart, the performance of the 6-bit LBW-Net is nearly lossless in the object detection tasks, and can even do better in some real world visual scenes, while empirically enjoying more than 4× faster deployment.
We study the minimization problem of a non-convex sparsity promoting penalty func- tion, the transformed l1 (TL1), and its application in compressed sensing (CS). The TL1 penalty inter- polates l0 and l1 norms through a nonnegative parameter a ∈ (0, +∞), similar to lp with p ∈ (0, 1], and is known to satisfy unbiasedness, sparsity and Lipschitz continuity properties. We first consider the constrained minimization problem, and discuss the exact recovery of l0 norm minimal solution based on the null space property (NSP). We then prove the stable recovery of l0 norm minimal solution if the sensing matrix A satisfies a restricted isometry property (RIP). We formulated a normalized problem to overcome the lack of scaling property of the TL1 penalty function. For a general sensing matrix A, we show that the support set of a local minimizer corresponds to linearly independent columns of A. Next, we present difference of convex algorithms for TL1 (DCATL1) in computing TL1-regularized con- strained and unconstrained problems in CS. The DCATL1 algorithm involves outer and inner loops of iterations, one time matrix inversion, repeated shrinkage operations and matrix-vector multiplications. The inner loop concerns an l1 minimization problem on which we employ the Alternating Direction Method of Multipliers (ADMM). For the unconstrained problem, we prove convergence of DCATL1 to a stationary point satisfying the first order optimality condition. In numerical experiments, we identify the optimal value a = 1, and compare DCATL1 with other CS algorithms on two classes of sensing ma- trices: Gaussian random matrices and over-sampled discrete cosine transform matrices (DCT). Among existing algorithms, the iterated reweighted least squares method based on l1/2 norm is the best in sparse recovery for Gaussian matrices, and the DCA algorithm based on l1 minus l2 penalty is the best for over-sampled DCT matrices. We find that for both classes of sensing matrices, the performance of DCATL1 algorithm (initiated with l1 minimization) always ranks near the top (if not the top), and is the most robust choice insensitive to the conditioning of the sensing matrix A. DCATL1 is also com- petitive in comparison with DCA on other non-convex penalty functions commonly used in statistics with two hyperparameters.
We study a non-convex low-rank promoting penalty function, the transformed Schatten- 1 (TS1), and its applications in matrix completion. The TS1 penalty, as a matrix quasi-norm defined on its singular values, interpolates the rank and the nuclear norm through a nonnegative parameter a∈ (0,+∞). We consider the unconstrained TS1 regularized low-rank matrix recovery problem and develop a fixed point representation for its global minimizer. The TS1 thresholding functions are in closed analytical form for all parameter values. The TS1 threshold values differ in subcritical (supercritical) parameter regime where the TS1 threshold functions are continuous (discontinuous). We propose TS1 iterative thresholding algorithms and compare them with some state-of-the-art algorithms on matrix completion test problems. For problems with known rank, a fully adaptive TS1 iterative thresholding algorithm consistently performs the best under different conditions, where ground truth matrices are generated by multivariate Gaussian, (0,1) uniform and Chi-square distributions. For problems with unknown rank, TS1 algorithms with an additional rank estimation procedure approach the level of IRucL-q which is an iterative reweighted algorithm, non-convex in nature and best in performance.
We study the problem of constructing sparse and fast mean revert- ing portfolios. The problem is motivated by convergence trading and formu- lated as a generalized eigenvalue problem with a cardinality constraint . We use a proxy of mean reversion coefficient, the direct Ornstein-Uhlenbeck (OU) estimator, which can be applied to both stationary and nonstationary data. In addition, we introduce three different methods to enforce the sparsity of the solutions. One method uses the ratio of l1 and l2 norms and the other two use l1 norm. We analyze various formulations of the resulting non-convex opti- mization problems and develop efficient algorithms to solve them for portfolio sizes as large as hundreds. By adopting a simple convergence trading strat- egy, we test the performance of our sparse mean reverting portfolios on both synthetic and historical real market data. In particular, the l1 regularization method, in combination with quadratic program formulation as well as differ-ence of convex functions and least angle regression treatment, gives fast and robust performance on large out-of-sample data set.
To appear in Journal of Scientific Computing (JOMP).
A computational method, based on l1-minimization, is proposed for the problem of link flow correction, when the available traffic flow data on many links in a road network are inconsistent with respect to the flow conservation law. Without extra information, the problem is generally ill-posed when a large portion of the link sensors are unhealthy. It is possible, however, to correct the corrupted link flows accurately with the proposed method under a recoverability condition if there are only a few bad sensors which are located at certain links. We analytically identify the links that are robust to miscounts and relate them to the geometric structure of the traffic network by introducing the recoverability concept and an algorithm for computing it. The recoverability condition for corrupted links is simply the associated recoverability being greater than 1. In a more realistic setting, besides the unhealthy link sensors, small measurement noises may be present at the other sensors. Under the same recoverability condition, our method guarantees to give an estimated traffic flow fairly close to the ground-truth data and leads to a bound for the correction error. Both synthetic and real-world examples are provided to demonstrate the effectiveness of the proposed method.
We present a new perspective on graph-based
methods for collaborative ranking for recommender
systems. Unlike user-based or itembased
methods that compute a weighted average
of ratings given by the nearest neighbors, or lowrank
approximation methods using convex optimization
and the nuclear norm, we formulate
matrix completion as a series of semi-supervised
learning problems, and propagate the known ratings
to the missing ones on the user-user or itemitem
graph globally. The semi-supervised learning
problems are expressed as Laplace-Beltrami
equations on a manifold, or namely, harmonic
extension, and can be discretized by a point integral
method. We show that our approach does
not impose a low-rank Euclidean subspace on the
data points, but instead minimizes the dimension
of the underlying manifold. Our method, named
LDM (low dimensional manifold), turns out to be
particularly effective in generating rankings of
items, showing decent computational efficiency
and robust ranking quality compared to state-ofthe-
Poisson equation on point cloud with Dirichlet boundary condition plays important
role in many problems. In this paper, we use the volume constraint proposed by Du et.al to handle
the Dirichlet boundary condition in the point integral method for Poisson equation on point cloud. We
prove that the solution given by volume constraint converges to the true solution as the point cloud
converges to the underlying smooth manifold.
The Laplace-Beltrami operator (LBO) is a fundamental object associated to Riemannian
manifolds, which encodes all intrinsic geometry of the manifolds and has many desirable prop-
erties. Recently, we proposed a novel numerical method, Point Integral method (PIM), to
discretize the Laplace-Beltrami operator on point clouds . In this paper, we analyze the
convergence of Point Integral method (PIM) for Poisson equation with Neumann boundary
condition on submanifolds isometrically embedded in Euclidean spaces.
In this paper, we propose a novel low dimensional manifold model (LDMM) and
apply it to some image processing problems. LDMM is based on the fact that the patch manifolds
of many natural images have low dimensional structure. Based on this fact, the dimension of the
patch manifold is used as a regularization to recover the image. The key step in LDMM is to solve
a Laplace-Beltrami equation over a point cloud which is solved by the point integral method. The
point integral method enforces the sample point constraints correctly and gives better results than the
standard graph Laplacian. Numerical simulations in image denoising, inpainting and super-resolution
problems show that LDMM is a powerful method in image processing.
We present the application of a low dimensional manifold model (LDMM) on hyperspectral
image (HSI) reconstruction. An important property of hyperspectral images is that the
patch manifold, which is sampled by the three-dimensional blocks in the data cube, is generally of
a low dimensional nature. This is a generalization of low-rank models in that hyperspectral images
with nonlinear mixing terms can also fit in this framework. The point integral method (PIM) is used
to solve a Laplace-Beltrami equation over a point cloud sampling the patch manifold in LDMM.
Both numerical simulations and theoretical analysis show that the sample points constraint is correctly
enforced by PIM. The framework is demonstrated by experiments on the reconstruction of
both linear and nonlinear mixed hyperspectral images with a significant number of missing voxels
and several entirely missing spectral bands.
In this paper, we formulate the deep residual network (ResNet) as a control problem of transport equation. In ResNet, the transport equation is solved along the characteristics. Based on this observation, deep neural network is closely related to the control problem of PDEs on manifold. We propose several models based on transport equation, Hamilton-Jacobi equation and Fokker-Planck equation. The discretization of these PDEs on point cloud is also discussed.
The p-th moment matrix is defined for a real random vector, generalizing the classical covariance matrix. Sharp inequalities relating the p-th moment and Renyi entropy are established, generalizing the classical inequality relating the second moment and the Shannon entropy. The extremal distributions for these inequalities are completely characterized.
The moment-entropy inequality shows that a contin- uous random variable with given second moment and maximal Shannon entropy must be Gaussian. Stam’s inequality shows that a continuous random variable with given Fisher information and minimal Shannon entropy must also be Gaussian. The Crame ́r- Rao inequality is a direct consequence of these two inequalities.
In this paper the inequalities above are extended to Renyi entropy, p-th moment, and generalized Fisher information. Gen- eralized Gaussian random densities are introduced and shown to be the extremal densities for the new inequalities. An extension of the Crame ́r–Rao inequality is derived as a consequence of these moment and Fisher information inequalities.
We explain how the classical notions of Fisher information of a random variable and Fisher information matrix of a random vector can be extended to a much broader setting. We also show that Stam’s inequality for Fisher information and Shannon entropy, as well as the more generalized versions proved earlier by the authors, are all special cases of more general sharp inequalities satisfied by random vectors. The extremal random vectors, which we call generalized Gaussians, contain Gaussians as a limiting case but are noteworthy because they are heavy- tailed.
A unified approach is presented for establishing
a broad class of Cram\'er-Rao inequalities for the location parameter,
including, as special cases,
the original inequality of Cram\'er and Rao, as well as an $L^p$ version recently
established by the authors. The new approach allows for
generalized moments and Fisher information measures to be defined by convex
functions that are not necessarily homogeneous.
In particular, it is shown that associated with any log-concave random
variable whose density satisfies certain boundary conditions is a
Cram\'er-Rao inequality for which the given log-concave random
variable is the extremal. Applications to specific instances are
We demonstrate how path integrals often used in problems of theoretical physics can be adapted to provide a machinery for performing Bayesian inference in function spaces. Such inference comes about naturally in the study of inverse problems of recovering continuous (infinite dimensional) coefficient functions from ordinary or partial differential equations, a problem which is typically ill-posed. Regularization of these problems using L2 function spaces (Tikhonov regularization) is equivalent to Bayesian probabilistic inference, using a Gaussian prior. The Bayesian interpretation of inverse problem regularization is useful since it allows one to quantify and characterize error and degree of precision in the solution of inverse problems, as well as examine assumptions made in solving the problem—namely whether the subjective choice of regularization is compatible with prior knowledge. Using path-integral formalism, Bayesian inference can be explored through various perturbative techniques, such as the semiclassical approximation, which we use in this manuscript. Perturbative path-integral approaches, while offering alternatives to computational approaches like Markov-Chain-Monte-Carlo (MCMC), also provide natural starting points for MCMC methods that can be used to refine approximations. In this manuscript, we illustrate a path-integral formulation for inverse problems and demonstrate it on an inverse problem in membrane biophysics as well as inverse problems in potential theories involving the Poisson equation.
We study inhomogeneous random graphs in the subcritical case. Among other results, we derive an exact formula for the size of the largest connected component scaled by log$n$, with$n$being the size of the graph. This generalizes a result for the “rank-1 case”. We also investigate branching processes associated with these graphs. In particular, we discover that the same well-known equation for the survival probability, whose positive solution determines the asymptotics of the size of the largest component in the supercritical case, also plays a crucial role in the subcritical case. However, now it is the$negative$solutions that come into play. We disclose their relationship to the distribution of the progeny of the branching process.
Zhi-Chao ZhangState Key Laboratory of Networking and Switching Technology, Beijing University of Posts and TelecommunicationsKeqin FengDepartment of Mathematical Sciences, Tsinghua UniversityFei GaoState Key Laboratory of Networking and Switching Technology, Beijing University of Posts and TelecommunicationsQiao-Yan WenState Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications
entangled states. We construct sets of fewer than d orthogonal maximally entangled states which are not
distinguished by one-way local operations and classical communication (LOCC) in the Hilbert space of d ⊗ d.
The proof, based on the Fourier transform of an additive group, is very simple but quite effective. Simultaneously,
our results give a general unified upper bound for the minimum number of one-way LOCC indistinguishable
maximally entangled states. This improves previous results which only showed sets of N d − 2 such states.
Finally, our results also show that previous conjectures in Zhang et al. [Z.-C. Zhang, Q.-Y. Wen, F. Gao, G.-J.
Tian, and T.-Q. Cao, Quant. Info. Proc. 13, 795 (2014)] are indeed correct.
A paper was published (Harsha and Subrahamanian Moosath, 2014) in which the authors claimed to have discovered an extension to Amari's \(\alpha\)-geometry through a general monotone embedding function. It will be pointed out here that this so-called \((F, G)\)-geometry (which includes \(F\)-geometry as a special case) is identical to Zhang's (2004) extension to the \(\alpha\)-geometry, where the name of the pair of monotone embedding functions \(\rho\) and \(\tau\) were used instead of \(F\) and \(H\) used in Harsha and Subrahamanian Moosath (2014). Their weighting function \(G\) for the Riemannian metric appears cosmetically due to a rewrite of the score function in log-representation as opposed to \((\rho, \tau)\)-representation in Zhang (2004). It is further shown here that the resulting metric and \(\alpha\)-connections obtained by Zhang (2004) through arbitrary monotone embeddings is a unique extension of the \(\alpha\)-geometric structure. As a special case, Naudts' (2004) \(\phi\)-logarithm embedding (using the so-called \(\log_\phi\) function) is recovered with the identification \(\rho=\phi, \, \tau=\log_\phi\), with \(\phi\)-exponential \(\exp_\phi\) given by the associated convex function linking the two representations.