Ting-Li ChenInstitute of Statistical Science, Academia SinicaDai-Ni HsiehInstitute of Statistical Science, Academia SinicaHung HungInstitute of Epidemiology and Preventive Medicine I-Ping TuInstitute of Statistical Science, Academia SinicaPei-Shien WuDept. of Biostatistics, Duke UniversityYi-Ming WuInstitute of Chemistry, Academia SinicaWei-Hau ChangInstitute of Chemistry, Academia SinicaSu-Yun HuangInstitute of Statistical Science, Academia Sinica
Statistics Theory and MethodsData Analysis, Bio-Statistics, Bio-Mathematicsmathscidoc:2004.33002
The Annals of Applied Statistics , 8, (1), 259-285, 2014
Cryo-electron microscopy (cryo-EM) has recently emerged as a powerful
tool for obtaining three-dimensional (3D) structures of biological macromolecules
in native states. A minimum cryo-EM image data set for deriving a
meaningful reconstruction is comprised of thousands of randomly orientated
projections of identical particles photographed with a small number of electrons.
The computation of 3D structure from 2D projections requires clustering,
which aims to enhance the signal to noise ratio in each view by grouping
similarly oriented images. Nevertheless, the prevailing clustering techniques
are often compromised by three characteristics of cryo-EM data: high noise
content, high dimensionality and large number of clusters. Moreover, since
clustering requires registering images of similar orientation into the same
pixel coordinates by 2D alignment, it is desired that the clustering algorithm
can label misaligned images as outliers. Herein, we introduce a clustering algorithm
γ-SUP to model the data with a q-Gaussian mixture and adopt the
minimum γ-divergence for estimation, and then use a self-updating procedure
to obtain the numerical solution. We apply γ-SUP to the cryo-EM images
of two benchmark macromolecules, RNA polymerase II and ribosome.
In the former case, simulated images were chosen to decouple clustering from
alignment to demonstrate γ-SUP is more robust to misalignment outliers than
the existing clustering methods used in the cryo-EM community. In the latter
case, the clustering of real cryo-EM data by our γ-SUP method eliminates
noise in many views to reveal true structure features of ribosome at the projection
We treat all the bivariate lack-of-memory (BLM) distributions in a unified approach and develop some new general properties of the BLM distributions, including joint moment generating function, product moments, and dependence structure. Necessary and sufficient conditions for the survival functions of BLM distributions to be totally positive of order two are given. Some previous results about specific BLM distributions are improved. In particular, we show that both the Marshall–Olkin survival copula and survival function are totally positive of all orders, regardless of parameters. Besides, we point out that Slepian’s inequality also holds true for BLM distributions.
This paper is about the propagation of the singularities in the solutions to the Cauchy problem of the spatially inhomogeneous Boltzmann equation with angular cutoff assumption. It is motivated by the work of BoudinDesvillettes on the propagation of singularities in solutions near vacuum. It shows that for the solution near a global Maxwellian, singularities in the initial data propagate like the free transportation. Precisely, the solution is the sum of two parts in which one keeps the singularities of the initial data and the other one is regular with locally bounded derivatives of fractional order in some Sobolev space. In addition, the dependence of the regularity on the cross-section is also given.
In this paper, we provide the O() corrections to the hydrodynamic model derived by Degond and Motsch from a kinetic version of the model by Vicsek and co-authors describing flocking biological agents. The parameter stands for the ratio of the microscopic to the macroscopic scales. The O() corrected model involves diffusion terms in both the mass and velocity equations as well as terms which are quadratic functions of the first-order derivatives of the density and velocity. The derivation method is based on the standard ChapmanEnskog theory, but is significantly more complex than usual due to both the non-isotropy of the fluid and the lack of momentum conservation.
The approach combines second and fourth order statistics to perform BSS of instantaneous mixtures. It applies for any number of receivers if they are as many as sources. It is a batch algorithm that uses non-Gaussianity and stationarity of source signals. It is linear algebra based direct method, reliable and robust, though large dimensions of sources may slow down the computation significantly. It is however limited to instantaneous mixtures.
We present a novel variation of the well-known infomax algorithm of blind source separation. Under natural gradient descent, the infomax algorithm converges to a stationary point of a limiting ordinary differential equation. However, due to the presence of saddle points or local minima of the corresponding likelihood function, the algorithm may be trapped around these bad stationary points for a long time, especially if the initial data are near them. To speed up convergence, we propose to add a sequence of random perturbations to the infomax algorithm to shake the iterating sequence so that it is captured by a path descending to a more stable stationary point. We analyze the convergence of the randomly perturbed algorithm, and illustrate its fast convergence through numerical examples on blind demixing of stochastic signals. The examples have analytical structures so that saddle points or local minima of the likelihood functions are explicit. The results may have implications for online learning algorithms in dissimilar problems.
We consider diffusivity of random walks with transition probabilities depending on the number of consecutive traversals of the last traversed edge, the so called senile reinforced random walk (SeRW). In one dimension, the walk is known to be sub-diffusive with identity reinforcement function. We perturb the model by introducing a small probability \delta of escaping the last traversed edge at each step. The perturbed SeRW model is diffusive for any \delta , with enhanced diffusivity (\delta ) in the small \delta regime. We further study stochastically perturbed SeRW models by having the last edge escape probability of the form \delta with \delta 's being independent random variables. Enhanced diffusivity in such models are logarithmically close to the so called residual diffusivity (positive in the zero \delta limit), with diffusivity between \delta and \delta . Finally, we generalize our results to higher dimensions where the unperturbed model is already diffusive. The enhanced diffusivity can be as much as \delta .
We study a system of semilinear hyperbolic equations passively advected by smooth white noise in time random velocity fields. Such a system arises in modelling non-premixed isothermal turbulent flames under single-step kinetics of fuel and oxidizer. We derive closed equations for one-point and multi-point probability distribution functions (PDFs) and closed-form analytical formulae for the one-point PDF function, as well as the two-point PDF function under homogeneity and isotropy. Exact solution formulae allow us to analyse the ensemble-averaged fuel/oxidizer concentrations and the motion of their level curves. We recover the empirical formulae of combustion in the thin reaction zone limit and show that these approximate formulae can either underestimate or overestimate average concentrations when the reaction zone is not tending to zero. We show that the averaged reaction rate slows down locally in
We study the enhanced diffusivity in the so called elephant random walk model with stops (ERWS) by including symmetric random walk steps at small probability \epsilon . At any \epsilon , the large time behavior transitions from sub-diffusive at \epsilon to diffusive in a wedge shaped parameter regime where the diffusivity is strictly above that in the un-perturbed ERWS model in the \epsilon limit. The perturbed ERWS model is shown to be solvable with the first two moments and their asymptotics calculated exactly in both one and two space dimensions. The model provides a discrete analytical setting of the residual diffusion phenomenon known for the passive scalar transport in chaotic flows (eg generated by time periodic cellular flows and statistically sub-diffusive) as molecular diffusivity tends to zero.