We establish necessary and sufficient conditions for consistent root reconstruction in continuous-time Markov models with countable state space on bounded-height trees. Here a root state estimator is said to be consistent if the probability that it returns to the true root state converges to 1 as the number of leaves tends to infinity. We also derive quantitative bounds on the error of reconstruction. Our results answer a question of Gascuel and Steel [GS10] and have implications for ancestral sequence reconstruction in a classical evolutionary model of nucleotide insertion and deletion [TKF91].
This paper deals with the estimation of a high-dimensional covariance with a con-
ditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error
covariance matrix in an approximate factor model, we allow for the presence of some
cross-sectional correlation even after taking out common but unobservable factors.
We introduce the Principal Orthogonal complEment Thresholding (POET) method
to explore such an approximate factor structure with sparsity. The POET estimator
includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan,
and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive
thresholding estimator (Cai and Liu, 2011) as specic examples. We provide mathe-
matical insights when the factor analysis is approximately the same as the principal
component analysis for high-dimensional data. The rates of convergence of the sparse
residual covariance matrix and the conditional sparse covariance matrix are studied
under various norms. It is shown that the impact of estimating the unknown factors
vanishes as the dimensionality increases. The uniform rates of convergence for the un-
observed factors and their factor loadings are derived. The asymptotic results are also
veried by extensive simulation studies. Finally, a real data application on portfolio
allocation is presented.
The purpose of this paper is to propose methodologies for statistical inference of low
dimensional parameters with high dimensional data.We focus on constructing confidence intervals
for individual coefficients and linear combinations of several of them in a linear regression
model, although our ideas are applicable in a much broader context.The theoretical results that
are presented provide sufficient conditions for the asymptotic normality of the proposed estimators
along with a consistent estimator for their finite dimensional covariance matrices. These
sufficient conditions allow the number of variables to exceed the sample size and the presence
of many small non-zero coefficients. Our methods and theory apply to interval estimation of a
preconceived regression coefficient or contrast as well as simultaneous interval estimation of
many regression coefficients. Moreover, the method proposed turns the regression data into
an approximate Gaussian sequence of point estimators of individual regression coefficients,
which can be used to select variables after proper thresholding. The simulation results that are
presented demonstrate the accuracy of the coverage probability of the confidence intervals
proposed as well as other desirable properties, strongly supporting the theoretical results.