Processing streaming data as they arrive is often necessary for high dimensional data analysis. In this paper, we analyze the convergence of a subspace online PCA iteration, as a followup of the recent work of Li, Wang, Liu, and Zhang [Math. Program., Ser. B, DOI 10.1007/s10107-017-1182-z] who considered the case for the most significant principal component only, i.e., a single vector. Under the sub-Gaussian assumption, we obtain a finite-sample error bound that closely matches the minimax information lower bound of Vu and Lei [Ann. Statist. 41:6 (2013), 2905-2947].
We propose to combine cepstrum and nonlinear time–frequency (TF) analysis
to study multiple component oscillatory signals with time-varying frequency and
amplitude and with time-varying non-sinusoidal oscillatory pattern. The concept of
cepstrum is applied to eliminate the wave-shape function influence on the TF analysis,
and we propose a new algorithm, named de-shape synchrosqueezing transform (deshape
SST). The mathematical model, adaptive non-harmonic model, is introduced
and the de-shape SST algorithm is theoretically analyzed. In addition to simulated
signals, several different physiological, musical and biological signals are analyzed to
illustrate the proposed algorithm.
Chenglong YuSouth Australian Health and Medical Research InstituteBernhard T. BauneUniversity of AdelaideJulio LicinioSouth Australian Health and Medical Research InstituteMa-Li WongSouth Australian Health and Medical Research Institute
Data Analysis, Bio-Statistics, Bio-Mathematicsmathscidoc:1703.42005
Major depressive disorder (MDD) is highly prevalent, resulting in an exceedingly high disease burden. The identification of generic risk factors could lead to advance prevention and therapeutics. Current approaches examine genotyping data to identify specific variations between cases and controls. Compared to genotyping, whole-genome sequencing (WGS) allows for the detection of private mutations. In this proof-of-concept study, we establish a conceptually novel computational approach that clusters subjects based on the entirety of their WGS. Those clusters predicted MDD diagnosis. This strategy yielded encouraging results, showing that depressed Mexican-American participants were grouped closer; in contrast ethnically-matched controls grouped away from MDD patients. This implies that within the same ancestry, the WGS data of an individual can be used to check whether this individual is within or closer to MDD subjects or to controls. We propose a novel strategy to apply WGS data to clinical medicine by facilitating diagnosis through genetic clustering. Further studies utilising our method should examine larger WGS datasets on other ethnical groups.