Ultrahigh dimensional feature selection: beyond the linear model

Jianqing Fan Richard Samworth Yichao Wu

Statistics Theory and Methods mathscidoc:1912.43273

Journal of machine learning research, 10, 2013-2038, 2009
Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Many frequently-used techniques are based on independence screening; examples include correlation ranking (Fan & Lv, 2008) or feature selection using a two-sample t-test in high-dimensional classification (Tibshirani et al., 2003). Within the context of the linear model, Fan & Lv (2008) showed that this simple correlation ranking possesses a sure independence screening property under certain conditions and that its revision, called iteratively sure independent screening (ISIS), is needed when the features are marginally unrelated but jointly related to the response variable. In this paper, we extend ISIS, without explicit definition of residuals, to a general pseudo-likelihood framework, which includes generalized linear models as a special case. Even in the least-squares setting, the new method improves ISIS by allowing feature deletion in the iterative process. Our technique allows us to select important features in high-dimensional classification where the popularly used two-sample t-method fails. A new technique is introduced to reduce the false selection rate in the feature screening stage. Several simulated and two real data examples are presented to illustrate the methodology.
No keywords uploaded!
[ Download ] [ 2019-12-21 11:34:19 uploaded by Jianqing_Fan ] [ 596 downloads ] [ 0 comments ]
@inproceedings{jianqing2009ultrahigh,
  title={Ultrahigh dimensional feature selection: beyond the linear model},
  author={Jianqing Fan, Richard Samworth, and Yichao Wu},
  url={http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20191221113419968101833},
  booktitle={Journal of machine learning research},
  volume={10},
  pages={2013-2038},
  year={2009},
}
Jianqing Fan, Richard Samworth, and Yichao Wu. Ultrahigh dimensional feature selection: beyond the linear model. 2009. Vol. 10. In Journal of machine learning research. pp.2013-2038. http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20191221113419968101833.
Please log in for comment!
 
 
Contact us: office-iccm@tsinghua.edu.cn | Copyright Reserved