Sufficient Dimension Reduction for High Dimensional Data in Regression
Li-Ping
Zhu

East China Normal University

Thursday, January 14, 2010 - 3:30pm
In this talk, we propose a model-free independence screening procedure to select the subset of active predictors by using the diagonal elements of an average partial mean estimation matrix.

Dimension reduction in ultrahigh dimensional feature space characterizes various contemporary problems in scientific discoveries. In this talk, we propose a model-free independence screening procedure to select the subset of active predictors by using the diagonal elements of an average partial mean estimation matrix. The new proposal possesses the sure independence screening property for a wide range of semi-parametric regressions, i.e. it guarantees to select the subset of active predictors with probability approaching to one as the sample size diverges. In addition, it is computationally efficient in the sense that it is free of tuning and avoids completely iterative algorithm. By adding a series of auxiliary variables to set up a benchmark for screening, a new technique is introduced to reduce the false discovery rate in the feature screening stage. Numerical studies through several synthetic examples and a real data example are presented to illustrate the methodology. The empirical investigations found that the new proposal allows strong correlations within the group of inactive features, and works properly even when the number of active predictors is fairly large.