Cross-validation in survey estimation: model selection and variance estimation

Colorado State University

Thursday, March 25, 2010 - 3:30pm

We propose a cross-validated version of the design-based variance estimator of survey estimators, and describe its use in several survey applications. The estimator is based on the same "leave-on-one" principle as traditional cross-validation, but takes the design effects on the variance into account. We apply the cross-validated estimator as a design-based model selection tool for regression estimators, and show that it is effective in minimizing the asymptotic design mean squared error of regression estimators, both those using parametric and nonparametric models. The cross-validated variance estimator is also proposed as an alternative to the commonly used linearized variance estimators. In both applications, we show how the new criterion has good theoretical and practical properties, and offer suggestions on how it could be implemented in an actual survey environment.