Jean Opsomer

Colorado State University

Cross-validation in survey estimation: model selection and variance estimation

We propose a cross-validated version of the design-based variance estimator of survey estimators, and describe its use in several survey applications. The estimator is based on the same "leave-on-one" principle as traditional cross-validation, but takes the design effects on the variance into account. We apply the cross-validated estimator as a design-based model selection tool for regression estimators, and show that it is effective in minimizing the asymptotic design mean squared error of regression estimators, both those using parametric and nonparametric models.

Thursday, March 25, 2010 - 3:30pm

Lan Xue

Oregon State University

Consistent variable selection in additive models

A penalized polynomial spline method will be introduced for simultaneous model estimation and variable selection in additive models. The proposed method approximates the nonparametric functions by polynomial splines, and minimizes the sum of squared errors subject to an additive penalty on norms of spline functions. This approach sets estimators of certain function components to exactly zero, thus performing variable selection.

Thursday, March 18, 2010 - 3:30pm

Tianwei Yu

Emory University

High-Resolution LC/MS Data Pre-processing

Liquid chromatography-mass spectrometry (LC/MS) is one of the major techniques in metabolomic studies. It is widely used to identify disease biomarkers, drug targets, and unravel complex metabolic networks. Due to the high-noise nature of the technology, especially when measuring low-abundance components in complex samples, reliable pre-processing is critical in order to maximize information retrieval from LC/MS data. We develop a set of algorithms for the processing of high-resolution LC/MS data.

Thursday, March 4, 2010 - 3:30pm

Snigdhansu Chatterjee

University of Minnesota

Statistical evidence of climate change: an analysis of global seawater data

We analyze a dataset on seawater pattern over the last few decades. For specificity, we restrict attention to temperature measures in the Arctic Ocean region for this talk. Our goal is to investigate whether there is a significant change of pattern in the Arctic Ocean seawater temperature, thus detecting climate change, after accounting for the systematic factors like location, depth, season, and the temporal and spatial dependence pattern of the observations.

Thursday, February 25, 2010 - 3:30pm

Yichao Wu

North Carolina State University

Robust Model-free Multiclass Probability Estimation

Classical statistical approaches for multiclass probability estimation are typically based on regression techniques such as multiple logistic regression, or density estimation approaches such as linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). These methods often make certain assumptions on the form of probability functions or on the underlying distributions of subclasses. In this paper, we develop a model-free procedure to estimate multiclass probabilities based on large-margin classifiers.

Thursday, February 18, 2010 - 3:30pm

Dan Cooley

Colorado State University

Spatial Hierarchical Models for Extremes: Modeling Both Climate and Weather Effects

Weather data are characterized by two types of spatial effects. Climate effects occur on a regional scale and characterize how climate varies by location. In terms of a statistical model, one can view climate effects as how the marginal distribution varies by location. Weather effects occur on a more local scale and characterize how different locations can be jointly affected by individual storms. One can think of weather effects as characterizing the joint behavior. We aim to characterize both the climate and weather effects of extreme weather, in particular extreme precipitation.

Thursday, February 11, 2010 - 3:30pm

Jay Breidt

Colorado State University

Penalized Balanced Sampling
The purpose of this work is to consider methods by which linear mixed models may be used at the design stage of a survey. This paper reviews the ideas of balanced sampling and the cube algorithm, and proposes an implementation of the cube algorithm by which penalized balanced samples can be selected.

Linear mixed models are flexible and extensible models that cover a wide range of statistical methods. They have found many uses in estimation for complex surveys, particularly in small area estimation and in extensions of generalized regression estimation. They have also been used as a means of relaxing constraints in calibration estimation. The purpose of this work is to consider methods by which linear mixed models may be used at the design stage of a survey.

Thursday, February 4, 2010 - 3:30pm

Daya Dayananda

University of St.Thomas, St. Paul, Minnesota

Using Stochastic Price Process of a Stock to Fair Pricing of Multiple Financial Warrants
The purpose of this presentation is to find a closed from formula to value a financial instrument called “warrant”. The holder of a financial warrant has the right but not the obligation to purchase a share of the company concerned by paying $K at a specific time T into the future. The holder pays now to own the warrant and we are interested in the calculation of the fair value of his payment. The possibilities of multiple warrants in a company will also be examined.

The use of economic and statistics principles have been instrumental in developing many quantitative methodologies in finance, for example the famous formula of Black-Scholes that led to a Noble Prize in economics. In order to research in mathematical finance, it is essential to understand both economic principles and the ever changing financial activities in the market.

Friday, January 29, 2010 - 4:30pm

Yixin Fang

Georgia State University

Variable Selection in Canonical Discriminant Analysis for Family Studies

In family studies, canonical discriminant analysis can be used to find linear combinations of phenotypes that exhibit high ratios of between-family to within-family variabilities. But with large numbers of phenotypes, canonical discriminant analysis may over-fit. To estimate the predicted ratios associated with the coefficients obtained from canonical discriminant analysis, two methods are developed; one is based on bias correction and the other based on cross-validation. Because the cross-validation is computationally intensive, an approximation to the cross-validation is also developed.

Thursday, January 28, 2010 - 3:30pm

Li-Ping Zhu

East China Normal University

Sufficient Dimension Reduction for High Dimensional Data in Regression
In this talk, we propose a model-free independence screening procedure to select the subset of active predictors by using the diagonal elements of an average partial mean estimation matrix.

Dimension reduction in ultrahigh dimensional feature space characterizes various contemporary problems in scientific discoveries. In this talk, we propose a model-free independence screening procedure to select the subset of active predictors by using the diagonal elements of an average partial mean estimation matrix. The new proposal possesses the sure independence screening property for a wide range of semi-parametric regressions, i.e. it guarantees to select the subset of active predictors with probability approaching to one as the sample size diverges.

Thursday, January 14, 2010 - 3:30pm


Subscribe to RSS - Colloquium