The Statistics Department hosts weekly colloquia on a variety of statistcal subjects, bringing in speakers from around the world.

Understanding the Effects of Predictor Variables in Black Box Supervised Learning Models

For many supervised learning applications, understanding and visualizing the effects of the predictor variables on the predicted response is of paramount importance. A shortcoming of black box supervised learning models (e.g., complex trees, neural networks, boosted trees, random forests, nearest neighbors, local kernel-weighted methods, support vector regression, etc.) in this regard is their lack of interpretability or transparency.

Type of Event:

Computerized Adaptive Testing with Response Revision: Design and Asymptotic Theory

In Computerized Adaptive Testing (CAT), questions are selected in real time and are adjusted to the test-taker’s latent ability. While CAT has become popular for many measurement tasks, such as educational testing and patient reported outcomes, it has been criticized for not allowing examinees to review and revise their answers.  Two main concerns regarding response revision in CAT are the deterioration of estimation efficiency, due to suboptimal item selection, and the compromise of test validity, due to the potential adoption of deceptive test-taking strategies by the examinees.

Type of Event:

Fog Computing in Cyber-physical Systems and Security

In this talk, we will discuss research challenges and opportunities of Fog Computing in Cyber-physical Systems and Security and present several case studies. We will first present an innovative Real-time In-situ Seismic Imaging (RISI) system design with fog computing. It is a smart sensor network that senses and computes the 3D subsurface imaging in real-time and continuously.

Type of Event:

Parameter estimation for linear Gaussian covariance models

Linear Gaussian covariance models are Gaussian models with linear constraints on the covariance matrix. Such models arise in stochastic processes from repeated time series data, Brownian motion tree models of phylogenetic data and network tomography models used for analyzing connections in the Internet. Maximum likelihood estimation in this class of models leads to a non-convex optimization problem that typically has many local maxima.

Type of Event:

A simple way to incorporate prior information on margins in Bayesian latent class models

I present an approach to incorporating informative prior beliefs about marginal probabilities into Bayesian latent class models for categorical data. The basic idea is to append synthetic observations to the original data such that (i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records.

Type of Event:

Small Area Estimation with Uncertain Random Effects

Random effects models play an important role in model-based small area estimation. Random effects account for any lack of fit of a regression model for the population means of small areas on a set of explanatory variables. In Datta, Hall and Mandal (2011, JASA), we showed that if the random effects can be dispensed with through a statistical test, then the model parameters and the small area means can be estimated substantially accurately. This work is most useful when the number of small areas, m, is moderately large.

Type of Event:

Prediction Summary Measures for a Nonlinear Model and for Right-Censored Time-to-Event Data

The R-squared statistic, or coefficient of determination, is commonly used to measure the predictive power of a linear model.  It is interpreted as the fraction of variation in the response explained by the predictors. Despite its popularity, a direct equivalent measure is not available for nonlinear regression models and for right-censored time-to-event data. In this talk, I will show that in addition to a measure of explained variation, another measure of explained prediction error is required to assess the predictive power of a nonlinear model.

Type of Event:

Regression with Covariate Subject to Limit of Detection

We consider generalized linear regression with left-censored covariate due to the lower limit of detection. The complete case analysis by eliminating observations with values below limit of detection yields valid estimates for regression coefficients, but loses efficiency. Substitution methods are biased; and maximum likelihood method relies on parametric models for the unobservable tail probability, thus may suffer from model misspecification.

Type of Event: