In this talk, I will present some new contributions to the area of high dimensional statistical learning. The focus will be on both classification and clustering. Classification is one of the central research topics in the field of statistical learning. For binary classification, we propose the Bi-Directional Discrimination (BDD) method which generalizes linear classifiers from one hyperplane to two or more hyperplanes. BDD provides a compromise between linear and general nonlinear methods.
The Statistics Department hosts weekly colloquia on a variety of statistcal subjects, bringing in speakers from around the world.
Type of Event:
In this talk, I will begin by providing a brief overview of my mathematics and statistics education research. This work has focused on understanding how teaching practices influence student achievement. More specifically, I am interested in exploring how technology can be used to improve learning in mathematics and statistics, how teacher knowledge affects student achievement, and how curriculum influences practice.
Words that are part of everyday English and used differently in a technical domain possess lexical ambiguity. The use of such words may encourage students to make incorrect associations between words they know and words that sound similar but have specific meanings in statistics that are different from the common usage definitions. This talk will present results from parts of a sequence of studies designed to understand the effects of and develop techniques for exploiting lexical ambiguities in the statistic classroom.
Reproducibility is essential to reliable scientific discovery in large-scale high-throughput biological studies. In this talk, I will present a unified approach to measure reproducibility of findings identified from replicate experiments and select discoveries using reproducibility between replicates.
Diagnosis of student mastery or non-mastery of a set of skills (or attributes) can be done using cognitive diagnosis models. Before diagnosing the students, the skills need to be chosen and the appropriate model needs to be selected. In this talk, I will first introduce the process of deconstructing the domain of an introductory statistics course into a hierarchical arrangement of cognitive attributes.
The desire to infer the evolutionary history of a group of species (species tree) should be more viable now that a considerable amount of multilocus molecular data is available. In this talk, I will introduce three statistical methods for reconstructing species trees under the multispecies coalescent model. The Bayesian method can estimate the topology, species divergence times, and population sizes of the species tree, but involves intensive computation.
In this talk, I will describe flexible new Bayesian methods to analyze functional and quantitative image data. The methods are based on functional mixed models, a framework that can simultaneously model multiple factors and account for correlation within and between the functions. I use an isomorphic basis-space approach to fitting the model, which leads to efficient calculations and adaptive smoothing yet flexibly accommodates the complex features characterizing these data.
Salivary glands are important for producing salivary proteins which contribute to host defense, lubrication, and digestion. However, salivary glands are often damaged or destroyed by radiation therapy or surgery for head and neck cancers, or by advanced Sjogrens syndrome. In order to engineer or replace salivary glands, it is important to define the major intracellular pathways of the nuclear program that causes terminal differentiation of the parotid acinar cells. Gene network discovery is a critical part to do this.
Stochastic models, diffusion models in particular, are widely used in science, engineering and economics. Inferring the parameter values from data is often complicated by the fact that the underlying stochastic processes are only partially observed. Examples include inference of discretely observed diffusion processes, stochastic volatility models, and double stochastic Poisson (Cox) processes. Likelihood based inference faces the difficulty that the likelihood is usually not available even numerically. Conventional approach discretizes the stochastic model to approximate the likelihood.
If X_1,...,X_n are a random sample from a density f in , then with probability one there exists a unique log-concave maximum likelihood estimator of f. The use of this estimator is attractive because, unlike kernel density estimation, the estimator is fully automatic, with no smoothing parameters to choose. We exhibit an iterative algorithm for computing the estimator and show how the method can be combined with the EM algorithm to fit finite mixtures of log-concave densities.