The Statistics Department hosts weekly colloquia on a variety of statistcal subjects, bringing in speakers from around the world.
Type of Event:
As the rapid development of biotechnology, more complex data sets are now generated to address extremely complex biological problems. It is challenging to develop new statistical methods to analyze such data. In this thesis, I propose a nonparametric hypothesis test and two statistical learning methods to solve biological problems arising from epigenomics, metagenomics, and neuroimaging. First, the proposed test aims at testing the significance of the interaction in bivariate smoothing spline ANOVA model.
Nonparametric Methods for Big and Complex Datasets Under a Reproducing Kernel Hilbert Space Framework
Large and complex data have been generated routinely from various sources, for instance, time course biological studies and social media. Classic nonparametric models, such as smoothing spline ANOVA models, are not well equipped to analyze such large and complex data.
With the rapid development of technology, increasing amount of data has been produced from many fields of science, such as biology, neuroscience, and engineering. The inadequate sample is no longer a bottleneck of modern statistical research. More often, we are facing data of extremely high dimensionality or coming from remarkably different sources. How to effectively extract information from the large-scale and high-dimensional data or data with various types and formats poses new statistical challenges.
We discuss optimal designs for the panel mixed logit model. The panel mixed logit model is usually used for the analysis of discrete choice experiments. The information matrix used in design criteria does not have a closed form expression and it is computationally difficult to evaluate the information matrix numerically. We derive the information matrix and use the obtained form to propose three methods to approximate the information matrix.