Tags: Colloquium Series

The Statistics Department hosts weekly colloquia on a variety of statistcal subjects, bringing in speakers from around the world.

Experimentation is an integral part of science. Exploring or analyzing data leads to new insights, hypotheses, and questions that fuel further investigation. Before data can be explored or analyzed, it needs to be collected. Where possible, this is ideally done through a designed experiment. In a designed experiment, conditions under which observations are made can be controlled, and causal relationships can be studied. A central question in…
Data Center is a facility, which houses computer systems and associated components, such as telecommunications and storage systems. It generally includes power supply equipment, communication connections, and cooling equipment. A large data center can use as much electricity as a small town. The emergence of data-center based computing services necessitates the examination of the costs associated with such data centers and the evolution of such…
In fMRI and fcMRI, the original data measured by the MRI machine are complex-valued, a complex-valued inverse Fourier transform is applied to reconstruct into a complex-valued image. Almost exclusively, the Cartesian real and imaginary images are converted to magnitude and phase images, then the phase half of the data is discarded before statistical analysis. A description of potential biological information in the phase will be provided along…
Population enrichment strategy offers a specific adaptive design methodology to study the effect of experimental treatments in various sub-populations of patients under investigation.  Instead of limiting the enrollment only to the enriched population, these designs enable the data-driven selection of one or more pre-specified subpopulations at an interim analysis and the confirmatory proof of efficacy in the selected subset at the end of…
Motivation: Advances in chromosome conformation capture and next-generation sequencing technologies are enabling genome-wide investigation of dynamic chromatin interactions. For example, Hi-C experiments generate genome-wide contact frequencies between pairs of loci by sequencing DNA segments ligated from loci in close spatial proximity. One essential task in such studies is peak calling, that is, detecting non-random interactions between loci…
Multiple types of (epi)genetic measurements are involved in the development and progression of complex diseases. Different types of (epi)genetic measurements are interconnected, and modeling their associations can lead to a better understanding of disease biology and facilitate building clinically useful models. Such analysis is challenging in multiple aspects. To fix notations, we use gene expression (GE) and copy number variation (CNV) as an…
In this talk, I discuss two current projects tangentially related under the umbrella of regression. The first part of the talk investigates informative missingness in the framework of recommender systems. For example, in 2009, Netflix ran a $1M prize competition to improve their algorithm to recommend movies to their viewers. In this setting, we can imagine a potential rating for every object-user pair. For Netflix, the object would be the movie…
Functional data arise frequently especially in today’s big data regime in diverse contexts including patient monitoring in medical treatments, weather analysis and in general, in everything that produces observations nearly continuous in time. Clustering of data is a fundamental tool in understanding similarities and dissimilarities between units in the data.  Bayesian methods for clustering of functional data use models which imply the…
Accurately forecasting solar power using a statistical method from multiple sources is an important but challenging problem. Our goal is to combine two different physics model forecasting outputs with real measurements from an automated monitoring network so as to better predict solar power in a timely manner. To this end, we propose a bottom-up approach of analyzing large-scale multilevel models with great computational efficiency requiring…