Harvard School of Public Health
With the advance of biotechnology, massive "omics" data, such as genomic and proteomic data, become rapidly available in population based studies to study interplay of genes and environment in causing human diseases. An increasing challenge is how to analyze such high-throughput "omics" data, interpret the results, make the findings reproducible. We discuss several statistical issues in analysis of high-dimensional "omics" data in population based "omics" studies. We present statistical methods for analysis of several types of "omics" data, including incorporation of biological structures in analysis of data from genome-wide association studies, analysis of genetic pathway data and gene selection, and analysis of genome-wide DNA methylation data and study of genes and environment. Data examples are presented to illustrate the methods. Strategies for interdisciplinary training in statistical genetics, computational biology and genetic epidemiology will also be discussed.