PhD Candidate, Statistcs
This dissertation consists of two parts for the topic of sample integrity in high dimensional data. The first part focuses on batch effect in gene expression data. Batch bias has been found in many microarray studies that involve multiple batches of samples. Currently available methods for batch effect removal are mainly based on gene-by-gene analysis. There has been relatively little development on multivariate approaches to batch adjustment, mainly because of the analytical difficulty that originates from the high dimensional nature of gene expression data.