Statistics and Data Science Seminar
Prof. Min Zhang
Purdue University
Bayesian variable selection for high dimensional models with applications in genomics
Abstract: High throughput biotechnologies such as microarray and next-generation sequencing permit simultaneous measurements of enormous bodies of expression and sequence information. However, the number of biological samples is much smaller compared to the number of available predictors. Statistically, we are challenged by the large number of parameters but small number of observations. To tackle this issue, we proposed a two-step variable selection procedure to reduce the dimension in the first stage where Gibbs sampler was developed to stochastically search through low-dimensional subspaces. With reduced number of variables, either Bayesian variable selection or traditional approaches can be employed in the second stage. The methods are evaluated via simulation studies and we also applied them to real data sets, including QTL mapping data and gene expression data.
Wednesday April 22, 2009 at 4:15 PM in SEO 612