Statistics and Data Science Seminar
Prof. Min Zhang
Purdue University
Cancelled
Abstract: High throughput biotechnologies such as microarray and next-generation
sequencing permit simultaneous measurements of enormous bodies of
expression and sequence information. However, the number of biological
samples is much smaller compared to the number of available predictors.
Statistically, we are challenged by the large number of parameters but
small number of observations. To tackle this issue, we proposed a
two-step variable selection procedure to reduce the dimension in the
first stage where Gibbs sampler was developed to stochastically search
through low-dimensional subspaces. With reduced number of variables,
either Bayesian variable selection or traditional approaches can be
employed in the second stage. The methods are evaluated via simulation
studies and we also applied them to real data sets, including QTL mapping
data and gene expression data.
Wednesday December 3, 2008 at 4:15 PM in SEO 612