Seminar Announcement - MSCS supplemental public website

Statistics and Data Science Seminar

Jun Xie

Purdue University

High dimensional classification and its application in pharmacogenomics research

Abstract: Many statistical classification methods, e.g., Fisher's linear discriminant analysis, cannot be directly applied to high dimensional data, where the number of variables is larger than the sample size. While high dimensional data analysis has been broadly discussed in statistics community, the impact of dimensionality on classifications is poorly understood. We examine and compare high dimensional classification methods through an application in pharmacogenomics research, where high-dimensional gene expression microarray data are used to predict patients' responses to a drug. Compared with most gene expression classification studies to detect strong signals, for instance tumor versus normal, a classifier between patients' response and non-response is more challenging and may be nonlinear. We introduce several new classification methods, including a sparse linear discriminant method, random projection, and a distribution based classification involving second-order interactions, as potential tools to deal with high dimensionality. We also want to call attentions to theories of high dimensional classification, where there are only few results available.

Wednesday October 27, 2010 at 3:00 PM in SEO 636

Seminar Announcement - MSCS@UIC

Faculty listings:

Statistics and Data Science Seminar