Statistics and Data Science Seminar
Hsin-Hsiung Huang
University of Central Florida
Bayesian Ultrahigh Dimensional Variable Selection for Mixed-type Multivariate Generalized Linear Models
Abstract: Inspired by our recent works on the NSF ATD challenges for spatiotemporal data analysis and modeling and Bayesian clustering research, we investigate whether the Bayesian methods can consistently estimate the model parameters when there are multivariate mixed-type responses. To this end, shrinkage priors are useful for identifying relevant signals in high-dimensional data. We develop a multivariate Bayesian model with shrinkage priors (MBSP) model to mixed-type response generalized linear models (MRGLMs), and we consider a latent multivariate linear regression model associated with the observable mixed-type response vector through its link function. Under our proposed model (MBSP-GLM), multiple responses belonging to the exponential family are simultaneously modeled and mixed-type responses are allowed. We show that the MBSP-GLM model achieves strong posterior consistency when $p$ grows at a subexponential rate with $n$. Furthermore, we quantify the posterior contraction rate at which the posterior shrinks around the true regression coefficients and allow the dimension of the responses $q$ to grow as $n$ grows. This greatly expands the scope of the MBSP model to include response variables of many data types, including binary and count data. To address the non-conjugacy concern, we propose an adaptive sampling algorithm via a P\'{o}lya-gamma data augmentation scheme for the MRGLM estimation. We provide simulation studies and real data examples.
Wednesday November 9, 2022 at 4:00 PM in Zoom