A Bayesian Variable Selection Model for Semi-Continuous Response Using Gaussian Process

Date
2023-09-06
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
To my knowledge, there is not a statistical method that can perform Bayesian variable selection in a setting where there is a semi-continuous response with a non-linear relationship to predictor variables. I have developed a two-part model to accommodate a semi-continuous response, that uses Gaussian processes to capture the non-linear relationship between input variables and outcomes. Bayesian variable selection is induced in both parts of the model through the construction of the kernel matrices. I have employed the Nystr\"{o}m approximation for kernel matrices to reduce the computational complexity that occurs when working with kernel matrices and large sample sizes. I perform simulation studies and determine my method is competitive in prediction and variable selection with methods such as elastic net, and other methods that capture non-linearity such as random forests, and gradient boosted trees. In addition, I apply my method to a coronary artery disease (CAD) dataset from the Duke Database for Cardiovascular Disease (DDCD) to determine key gene expression features associated with the CAD index, a measure of CAD severity.
Description
Keywords
Bayesian, Gaussian Process, Variable Selection
Citation
Lipman, D. (2023). A Bayesian variable selection model for semi-continuous response using Gaussian process (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.