A Bayesian Variable Selection Model for Semi-Continuous Response Using Gaussian Process
dc.contributor.advisor | Chekouo, Thierry | |
dc.contributor.advisor | Deardon, Rob | |
dc.contributor.author | Lipman, Danika | |
dc.contributor.committeemember | Wu, Jingjing | |
dc.contributor.committeemember | Lu, Xuewen | |
dc.contributor.committeemember | Safo, Sandra | |
dc.contributor.committeemember | Chekouo, Thierry | |
dc.contributor.committeemember | Deardon, Rob | |
dc.date | 2023-11 | |
dc.date.accessioned | 2023-09-12T21:38:07Z | |
dc.date.available | 2023-09-12T21:38:07Z | |
dc.date.issued | 2023-09-06 | |
dc.description.abstract | To my knowledge, there is not a statistical method that can perform Bayesian variable selection in a setting where there is a semi-continuous response with a non-linear relationship to predictor variables. I have developed a two-part model to accommodate a semi-continuous response, that uses Gaussian processes to capture the non-linear relationship between input variables and outcomes. Bayesian variable selection is induced in both parts of the model through the construction of the kernel matrices. I have employed the Nystr\"{o}m approximation for kernel matrices to reduce the computational complexity that occurs when working with kernel matrices and large sample sizes. I perform simulation studies and determine my method is competitive in prediction and variable selection with methods such as elastic net, and other methods that capture non-linearity such as random forests, and gradient boosted trees. In addition, I apply my method to a coronary artery disease (CAD) dataset from the Duke Database for Cardiovascular Disease (DDCD) to determine key gene expression features associated with the CAD index, a measure of CAD severity. | |
dc.identifier.citation | Lipman, D. (2023). A Bayesian variable selection model for semi-continuous response using Gaussian process (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. | |
dc.identifier.uri | https://hdl.handle.net/1880/117001 | |
dc.identifier.uri | https://doi.org/10.11575/PRISM/41845 | |
dc.language.iso | en | |
dc.publisher.faculty | Graduate Studies | |
dc.publisher.institution | University of Calgary | |
dc.rights | University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. | |
dc.subject | Bayesian | |
dc.subject | Gaussian Process | |
dc.subject | Variable Selection | |
dc.subject.classification | Statistics | |
dc.title | A Bayesian Variable Selection Model for Semi-Continuous Response Using Gaussian Process | |
dc.type | master thesis | |
thesis.degree.discipline | Mathematics & Statistics | |
thesis.degree.grantor | University of Calgary | |
thesis.degree.name | Master of Science (MSc) | |
ucalgary.thesis.accesssetbystudent | I do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible. |