Cluster Aanlysis of Gene Expression Profiles via Flexible Count Models for RNA-seq Data

atmire.migration.oldid3301
dc.contributor.advisorde Leon, Alexander
dc.contributor.authorRuan, Ji
dc.date.accessioned2015-06-10T15:21:17Z
dc.date.available2015-11-20T08:00:30Z
dc.date.issued2015-06-10
dc.date.submitted2015en
dc.description.abstractClustering RNA-seq data is used to characterize environment-induced (e.g., treatment) differences in gene expression profiles by separating genes into clusters based on their expression patterns. Wang et al. [2013] recently adopted the bi-Poisson distribution, obtained via the trivariate reduction method, as a model for clustering bivariate RNA-seq data. We discuss the inadequacy of the bi-Poisson distribution in modelling the correlation between dependent Poisson counts, and its impact on clustering such data. We introduce an alternative Gaussian copula model that incorporates a flexible dependence structure for the counts, report simulation results to compare the performance of the Gaussian copula and bi-Poisson models, and investigate the impact on clustering of Poisson counts of misspecified dependence structures. We illustrate our methodology on a lung cancer RNA-seq data.en_US
dc.identifier.citationRuan, J. (2015). Cluster Aanlysis of Gene Expression Profiles via Flexible Count Models for RNA-seq Data (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/25338en_US
dc.identifier.doihttp://dx.doi.org/10.11575/PRISM/25338
dc.identifier.urihttp://hdl.handle.net/11023/2293
dc.language.isoeng
dc.publisher.facultyGraduate Studies
dc.publisher.institutionUniversity of Calgaryen
dc.publisher.placeCalgaryen
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectStatistics
dc.subject.classificationRNA-seq Dataen_US
dc.subject.classificationClusteringen_US
dc.subject.classificationEM algorithmen_US
dc.titleCluster Aanlysis of Gene Expression Profiles via Flexible Count Models for RNA-seq Data
dc.typemaster thesis
thesis.degree.disciplineMathematics and Statistics
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameMaster of Science (MSc)
ucalgary.item.requestcopytrue
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2015_ruan_ji.pdf
Size:
1.04 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.65 KB
Format:
Item-specific license agreed upon to submission
Description: