Cluster analysis of correlated non-Gaussian continuous data via finite mixtures of Gaussian copula distributions
Date
2019-06-12
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Model-based cluster analysis in non-Gaussian settings is not straightforward due to a lack of standard models for non-Gaussian data. In this thesis, we adopt the class of Gaussian copula distributions (GCDs) to develop a flexible model-based clustering methodology that can accommodate a variety of correlated, non-Gaussian continuous data, where variables may have different marginal distributions and come from different parametric families. Unlike conventional model-based approaches that rely on the assumption of conditional independence, GCDs model conditional dependence among the disparate variables using the matrix of so-called normal correlations. We outline a hybrid approach to cluster analysis that combines the method of inference functions for margins (IFM) and the parameter-expanded EM (PX-EM) algorithm. We then report simulation results to investigate the performance of our methodology. Finally, we highlight the applications of this research by applying this methodology to a dataset regarding the purchases made by clients of a wholesale distributor.
Description
Keywords
Cluster analysis, Copula
Citation
Burak, K. L. (2019). Cluster analysis of correlated non-Gaussian continuous data via finite mixtures of Gaussian copula distributions (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.