Browsing by Author "Li, Jia"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Open Access Correlated Data Analysis via Variants of EM Algorithm: Application to Data on Physical Activity and Maternal Health(2024-09-13) Li, Jia; De Leon, Alexander; Li, Haocheng; Wu, Jingjing; Lu, Xuewen; Chu, Man-Wai; Sheng, XiaomingThe thesis concerns the analysis of correlated data on multiple variables via the EM algorithm and its variants. Specifically, we focus on (cross-sectional) multivariate iid data comprising a disparate mix of binary and non-Gaussian variables (including the special case of multivariate binary data), and on longitudinal data on multiple Gaussian responses in a regression setting. For the case with correlated data on multiple binary variables and that with mixed data on binary and non-Gaussian continuous variables, we introduced the class of meta-probit (MPMs) and extended meta-probit models (XMPMs) as generalizations to non-Gaussian settings of the grouped continuous model (GCM) – also known as the multivariate probit model (MVPM) – and its extension to mixed data, the conditional GCM (CGCM). Con- structed from Gaussian copula distributions (GCDs), a class of meta-Gaussian distributions based on the Gaussian copula, MPMs and XMPMs broaden the sphere of applications of joint models to settings that involve complex non-standard data on variables with different measurement scales and with marginal distributions, latent and otherwise, from different parametric families. To avoid the computational challenges of maximum likelihood (ML) estimation in MPMs/XMPMs, we adopted the method of inference function for margins, a two-part estimation method that first estimates marginal parameters marginally via (marginal) ML estimation, and then estimates joint parameters (i.e., normal correlations) jointly via profile ML estimation based on the full joint likelihood function, with marginal parameters evaluated at their marginal estimates. The method is especially appropriate for copula models, in general, and MPMs/XMPMs, in particular, because marginal distributions are specified completely independently of their dependence structure in copula models. For joint estimation of the normal correlations, we adopted a parameter expanded EM (PX-EM) algorithm to simplify E-step calculations – all done numerically exactly using freely available R packages – and to make possible a closed-form M-step update, allowing us to avoid the complications associated with having to estimate a correlation matrix. We used the standard theory of inference functions to obtain the (joint) asymptotic Gaussian distribution of the resulting maximum pseudo-likelihood estimates (MPLEs). Results of Monte Carlo simulations confirmed the consistency and asymptotic unbiasedness of MPLEs, with SEs that generally reflected the estimates’ true sampling variability. Finally, we generalized the ECME algorithm to multiple-outcomes setting to implement ML estimation for the joint Gaussian LMMs with atypically large numbers of random effects. Monte Carlo simulations show that the resulting estimates are consistent, with comparable efficiencies with those obtained by pairwise methods. We further illustrate our methodology with longitudinal survey data on physical activity collected by ActivPALTM (www.paltech. plus.com).