Browsing by Author "Chekouo, Thierry T."
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Open Access Bias and Bias-Correction for Individual-Level Models of Infectious Disease(2020-01-30) Jafari, Behnaz; Deardon, Rob; Chekouo, Thierry T.; Kopciuk, Karen ArleneAccurate infectious disease models can help scientists understand how an ongoing disease epidemic spreads and help forecast the course of epidemics more effectively (e.g. O'Neill, 2010; Jewell et al., 2009; Deardon et al., 2010). The main purpose of infectious disease modeling is to capture the main risk factors that affect the spread of a disease and make a prediction based on these factors. In real life, we do not generally have homogeneous and homogeneously mixing populations and various factors affect the spread of a disease (e.g. geographical, social, domestic, and employment networks, genetics factors). Using individual-level-models (ILMs) (Deardon et al., 2010) can help researchers to incorporate population heterogeneity. In these models inferences are made within a Bayesian Markov chain Monte Carlo (MCMC) framework (e.g. Gamerman and Lopes, 2006), obtaining posterior estimates of model parameters. However, parameter estimation and bias of estimates go hand in hand. The issue of bias of parameter estimates, and methods for bias correction, have been widely studied in the context of many of the most established and commonly used statistical models, and associated methods of parameter estimation. However, these methods are not directly applicable to individual-level infections disease data. The focus of this thesis is to investigate circumstances in which ILM parameter estimates may be biased in some simple disease system scenarios. Further, we aim to find bias-corrected estimates of ILM parameters using simulation and compare them with the posterior estimates of the model parameter. We also discuss the factors that affect performance of these estimators.Item Open Access Covariate Balancing Using Statistical Learning Methods in the Presence of Missingness in Confounders(2019-09-20) Mason, Levi James; Shen, Hua; Chekouo, Thierry T.; Deardon, RobIn observational studies researchers do not have control over treatment assignment. A consequence of such studies is that an imbalance in observed covariates between the treatment and control groups possibly exists. This imbalance can arise due to the fact that treatment assignment is frequently influenced by observed covariates (Austin, 2011a). As a result, directly comparing the outcomes between these two groups could lead to a biased estimation of the treatment effect (d’Agostino, 1998). The propensity score, defined as the probability of treatment assignment conditional on observed covariates, can be used in matching, stratification, and weighting to balance the observed covariates between the treatment and control groups in order to more accurately estimate the treatment effect (Rosenbaum and Rubin, 1983). This study looked at using statistical learning techniques to estimate the propensity score. The techniques included in this study were: logistic regression, classification and regression trees, pruned classification and regression trees, bagging classification and regression trees, boosted classification and regression trees, and random forests. These estimated propensity scores were then used in linearized propensity score matching, stratification, and inverse probability of treatment weighting using stabilized weights to estimate the treatment effect. Comparisons among these methods were made in a simulation study setting. Both a binary and continuous outcome were analyzed. In addition, a simulation was performed to assess the use of multiple imputation using predictive mean matching when a confounder had data missing at random. Based on the results from the simulation studies it was demonstrated that the most accurate treatment effect estimates came from inverse probability of treatment weighting using stabilized weights where the propensity scores were estimated by logistic regression, random forests, or bagging classification and regression trees. These results were then applied in a retrospective cohort data set with a missing confounder to determine the treatment effect of adjuvant radiation on breast cancer individuals.Item Open Access Efficient Estimation of Partly Linear Transformation Model with Interval-censored Competing Risks Data(2019-09-19) Wang, Yan; Lu, Xuewen; Shen, Hua; Chekouo, Thierry T.We consider the class of semiparametric generalized odds rate transformation models to estimate the cause-specific cumulative incidence function, which is an important quantity under competing risks framework, and assess the contribution of covariates with interval-censored competing risks data. The model is able to handle both linear and non-linear components. The baseline cumulative incidence functions and non-linear components of different competing risks are approximated with B-spline basis functions or Bernstein polynomials, and the estimated parameters are obtained by employing the sieve maximum likelihood estimation. We designed two examples in the simulation studies and the simulation results show that the method performs well. We used the proposed method to analyze the HIV data obtained from patients in a large cohort study in sub-Saharan Africa.Item Open Access Efficient Estimation of the Additive Hazards Model with Bivariate Current Status Data(2020-08-14) Zhang, Ce; Lu, Xuewen; Chekouo, Thierry T.; Shen, HuaIn this thesis, we present sieve maximum likelihood estimators of the both finite and infinite dimensional parameters in the marginal additive hazards model with bivariate current status data, where the joint distribution of the bivariate survival times is modeled by a copula. We assume the two baseline hazard functions and the copula are unknown functions, and use constrained Bernstein polynomials to approximate these functions. Compared with the existing methods for estimation of the copula models for bivariate survival data, the proposed new method has two main advantages. First, our method does not need to specify the form of the copula model and is more flexible. Second, the proposed estimators have strong consistency, optimal rate of convergence and the regression parameter estimator is asymptotically normal and semi-parametrically efficient. Simulation studies reveal that the proposed estimators have good finite-sample properties. Finally, a real data application is provided for illustration.Item Open Access Power Analysis of Transcriptome-Wide Association Studies (TWAS)(2020-08) Ding, Bowei; Wu, Jingjing; Long, Quan; Lu, Xuewen; Chekouo, Thierry T.Association studies between genetic variants and complex traits are popular and valuable in both genetic and clinical fields. Among all kinds of studies proposed, transcriptome-wide association studies (TWAS) have become influential and widely used. In my thesis, I focus on revealing under which settings of genetic parameters and architectures, TWAS will be more powerful in detecting contributing genes than other analytical methods, including genome-wide association studies (GWAS) and eQTL-based meditated GWAS (emGWAS). We first derive novelly the closed-form of the non-centrality parameter (NCP) in the non- central distribution under alternative hypothesis. Then we estimate the power based on the estimated NCP. Through simulation studies, we compare the power of the three methods, i.e. TWAS, GWAS and emGWAS. Our numerical results show that while the number of significant genes, level of trait heritability and phenotypic variance component explained by expressions (PVX) all have influence on the power of the three analytical models according to the corresponding genetic architecture, the expression heritability is the most influential factor which makes TWAS stand out.