Browsing by Author "He, Jingni"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Embargo Characterizing genetic basis of complex diseases by integrating data-bridge and genomics(2023-06) He, Jingni; Long, Quan; de Koning, Jason; Tekougang, Thierry ChekouoWith the advancement of high-throughput sequencing and genotyping technology, many multi-omics data are generated in the genomic projects. Such multi-omics data are in between of genotype and phenotype, therefore, may serve as data-bridges to help statistical genetic analyses. How to effectively integrate such data-bridges brings challenges and opportunities for statistical geneticists. For instances, the problem of statistical overfitting, the question of seamlessly integrating biological priors with high-dimensional data, and the interpretation of statistical results in the context of biology. The works in this thesis focus on integrating such data-bridges to characterize the genetic basis of complex diseases and addressing the aforementioned challenges. I have developed novel statistical models of analyzing multi-omics data from four perspectives: (Q1) How to integrate biological priors such as transcription factors with statistical models; (Q2) How to utilize trans- regulatory variants while keeping the model robust despite the large number of possible candidates; (Q3) How to utilize data-bridges to improve the modeling of rare genetic variants; and (Q4) How to utilize brain imaging data in genetic association mapping. These efforts led to four novel statistical models and their implementation: namely, (M1) sTF-TWAS, which integrates the prior knowledge of transcription factors (TF) with association study; (M2) transTF-TWAS, which utilizes Group Lasso to incorporate TF-linked trans-located variants; (M3) rvTWAS, which leverages transcriptome-directed feature selection towards rare variants; and (M4) IMAS, which uses borrowed brain images to conduct image-directed feature selection and aggregations. All these four methods are verified by comprehensive simulations based on known genetic architectures and heritability models. Utilizing the large-scale omics data accessed through dbGaP and UK Biobank, as well as the large cohorts from our collaborator, I have applied them to cancers and neuropsychiatric disorders, yielding the discovery of additional genes underlying complex traits. I have also thoroughly validated the methods by analyzing the discoveries using existing biological literature and databases. The development of these methods opens a door for integrating data-bridges such as transcriptomes and imaging data in genetic mapping. The novel findings provide additional insights into the genetic basis of cancers and brain disorders.