Browsing by Author "Sajobi, Tolulope T."
Now showing 1 - 14 of 14
Results Per Page
Sort Options
Item Open Access A systematic approach to using regression modelling and ‘big data’ to derive a meaningful clinical decision rule for epilepsy(2018-08-22) Josephson, Colin Bruce; Wiebe, Samuel; Jetté, Nathalie; Sajobi, Tolulope T.; Marshall, Deborah A.Introduction: clinical decision rules (CDRs) have been developed in a number of medical fields resulting in improved patient outcomes, quality of care, and health economics. Aims: to identify all CDRs developed for epilepsy and to derive one that guides the prescription of the antiepileptic drug (AED), levetiracetam, according to its risk of a psychiatric adverse effect. Methods: a systematic review and meta-analysis was first performed to determine the state of the literature with respect to CDRs in epilepsy. The Health Improvement Network (THIN) electronic medical records register was used to identify patients with epilepsy by employing a modified validated case definition with a 5-year washout. Analyses were restricted to patients receiving AED monotherapy and the association between levetiracetam use and psychiatric adverse effects was explored Cox proportional hazards regression with timevarying covariates. Finally, logistic regression with parameter regularisation and k=5 fold cross validation was used to derive the CDR that predicts the development of psychiatric adverse effects following levetiracetam prescription. Results: the systematic review identified four epilepsy-specific CDRs, none of which guided AED prescription. A total of 9595 presumed incident cases of epilepsy (85.7 cases per 100,000 persons) were identified in THIN. Both carbamazepine (hazard ratio [HR]: 0.84, 95% confidence interval [95% CI]: 0.73– 0.97; p = 0.02) and lamotrigine (HR: 0.83, 95% CI: 0.70–0.99; p = 0.03) were associated with reduced hazards of a psychiatric sign, symptom, or disorder iii compared to no AED treatment. Levetiracetam was not associated with psychiatric adverse effects but the analyses were underpowered (n=202; 3%). All patients receiving levetiracetam (1173/7400; 16%) were included for CDR derivation. Prediction variables were incorporated into multiple logistic regression models with parameter regularisation. Odds of reporting a psychiatric complaint were elevated for females and those with a pre-exposure history of depression, anxiety, recreational drug use, or higher social deprivation. The prediction model performed well (area under the curve [AUC] 0.68; 95% confidence interval 0.58- 0.79 after stratified k=5 fold cross-validation). Using a cut-off threshold 0.1, the CDR had a specificity of 83%. Conclusion: If externally validated and properly implemented, this CDR could be used to guide prescription in clinical practice.Item Open Access Analysis of Metabolomics Data via Mixed Models(2020-08) Ren, Austin Mu Qing; de Leon, Alexander R.; Kopciuk, Karen Arlene; Vogel, Hans J.; Sajobi, Tolulope T.Generalized linear mixed models have been widely studied and used in many different disciplines, yet very little application of them can be found with metabolomics data analysis. Traditional methods of cancer classification used to determine disease severity, such as biopsies, can be harmful to the health of the patients. Classification based on metabolomics data analysis demonstrates a main advantage as it only requires non-invasive procedures such as the drawing of a small amount of blood from patients. However, data analysis in cancer research often requires the handling of multiple correlated measurements of disease severity. The methods that are most commonly used with metabolomics data, such as partial least squares discriminant analysis, were traditionally designed to handle univariate data only, and can be very challenging to work with when applied to data with multiple correlated outcomes. Therefore, different methods should be considered for metabolomics data analysis in cancer classification. In this thesis, we proposed bivariate generalized linear mixed models with binary outcomes using the probit link function for the analysis of metabolomics data. The models were specifically designed to handle multiple correlated outcomes via the inclusion of subject-specific random intercepts. Random slopes were not included in the models to reduce complexity. We specifically designed three settings for the random intercept models: shared, independent, and correlated between the outcomes. An extensive number of simulations were carried out to test our models' parameters, including: standard deviation and correlation of the distribution of the random intercepts, correlation between the covariates as well as correlation between the covariates and the outcomes, the proportion of data missing among the covariates, misspecified distribution of the random intercepts, and misspecified conditional correlation between the outcomes. In addition, we also incorporated the nearest neighbors algorithm as a missing values imputation method and LASSO as a feature selection method to our mixed models in order to handle the common issues of high dimensional covariates and missing values in metabolomics data. Finally, our proposed mixed models were applied to a real dataset with prostate cancer patients to evaluate our models' performance on outcome predictions.Item Open Access APPROACH e-PROM system: a user-centered development and evaluation of an electronic patient-reported outcomes measurement system for management of coronary artery disease(2024-08-28) Roberts, Andrew; Benterud, Eleanor; Santana, Maria J.; Engbers, Jordan; Lorenz, Christine; Verdin, Nancy; Pearson, Winnie; Edgar, Peter; Adekanye, Joel; Javaheri, Pantea; MacDonald, Courtney E.; Simmons, Sarah; Zelinsky, Sandra; Caird, Jeff; Sawatzky, Rick; Har, Bryan; Ghali, William A.; Norris, Colleen M.; Graham, Michelle M.; James, Matthew T.; Wilton, Stephen B.; Sajobi, Tolulope T.Abstract Background Coronary artery disease (CAD) confers increased risks of premature mortality, non-fatal morbidity, and significant impairment in functional status and health-related quality of life. Routine administration of electronic patient-reported outcome measures (PROMs) and its real time delivery to care providers is known to have the potential to inform routine cardiac care and to improve quality of care and patient outcomes. This study describes a user-centered development and evaluation of the Alberta Provincial Project for Outcomes Assessment (APPROACH) electronic Patient Reported Outcomes Measurement (e-PROM) system. This e-PROM system is an electronic system for the administration of PROMs to patients with CAD and the delivery of the summarized information to their care providers to facilitate patient-physician communication and shared decision-making. This electronic platform was designed to be accessible via web-based and hand-held devices. Heuristic and user acceptance evaluation were conducted with patients and attending care providers. Results The APPROACH e-PROM system was co-developed with patients and care providers, research investigators, informaticians and information technology experts. Five PROMs were selected for inclusion in the online platform after consultations with patient partners, care providers, and PROMs experts: the Seattle Angina Questionnaire, Patient Health Questionnaire, EuroQOL, and Medical Outcomes Study Social Support Survey, and Self-Care of Coronary Heart Disease Inventory. The heuristic evaluation was completed by four design experts who examined the usability of the prototype interfaces. User acceptance testing was completed with 13 patients and 10 cardiologists who evaluated prototype user interfaces of the e-PROM system. Conclusion Both patients and physicians found the APPROACH e-PROM system to be easy to use, understandable, and acceptable. The APPROACH e-PROM system provides a user-informed electronic platform designed to incorporate PROMs into the delivery of individualized cardiac care for persons with CAD.Item Open Access Classification Models for Multivariate Non-normal Repeated Measures Data(2021-01-08) Brobbey, Anita; Sajobi, Tolulope T.; Wiebe, Samuel; Williamson, Tyler S.; Nettel-Aguirre, AlbertoMultivariate repeated measures data, in which multiple outcomes are repeatedly measured at two or more occasions, are commonly collected in several disciplines (e.g., medicine, ecology, environmental sciences), where investigators seek to discriminate between population groups or make predictions based on changes in multiple correlated outcomes over time. Repeated measures discriminant analysis have been developed and applied to address these research questions. These classification models, which have been mostly developed based on growth curve models, covariance pattern models, and mixed-effects models, are advantageous in that they can account for complex correlation structures in multivariate repeated measures data (e.g., within-outcome and between-outcome correlations) to improve their predictive accuracy. However, they largely rely on the assumption of multivariate normality, which is rarely satisfied in multivariate repeated measures data. To our knowledge, there has been limited investigation of the behavior of these existing models in multivariate non-normal repeated measures data. The overarching goal of this research was to develop robust repeated measures discriminant analysis classifiers for multivariate non-normal repeated measures data. Specifically, we developed repeated measures discriminant analysis based on maximum trimmed likelihood estimators (MTLE) and generalized estimating equations (GEE) estimators and examine their accuracy in comparison to classifiers based on maximum likelihood estimation (MLE) using Monte Carlo methods. The simulation conditions examined, included population distribution, sample size, covariance structure (between-outcomes and within-outcome), covariance heterogeneity, repeated number of occasions, and number of outcome variables. The Monte Carlo study results indicated that the proposed methods increased overall mean classification accuracy by 2% - 15% in multivariate non-normal repeated measures data compared to repeated measures discriminant analysis based on MLE under most scenarios. Data from two cohort studies were used to illustrate the implementation of the proposed repeated measures discriminant analysis methods. The outcomes of this research includes novel multivariate classifiers for predicting group membership in multivariate normal and non-normal repeated measures data. This research contributes to the advancement of statistical science on methods for analyzing multivariate repeated measures data.Item Open Access Dementia risk prediction in individuals with mild cognitive impairment: a comparison of Cox regression and machine learning models(2022-11-02) Wang, Meng; Greenberg, Matthew; Forkert, Nils D.; Chekouo, Thierry; Afriyie, Gabriel; Ismail, Zahinoor; Smith, Eric E.; Sajobi, Tolulope T.Abstract Background Cox proportional hazards regression models and machine learning models are widely used for predicting the risk of dementia. Existing comparisons of these models have mostly been based on empirical datasets and have yielded mixed results. This study examines the accuracy of various machine learning and of the Cox regression models for predicting time-to-event outcomes using Monte Carlo simulation in people with mild cognitive impairment (MCI). Methods The predictive accuracy of nine time-to-event regression and machine learning models were investigated. These models include Cox regression, penalized Cox regression (with Ridge, LASSO, and elastic net penalties), survival trees, random survival forests, survival support vector machines, artificial neural networks, and extreme gradient boosting. Simulation data were generated using study design and data characteristics of a clinical registry and a large community-based registry of patients with MCI. The predictive performance of these models was evaluated based on three-fold cross-validation via Harrell’s concordance index (c-index), integrated calibration index (ICI), and integrated brier score (IBS). Results Cox regression and machine learning model had comparable predictive accuracy across three different performance metrics and data-analytic conditions. The estimated c-index values for Cox regression, random survival forests, and extreme gradient boosting were 0.70, 0.69 and 0.70, respectively, when the data were generated from a Cox regression model in a large sample-size conditions. In contrast, the estimated c-index values for these models were 0.64, 0.64, and 0.65 when the data were generated from a random survival forest in a large sample size conditions. Both Cox regression and random survival forest had the lowest ICI values (0.12 for a large sample size and 0.18 for a small sample size) among all the investigated models regardless of sample size and data generating model. Conclusion Cox regression models have comparable, and sometimes better predictive performance, than more complex machine learning models. We recommend that the choice among these models should be guided by important considerations for research hypotheses, model interpretability, and type of data.Item Open Access Development of a Clinical Care Pathway for Patients with Suspected Acute Coronary Syndromes in the Emergency Department(2020-04-30) O'Rielly, Connor M.; McRae, Andrew D.; Ronksley, Paul Everett; Andruchow, James E.; Sajobi, Tolulope T.Chest pain is a predominant reason for emergency department (ED) visits and hospitalizations in Canada. ED physicians use diagnostic tools (e.g., biomarkers) to identify patients with myocardial infarction (MI) requiring intervention, and prognostic tools (e.g., risk scores) to determine which patients without MI are eligible for discharge. While clinical guidelines recommend that these two portions of the assessment occur sequentially, the evidence for each has emerged in isolation. There is also a paucity of evidence on risk score use in the era of high-sensitivity cardiac troponin (hs-cTn) assays, adverse event risk factors for patients without MI, and appropriate timelines for follow-up. This project had three complimentary objectives: (1) Synthesize available evidence on prognostic prediction score performance when hs-cTn assays are incorporated; (2) Quantify the time course of major adverse cardiac events (MACE) in patients without index MI and identify characteristics with potential predictive value for MACE, and; (3) Develop a sequential clinical pathway for the assessment of chest pain in the ED and measure the impacts on diagnostic and prognostic accuracy as well as ED patient flow. A systematic review was conducted to synthesize evidence on the chest pain risk scores to be prioritized for integration into the clinical pathway. A time-to-event analysis was then conducted to measure timing of MACE in patients without index MI, as well as a stratified analysis to identify characteristics with predictive value for 30-day MACE to be used in the pathway for clinical stratification. Trial clinical pathways were developed and quantitatively compared. Pathways combined a validated 2-hour hs-cTn diagnostic algorithm with variable clinical pre-stratification, risk score types, and low-risk cut-offs. A sequential clinical pathway using a validated hs-cTn algorithm and the HEART score can identify nearly 40% of ED chest pain patients as eligible for discharge without the need for further testing with no missed MI or 30-day MACE. This thesis project contributed evidence necessary for the updating and advancing of the ED chest pain assessment and presents an evidence-based sequential clinical pathway that maximizes the efficiency of the ED chest pain assessment.Item Open Access Exploring the Relationship Between Diabetes and Physical Activity Behaviours: Results from the Canadian Health Measures Survey (2007-2017)(2020-05-12) Booth, Jane; Sigal, Ronald J.; Rabi, Doreen M.; Goldfield, Gary S.; Sajobi, Tolulope T.Background: Diabetes Canada clinical practice guidelines recommend that individuals with type 2 diabetes accumulate a minimum of 150 minutes per week of moderate-to vigorous-intensity physical activity (MVPA) and reduce the amount of time spent sedentary. To our knowledge, there are no nationally-representative studies in Canada that have used objectively-measured physical activity data to assess the associations between physical activity and sociodemographic characteristics or cardiometabolic measures in people with type 2 diabetes. Thus, the objectives of this thesis were to (1) evaluate the associations between physical activity, sedentary time and cardiometabolic health and (2) evaluate the associations between physical activity, sedentary time and sociodemographic characteristics in adults with type 2 diabetes in a representative sample of the Canadian population. Methods: Cycles 1 to 5 of the Canadian Health Measures Survey (CHMS) were used. Participants with type 2 diabetes between 20 and 79 years of age who had at least four days of valid activity monitor wear were included. Means, medians and interquartile ranges were used to present estimates of physical activity and sedentary time. Physical activity was stratified by MVPA tertile and cardiometabolic mean values and/or proportions with 95% confidence intervals were compared. Median regression was used to evaluate the associations between 60-minute per week increment in total MVPA with hemoglobin A1c (A1C) and body mass index (BMI). Ordinal logistic regression was used to estimate the odds of achieving lower amounts of MVPA based on sociodemographic factors. Results: Only 21.5% of adults with type 2 diabetes met clinical practice guideline recommendations for physical activity. Higher amounts of MVPA and daily steps were associated with lower BMI, waist circumference and cardiometabolic risk composite score. Female sex, lower income, BMI ≥ 25 kg/m2, and being a current or former smoker were associated with lower levels of physical activity. Conclusions: Less than one quarter of adults with type 2 diabetes met physical activity recommendations. We identified important sociodemographic characteristics that were determinants of low levels of physical activity which should be considered by healthcare providers and policy-makers in order to inform and deliver effective physical activity interventions.Item Open Access Machine learning models for functional impairment risk prediction in ischemic stroke patients(2020-09-03) Alaka, Shakiru Ayomide; Sajobi, Tolulope T.; Menon, Bijoy K.; Hill, Michael D.; Williamson, Tyler S.Background: Stroke-related functional impairment risk scores are commonly used to estimate the patient-specific risk of functional impairment in acute care settings. However, these models have been primarily developed based on regression models, which might not provide optimal predictive accuracy, especially when validated in an external cohort. Purpose: To evaluate the predictive accuracy of machine-learning (ML) models for predicting functional impairment risk in acute ischemic stroke patients. Second, to compare the predictive accuracy of machine-learning models and regression-based models using computer simulations. Methods: Using data from the Precise and Rapid Assessment of Collaterals with Multi-phase CT Angiography (PROVE-IT). The Modified Rankin Scale (mRS) score was used to assess the 90-day functional impairment status. The accuracy of machine-learning models such as random forest (RF), classification and regression tree (CART), support vector machine (SVM), C5.0 decision tree (DT), adaptive boost machine (ABM), and least absolute shrinkage and selection operator (LASSO) logistic regression, and logistic regression (LR) was used to predict the risk of patient-specific risk of 90-day functional impairment. Area under the receiver operating characteristic curve (AUC) sensitivity, specificity, Mathews correlation coefficient (MCC) and Brier score was used to assess the predictive accuracy of these models via internal cross-validation and external validation in the Identifying New Approaches to Optimize Thrombus Characterization for Predicting Early Recanalization and Reperfusion with IVtPA Using Serial CT Angiography (INTERSSeCT) cohort study. Monte Carlo methods were used to develop recommendations for selecting machine-learning models under a variety of data characteristics. Results: Both logistic regression and machine-learning models had comparable predictive accuracy when validated internally (AUC range = [0.65 – 0.72]; MCC range = [0.29 - 0.42]) and externally (AUC range = [0.66 – 0.71]; MCC range = [0.34 – 0.42]). However, regression-based had a fairly better calibration than the ML models. Our simulation study showed that ML and regression-based models are not equally robust to a variety of data analytic characteristics. LR models exhibited higher AUC in studies with a small/moderate set of predictors, while RF had about 15% higher discrimination studies with high dimensional set of predictors. ML models may be less accurate for predicting outcomes in studies with a few sets of predictors or when there is a large class imbalance in the data sets. Conclusions ML and regression-based algorithms are not equally sensitive to data analytic conditions, even though our data analysis revealed no significant differences between the former and the latter. ML might offer some discriminative advantages over the latter depending on the size and type of study predictors. We recommend that the choice between these classes of models should be guided by data characteristics, study design, and purpose for which the models are being developed.Item Open Access Predicting Early Discontinuation of Adjuvant Chemotherapy and its Impact on Survival among Individuals with Stage III Colon Cancer(2020-08-05) Boyne, Devon J; Brenner, Darren R.; Friedenreich, Christine M.; Cheung, Winson Y.; Hilsden, Robert J.; Sajobi, Tolulope T.Background: Approximately one in three patients with stage III colon cancer fail to complete the entirety of their adjuvant chemotherapy prescription. Two questions arise from this observation: 1) Can we predict which patients will discontinue adjuvant chemotherapy? and 2) Does a shortened duration of adjuvant chemotherapy impact overall survival? Evidence pertaining to the first question is limited. While question two was recently addressed within a large randomized trial, results from this trial have been controversial. Methods: To address question one, we conducted a systematic review and survey of medical oncologists to identify factors that predict non-completion of adjuvant chemotherapy. Building upon the results of this investigation, we developed an online calculator to predict the risk of discontinuation at the individual-level. For question two, a systematic review and meta-analysis was performed. In addition, we emulated a target trial that examined the effect of a shortened duration of adjuvant chemotherapy on overall survival using real-world data.Results: According to a systematic review of 18 studies and survey of 14 medical oncologists, there was evidence that increased comorbidity, worse performance status, higher T stage, and adjuvant CAPOX chemotherapy or poor oxaliplatin candidacy were associated with an increased risk of discontinuation. Using information from 1,378 patients, an online risk calculator was developed. Internal validation suggested that this calculator accurately predicted and classified patients with respect to their risk of discontinuation (optimism-adjusted C-statistic=0.80; 95% CI:0.79-0.82; calibration plots were within acceptable limits). A meta-analysis of 22 studies suggested that a shortened duration of adjuvant chemotherapy was harmful among patients prescribed a monotherapy (HR: 0.59; 95% CI: 0.52-0.68) but not among among those prescribed FOLFOX or CAPOX (HR: 0.80; 95% CI: 0.58-1.09). In a target trial analysis of 485 colon cancer patients, both the overall and subgroup-specific hazard ratios were consistent with those from a randomized trial. Conclusions: Results from this investigation can help assess and communicate the risk of early discontinuation within this study population. Results from our meta-analysis and target trial emulation suggest that a shortened duration of adjuvant chemotherapy may be appropriate for some patients which supports findings from a recent randomized trial.Item Open Access Prehabilitation for Enhanced Recovery After Colorectal Surgery(2020-06-08) Gillis, Chelsia; Fenton, Tanis R.; Gramlich, Leah M.; Culos-Reed, Susan Nicole; Sajobi, Tolulope T.Background: Postoperative morbidity is largely the product of the preoperative condition of the patient, the quality of surgical care provided, and the degree of surgical stress elicited. Enhanced Recovery After Surgery (ERAS) minimizes surgical stress with standardized evidence-based perioperative care; yet the ERAS care elements focus mainly on the intra- and postoperative periods, which may not sufficiently enhance recovery if preoperative patient-related factors have not been modified before surgery. Prehabilitation programs aim to enhance recovery by targeting the preoperative condition of the patient.Methods: This dissertation includes four manuscripts that broadly contribute to the evidence that supports the hypothesis that the patient’s preoperative status modifies outcomes in colorectal surgery. Results: First, intermediately frail and frail patients with poor functional walking capacity before surgery suffer more postoperative complications than patients with better functional walking capacity. Second, nutrition prehabilitation, with and without exercise, reduces mean length of hospital stay by two days. Third, patient interviews suggest that patients support the idea of using prehabilitation to enhance their preoperative condition. Finally, the last manuscript offers methodological suggestions to measure and analyze external variables as a means of advancing the prehabilitation literature and further enhancing patient outcomes. Conclusion: The findings of this doctoral dissertation add to the growing body of evidence that the process of surgical recovery begins before surgery. Prehabilitation interventions can be applied to support better postoperative recoveries.Item Open Access Testing ASPECTS Reliability Using Color Coded Algorithm Enhanced Gray- White Matter Non Contrast CT(2018-07-09) Hafeez, Moiz; Menon, Bijoy K.; Qiu, Wu; Federico, Paolo; Demchuk, Andrew M.; Krupinski, Elizabeth A.; Sajobi, Tolulope T.The Alberta Stroke Program Early CT Score (ASPECTS) is widely used to assess and diagnose Acute Ischemic Stroke Patients (AIS). Inter-rater reliability for ASPECTS however, is very poor even amongst physicians with extensive expertise. Much of this limitation has to do with the lack of agreement amongst physicians in identifying Early Ischemic Changes (EIC) on Non- Contrast Computed Tomography (NCCT) scans. This lack of agreement is due to the extremely subtle findings that the human eye is exposed to on gray scale NCCT scans during the acute period of ischemia. We therefore sought to use post processing algorithms to develop Color- Coded Algorithm Enhanced Gray- White Matter (AEGWM) NCCT scans. Increased differentiation between Gray- White matter on AEGWM NCCT scans was developed to act as a powerful imaging tool allowing for better delineation of EIC for AIS patients. In this thesis I investigated the utility of AEGWM NCCT scans for the purposes of detecting EIC in AIS patients. Overall, we found that AEGWM scans performed better as opposed to gray scale NCCT scans when using DWI as ground truth. In addition, inter rater agreement increased consistently across raters of all levels of expertise while using AEGWM scans. Although with some limitations, the use of AEGWM scans may be a promising research direction to pursue for future work.Item Open Access Treatment Effect Models for Subgroup Analysis with Missing Data(2018-08-31) Fu, Yunting; Shen, Hua; Kopciuk, Karen Arlene; De Leon, Alexander R.; Sajobi, Tolulope T.The need for subgroup analysis in clinical trials in various contexts is increasing and data-driven approaches for subgroup identification based on statistical principles are desired. Among all subgroup identification methods, we focus on the treatment effect models that estimate the treatment contrast, since these models are intuitive and useful to interpretation. We evaluate and address the consequences of having missing data when using the Interaction Trees (IT), Qualitative Interaction Trees (QUINT) and Subgroup Identification based on Differential Effect Search (SIDES) methods. Simulation studies are used to demonstrate the accuracy of variable selection and bias in treatment effects when using complete, incomplete and imputed data across various scenarios when the sample size, proportion of missingness and imputation methods differ. We also applied these methods to a non-small cell lung cancer (NSCLC) dataset obtained from a retrospective study. Our results indicate that both IT and QUINT methods work equivalently well in most situations, while the SIDES results are, in general, less comparable due to the different mechanisms of the methods. The treatment effect models should be chosen based on the objective of the study, the sample size, the number of variables containing missing data, and the data structure. In terms of the methods for addressing missing data, an assumption of the data structure needs to be made during the method selection. MissForest is an excellent choice for a dataset with a tree-based structure, while MI methods would be a good fit for the other situations.Item Open Access Using machine learning methods to improve chronic disease case definitions in primary care electronic medical records(2018-04-23) Lethebe, Brendan Cord; Williamson, Tyler S.; Sajobi, Tolulope T.; Quan, Hude; Ronksley, Paul EverettBackground: Chronic disease surveillance at the primary care level is becoming more feasible with the increased use of electronic medical records (EMRs). However, the quality of surveillance information is directly dependent on the quality of the case definitions that identify the conditions of interest. Purpose: To determine whether machine learning algorithms can produce chronic disease case definitions comparable to committee created case definitions in a primary care EMR setting. Methods: A chart review was conducted for the presence of hypertension, diabetes, osteoarthritis, and depression in a cohort of 1920 patients from the Canadian Primary Care Sentinel Surveillance Network database. The results of this chart review were used as training data. The C5.0, Classification and Regression Tree, Chi-Squared Automated Interaction Detection decision trees, Forward Stepwise logistic regression, Least Absolute Shrinkage and Selection Operator penalized logistic regression were compared using 10-fold cross validation. Sensitivity, specificity, positive predictive value and negative predictive value were estimated and compared for the four chronic conditions of interest. Results: Validity measures were similar across algorithms. For hypertension, sensitivity ranged between 93.1-96.7%, while specificity ranged from 88.8-93.2%. For diabetes, sensitivities ranged from 93.5-96.3% with specificities between 97.1-99.0%. For osteoarthritis, sensitivities ranged from 82.0-84.4% with specificities between 92.7-94.0%. For depression, sensitivities went from 81.4-88.3%, and specificities ranged from 93.4-94.9%. Compared with the committee-created case definitions, these metrics were equivalent or better using the machine learning method. Conclusions: Machine learning algorithms produced accurate case definitions comparable to committee-created case definitions. It is possible to use machine learning techniques to develop high quality case definitions from EMR data.Item Open Access Whole-Brain Atrophy Rates, Regional Cerebral Blood Flow, and Cognitive Profiles of Transient Ischemic Attack Patients and Controls(2019-06-18) Reid, Meaghan; Barber, Philip A.; Sajobi, Tolulope T.; Coutts, Shelagh B.; Longman, Richard StewartDementia is one of the most common causes of disability amongst the old and the prevalence is expected to double within the next twenty years. Recent prevention trials have failed to find a cure, likely due to inappropriate trial selection and a lack of reliable outcome measurements. Standardized clinical, demographic, imaging and neuropsychological biomarkers will improve selection criteria and therapeutic interventions. Transient ischemic attack (TIA) patients are at an increased risk of late-life cognitive decline due to their common vascular risk factors with dementia and underlying cerebrovascular pathology. We hypothesized that TIA patients would have increased longitudinal rates of cerebral atrophy as measured by T1 magnetic resonance (MR) imaging compared to non-TIA controls over 1 year and that increased cerebral atrophy rates would be associated with poorer cognitive outcomes. Secondly, we hypothesized that at baseline TIA patients would have lower regional cerebral blood flow (CBF) as measured by arterial spin labelled (ASL) MR imaging compared to non-TIA controls, and that CBF would be associated with cognition. Our results suggest that TIA patients show almost double the cerebral atrophy rates of non-TIA controls over 1-year, and in the absence of demonstrated change in cognition, supports that these subjects with TIA are in a preclinical stage of cognitive decline. Our results also show that TIA patients have reduced CBF in the left entorhinal cortex, the posterior cingulate bilaterally and the right precuneus which was associated with poorer memory outcomes. These predictors of early neurodegeneration and vascular changes show that TIA patients are a high-risk population for dementia and could improve inclusion criteria for clinical trials to prevent dementia in the future.