Browsing by Author "Kopciuk, Karen A"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Open Access A quantitative multimodal metabolomic assay for colorectal cancer(2018-01-04) Farshidfar, Farshad; Kopciuk, Karen A; Hilsden, Robert; McGregor, S. E; Mazurak, Vera C; Buie, W. D; MacLean, Anthony; Vogel, Hans J; Bathe, Oliver FAbstract Background Early diagnosis of colorectal cancer (CRC) simplifies treatment and improves treatment outcomes. We previously described a diagnostic metabolomic biomarker derived from semi-quantitative gas chromatography-mass spectrometry. Our objective was to determine whether a quantitative assay of additional metabolomic features, including parts of the lipidome could enhance diagnostic power; and whether there was an advantage to deriving a combined diagnostic signature with a broader metabolomic representation. Methods The well-characterized Biocrates P150 kit was used to quantify 163 metabolites in patients with CRC (N = 62), adenoma (N = 31), and age- and gender-matched disease-free controls (N = 81). Metabolites included in the analysis included phosphatidylcholines, sphingomyelins, acylcarnitines, and amino acids. Using a training set of 32 CRC and 21 disease-free controls, a multivariate metabolomic orthogonal partial least squares (OPLS) classifier was developed. An independent set of 28 CRC and 20 matched healthy controls was used for validation. Features characterizing 31 colorectal adenomas from their healthy matched controls were also explored, and a multivariate OPLS classifier for colorectal adenoma could be proposed. Results The metabolomic profile that distinguished CRC from controls consisted of 48 metabolites (R2Y = 0.83, Q2Y = 0.75, CV-ANOVA p-value < 0.00001). In this quantitative assay, the coefficient of variance for each metabolite was <10%, and this dramatically enhanced the separation of these groups. Independent validation resulted in AUROC of 0.98 (95% CI, 0.93–1.00) and sensitivity and specificity of 93% and 95%. Similarly, we were able to distinguish adenoma from controls (R2Y = 0.30, Q2Y = 0.20, CV-ANOVA p-value = 0.01; internal AUROC = 0.82 (95% CI, 0.72–0.93)). When combined with the previously generated GC-MS signatures for CRC and adenoma, the candidate biomarker performance improved slightly. Conclusion The diagnostic power for metabolomic tests for colorectal neoplasia can be improved by utilizing a multimodal approach and combining metabolites from diverse chemical classes. In addition, quantification of metabolites enhances separation of disease-specific metabolomic profiles. Our future efforts will be focused on developing a quantitative assay for the metabolites comprising the optimal diagnostic biomarker.Item Open Access Application of Machine Learning Algorithms to Actuarial Ratemaking within Property and Casualty Insurance(2023-09-19) Arumugam, MohanaGowri; Ambagaspitiya, Rohana Shantha; Lu, Xuewen; Scollnik, David Peter Michael; Kopciuk, Karen A; Bae, TaehanA scientific pricing assessment is essential for maintaining viable customer relationship management solutions (CRM) for various stakeholders including consumers, insurance intermediaries, and insurers. The thesis aims to examine research problems neighboring the ratemaking process, including relaxing the conventional loss model assumption of homogeneity and independence. The thesis identified three major research scopes within multiperil insurance settings: heterogeneity in consumer behaviour on pricing decisions, loss trending under non-linearity and temporal dependencies, and loss modelling in presence of inflationary pressure. Heterogeneous consumers on pricing decisions were examined using demand and loyalty-based strategy. A hybrid decision tree classification framework is implemented, that includes semi-supervised learning model, variable selection technique, and partitioning approach with different treatment effects in order to achieve adequate risk profiling. Also, the thesis explored a supervised tree learning mechanism under highly imbalanced overlap classes and having a non-linear response-predictors relationship. The two-phase classification framework is applied to an owner’s occupied property portfolio from a personal insurance brokerage powered by a digital platform within the Canadian market. The hybrid three-phase tree algorithm, which includes conditional inference trees, random forest wrapped by the Boruta algorithm, and model-based recursive partitioning under a multinomial generalized linear model, is proposed to study the price sensitivity ranking of digital consumers. The empirical results suggest a well-defined segmentation of digital consumers with differential price sensitivity. Further, with highly imbalanced and overlapped classes, the resampling technique was modelled together with the decision tree algorithm, providing a more scientific approach to overcome classification problems than the traditional multinomial regression. The resulting segmentation was able to identify the high-sensitivity consumers group, where premium rate reductions are recommended to reduce the churn rate. Consumers are classified as an insensitive group for which the price strategy to increase the premium rate is expected to have a slight impact on the closing ratio and retention rate. Insurance loss incurred greatly exhibits abnormal characteristics such as temporal dependence, nonlinear relationship between dependent and independent variables, seasonal variation, and mixture distribution resulting from the implicit claim inflation component. With such abnormal variable characteristics, the severity and frequency components may exhibit an altered trending pattern, that changes over time and never repeats. This could have a profound impact on the experience rating model, where the estimates of the pure premium and the rate relativity of tariff class are likely to be under or over-estimated. A discussion of the pros and cons of the conventional loss trending approach leads to an alternative framework for the loss cost structure. The conventional pure premium is further split into base severity and severity deflator random variables using a do(·) operator within causal inference. The components are separately modelled based on different time basis predictors using the semiparametric generalized additive model (GAM) with a spline curve. To maximize the claim inflation calendar year effect and improve the efficiency of severity trending, this thesis refines the claim inflation estimation by adapting Taylor’s [86] separation method that estimates the inflation index from a loss development triangle. In the second phase of developing the severity trend model, we integrated both the base severity and severity deflator under a new generalized mechanism known as Discount, Model, and Trend (DMT). The two-phase modelling was built to overcome the mixture distribution effect on final trend estimates. A simulation study constructed using the claims paid development triangle from a Canadian Insurtech broker’s houseowners/householders portfolio was used in a severity trend movement prediction analysis. We discovered that the conventional framework understated the severity trends more than the separation cum DMT framework. GAM provides a flexible and effective mechanism for modelling nonlinear time series in studies of the frequency loss trend. However, GAM assumes that residuals are independent and identically distributed (iid), while frequency loss time series can be correlated in adjacent time points. This thesis introduces a new model called Generalized Additive Model with Seasonal Autoregressive term (GAMSAR) that accounts for temporal dependency and seasonal variation in order to improve prediction confidence intervals. Parameters of the GAMSAR model are estimated by maximum partial likelihood using a modified Newton’s method developed by Yang et al. [97], and the goodness-of-fit between GAM, and GAMSAR is demonstrated using a simulation study. Simulation results show that the bias of the mean estimates from GAM differs greatly from their true value. The proposed GAMSAR model shows to be superior, especially in the presence of seasonal variation. Further, a comparison study is conducted between GAMSAR and Generalized Additive Model with Autoregressive term (GAMAR) developed by Yang et al. [97], and the coverage rate of 95% confidence interval confirms that the GAMSAR model has the ability to incorporate the nonlinear trend effects as well as capture the serial correlation between the observations. In the empirical analysis, a claim dataset of personal property insurance obtained from digital brokers in Canada is used to show that the GAMSAR(1)12 captures the periodic dependence structure of the data precisely compared to standard regression models. The proposed frequency severity trend models support the thesis’s goal of establishing a scientific approach to pricing that is robust under different trending processes.Item Open Access Metabolomic Biomarkers for Colorectal Cancer(2016) Farshidfar, Farshad; Bathe, Oliver F.; Vogel, Hans J; Kopciuk, Karen A; Hilsden, Robert; Buie, W. DonaldColorectal cancer (CRC) is the second most common cancer in the North America. It is also a huge burden for society. Remarkable efforts have been and are being made to improve CRC diagnosis, to enhance the effectiveness of treatments, and to eventually improve the outcome of these patients. Metabolomic profiling, as a method for describing metabolic state and alterations in the molecular constituents and capable of yielding unique and invaluable information about tumor biology, has been employed. Using a range of spectroscopy and mass spectrometry techniques, we have sought to characterize the changes in the serum metabolome that appear as a result of malignant and pre-malignant lesions in the colon and rectum. In Chapter 2, Application of gas chromatography-mass spectrometry (GC-MS) and nuclear magnetic resonance (NMR) spectroscopy for staging CRC is described. Chapter 3 describes a larger study of 320 CRC and 31 colorectal adenoma cases as well as their matching controls by GC-MS, which led to the identification of validated metabolomic signature for identification of CRC and a proposed signature for identification of colorectal adenoma. In chapter 4, an effort for quantitative profiling of 62 CRC cases and 31 colorectal adenomas and their matching controls by tandem mass spectrometry is illustrated, and a validated quantitative signature for diagnosis of CRC is reported. Chapter 5 is dedicated to studying the prognostic value of metabolomic profiling in colorectal liver metastatic patients, and a novel workflow for estimation of recurrence risk using high-dimensional data is proposed. Challenges and pitfalls confronted in different steps of the project were addressed when possible by the use of available methods. Where no reliable method was available, we made an effort to develop one. This thesis, therefore, is focused on the metabolomic characterization of CRC and the adaptation of this knowledge for the development of clinically valuable biomarkers.