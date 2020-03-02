Individual ceramides and CAD. We quantified 32 sphingolipids including the major ceramides [cer(d18:1)], dihydroceramides [dihydro-cer(d18:0)], glucosylceramides [(glucosyl-cer(d18:1)], dihydrosphingomyelins [dihydro-SM(d18:0)], sphingomyelins [SM(d18:1)], sphinganine, and sphingosine (Figure 2). All sphingolipids measured, except for 2 glucosylceramides, were elevated in patients with CAD compared with levels in control subjects (Table 2). Sphingosine (P < 2 × 10–16), dihydro-cer(d18:0/ 16:0) (P < 2 × 10–16), dihydro-cer(d18:0/ 18:0) (P < 2 × 10–16), and cer(d18:1/ 24:1) (P < 2 × 10–16) were most strongly associated with CAD (OR per SD 3.47, 95% CI: 2.63–4.69; OR per SD 2.54, 95% CI: 2.06–3.18; OR per SD 2.82, 95% CI: 2.24–3.60; OR per SD 2.30, 95% CI: 2.24–3.60; OR per SD 2.29, 95% CI: 1.86, 2.85, respectively). Figure 3 depicts the ORs for CAD for all sphingolipid species measured, including the unadjusted model, a parsimonious model (i.e., a minimally adjusted model that includes the covariates age, sex, and BMI), and a fully adjusted model (i.e., a model that includes the covariates age, sex, BMI, total cholesterol [total-C], LDL cholesterol [LDL-C], HDL cholesterol [HDL-C], VLDL cholesterol [VLDL-C], TGs, hypertension, diabetes, and smoking).

Figure 2 Schematic of the Utah CAD study design and the subset of available biospecimens used for LC-MS/MS sphingolipid analysis. Machine learning was applied to the sphingolipidomic data to develop novel scores that associated with CAD beyond conventional lipid markers, such as cholesterol (created with BioRender).

Figure 3 Forest plot of OR (95% CI) for CAD per SD of sphingolipid species in the Utah CAD study. (A) Unadjusted OR. (B) Fully adjusted OR (age, sex, BMI, total-C, LDL-C, VLDL-C, TGs, hypertension, diabetes, smoking). (C) Minimally adjusted OR (age, sex, BMI) model. The numerically presented ORs (95% CI) represent the minimally adjusted age, sex, and BMI model.

Table 2 Means and interquartile ranges for LC-MS/MS measured sphingolipids in and control groups of the Utah CAD study

Ceramide risk score and CAD. For each subject, we calculated the ceramide risk score (i.e., cardiac event risk test 1 [CERT1]) that was developed by Zora Biosciences and is in operation at the Mayo Clinic as a means of predicting 5-year risk of CV mortality (4, 21, 22). CERT1 performed well in this cohort, as subjects with CAD had significantly higher CERT1 risk scores than did the control participants (OR per SD 2.18, 95% CI 1.77–2.71) (Figure 3). Interestingly, the CERT1 score, which comprises the individual ceramide species cer(d18:1/16:0), cer(d18:1/18:0), and cer(d18:1/24:1) as well as the ratio of these lipids to cer(d18:1/24:0), did not provide better predictive power than the individual ceramide species included in the score [cer(d18:1/16:0); OR per SD 2.30, 95% CI: 1.87–2.6; cer(d18:1/18:0); OR per SD 2.30, 95% CI: 1.87–2.85; cer(d18:1/24:1); OR per SD 2.29, 95% CI: 1.86–2.85] (Figure 3). Since cer(d18:1/24:0) was also elevated in individuals with CAD (OR per SD 2.12, 95% CI: 1.73–2.61), its inclusion in the denominator of CERT1 diminished the score’s predictive power in our sample (Figure 4).

Figure 4 OR (95% CI) of CAD per SD of previously reported lipid markers of CVD in the Utah CAD study. (A) Unadjusted OR. (B) Fully adjusted OR (age, sex, BMI, hypertension, diabetes, smoking). (C) Minimally adjusted OR (age, sex, BMI). The numerically presented ORs (95% CI) represent the minimally adjusted age, sex, and BMI model. Since we compared clinical lipid markers (LDL, VLDL, HDL, TGs) with ceramide ratios and scores, they were not included in the fully adjusted model. CERT1, cardiac event risk test (12-point scale). HDL-C, LDL-C, VLDL-C, and TG values are given in mg/dL.

Probing the role of specific ceramide species in CAD. To discern how the chemical composition of sphingolipids influenced their association with CAD, we grouped them into 2 different categories. In 1 category, we summed all species within a sphingolipid class (e.g., ceramides, dihydroceramides, sphingomyelins, etc.), independent of acyl chain length. In a second category, we summed all sphingolipids that had certain acyl chains attached to the sphingoid base (e.g., all species with C16:0, C18:0, C20:0, C24.1:0, or C24:0 acyl chains), independent of sphingolipid class. We found that total C24:1-containing sphingolipids (OR per SD 2.66, 95% CI: 2.12–3.38) and/or total dihydroceramides, independent of chain length (OR per SD 2.46, 95% CI: 1.99–3.10), were most strongly associated with CAD (Figure 5).

Figure 5 OR (95% CI) of CAD per SD of summed sphingolipid variables in the Utah CAD study. (A) Unadjusted OR. (B) Fully adjusted OR (age, sex, BMI, total-C, LDL-C, VLDL-C, TGs, hypertension, diabetes, smoking). (C) Minimally adjusted OR (age, sex, BMI). The numerically presented ORs (95% CI) represent the minimally adjusted age, sex, and BMI model. Total SM, total sphingomyelin; Total C16, sum of all C16 acyl chains; Total C18, sum of all C18 acyl chains; Total C20, sum of all C20 acyl chains; Total C22, sum of all C22 acyl chains; Total C24, sum of all C24 acyl chains; Total C24:1, sum of all C24:1 acyl chains.

Ceramide correlations with cholesterol and other conventional biomarkers. In order to explore the relationship between ceramides and other common biomarkers of CVD risk, we generated a Gaussian graphical model (GGM) between ceramides, TGs, LDL-C, HDL-C, and VLDL-C (Figure 6). The GGM measured the correlation of sphingolipids with each other and with traditional lipid biomarkers. All correlations were conditioned on the presence of the other analytes (r ≥ 0.20), thus representing direct relationships that are uninfluenced by other components. The GGM demonstrated that ceramide species correlated with each other in a single, interconnected network but that their associations with classic CVD risk biomarkers were weak (i.e., r < 0.20). In Figure 6, the strength of the correlations is depicted by the thickness of the lines connecting lipid nodes. The strongest positive correlations (red lines) were between cer(d18:1/20:0) and cer(d18:1/18:0); dihydro-cer(d18:0/24:0) and dihydro-cer(d18:0/22:0); and dihydro-SM(d18:0/24:0) and dihydro-SM(d18.0.22.0) (Figure 6). As expected, VLDL-C positively correlated with TGs (23). Ceramides did not correlate with VLDL-C, TGs, or other lipid markers of CVD risk. These findings indicate that sphingolipids are largely independent of traditional CVD lipid biomarkers and therefore provide new information about disease status, a critical consideration when developing novel biomarkers.

Figure 6 GGM of correlations between ceramide species and conventional lipid markers in patients with CAD. Conditioned on the presence of all other analytes (r ≥ 0.20). Analytes are represented by nodes (gray hexagons) and conditional correlations by edges (lines). Pink lines indicate positive correlations and blue lines indicate the inverse. Line width represents the strength of the conditional correlation, and the lack of a line indicates no detectable relationship above the threshold. GlcCer, glucosylceramide.

Generating novel CAD predictive ceramide risk scores using machine learning. We used machine learning, a branch of artificial intelligence, to reduce our large set of sphingolipids to a small set of predictive biomarkers. Machine learning incorporates pattern recognition within complex data sets and has been used previously to develop CVD risk prediction models. In comparison with classical statistical methods, machine-learning techniques can identify algorithms that predict health outcomes, even when relationships are complex and nonlinear (24, 25). Moreover, machine-learning–generated models tend to be more generalizable (24, 25).

We created these new sphingolipid-based risk scores using random forest (RF) and least absolute shrinkage and selection operator (LASSO) regression approaches for variable reduction and selection (Table 3). The RF method develops algorithms that can precisely classify observations into groups (i.e., CAD patients versus controls). With this method, the number of variables incorporated has a strong impact on model accuracy: if variables improve the model fit, RF accuracy improves; if not, accuracy is diluted by meaningless variables. We therefore ran 2 RF models. For the first, our input included sphingolipid variables only. For the second, our input included sphingolipid variables in concert with classical CVD risk markers (LDL-C, HDL-C, VLDL-C, and TGs). For our LASSO approach, the input included all sphingolipids and the aforementioned conventional CVD lipid markers. We evaluated the biomarker score classification using both ORs and receiver operator characteristic–area under the curve (ROC-AUC) analysis (Figure 7 and Table 4). For both RF and LASSO approaches, the 5 lipids most positively associated with CAD were used to generate a score.

Figure 7 OR (95% CI) of CAD per SD of novel scores generated through the application of machine-learning approaches in the Utah CAD study. (A) Unadjusted OR. (B) Multivariable-adjusted OR (age, sex, BMI, diabetes, hypertension, smoking). (C) Minimally adjusted OR (age, sex, BMI). The multivariable models for this analysis do not include HDL-C, LDL-C, VLDL-C, total-C, or TGs, as they were included as input variables.

Table 3 Novel sphingolipid scores for CAD generated through the application of machine-learning techniques

Table 4 AUC of ROC plots for lipid-based clinical indices

An RF-generated sphingolipid-inclusive CAD (RF-SIC) risk score (AUC = 0.75) outperformed CERT1 (AUC = 0.67) and conventional CVD risk biomarkers including LDL-C (AUC = 0.69) and total-C (AUC = 0.63) (Figure 7 and Supplemental Figure 1; supplemental material available online with this article; https://doi.org/10.1172/JCI131838DS1). An RF model generated from the sphingolipids plus CVD risk markers (denoted with a superscript plus sign, RF-SIC+, AUC = 0.78) included LDL-C and displayed a more precise classification of CAD patients versus controls, as compared with the RF-SIC score that excluded LDL-C (Figure 7). When evaluated by OR, RF-SIC+ (OR 5.03, 95% CI: 3.69–7.07) outperformed the RF-SIC score (OR 3.49, 95% CI: 2.71–4.58) (Figure 7).

The LASSO-generated SIC (LASSO-SIC) performed similarly to the RF-generated score (AUC for LASSO-SIC = 0.74; OR 2.86, 95% CI: 2.67, 3.66). We conducted an exploratory analysis, adding a term that was the ratio of the lipid with the highest positive CAD association versus the lipid that had the most negative association. This resulted in a slight increase in predictability (LASSO-SIC2, AUC = 0.75; OR 3.06, 95% CI: 2.42, 3.94) (Figure 7, Table 4, and Supplemental Figure 2). Adding in another ratio (i.e., the second-highest, positively associated lipid versus the second-highest, negatively associated lipid variables) enhanced performance further (LASSO-SIC3, AUC = 0.77; OR 3.91, 95% CI: 2.98, 5.24) (Figure 7, Table 4, and Supplemental Figure 2).

On the basis of this information, we generated a final SIC score that included the highest-performing sphingolipid RF- and LASSO-generated components and yielded increased discriminatory ability (AUC = 0.79; OR 4.67, 95% CI: 3.47, 6.43) (Figure 7 and Table 4). Only sphingolipids, and not LDL-C, were included in the final SIC score, so that comparisons of an inclusive sphingolipid measurement with conventional CVD lipid markers could be performed. A list of the lipid components in each novel score is provided in Table 3.

Comparison of machine-learning–generated scores with conventional markers of CAD. We next compared the ability of SIC, CERT1, and standard clinical biomarkers (TGs, LDL-C, etc.) to classify CAD patients compared with controls (Figure 8A and Table 4). We provide the following ROC curves (with the AUC) for comparison (Figure 8, B–G): clinical factors alone (age, sex, BMI, diabetes, hypertension, smoking; AUC = 0.63); clinical factors plus CERT1 (AUC = 0.66); clinical factors plus SIC (AUC = 0.72); clinical factors plus standard clinical lipids (AUC = 0.64); CERT1 plus clinical factors and clinical lipids (AUC = 0.64); and SIC plus clinical factors and clinical lipids (AUC = 0.65). Since the AUC can be an insensitive measure of model performance, particularly when the initial model (i.e., American Heart Association/American College of Cardiology [AHA/ACC] risk factors) performs strongly, we also calculated a continuous net reclassification index (NRI) and an integrated discrimination index (IDI) (26). These scores provide a more comprehensive picture of model performance and a means to assess the value of including SIC or CERT1 in addition to standard clinical biomarkers. For SIC, the NRI was 0.67 (95% CI: 0.52–0.81, P < 0.0001) and the IDI was 0.10 (95% CI: 0.08–0.11, P < 0.0001) (Supplemental Table 1). As a frame of reference, an NRI exceeding 0.6 is considered strong and 0.4 is considered intermediate (27). The SIC was superior to CERT1, which had an NRI of 0.48 (95% CI: 0.32–0.64, P < 0.0001) and an IDI of 0.04 (95% CI: 0.03–0.06, P < 0.0001) (Supplemental Table 1). The SIC improved the ROC C-statistic, NRI, and IDI compared with AHA/ACC guideline risk factors alone, underscoring the power of including sphingolipids as biomarkers of CAD.

Figure 8 Comparison of conventional CAD risk markers with novel sphingolipid scores in the Utah CAD study. (A) ROC curve for novel SIC score and conventional risk markers. (B) ROC curve for AHA/ACC-based clinical risk factors (age, sex, BMI, diabetes, hypertension, smoking). (C) The same AHA/ACC guidelines in addition to the CERT1 score and (D) the SIC score. ROC curves for (E) the aforementioned AHA/ACC clinical markers in addition to lipid markers (total-C, HDL-C, LDL-C, VLDL-C, TGs), (F) the clinical and lipid markers in addition to CERT1, and (G) the SIC score. For B–G, the C-statistics are indicated on the respective graphs by AUC.

Many of the lipids extracted by our variable reduction techniques [i.e., SM(d18:0/24:1), SM(d18:0/22:0), SM(d18:0/18:0), sphingosine, cer(d18:0/18:0), and cer(d18:0/16:0)] are transient intermediate lipid species and therefore reflect pathway activity and flux (for a full list of selected lipids, see Table 3). This finding suggests that although abundant ceramide species are implicated in driving disease states, these causal lipid species may not be the most sensitive clinical markers.

Stratification by CAD presentation. To further probe the clinical utility of the SIC score, we evaluated it in patients with CAD who were stratified into 3 subgroups: (a) patients having had a MI only; (b) patients who had a surgical intervention only (coronary artery bypass grafting [CABG] or percutaneous transluminal coronary angioplasty [PTCA]); or, (c) patients who had an MI in combination with a surgical intervention. Patients undergoing a surgical intervention only are considered to have a more tightly controlled disease state, whereas those with both surgical intervention and MI are likely to be in a more severe or uncontrolled disease state (28). Patients with an MI only are considered intermediate. As compared with the control population (i.e., all non-cases), the CERT1 and SIC scores were highest in the individuals with the more severe disease presentation (OR per SD > 1.80, P < 5 × 10–11; P heterogeneity < 2 × 10–16) (Figure 9). By comparison, standard clinical markers including LDL-C, total-C, and TGs did not show a preferential increase for individuals in this, as opposed to any other, category (Table 5). These findings suggest that ceramide-based scores may have utility for risk stratification, which is in line with previous studies that demonstrated the capacity of ceramides, but not LDL-C, to predict secondary cardiac events (17).

Figure 9 Association of sphingolipid scores with CAD, stratified by disease presentation (MI alone, surgery alone, MI plus surgery). OR (95% CI) for CAD per SD of sphingolipid species in the Utah CAD study, adjusted for age, sex, and BMI.