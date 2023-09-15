Additional details can be found in the Supplemental Methods.

This study is reported according to the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) guidelines (61) for cohort studies and the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines (62).

Study design and setting. The EVERREST prospective study was a multicenter prospective cohort study recruiting pregnant women from 4 European tertiary referral centers: University College London Hospital, United Kingdom; University Medical Centre Hamburg-Eppendorf, Germany; Maternal-Fetal Unit Hospital Clinic, Barcelona, Spain; and Skane University Hospital, Lund, Sweden.

Study population. Full details on the protocol have been published previously (26). In brief, pregnant women were eligible if they had a singleton fetus with an ultrasound EFW below 600 g and below the third centile according to local criteria between 20+0 and 26+6 weeks of gestation. Exclusion criteria included a known abnormal karyotype or a major fetal structural abnormality at enrollment (63); indication for immediate delivery; preterm rupture of membranes before enrollment; maternal HIV or hepatitis B or C infection (because of the impact on processing and storage of biological samples); maternal age under 18 years; any medical or psychiatric condition that compromised the woman’s ability to participate; and a lack of capacity to consent. Pregnant women with a known congenital infection were not recruited, and for the purposes of this analysis, pregnancies that were terminated were excluded. Decisions to terminate (n = 7) were based on parental concerns about the short- and long-term prognosis for the fetus and maternal health risks.

Outcomes. The primary outcome was fetal or neonatal death (≤28 days of life). Secondary outcomes were: fetal death or delivery at or before 28+0 weeks of gestation; slow fetal growth, defined as a worsening of weight deviation of 10 or more percentage points over a 2-week interval (including before and after enrollment) or an equivalent trajectory over a longer period (29); and the development of abnormal UmA Dopplers, defined as development of the UmA PI above the 95th centile in pregnancies in which the UmA PI was at 95th centile or below at enrollment (28). Slow fetal growth was selected as a secondary outcome because it showed an association with fetal or neonatal death in the discovery set but could only be assessed with serial scans. A surrogate biomarker could potentially give the same information at the time of diagnosis and provide pathophysiological insights. Ascertainment for outcomes of this study was possible, at the latest, by 29 days of life. Follow-up for neonatal morbidity and infant health and neurodevelopment to the age of 2 years continues.

All pregnancies were managed according to the local fetal medicine unit protocols. This included ultrasound assessment of biometry every 2 weeks and Doppler velocimetry every week, increasing to Doppler velocimetry twice a week or more with absent or reversed UmA EDF. Preeclampsia was defined according to International Society for the Study of Hypertension in Pregnancy (ISSHP) criteria (64), meaning that, given the presence of FGR, any woman developing new-onset hypertension after 20+0 weeks of gestation was classified as having preeclampsia rather than pregnancy-induced hypertension. Formalin-fixed placental samples were classified according to Amsterdam consensus criteria by a single assessor (65). To minimize bias, study placental samples were mixed with placental samples from healthy term pregnancies and pregnant women who delivered spontaneously preterm, with the investigator blinded to pregnancy phenotype and outcome during the assessment.

Ultrasound measurements. All ultrasound examinations were performed by staff trained and validated in the common EVERREST Prospective Study protocol (26). At each ultrasound scan, Doppler velocimetry of the UmA, UtA, middle cerebral artery (MCA), DV, and umbilical vein was performed (66). Local EFW formulas and centile charts were used to determine study eligibility, but for consistency, all EFWs were recalculated using the Hadlock 3 formula (incorporating head circumference, abdominal circumference, and femur length), with z scores recalculated using the Marsal chart for descriptive data (Supplemental Equations 1–3) (67, 68). EFWs and z scores were also recalculated using Intergrowth formulas for analysis (Supplemental Equations 4 and 5) (69). The effect of alternative Doppler reference charts was explored, with results similar to those presented previously (70–74).

Demographic data. Maternal ethnicity was self-reported according to the UK 2021 census list of ethnic groups, with the following options: White; Asian, including Indian, Pakistani, Bangladeshi, Chinese and any other Asian background; Black, including Caribbean, African, and any other Black background; multiethnic; and other (75).

Blinding. Maternal serum protein concentrations were not available to clinicians, participants, or researchers during the pregnancy, as all samples were analyzed after complete primary outcome data had been ascertained. Serum PlGF and sFLT1 concentrations were not used as part of clinical care at any of the study centers during the recruitment period.

Sample collection. Maternal blood was collected at study enrollment in BD Vacutainer serum-separating tubes and processed according to the manufacturer’s instructions. Serum aliquots (500 μL) were frozen and stored at –80°C. Placental samples for Amsterdam criteria categorization were collected from 2 areas of each placenta, midway between the cord insertion and margin in areas free from macroscopic infarcts or lesions. Samples were rinsed in PBS, formalin fixed, wax embedded, sectioned, and stained with H&E.

Measurement of a priori candidate biomarkers in maternal serum. PlGF and sFLT1 concentrations were measured using Elecsys electrochemiluminescence immunoassays on a Cobas e411 analyzer (Roche Diagnostics). The NPX of 90 additional proteins associated with cardiovascular disease was measured using the Olink Cardiovascular II proximity extension assay (full list of proteins in Supplemental Table 17). In the discovery set, but not the validation set, VEGFA, VEGFD, VEGFR2, neuropilin 1 (NRP1), and endoglin were measured in triplicate using Quantikine colorimetric sandwich ELISAs (R&D Systems).

Identification of novel candidate biomarkers in maternal serum using liquid chromatography and tandem mass spectrometry. Five pooled serum samples were created for the following pregnancy outcomes: (a) pregnancies ending in fetal or neonatal death; (b) pregnancies ending in neonatal survival with delivery before 37+0 weeks of gestation; (c) pregnancies ending in neonatal survival with delivery at 37+0 weeks of gestation or later; (d) slow fetal growth trajectory; and (e) normal fetal growth trajectory. Pooled serum samples were depleted of 12 high-abundance proteins using Proteome Purify 12 resin (R&S Systems), according to the manufacturer’s instructions, concentrated using Vivaspin 500 5 kDa Molecular Weight Cut-Off columns (GE Healthcare), reduced with 10 mM tris(2-carboxyethyl)phosphine hydrochloride, and then alkylated with 7.5 mM iodoacetamide. Pooled samples were digested using a trypsin/Lys-C mix, labeled with Tandem Mass Tags (Thermo Fisher Scientific), and combined (76). The combined sample underwent 2D high-performance reversed-phase liquid chromatography and tandem mass spectrometry. In the first dimension, samples were fractionated into 30 parts at high pH using a Poroshell 300 Extend C18 column (Agilent Technologies), following which fractions 1 to 4 were combined with fractions 27 to 30, respectively, given the low abundance in the first 4 fractions. The second fractionation was performed on the Ultimate 3000 nano-liquid chromatography system using Acclaim PepMap 100 C18 precolumns and Acclaim PepMap 100 C18 Nano-LC columns run in tandem with analysis on the linear trap quadrupole (LTQ) Orbitrap XL 2.5.5 (all from Thermo Fisher Scientific). A blank calibration sample was run after every 3 fractions, and a standard sample of known mass was run after every 6 fractions for quality control.

Proteins were identified using Proteome Discover version 1.4 software (Thermo Fisher Scientific) to search the human Swiss-Prot database with the Mascot search engine (Matrix Science). Proteins were scored on variability, peptide count, ubiquity, ratio between pools, and consistent trend across pools (Supplemental Tables 18 and 19). Expression pattern clusters, based on standardized and raw quantification ratios, were generated using the Graphical Proteomics Data Explorer (GPRoX) platform. On the basis of their scores and expression clusters, 5 candidate proteins were selected and measured in individual samples using ELISAs. Fibronectin, PSG1 (both from R&D Systems), and CSH (DRG International, measuring CSH1 and CSH2) were measured in the discovery and validation sets, while SAA (R&D Systems) and LNPEP (Cloud-Clone) were measured in the discovery set only. See Supplemental Table 20 for a summary of proteins analyzed for each study component.

Priority survey and model selection. An online survey was sent to patients and clinicians asking their opinion on the importance of different pregnancy outcomes and, for each outcome, whether they would prioritize sensitivity or specificity (see Supplemental Table 21 for full wording of the questions). Models were selected on the basis of the survey results and the model performance metrics described below. Protein models were published online prior to the validation data analysis.

Sample size. Since this work involved the discovery of novel biomarkers, a formal a priori sample size calculation was not possible. Before analyzing the discovery set, it was determined that this sample of 63 with 21 fetal or neonatal deaths gave an 80% power to detect a standardized effect size of 0.9 (large) to a significance level of 0.05 (77).

Model development. Two-protein models for the development of abnormal UmA Dopplers and 2- and 3-protein models for the other 3 pregnancy outcomes, with internal validation using LOOCV, were compared on the basis of AUC, specificity for 90% sensitivity, sensitivity for 90% specificity, F1 score, Matthews correlation coefficient (MCC), and precision-recall characteristics (PRROC) AUC. ROC curves were generated with the pROC R package (version 1.18.0, https://cran.r-project.org/web/packages/pROC/index.html). Ninety-five percent CIs for AUCs were determined by stratified bootstrapping. PRROC curves were generated with the MLmetrics R package (version 1.1.1, https://cran.r-project.org/web/packages/MLmetrics/index.html). Models with variance inflation factors of 5 or more were excluded using the following R package: https://cran.r-project.org/web/packages/car/index.html, version 3.1-0. Two-variable models predicting fetal or neonatal death and death or delivery at or before 28+0 weeks’ gestation containing ultrasound parameters with or without PlGF or CSH (as the proteins showing the strongest associations with these outcomes) were compared in the same way. Outcomes and protein models to be validated were published on the study registry prior to analysis of the validation data.

Parenclitic network analysis. Parenclitic networks of the 102 proteins were generated for each of the 4 pregnancy outcomes. For each outcome, 2D kernel density estimations were generated for every pair-combination of variables in “controls” (pregnancies without the outcome). Individual networks were then generated for each “case,” with linkages created if a pair-wise relationship of variables differed from the control distribution by more than a given threshold (80). These individual case networks were then combined. VEGFA, BNP, PARP1, and melusin were included as binary variables of “detectable” or “not detectable.” Booking BMI and fetal sex were included as variables in the networks, except for “development of abnormal UmA Dopplers,” where the sample was not large enough to accommodate them.

Model validation. Concentrations of CSH and PSG1, as measured by ELISA, and NPX values for the Olink multiplex proteins showed substantial variation in centrality and spread between the discovery and validation sets. To account for this, values of each protein were centered to a mean of 0 and scaled to a SD of 1 in the discovery set and validation set separately. These centered and scaled values were used for subsequent analyses, including model validation. Concentrations of PlGF, sFLT1, and fibronectin did not require transformation.

Models generated from the discovery set were run on data from the validation set and were considered validated if the 95% CI for the validation estimate of the AUC included the LOOCV AUC estimate from the discovery set. For validated models, data from both sets were combined to give final test characteristics. LR tests were used to determine whether the addition of pregnancy characteristics (maternal BMI, maternal age, maternal ethnicity, fetal sex, gestational age at enrollment, and preeclampsia at enrollment) significantly improved the validated models. Model calibration was assessed by plotting the predicted probability against the observed frequency of outcome.

Functional interactions. Centered and scaled data from the discovery and validation sets were combined to retest univariate associations with the primary and secondary outcomes. Proteins showing a significant association at a 5% Benjamini-Hochberg FDR were explored for physical and functional interactions and for enrichment of GO biological processes, relative to the background of all proteins measured, using STRING (Swiss Institute of Bioinformatics) (81). Where no enrichment was detected, shared GO biological processes were identified through comparison with the whole genome.

Modeling pregnancy duration. Protein and ultrasound measurements from the combined discovery and validation sets were tested for their association with gestational age at live birth or diagnosis of fetal death and interval from enrollment to live birth or diagnosis of fetal death using linear regression. Variables showing a significant association at a 1% Benjamini-Hochberg FDR were used to create linear models predicting these outcomes in a stepwise fashion. Maternal age, BMI, ethnicity, gestational age at enrollment, preeclampsia at enrollment, and fetal sex were tested for model improvement. Model fit was tested by assessing variance inflation factors for multicollinearity, assessing the distribution of the residuals for heteroscedasticity and outliers, and looking for observations with high leverage. UmA and UtA Doppler velocimetry, PlGF concentration, CSH concentration and PAPPA NPX were tested for their associations with placental histological classification using logistic regression.

Statistics. Data analysis was performed using STATA/MP 16.1 software (StataCorp) unless otherwise specified. Descriptive and investigative variables were tested for skew and kurtosis (78, 79) and handled as symmetrical if there was no evidence of either. PlGF, sFLT1, endoglin, VEGFD, NRP1, CSH, SAA, and LNPEP were transformed to their natural logs, and multiplex data were analyzed as provided, on a log 2 scale. Characteristics of the discovery and validation sets were compared using χ2 tests (categorical data), Fisher’s exact tests (binary data with sparse outcomes), 2-sided t tests (symmetrical continuous data), and Mann-Whitney U tests (skewed continuous data).

Missing data for BMI (n = 5) were imputed using chain equations. UmA PI at enrollment was systematically missing (n = 16), with most missing cases having absent or reversed EDF (n = 15). UmA Doppler velocimetry was therefore handled as an interval variable, “UmA PI category,” where 0 = UmA PI at or below the 95th centile, 1 = UmA above the 95th centile with positive EDF, 2 = absent EDF, and 3 = reversed EDF. Where UtA PI at enrollment was missing (n = 10), a mean UtA PI below or above the 95th centile could be inferred in 9 cases in which the mean UtA PI was consistently normal (n = 1) or abnormal (n = 8), respectively, at scans prior to and after enrollment and UtA PI values were imputed using multiple imputation. Associations between ultrasound measurements and both fetal or neonatal death and death or delivery at or before 28+0 weeks of gestation were analyzed using logistic regression. Univariate associations between protein concentrations or NPX and outcomes were assessed using 2-sided t tests, Mann-Whitney U tests, and logistic regression, with Benjamini-Hochberg procedures to account for multiple comparisons.

Study approval. Ethics approval was provided by the National Research Ethics Service Committee London – Stanmore in the UK (REC reference: 13/LO/1254); the Hospital Clinic of Barcelona’s Clinical Research Ethics Committee in Spain (Reg: HCB/2014/0091); the Regional Ethical Review Board in Lund in Sweden (DNr 2014/147); and the Ethics Committee of the Hamburg Board of Physicians in Germany (PV4809). This study was conducted according to Declaration of Helsinki principles, and written informed consent was given by all participants before enrollment.

Data availability. The full data set will not be made publicly available because the degree of detailed phenotyping could allow individual patient identification. Limited data sharing may be possible, with the agreement of the EVERREST Consortium, upon transfer agreement request, directed to the corresponding author. Values for all data points in the figures can be found in the Supplemental Supporting Data Values file.