Development and validation of a risk prediction model for premenopausal breast cancer in 19 cohorts

Scritto il 01/05/2025
da Kristen D Brantley

Breast Cancer Res. 2025 May 1;27(1):67. doi: 10.1186/s13058-025-02031-8.

ABSTRACT

BACKGROUND: Incidence of premenopausal breast cancer (BC) has risen in recent years, though most existing BC prediction models are not generalizable to young women due to underrepresentation of this age group in model development.

METHODS: Using questionnaire-based data from 19 prospective studies harmonized within the Premenopausal Breast Cancer Collaborative Group (PBCCG), representing 783,830 women, we developed a premenopausal BC risk prediction model. The data were split into training (2/3) and validation (1/3) datasets with equal distribution of cohorts in each. In the training dataset variables were chosen from known and hypothesized risk factors: age, age at menarche, age at first birth, parity, breastfeeding, height, BMI, young adulthood BMI, recent weight change, alcohol consumption, first-degree family history of BC, and personal history of benign breast disease (BBD). Hazard ratios (HR) and 95% confidence intervals (CI) were estimated by Cox proportional hazards regression using age as time scale, stratified by cohort. Given that complete information on all risk factors was not available in all cohorts, coefficients were estimated separately in groups of cohorts with the same available covariate information, adjusted to account for the correlation between missing and non-missing variables and meta-analyzed. Absolute risk of BC (in situ or invasive) within 5 years, was determined using country-, age-, and birth cohort-specific incidence rates. Discrimination (area under the curve, AUC) and calibration (Expected/Observed, E/O) were evaluated in the validation dataset. We compared our model with a literature-based model for women < 50 years (iCARE-Lit).

RESULTS: Selected model risk factors were age at menarche, parity, height, current and young adulthood BMI, family history of BC, and personal BBD history. Predicted absolute 5-year risk ranged from 0% to 5.7%. The model overestimated risk on average [E/O risk = 1.18 (1.14-1.23)], with underestimation of risk in lower absolute risk deciles and overestimation in upper absolute risk deciles [E/O 1st decile = 0.59 (0.58-0.60); E/O 10th decile = 1.48 (1.48-1.49)]. The AUC was 59.1% (58.1-60.1%). Performance was similar to the iCARE-Lit model.

CONCLUSION: In this prediction model for premenopausal BC, the relative contribution of risk factors to absolute risk was similar to existing models for overall BC. The discriminatory ability was nearly identical (< 1% difference in AUC) to the existing iCARE-Lit model developed in women under 50 years. The inability to improve discrimination highlights the need to investigate additional predictors to better understand premenopausal BC risk.

PMID:40312753 | DOI:10.1186/s13058-025-02031-8