CNet-Cox for interpretable network biomarker discovery and survival risk scoring in precise breast cancer prognosis

Image credit: [Lingyu Li]

Abstract

Biomarker discovery in biomedicine is often cast as feature selection, yet most methods overlook gene co-localization within regulatory interaction networks, yielding isolated biomarkers with limited biological interpretability and clinical translatability. Here, we propose CNet-Cox, a disease-agnostic, Connected Network-regularized Cox proportional hazards framework that incorporates prior network connectivity into sparse feature selection to identify connected prognostic module. Applied to breast cancer, CNet-Cox revealed the network structure of 68 prognostic biomarkers associated with survival on discovery dataset (TCGA, n = 1080) and achieved a concordance index of 0.913 on internal test dataset, outperforming conventional regularized Cox methods. From these network biomarkers, we derived a six-gene prognostic risk score (PRS) and validated its robustness across seven independent bulk transcriptomic datasets (GEO; n = 1602) and a spatial transcriptomics dataset (Visium; 4992 spots). The PRS consistently improved risk stratification (log-rank p < 0.05) and produced concordant predictions with MammaPrint in spatial prognostics (Pearson r = 0.993). Although evaluated in breast cancer, CNet-Cox is readily extensible to other diseases, molecular interaction networks and time-to-event endpoints, providing a generalizable tool for digital pathology and precision oncology. Overall, our comprehensive downstream analyses highlight that CNet-Cox offers a novel network-aware survival model for systematically discovering connected biomarkers and delivering scalable, precise and interpretable risk prediction.

Publication
In npj Digital Medicine

Supplementary notes can be added here, including code, data, math, and images.

Lingyu Li
Lingyu Li
Postdoctoral Fellow

Focus on bioinformatics, including but not limited to spatial transcriptomics analysis, sparse statistical learning and biomarker identification.