Prognostic biomarker discovery via a connected network-constrained Cox proportional hazards model

Image credit: [Lingyu Li]

Abstract

Biomarker discovery in biomedical sciences can be framed as feature selection in machine learning. However, existing methods often overlook gene co-localization within regulatory interaction networks, leading to the identification of isolated biomarkers with limited biological interpretability. Here, we present the Connected Network-regularized Cox proportional hazards model (CNet-Cox), which incorporates network connectivity constraints into sparse regularization to identify prognostic biomarkers for breast cancer (BRCA) on the discovery dataset from TCGA (1,092 patients), while explicitly accounting for patient survival time. CNet-Cox reveals the network structures of prognostic genes, evaluated in the internal validation dataset with a concordance index of 0.913, surpassing traditional regularized Cox methods. CNet-Cox shifts biomarker recognition from isolated to connected features within biomolecular networks and offers new biological insights. Furthermore, we established a six-gene BRCA prognostic risk scoring (PRS) metric and validated its robustness across six independent external validation datasets comprising 1,829 patients, and one spatial transcriptomic dataset containing 4,992 spots. The PRS score consistently demonstrated superior performance in patient/sample stratification across extensive and diverse validation datasets. Overall, our comprehensive downstream analyses underscore that CNet-Cox offers a novel approach for embedding network topology into feature selection, enabling the systematic discovery of key connected prognostic biomarkers. This significantly advances early detection and prognosis prediction, facilitating precision medicine for BRCA.

Publication
In BRIEF BIOINFORM
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Supplementary notes can be added here, including code, data, math, and images.

Lingyu Li
Lingyu Li
Postdoctoral Fellow

Focus on bioinformatics, including but not limited to spatial transcriptomics analysis, sparse statistical learning and biomarker identification.