Prognostic biomarker discovery via a connected network-constrained Cox proportional hazards model

Image credit: [Lingyu Li]

摘要

Biomarker discovery in biomedical sciences can be framed as feature selection in machine learning. However, existing methods often overlook gene co-localization within regulatory interaction networks, leading to the identification of isolated biomarkers with limited biological interpretability. Here, we present the Connected Network-regularized Cox proportional hazards model (CNet-Cox), which incorporates network connectivity constraints into sparse regularization to identify prognostic biomarkers for breast cancer (BRCA) on the discovery dataset from TCGA (1,092 patients), while explicitly accounting for patient survival time. CNet-Cox reveals the network structures of prognostic genes, evaluated in the internal validation dataset with a concordance index of 0.913, surpassing traditional regularized Cox methods. CNet-Cox shifts biomarker recognition from isolated to connected features within biomolecular networks and offers new biological insights. Furthermore, we established a six-gene BRCA prognostic risk scoring (PRS) metric and validated its robustness across six independent external validation datasets comprising 1,829 patients, and one spatial transcriptomic dataset containing 4,992 spots. The PRS score consistently demonstrated superior performance in patient/sample stratification across extensive and diverse validation datasets. Overall, our comprehensive downstream analyses underscore that CNet-Cox offers a novel approach for embedding network topology into feature selection, enabling the systematic discovery of key connected prognostic biomarkers. This significantly advances early detection and prognosis prediction, facilitating precision medicine for BRCA.

出版物
In BRIEF BIOINFORM
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Supplementary notes can be added here, including code, data, math, and images.

李苓玉
李苓玉
博士后研究员

研究方向为生物信息学,包括并不限于:空间转录组学分析、稀疏统计学习和生物标志物识别。