UKINETS2024 22nd Annual Meeting of the UK and Ireland Neuroendocrine Tumour Society 2024 Oral Communications (5 abstracts)
Clustering of gastroenteropancreatic neuroendocrine neoplasms (GEP-NEN) using machine learning (ML) and comparison with Tumour, Node, Metastasis (TNM) staging: a retrospective, population-based study using Surveillance, Epidemiology, and end results (SEER)
1Hampshire Hospitals NHS Foundation Trust, Winchester, United Kingdom. 2St. George University School of Medicine, West Indies, Grenada. 3Hampshire Hospitals NHS Foundation Trust, Basingstoke, United Kingdom. 4King’s College Hospital NHS Foundation Trust, London, United Kingdom. 5Winchester University, Winchester, United Kingdom
Introduction: TNM 8 is the staging system for GEP-NEN, guiding prognosis and treatment. However, it ignores important prognostic factors including age, sex, race, tumour site, and morphology.
Methods: 35,347 adults diagnosed between 2011-2021 with GEP-NEN and had no missing data in any variables were extracted from SEER. Age, sex, race, tumour site, size, morphology, number of lymph nodes and metastasis site were used to create 3 clusters using K-means ML clustering model. Univariable cox regression, Kaplan Meier (KM) plots and overall survival (OS) estimates were produced for TNM stage (model-1) and clusters (model-2). Prediction of OS concordance index (CI) was compared between both models using survival XGBoost ML model. A decision tree was developed to cluster patients.
Results: KM plots and OS estimates for model-1 showed overlap between stages 0 and 1, and between stages 2 and 3, with only stage 4 having distinct OS. Cox regression showed only stage 4 had different OS from stage 0. Three clusters were formed using K-means ML model: high, intermediate and poor clusters. KM plots, OS and Cox regression for model-2 showed no overlap between clusters, showing distinct OS between them. The poor survival cluster was characterised by advanced age, male sex (56%), advanced stage, higher tumour size, higher number of regionally positive lymph nodes, metastatic disease, NEC morphology, cecal, colon, pancreas and small intestine NEN. CI for TNM stage and clusters were 68.8% and 65.9% respectively. CI using both TNM stage and clusters was even better (73.2%). A decision tree was developed to cluster patients with accuracy of 91.6% (F1 score 93.1%). The most important factors for OS according to ML were age, stage, metastasis, site, size, NEC, T stage, number of regionally positive lymph nodes, race, and sex in decreasing order of importance.
Conclusion: ML can be used to improve TNM staging for better prognostication of GEP-NEN patients by clinicians.