SFEIES24 Poster Presentations Diabetes & Metabolism (68 abstracts)
1Dublin City University, Dublin, Ireland; 2SFI Centre for Research Training in Machine Learning, Dublin, Ireland; 3University College Dublin, Dublin, Ireland; 4Florida Institute from Human and Machine Cognition, Pensacola, USA
Objective: To assess the accuracy of gestational diabetes mellitus (GDM) diagnoses in electronic health records (EHRs) by comparing them to a real-time clinical team database maintained by the hospital.
Methods: The study employed a retrospective validation design to evaluate the accuracy of GDM diagnoses in the EHRs of The Coombe Hospital, Dublin. Patient IDs were matched between the EHR system and a real-time clinical team database (GDM Val) which recorded all GDM patients. Data were collected from 2018-2022 and included medical histories recorded by midwives. GDM in the EHRs were labelled as positive if "Diabetes developed during pregnancy" was noted, then matched with GDM diagnoses from GDM Val. The comparison yielded true positives, false positives, true negatives, and false negatives, assessing the EHRs reporting accuracy against GDM Val.
Results: The dataset included 37,651 EHRs from 31,100 patients, (mean±SD: age, 32±5 y; BMI, 26.2±5.5 kg/m²); 20.7% had a BMI over 30. GDM prevalence was 11.0% using EHRs and 10.5% using GDM Val. Of 3,952 patients with matching IDs, 3,388 were correctly identified with GDM in both EHRs and GDM Val (ground truth), while 564 lacked a corresponding GDM label in EHRs. Additionally, 771 patients were incorrectly diagnosed with GDM in EHRs without matching IDs in GDM Val. Overall, there were 32,928 true negatives (87.5%), 3,388 true positives (9.0%), 771 false positives (2.0%), and 564 false negatives (1.5%). Furthermore, GDM prevalence for both EHRs and GDM Val databases revealed a notable reduction in 2020 (EHR, 10.0%; GDM Val, 7.7%), indicating a deviation from the trend observed in other years.
Conclusions: The analyses revealed clinically meaningful discrepancies between EHR-recorded GDM diagnoses and the clinical teams database, highlighting a need for improved reporting accuracy in EHR systems if they are to be used for EHR trained machine learning models.