A Digital Twin–Driven Machine Learning Framework for Diabetes Risk Prediction and Short-Term Health Trajectory Simulation

Authors

DOI:

https://doi.org/10.33022/ijcs.v15i2.5110

Keywords:

Digital Twin, Machine Learning, Diabetes Prediction, Predictive Healthcare, Simulation Modelling

Abstract

Diabetes remains a major global health challenge, requiring early risk detection and proactive management to reduce long-term complications. However, existing approaches are predominantly reactive and rely on static clinical indicators, limiting their ability to support personalized and forward-looking care. This study proposes an integrated framework that combines machine learning (ML) and digital twin (DT) technologies to enable both diabetes risk prediction and short-term health trajectory simulation. Using the CDC Diabetes Health Indicators dataset, a structured CRISP-DM methodology was applied to guide data preprocessing, feature selection, model development, and evaluation. Class imbalance (13.9% minority class) was addressed using the Synthetic Minority Over-sampling Technique (SMOTE). Five machine learning models were evaluated, with Gradient Boosting achieving the best performance (ROC-AUC = 0.797; F1-score = 0.415), indicating acceptable discriminative capability under imbalanced conditions. Building on this predictive layer, a digital twin framework was developed to simulate individual risk trajectories over a 90-day period. The system was operationalized through a web-based architecture that integrates prediction, simulation, and visualization into a unified interface. The results indicate that combining machine learning with digital twin modelling links point-in-time risk estimation with short-term trajectory exploration. While the simulation is based on model-driven assumptions rather than real-time physiological data, it provides an additional analytical layer that supports anticipatory decision-making. This study contributes a scalable, modular framework that bridges predictive analytics and simulation, offering a practical step towards more proactive, data-driven approaches in digital health.

Downloads

Published

15-04-2026