Using Machine Learning Applied to Real-World Healthcare Data for Predictive Analytics: An Applied Example in Bariatric Surgery

May, 2023


Objectives: Laparoscopic metabolic surgery (MxS) can lead to remission of type 2 diabetes (T2D); however, treatment
response to MxS can be heterogeneous. Here, we demonstrate an open-source predictive analytics platform that applies
machine-learning techniques to a common data model; we develop and validate a predictive model of antihyperglycemic
medication cessation (validated proxy for A1c control) in patients with treated T2D who underwent MxS.

Methods: We selected patients meeting the following criteria in 2 large US healthcare claims databases (Truven Health
MarketScan Commercial [CCAE]; Optum Clinformatics [Optum]): underwent MxS between January 1, 2007, to October 1, 2013
(first = index); aged $18 years; continuous enrollment 180 days pre-index (baseline) to 730 days postindex; baseline T2D
diagnosis and treatment. The outcome was no antihyperglycemic medication treatment from 365 to 730 days after MxS. A
regularized logistic regression model was trained using the following candidate predictor categories measured at baseline:
demographics, conditions, medications, measurements, and procedures. A 75% to 25% split of the CCAE group was used
for model training and testing; the Optum group was used for external validation.

Results: 13 050 (CCAE) and 3477 (Optum) patients met the study inclusion criteria. Antihyperglycemic medication cessation
rates were 72.9% (CCAE) and 70.8% (Optum). The model possessed good internal discriminative accuracy (area under the
curve [AUC] = 0.778 [95% CI = 0.761-0.795] in CCAE test set N = 3527) and transportability (external AUC = 0.759 [95% CI =
0.741-0.777] in Optum N = 3477).

Conclusion: The application of machine learning techniques to real-world healthcare data can yield useful predictive models
to assist patient selection. In future practice, establishment of prerequisite technological infrastructure will be needed to
implement such models for real-world decision support.

Resource Type: