Subjects
This retrospective cohort study used the electronic health record data of consecutive patients admitted to the ICU at Chiba University Hospital, Japan, from November 2010 to March 2019. The annual number of patients admitted to the 22-bed surgical/medical ICU ranged from 1,541 to 1,832. We excluded patients on maintenance dialysis and those without a documented body weight. This study was approved by the Ethical Review Board of Chiba University Graduate School of Medicine (approval number: 3380) in accordance with the Declaration of Helsinki. The Ethical Review Board of Chiba University Graduate School of Medicine waived the requirement for written informed consent in accordance with the Ethical Guidelines for Medical and Health Research Involving Human Subjects in Japan.
Definition of oliguria and AKI
We defined oliguria as urine output of less than 0.5 mL/kg/h according to the Kidney Disease: Improving Global Outcomes stage I criteria. AKI was diagnosed based on an increase in serum creatinine level of at least 0.3 mg/dL from the baseline or oliguria38.
Data collection
Patient records from the ICU data system contained 1,031 input variables, including (A) physiological measurements acquired every minute (heart rate, blood pressure, respiratory rate, peripheral oxygen saturation, and body temperature), (B) blood tests (complete blood count, biochemistry, coagulation, and blood gas analysis), (C) name and dosage of medications, (D) type and amount of blood transfusion, (E) patient observation record, and (F) patient care record. The minute-by-minute time-series tables were aggregated into hourly time-series tables. In the process of aggregating the tables, the median value was used for physiological measurements and the blood test values were obtained from the most recent test. For patient excretion values, urine and stool volumes were calculated as one-hour sums. The following six calculated variables were added to the dataset: hourly intake, hourly output, hourly total balance, hourly urine volume (mL/kg), oliguria (urine volume of less than 0.5 mL/kg/h), and oliguria for six consecutive hours. A total of 222 background information variables, including age, sex, and admission diagnosis, were also added to the dataset. Consequently, the dataset contained 1,127 variables. We treated the missing values as a separate group or excluded them from the analysis. To remove potential collinearity values, we performed a multicollinearity test and analyzed the data without these values.
Machine learning algorithms and statistical analyses
The dataset was randomly divided: 80% for training and 20% for testing. We developed a sequential machine-learning model to predict oliguria at any given time during the ICU stay using hourly variables and baseline information (Fig. 1). For the values that were not continuously obtained, we used the most recent ones for the model development. The input variables were updated to encompass a 1-h window of the preceding values for the physiological measurements, blood tests, and medications. The primary and secondary outcome variables were oliguria at 6 and 72 h after an arbitrary time point from ICU admission to discharge, respectively. Accordingly, we used variables recorded until 6 or 72 h before ICU discharge corresponding to each outcome variable. The outcome variable was not incorporated as a predictor in the final model. After constructing the algorithm with the training data, the model predictions were validated using the test data. We validated the model performance with a fivefold cross validation. To ensure that the estimated model probabilities aligned with the actual probabilities of oliguria occurrence, we plotted the calibration curve of the model. The curve indicated that our model was well calibrated (Supplementary File 1: Fig. S4).
We selected four representative machine learning classifiers: LightGBM, category boosting (CatBoost), random forest, and extreme gradient boosting (XGboost). Before developing the prediction model, we compared the computational performances and model accuracies using the four classifiers (Supplementary File 1: Table S2). To develop the machine learning algorithm, we used a cloud computer (Google Collaboratory memory 25 GB) to evaluate the accuracy of the model. The AUC values based on the receiver operating calibrating curves, sensitivity, specificity, and F1 score were calculated. Among the machine learning classifiers, LightGBM showed the best computation speed and AUC and the second-best F1 score with a marginal difference from XGboost (XGboost 0.899, LightGBM 0.896). Based on these results, we decided to use LightGBM for the analysis in this study. After developing a prediction model with all the variables, we reduced the number of variables for prediction by selecting clinically relevant variables (Supplementary File: Table S2). Subsequently, we compared the performances of the LightGBM model using the selected variables and all the variables. As a sensitivity analysis, we re-analyzed the data using a different computer environment, Amazon Web Service Sagemaker. The computer settings included the following: image: Data Science 3.0, kernel: python 3, and instance type: ml.t3.medium (memory 64 GB).
To evaluate the important variables contributing to building the prediction model, we used the SHAP value. The SHAP value indicates the impact of each feature on the model output, with higher interpretability in machine learning models. We expressed the SHAP value as an absolute number with a positive or negative association between the variable and outcome. SHAP individual force plots showed several features at scale with a color bar that indicated the feature contribution to the onset of oliguria in individual instances, enhancing the interpretability regarding the connection between traits and the occurrence of oliguria. For the subgroup analyses, we compared the accuracies of the models in predicting oliguria based on sex, age (≤ 65 or > 66 years), and furosemide administration. To quantify the differences in the AUC plots of the two groups, the absolute values of the differences in the AUCs of each group from 6 to 72 h were summed and averaged to obtain the MAE.
Data were expressed as medians with interquartile ranges for continuous values and as absolute numbers and percentages for categorical values. A P value < 0.05 was considered as statistically significant. The main Python packages used in the analysis to create the machine learning algorithms were Python 3.10.11, pandas 1.5.3, numpy 1.22.4, matplotlib 3.7.1, scikit-learn 1.2.2, XGboost 1.7.2, lightgbm 2.2.3, catboost 1.1.1, and shap 0.41.0.
- The Renal Warrior Project. Join Now
- Source: https://www.nature.com/articles/s41598-024-51476-y