Uplift modeling to predict individual treatment effects of renal replacement therapy in sepsis-associated acute kidney injury patients – Scientific Reports

Setting

This retrospective study utilized the Medical Information Mart for Intensive Care (MIMIC-III), a publicly available, large-scale critical care database. The database comprised 61,532 intensive care unit (ICU) admissions from 46,520 patients at the Beth Israel Deaconess Medical Center in Boston, MA, between 2001 and 2012. The data were collected from all patient admissions to various types of Intensive Care Units (ICUs) at the center during the specified period. The average length of stay in ICU was 2.1 (1.2–4.6) days, and the hospital mortality rate was 11.5%15. The MIMIC-III database integrates non-identifiable, comprehensive clinical data, including demographics, hourly vital signs, clinical measurements, laboratory results, treatments, and the International Classification of Diseases Ninth Revision (ICD-9) codes of diagnoses and procedures.

The data in MIMIC-III has been non-identifiable, and the institutional review boards of the Massachusetts Institute of Technology (No. 0403000206) and Beth Israel Deaconess Medical Center (2001-P-001699/14) both approved the use of the database for research. All data analysis and reporting has been performed in accordance with institutional guidelines and regulations.

Inclusion and exclusion criteria

The inclusion criteria: (1) diagnosed as sepsis; (2) suffering from AKI. Exclusion criteria: (1) not admitted to ICU for the first time. (2) Patients age < 18. (3) End-stage renal disease (ESRD). 4. Blood potassium > 6.5 mmol/L.

To diagnose sepsis, we utilized the third sepsis definition (sepsis-3), which defines sepsis as a life-threatening condition characterized by organ dysfunction caused by a dysregulated host response to infection16. We screened patients with documented or suspected infection and an acute change in total Sequential Organ Failure Assessment (SOFA) score of ≥ 2. This method closely aligns with the sepsis-3 definition and has been demonstrated to be effective in the Medical Information Mart for Intensive Care III (MIMIC-III) database17.

AKI was defined in accordance with the Kidney Disease: Improving Global Outcomes (KDIGO) criteria18, which classify AKI into three stages based on urine output and serum creatinine levels. Diagnosis of AKI was confirmed if the highest KDIGO stage during the ICU stay was greater than or equal to 1. In addition to recording the baseline KDIGO stage of each patient, we also continuously documented the KDIGO stage until the initiation of RRT.

We excluded all patients with blood potassium > 6.5, which is thought to be one of the most important reasons for urgently initiating CRRT6. We ruled out patients not on their first admission to avoid multiple records of the same patient.

The study population was divided into two groups: the RRT group and non-RRT group, based on whether they received RRT treatment during their ICU stay.

Primary endpoint

The primary endpoint of the study was 28-day mortality, which encompassed both in-hospital and out-of-hospital deaths.

Statistical analysis

Continuous variables were expressed as the mean ± SD and interquartile ranges (IQR) when the data as appropriate. Student’s t-test was used for normally distributed variables, while the Mann–Whitney U test was used for skewed variables. Categorical variables were presented as counts and percentages, and compared using either the chi-square test. The estimation of sample size was carried out using a power analysis based on a two-sample t-test.

To estimate the association between RRT and outcomes, as well as to select the best-matched patients for further artificial intelligence analysis, propensity score matching (PSM) was employed in our study using a greedy nearest neighbor matching algorithm. Patients were matched at a 1:1 ratio, with each RRT patient matched to a non-RRT patients, based on estimated propensity scores. The efficacy of PSM in reducing between-group differences was assessed by calculating the standardized mean difference (SMD).

Uplift modeling

The uplift model aims to forecast the difference in outcomes between individuals who receive treatment and those who do not, while also identifying patients who are more likely to benefit from treatment. Upon reviewing previous research that employed s-learner and t-learner methods19,20, we found that the intrinsic logic of these studies was to indirectly model uplift. We utilized a class transformation method to create a new variable Z, where Z = 1 if a patient was in the RRT group and survived for 28 days, Z = 1 if a patient was in the non-RRT group and died within 28 days, Z = 0 if a patient was in the RRT group and died within 28 days, and Z = 0 if a patient was in the non-RRT group and survived for 28 days. We can prove that if the number of cases in the RRT group and non-RRT group was equal, then P (Z = 1│Xi) had a linear correlation with the uplift score, which can be calculated as Uplift Score = 2P (Z = 1│Xi) − 1. This method is applicable to cases where both treatment and outcome are binary variables, and single model prediction is used to achieve the conversion of prediction goals.

The validity of different models was evaluated using the adjusted qini curve, which was obtained by connecting proportion points of the adjusted qini index in different groups. The adjusted qini index was defined as following formula.

$${text{Q}}left(mathrm{varphi }right)=frac{{{text{n}}}_{{text{t}},1}left(mathrm{varphi }right)}{{{text{N}}}_{{text{t}}}}-frac{{{text{n}}}_{{text{c}},1} (mathrm{varphi }){{text{n}}}_{{text{t}}} (mathrm{varphi })}{{{text{N}}}_{{text{t}}}{{text{n}}}_{{text{c}}} (mathrm{varphi })}$$

(mathrm{varphi }) is defined as the proportion of patients ranked from highest to lowest based on their Uplift Score in either the treatment or non-treatment group. For instance, (mathrm{varphi })= 0.3 represents for the top 30% uplift score patients in treatment group or none treatment group. ({{text{n}}}_{{text{t}},{text{y}}=1} (mathrm{varphi })) represents the number of patients in the treatment group who are predicted to survive among the given percentage of patients.({{text{n}}}_{{text{c}},{text{y}}=1} (mathrm{varphi })) represents the number of patients in the non-treatment group who are predicted to survive among the same percentage of patients. ({mathrm{ N}}_{{text{t}}}mathrm{ and }{{text{N}}}_{{text{c}}}) represents the total sample size of the treatment and non-treatment groups. The Area under the Uplift Curve (AUUC) is the area between the uplift curve and the random line, which serves as an indicator of model validity. A high AUUC corresponds to a high validity of the model.

The candidate predictors of our model included demographics, vital signs, SOFA and qSOFA scores, Kdigo stage, comorbidities and laboratory tests in the first 24 h after ICU admission. Predictor contributions were evaluated using the Shapley additive explanations (SHAP) strategy.

After modeling, we divided the patients in the validation cohort into a high benefit group and a low benefit group, and confirmed the characteristics of the high benefit group. We labeled the validation cohort according to the model, which allowed us to create a nomogram for patients who may benefit from RRT.

Ethical approval

The clinical data used for this research was obtained from publicly available non-identifiable database, Medical Information Mart for Intensive Care (MIMIC-III), and does not require a separate ethics approval or consent obtaining process. The Institutional Review Board at the Chinese PLA General Hospital waived the review of the research plan as the data was obtained from a publicly available database with no potential violation or infringement of the ethical regulations.