Background
Internet-based interventions produce comparable effectiveness rates as face-to-face therapy in treating depression. Still, more than half of patients do not respond to treatment. Machine learning (ML) methods could help to overcome these low response rates by predicting therapy outcomes on an individual level and tailoring treatment accordingly. Few studies implemented ML algorithms in internet-based depression treatment using baseline self-report data, but differing results hinder inferences on clinical practicability. This work compares algorithms using features gathered at baseline or early in treatment in their capability to predict non-response to a 6-week online program targeting depression.
Methods
Our training and test sample encompassed 1270 and 318 individuals, respectively. We trained random forest algorithms on self-report and process features gathered at baseline and after 2 weeks of treatment. Non-responders were defined as participants not fulfilling the criteria for reliable and clinically significant change on PHQ-9 post-treatment. Our benchmark models were logistic regressions trained on baseline PHQ-9 sum or PHQ-9 early change, using 100 iterations of randomly sampled 80/20 train-test-splits.
Results
Best performances were reached by our models involving early treatment characteristics (recall: 0.75–0.76; AUC: 0.71–0.77). Therapeutic alliance and early symptom change constituted the most important predictors. Models trained on baseline data were not significantly better than our benchmark.
Conclusions
Fair accuracies were only attainable by involving information from early treatment stages. In-treatment adaptation, instead of a priori selection, might constitute a more feasible approach for improving response when relying on easily accessible self-report features. Implementation trials are needed to determine clinical usefulness.