An Automated Approach to Identify Responding Subpopulations

Our customer had run 5 trials in treatment resistant depression amassing a dataset of drug efficacy spanning different doses and treatment regimens. They needed the ability to fist standardize and unify this trial data into a single dataset and then needed a repeatable approach to apply data science techniques to stratify patients, identify cohorts of responders, and ultimately discover novel biomarkers that could predict future clinical response. The trial data included biological assays of protein levels, clinically validated self assessments, and multiple treatment levels.
We developed an algorithm capable of learning clinically meaningful combinations of baseline lab values that were predictive of both primary and secondary endpoint outcomes, automatically identifying subpopulations with significantly elevated likelihood of experiencing treatment benefit.
This algorithm required harmonizing data from five previously conducted TRD trials, each featuring different dose levels, regimens, and biomarker panels.
Our approach emphasized transparency and explainability allowing translational scientists to validate findings and generate hypotheses around potential mechanisms of action and novel biomarkers.
Ultimately, our work provided a reusable framework to accelerate insight generation and patient stratification efforts across CNS studies.