Aims: Telemedical interventions in heart failure patients intend to avoid unfavourable, indication-related events by an early, individualized care, which reacts to the current patients need. However, telemedical support is an expensive intervention, and usually only patients with high risk for unfavourable follow-up events will be able to profit from it. Mockel et al. therefore adapted a new design which we call 'prognostic-efficacy-combination design'. This design allows to define a biomarker cut-off and to perform a randomized controlled trial (RCT) in a biomarker-selected population within a single study. However, so far, it has not been evaluated if this double use of the control group for biomarker cut-off definition and efficacy assessment within the RCT leads to a bias in treatment effect estimation. In this methodological research work, we therefore want to evaluate whether the 'prognostic-efficacy-combination design' leads to biased treatment effect estimates and also compare it to alternative designs. If there is a bias, we further want to analyse its magnitude under different parameter settings.
Methods: We perform a systematic Monte Carlo simulation study to investigate among others potential bias, root mean square error and sensitivity, and specificity as well as the total treatment effect estimate in various realistic trial scenarios that mimic and vary the true data characteristics of the published TIM-HF2 Trial. In particular, we vary the event proportion, the sample size, the biomarker distribution, and the lower bound for the sensitivity.
Results: The results show that indeed the proposed design leads to some bias in the effect estimators, indicating an overestimation of the effect. However, this bias is relatively small in most scenarios.
Conclusions: The 'prognostic-efficacy-combination design' can generally be recommended for clinical applications due to its efficiency compared to two separate trials. We recommend a sufficiently large sample size depending on the trial scenario. Our simulation code can be adapted to explore suitable sample sizes for other settings.