Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

OBJECTIVE: Participants enrolled into randomized controlled trials (RCTs) often do not reflect real-world populations. Previous research in how best to transport RCT results to target populations has focused on weighting RCT data to look like the target data. Simulation work, however, has suggested that an outcome model approach may be preferable. Here, we describe such an approach using source data from the 2 × 2 factorial NAVIGATOR (Nateglinide And Valsartan in Impaired Glucose Tolerance Outcomes Research) trial, which evaluated the impact of valsartan and nateglinide on cardiovascular outcomes and new-onset diabetes in a prediabetic population. MATERIALS AND METHODS: Our target data consisted of people with prediabetes serviced at the Duke University Health System. We used random survival forests to develop separate outcome models for each of the 4 treatments, estimating the 5-year risk difference for progression to diabetes, and estimated the treatment effect in our local patient populations, as well as subpopulations, and compared the results with the traditional weighting approach. RESULTS: Our models suggested that the treatment effect for valsartan in our patient population was the same as in the trial, whereas for nateglinide treatment effect was stronger than observed in the original trial. Our effect estimates were more efficient than the weighting approach and we effectively estimated subgroup differences. CONCLUSIONS: The described method represents a straightforward approach to efficiently transporting an RCT result to any target population.

Original publication




Journal article


J Am Med Inform Assoc

Publication Date





429 - 437


electronic health records, machine learning, public health informatics, treatment heterogeneity, Antihypertensive Agents, Cardiovascular Diseases, Diabetes Mellitus, Type 2, Disease Progression, Electronic Health Records, Evidence-Based Medicine, Humans, Hypoglycemic Agents, Machine Learning, Nateglinide, Outcome Assessment, Health Care, Prediabetic State, Randomized Controlled Trials as Topic, Translational Research, Biomedical, Valsartan