Bias introduced after looking at study results
Biases can be introduced when knowledge of the results of studies influences analysis and reporting decisions.Key Concepts addressed:
- 2-11 All fair comparisons and outcomes should be reported
- 2-5 People should not know which treatment they get
- 2-12 Subgroup analyses may be misleading
Biases can be introduced when knowledge of the results of studies influences analysis and reporting decisions, for example, when studies stop earlier than planned, or if there is bias in the selection of the treatment outcomes analyzed.
Bias results from processes that tend to produce information that depart systematically from the truth. Avoiding bias is relevant when analysing the results of studies statistically. Analysis biases may be introduced during the design of studies, when decisions about which analyses to do might lead to the favouring of one of the treatments compared over another. This might include decisions about how to deal with data for participants who don’t adhere to their allocated intervention, the analysis of those who experience other outcomes before the main outcomes for the study, or the definition, counting and combination of particular outcomes in the analyses. These design biases are akin to those that can arise when the choice of the comparator to test in the study has been biased so that the eventual results will be unduly favourable to the newer treatment.
Things can get much worse after study results have been inspected. Changes might then be made to how the analyses will be done or reported, with fore-knowledge of how these changes will favour one or other of the treatments compared. If these changes occur between the collection of the study data and its eventual reporting, the reader of the published results might be misled, especially if the changes are not clearly described and explained.
Biased analyses before the planned end of a study
Biases after looking at study results can occur both after formal statistical analyses, and through more informal routes. For example, if the researchers are collecting the outcomes or observing these outcomes because they are providing the treatments to participants in the study, they may get a sense of the accumulating results, for example, about which patients are doing particularly well or badly. This might lead them to alter the planned analyses, such as changing what they feel is the “most important” outcome, choosing an earlier time point as the main one to emphasise, or dividing the data in different ways in subgroup analyses. One way to avoid this is by keeping the researchers and the practitioners blind (masked) to the treatment allocated to each participant.
When study results are being analysed more formally, the problems can become worse as these initial analyses may reveal what the results would be if the analyses are modified. Such biases might occur before or after the study has reached its intended completion.
During a study, accumulating results might be examined to see if there is clear evidence of benefit or harm for one intervention, which might make it unethical to continue the study. On the other hand, it may become clear that the effect that was hoped for is not achievable in the study and that it would be better to stop the study for futility rather than to continue to recruit participants to a study that will use resources but will not resolve the initial uncertainties. These early stopping decisions can lead to bias when the interim results happen to be high or low simply by chance, especially if there is a vested interest in closing the study and turning these interim results into its final results (Trotta 2008).
One way to avoid biases that might arise if the researchers themselves are responsible for these interim decisions is to have an independent Data Monitoring Committee consider the accumulating results. The committee can agree guidelines for deciding when to make interim analyses available to an oversight group for the study, such as a Trial Steering Committee (Grant 2005).
Sometimes, interim results may be presented more publicly, to allow practitioners and potential participants in the trial to make up their own minds about whether or not to continue with the study. For example, the preliminary results of the ISIS-2 trial of aspirin and streptokinase for people having a heart attack (myocardial infarction). The trial Steering Committee published a half-page interim report showing benefits reported to them the previous month. These showed a reduction in the risk of death in the short term among patients who had received streptokinase within 4 hours of experiencing symptoms of heart attack (ISIS-2 1987). Despite this information, some insufficiently persuaded clinicians continued to recruit patients to the trial within this time window, as well as others who had presented more that 4 hours after their symptoms had begun (ISIS-2 1988).
Biased analyses after the planned end of a study
At the end of a study, changes to the analyses after looking at the results can lead to bias through:
- changes in the primary outcome, or in how outcomes are defined or combined in composite outcomes;
- introduction or modification of subgroup analyses, in which different groups of participants are analysed separately; perhaps to highlight the presence or absence of benefit in certain types of person or setting. In addition to the problems of bias in these analyses, chance might mean that the findings are not a reliable guide to the truth (Counsell 1994, Clarke 2001);
- selective reporting of particular outcomes, analyses or treatment comparisons. For example, in a study comparing three treatments, there are seven different ways in which the treatments might be compared. This gives researchers opportunities to highlight some comparisons over others, based purely on their results; and
- changes to the statistical techniques, such as the introduction of adjustments for differences in the baseline characteristics of the participants which had not been pre-planned or pre-specified.
The potential impact of some of these biases have been studied, and some of these studies have themselves have been considered in systematic reviews. For example, systematic reviews by Kerry Dwan and colleagues have brought together information on how the methods used in the analyses and reporting of randomised trials changed between the design phase of the trial and the publication of its results.
In their most recent review, they found 22 studies (containing more 3000 randomised participants) published between 2000 and 2013 that found discrepancies in statistical analyses (8 studies), composite outcomes (1), the handling of missing data (3), unadjusted versus adjusted analyses (3), handling of continuous data (3) and subgroup analyses (12), concluding that discrepancies in analyses between publications and other study documentation were common, but not discussed in the trial reports (Dwan 2014). In their systematic reviews of studies of selective reporting, they found that comparisons of trial publications to protocols found that 40-62% of studies had at least one primary outcome that was changed, introduced, or omitted (Dwan 2011; Dwan 2013).
In systematic reviews of the impact of early stopping, Montori et al in 2005 and Bassler et in 2010 have shown how early stopping might bias conclusions about the effects of interventions. The first review included 143 randomised trials stopped early for benefit, with 92 of these published in 5 high-impact, influential medical journals and, on average, the trials recruited about two-thirds of their planned sample size. Montori et al concluded that randomised trials stopped early for benefit were becoming more common, often fail to adequately report relevant information about the decision to stop early, and show implausibly large treatment effects, particularly when the number of events is small. They wrote that “clinicians should view the results of such trials with scepticism” (Montori 2005). Five years later, Bassler et al compared 91 truncated randomised trials with 424 matched non-truncated trials, finding a pooled ratio of relative risks of 0.71 (95% confidence interval, 0.65-0.77). This showed that the effects estimates in the trials that stopped early were on average more favourable to the treatments than those from similar trials that did not stop early (Bassler 2010).
If users of the reports of studies are to have confidence in their final reports, they need to be reassured that bias was not introduced to the results in those reports after the early results had been seen. Although the afore-mentioned reviews show that protocols are no guarantee against this, access to a protocol or a study’s statistical analysis plan might identify any changes that were made; and, since 2013, guidance on the structured reporting of protocols is available from the SPIRIT group (Chan 2013). In relation to the choice of outcomes to analyze and report, those designing studies should consider the use of core outcome sets as the minimum that they should measure, analyze and report in all trials in a particular condition. Work by the COMET initiative has already identified 200 such outcome sets (Gargon 2014), which are now available through the COMET database www.cometinitiative.org/studies/search.
It is tempting for people to change their views on what is important about a study after they have knowledge of the results. Such biases need to be avoided by careful planning of what analyses will be done, and clear explanations of any changes that were made to those plans and the reasons for them.
The text in these essays may be copied and used for non-commercial purposes on condition that explicit acknowledgement is made to The James Lind Library (www.jameslindlibrary.org).
Bassler D, Briel M, Montori VM, Lane M, Glasziou P, Zhou Q, Heels-Ansdell D, Walter SD, Guyatt GH; STOPIT-2 Study Group (2010). Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. JAMA 303:1180-1187.
Chan A-W, Tetzlaff JM, Altman DG, Laupacis A, Gotzsche PC, Krleža-Jerić K, Hróbjartsson A, Mann H, Dickersin K, Berlin J, Doré C, Parulekar W, Summerskill W, Groves T, Schulz K, Sox H, Rockhold FW, Rennie D, Moher D (2013). SPIRIT 2013 Statement: Defining standard protocol items for clinical trials. Annals of Internal Medicine 158:200-207.
Counsell CE, Clarke MJ, Slattery J, Sandercock PAG (1994). The miracle of DICE therapy for acute stroke: fact or fictional product of subgroup analysis? BMJ 309:1677-1681.
Clarke M, Halsey J (2001). DICE 2: a further investigation of the effects of chance in life, death and subgroup analyses. International Journal of Clinical Practice 55:240-242.
Dwan K, Altman DG, Clarke M, Gamble C, Higgins JP, Sterne JA, Williamson PR, Kirkham JJ (2014). Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Medicine 11(6):e1001666.
Dwan K, Altman DG, Cresswell L, Blundell M, Gamble CL, Williamson PR (2011). Comparison of protocols and registry entries to published reports for randomised controlled trials. Cochrane Database of Systematic Reviews (1):MR000031.
Dwan K, Gamble C, Williamson PR, Kirkham JJ; Reporting Bias Group (2013). Systematic review of the empirical evidence of study publication bias and outcome reporting bias – an updated review. PLoS One 8(7):e66844.
Gargon E, Gurung B, Medley N, Altman DG, Blazeby JM, Clarke M, Williamson PR (2014). Choosing important health outcomes for comparative effectiveness research: a systematic review. PLoS ONE 9(6):e99111.
Grant AM, Altman DG, Babiker AB, Campbell MK, Clemens FJ, Darbyshire JH, Elbourne DR, McLeer SK, Parmar MK, Pocock SJ, Spiegelhalter DJ, Sydes MR, Walker AE, Wallace SA; DAMOCLES study group (2005). Issues in data monitoring and interim analysis of trials. Health Technology Assessment 9(7):1-238.
ISIS-2 Steering Committee (1987). Intravenous streptokinase given within 0-4 hours of onset of myocardial infarction reduced mortality in ISIS-2. Lancet 329:502.
ISIS-2 (second International Study of Infarct Survival) Collaborative Group (1988). Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17 187 cases of suspected acute myocardial infarction: ISIS-2. Lancet 332:349–360.
Montori VM, Devereaux PJ, Adhikari NK, Burns KE, Eggert CH, Briel M, Lacchetti C, Leung TW, Darling E, Bryant DM, Bucher HC, Schunemann HJ, Meade MO, Cook DJ, Erwin PJ, Sood A, Sood R, Lo B, Thompson CA, Zhou Q, Mills E, Guyatt GH (2005). Randomized trials stopped early for benefit: a systematic review. JAMA 294:2203-2209.
Trotta F, Apolone G, Garattini S, Tafuri G (2008). Stopping a trial early in oncology: for patients or for industry? Annals of Oncology 19(7):1347-1353