Realizations in Biostatistics: O'Brien-Fleming designs in practice

Sunday, January 4, 2009

O'Brien-Fleming designs in practice

It seems that the O'Brien-Fleming design is the most popular of all group sequential clinical trial designs. This particular strategy represents over 80% of all adaptive clinical trials I have been involved with, and I've been involved with quite a few. Of course, there are tweaks to this design. For example, you can run a trial as a two-sided O'Brien-Fleming (so that you stop early if you show exceptional benefit or harm) or as a efficacy-futility design. For the efficacy-futility design you can use an 'inner-wedge' strategy so that you continue the trial in the case of moderate benefit or harm, or you can use a one-sided efficacy-futility design so that you continue in the case of moderate benefit but stop in the case of futility or any apparent harm. You can design the trial as a classical O'Brien-Fleming design or as a Lan-DeMets approximation using spending functions. Most of these decisions, despite the combinatorial explosion of choices, are straightforward because of ethical and logistical constraints. The main choices to be made are the number of interim analyses and, in the case where there are just one or two, the timing of the analyses. Statistically, this design is easy to implement, though there are several "gotchas" that makes it such that I would want someone experienced in the methodology to at least supervise the statistics. (Running software such as the SAS routines or EaST, while those packages are good, won't suffice.) After the design is done, the fun (or the hard part, depending on your point of view) begins. Other articles in this blog cover some of the finer details of design. (Just click the tags to find related entries.)

The first decision is how to implement the interim analyses, and exactly who should know what information at each interim analysis. In Phase 2, this isn't as big a deal as it is in Phase 3, where regulatory authorities tend to be very picky. If subjects' treatment codes, or even the results of the analysis based on those codes, are revealed publically at the interim analysis stage, the trial may be compromised. For example, the greatest fear is that the sponsor will make unethical adjustments to the trial based on the interim analysis, but this is not the only concern. If word gets out that the treatment is "successful" or "not successful" at the interim analysis, and the trial is supposed to continue, then potential subjects may refuse to give consent because they may want to be guaranteed treatment (if successful) or not want to receive a futile treatment. Therefore, the most common way to implement the interim analysis is to use an independent board consisting of at least a biostatistician and clinician who do not otherwise make decisions in the trial. These individuals will analyze the unblinded data and make recommendations based on the results. The sponsor will receive only recommendations, which usually consist of continue the study, terminate the study, or modify the study.

Once the actual interim analysis process is chosen (and what information is released), then the clinical trial operations and data management groups also have to prepare for the interim analyses. Clinical study monitors need to go to the clinics and verify all interim data to make sure it is consistent with the investigators assessment, just as if the interim analysis were the study's final analysis. Data management has to enter and verify the data in the database. Then the independent statistician (i.e. whose only job regarding to study is running the interim analysis) needs to run and verify the analysis. Electronic data capture (EDC) usually makes this process faster and easier, making a smaller lag time between the cut off data for the interim data and the analysis (and recommendations). The minutes from the review of the analysis need to be archived, but sealed until the end of the study. In my experience, the time between data cutoff and the interim analysis is very busy, and very hectic no matter whether EDC is used or not.

Of course, the above logistical considerations apply to any sort of group sequential, adaptive, or sequential Bayesian design. If a sequential trial runs to the maximum sample size, then it is more expensive than the fixed sample size counterpart because of the added planning in the beginning and the additional effort in the middle. These designs can show their strengths when they terminate early, however.

Sequential and adaptive designs have several subtle caveats that I will address in the coming posts.