Realizations in Biostatistics: Surreality in noninferiority trials: Advanced Life "statistically not inferior"

Thursday, June 21, 2007

Surreality in noninferiority trials: Advanced Life "statistically not inferior"

Noninferiority trials are trials that aim to show that one drug (or, more generically, factor) is not worse than another, older active drug. This is accomplished by setting a "noninferiority margin," running the trial, taking the difference between the two treatment effects, and finding out where the lower end of the 95% confidence interval lies. If the lower 95% confidence limit is higher than the noninferiority margin, then the new drug is noninferior to the other one.

These trials are useful (in pharmaceuticals) when placebo controlled trials are considered unethical or otherwise infeasible, and there are other treatments on the market, and are used all the time in anti-infectives (especially antibiotics). And, of course, they have their problems, like how to choose noninferiority margins (and I hope to shed a little darkness on this issue at an upcoming talk at the Joint Statistical Meetings). Don't look to the FDA for guidance; they've retracted all guidances and says that all margins must be justified statistically and clinically on a case-by-case basis.

And then there is this problem. The short of it goes like this: Advanced Life's new antibiotic treatment showed an effect that was slightly smaller than the active control, yet the lower 95% confidence limit still fell above the noninferiority margin. (This is the meaning of "statistically not inferior to ... Biaxin, although the latter drug technically performed better.") Of course, the consequences of this seemed pretty bad, and the press release language is certainly bizarre.

Noninferiority trials, in my opinion, are one of the failings of statisticians. We simply haven't figured out, at least in the classical statistics camp, how to effectively do this kind of analysis. The way we have it set up right now, we can have a drug that is slightly inferior slip under the radar, and if we approve a slightly inferior drug, and then use that as an active control against another drug that is slightly inferior to that, and so forth, you can end up with a truly worthless drug coming through with shining colors.

I haven't explored the Bayesian version of noninferiority, but it seems better to evaluate a drug on the basis of a posterior probability that the new drug is at least as good as the control than it does to set an arbitrary margin as see where the confidence interval of the difference falls. Unless we can come up with a better solution based on classical statistics.