Tuesday, April 20, 2010

How to waste millions of dollars with clinical trials, Part II: Rexahn

Lots of people have been calling BS on Rexahn's press release about its Phase 2a data on Serdaxin and subsequent "additional statements." Read the articles below; they offer good background and analysis.

Here, I think Rexahn really did it to themselves, but I also think the environment around drug development is partly to blame. Everybody loves the p-value less than 0.05 [to Adam Fueurstein's credit in the first reference below, he blows off the "not statistically significant" issue], and companies are ready to comply by cherry-picking the best-looking p-value to present. (I know: I've been asked to do this cherry picking.)

Why shouldn't we care about statistically significant? Simply because it's a Phase 2a study. This is the place where we don't know anything about a drug's efficacy, and we're starting to learn. We don't know how to power a study to reach statistical significance, simply because we don't know the best endpoint, the best dosing schedule, the best dose, or really anything except what is observed in preclinical. And we know that the drug development graveyard is littered with drugs that did well in preclinical and bombed in the clinic. So how can we expect to know how many subjects to enroll to show efficacy? We could also use some nouveau Bayesian adaptive design (and I could probably design one for most circumstances), but tweaking more than two or three things in the context of one study is a recipe for disaster.

Here's what I would prescribe (while ignoring the needs of press release consumers):
1. Forget about statistically significant. Whatever calculations are made for power analysis or number of subjects are usually a joke, anyway. The real reason for a sample size usually has to do with budget, and the desire to collect at least the minimum amount of information needed to design a decent Phase 2 trial. If there is a stated power calculation, it has usually been reverse engineered. Instead, it is sometimes possible to do sample size calculations based on size of confidence intervals (to reflect the certainty of an estimate) of different endpoints (see #2).
2. Forget about a primary endpoint. There is no need. Instead, use several clinically relevant instruments, and pick one after the study that is the best from a combination of resolution (i.e. ability to showcase treatment effect) and clinical relevance.
3. Set some criteria for "success," i.e. decision criteria for further development of the drug, that does not include statistical significance. This might include an apparent dose effect (say, 80% confidence interval around a parameter in a dose-response design that shows positive dose-related effect), tolerability at a reasonably high dose, or, if you implement a Bayesian design (adaptive or otherwise), a given probability (say 80%) of a successful adequate and well-controlled trial of reasonable size with clinically relevant criteria. What these criteria are of course needs to be carefully considered -- they have to be reasonable enough to be believable to investors and scientifically sound enough to warrant further treatment of patients with your unproven drug, and yet not so ridiculously high that your press release is almost definitely going to kill your ability to woo investors.
4. Be transparent and forthcoming in your press release, because enough data is out there and there are enough savvy investors and bloggers who can call BS on shenanigans.

Also see: