Tuesday, January 2, 2007

Confidence limits on NNTs - a guide to comparing NNTs

Previously in this series I discussed the definition of the NNT (i.e. when comparing therapy to placebo, it's 1/absolute risk reduction of therapy) and how to interpret it (it's the expected number of people that you would have to treat to prevent one unfavorable outcome). It's evil twin, the NNH, is similarly calculated and interpreted in association with an adverse event.

In my first entry on the subject, a commenter asked whether the NNT is a number or a statistic with an error. The answer is that it's a statistic with an error. Problem is, most people do not report the error (or confidence interval) along with the NNT. The error or confidence interval helps us answer questions such as "Drug A has NNT of 21 and Drug B has NNT of 22. Is there really a difference?"

To begin with, I'll assume that all needed data is available in this form:
  • Risk of unfavorable outcome for placebo group
  • Risk of unfavorable outcome for treatment group
  • Sample size
To calculate the NNT itself, just do 1/(risk in placebo-risk in treatment). The sample size is not needed. However, to calculate the error, the sample size is necessary.

Then, we calculate the error for (risk in placebo-risk in treatment) (i.e. 1/NNT). The expression is a little more complicated, but not hard to put into a spreadsheet or calculator:

std error = sqrt(risk placebo * (1 - risk placebo) / (# in placebo group) + risk treatment * (1 - risk treatment) / (# in treatment group)),

where sqrt means take the square root of the whole thing. A simple explanation goes as follows:

risk group A * (1 - risk group A) / (# in group A)

is the variance of the estimate in group A. Add the variances in the placebo and treatment groups to get the variance of the treatment, and then take the square root to get the error. So two principles: variances (often) add, and the error is the square root of the variance.

To get a 95% confidence interval of the risk reduction, you take the difference and add/subtract 2 times the error1.

Example. In the last entry I compared niacin and simvastatin. The article has some of the information we need:

DrugRisk placeboRisk treatmentSample size
Niacin36.5%31.5%1390
Simvastatin21.5%13.5%2221 (sim)
2223 (placebo)


I had to do some sleuthing to get the sample size numbers. For niacin a Google search for the Coronary Heart Disease project landed this draft of a report, from which I found a total sample size and divided by 6 (there were six groups). For the simvastatin number I used the Wikipedia entry on Scandinavian Simvastatin Survival Study to get the sample size. But anyway, we're able to do a confidence interval calculation. We start with niacin:
  • risk reduction = (36.5% - 31.5%) = 5% (so NNT = 1/5% = 20)
  • error = sqrt(0.365 * 0.635/1390 + 0.315*0.685/1390) = 0.018 = 1.8%
  • 95% confidence interval of risk reduction is 5%-2*1.8% to 5%+2*1.8% = 2.4% to 8.6%
The 95% confidence interval for risk reduction for simvastatin is 5.8% to 10.2%. (I got a risk reduction of 8% and standard error of 1.1%).

End example.

The simplest way to get a 95% confidence interval for the NNT is to just do 1/confidence limits. You will also have to invert the order of the limits. Granted, this isn't necessarily the best way, and I'll probably show how to do another way one day, but it's easy and are actual (approximate) 95% confidence limits. So the NNT limits for niacin are (1/8.6% to 1/2.4%) = (11.63 to 41.7). The NNT limits for simvastatin are (9.8 to 17.2).

From this quick and dirty calculation, it's not absolutely clear that niacin has higher efficacy than simvastatin. Part of the reason for this is the wide error in the risk reduction estimate for niacin, which comes from the fact that 31.5% of subjects in the niacin group of the study had a cardiovascular event.

A few other issues are worth point out here, and they cloud the issue even more. I took these numbers from two different studies: the 4S study and the Coronary Heart Disease project. If you look at these two studies, they have different inclusion criteria (e.g. the CHD project had an inclusion criterion of men only). Eventually, in trying to get the information we need, we come across such barriers. Preferably, the numbers I used above would have come from the same study, and given the differences between objectives and populations in the studies, the comparison between the simvastatin and niacin NNTs are not as straightforward as back-of-the-envelope calculations as given above can lead you to believe. It's important to keep the limitations of both the data and the statistics in mind.

1Technical details: this is an approximate interval, and those who have been through stats classes may prefer to use z0.025=1.965. I don't think it matters too much except in cases that accuracy is very important such as academic reports and regulatory submissions. Also, this confidence interval has fallen out of favor with statisticians, but is easy and useful for the kinds of back-of-the-envelope things I'm doing here. [back]