Wednesday, November 25, 2015

Even the tiniest error messages can indicate an invalid statistical analysis

The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the CSV file in a text editor, and found that the reason for the parsing error was that the data set was duplicated within the CSV file. The parsing error resulted from the reading of the header twice. As a result, anything I did afterward was suspect.

Word to the wise: track down the reasons for even the most innocuous-seeming warnings. Every stage of a statistical analysis is important, and small errors anywhere along the way and have huge consequences downstream. Perhaps this is obvious, but you still have to slow down and take care of the details.

(Note that I'm editing this to be a part of my Little Debate series, which discusses the tiny decisions dealing with data that are rarely discussed or scrutinized, but can have a major impact on conclusions.)

Thursday, October 22, 2015

Statisticians ruin the day again, this time with a retraction

Authors retract second study about medical uses of honey - Retraction Watch at Retraction Watch:

For the second time, authors of manuscripts have had to retract their papers because of serious data analysis errors. While the details of the actual errors are scant, we do know that the article was published, a company tried to replicate the results but failed, the journal editor employed a third-party statistician who found serious errors in the data analysis, and the errors were serious enough that the paper, to stay accepted, would have had to go through a major revision and further peer review.

Better to get a statistician to ruin your day before publication (and require another revision) than to have to eat crow because a third party did it. Other thoughts:

  • Did the authors have a statistician review this before submitting?
  • Did the journal have this go through a statistical peer review before accepting?
  • Come to think of it, was a statistician involved in the planning of this study?

Thursday, October 15, 2015

The thirty-day trial

Steve Pavlina wrote about a self-help technique called the thirty-day trial. To perform the technique, you commit 30 days of some new habit, such as quitting smoking or writing in a journal. The idea is that it’s psychologically easier to commit to something for 30 days than to make a permanent change, but after the 30 days you break addiction to old habits and have the perspective of whether to continue on with the new habit, go back, or go a completely different direction.

For activities like journaling or quitting smoking, this technique might work. After all, psychologist Jeremy Dean announced that it takes 21 days to form a new habit. (This has been treated with some due skepticism, which it should.) However, if you try a 30 day quit-smoking trial and it doesn’t work, try again. And if that doesn’t work, try it again. Until you succeed.

However, for activities such as a new diet, treatments, or any sort of healthcare advice, the 30 day trial should not be used.

For cases where one might try a 30 day trial just to see if it feels better, such as an elimination diet, confirmation bias is likely to be at play. This is especially true in the case of an elimination diet, where one eliminates some aspect of a diet, like dairy or gluten, and see if some kinds of symptoms, such as bloating or fatigue, go away. In such trials, the results may be due to the eliminated item in question, placebo/nocebo effect, or some third confounded eliminated item. For instance, bloating from gluten-containing foods probably comes from medium-length carbohydrates called FODMAPs. Just because you feel better for 30 days after discontinuing gluten-containing foods doesn’t mean that gluten is the culprit. In fact, it probably isn’t, as determined by a well-designed clinical trial. Likewise, eliminating milk from a diet isn’t likely to do too much unless lactose intolerance is the culprit, and there are tests for that.

Returning to Pavlina’s advice, he recommends a 30 day trial over the results of a clinical trial. This is sloppy and irresponsible advice. First, it is very unlikely that an individual will have access to screening assays, animal models, or the millions of dollars needed to discover and develop a drug. That is, unless said individual is up for trying millions of compounds, many potentially toxic and will probably tragically shorten the trial. Instead, a pharmaceutical should be used under the supervision of a doctor, who should be aware of the totality of literature (ideally consisting of randomized controlled trials if possible and a systematic review) and can navigate the benefits and possible adverse effects. A 30-day trial of a pharmaceutical or medical device may be completely inappropriate to realize benefits or assess risks. Here is the primary case where the plural of anecdote is not data.

The bottom line is that the 30 day trial is one of several ways (and perhaps not the best one) to change habits that you know from independent confirmation need to be done, like quitting smoking. It’s a terrible way to adjust a diet or determine if a course of treatment is the right one. Treatments should be based on science and under the supervision of a physician who can objectively determine whether a treatment is working correctly.

Friday, May 8, 2015

Statistics: P values are just the tip of the iceberg : Nature News & Comment

Statistics: P values are just the tip of the iceberg : Nature News & Comment:

This article is very important. Yes, p-values reported in the literature (or in your own research) need scrutiny, but so does every step in the analysis process, starting with the design of the study.

Friday, April 17, 2015

The many faces of the placebo response

This week, a study was published that claimed that the placebo response is mediated by genetics. Though I need to dig a little deeper and figure out exactly what this article is saying, I do think we need to take a step back and remember what can constitute a placebo response before we start talking about what this means for medical treatment and clinical trials.

In clinical trials, the placebo response can refer to a number of apparent responses to sham treatment:

  • The actual placebo response, i.e. a body’s physiological response to something perceived to be a treatment
  • Natural course of a disease, including fluctuations, adaptations
  • Investigator and/or subject bias on subjective instruments (hopefully mitigated by blinding/masking treatment arms in a study)
  • Regression to the mean (an apparent time-based relationship caused by one measurement that varies markedly from the average measurement)
  • … and many, many other sources

This week’s discovery does suggest that there is something physiological to the actual placebo response, and certainly genetics can influence this response. This may be useful in enriching clinical trials where a large placebo response is undesirable, e.g. by removing those subjects who are likely to response well to anything active or inert. After all, you don’t want an estimate of a treatment effect contaminated by a placebo response, nor do you want an impossibly high bar for showing an effect.

But we still need to remember the mundane sources of “placebo response” and lower those before we get too excited about genetic tests for clinical trials.

Tuesday, April 14, 2015

Helicopter parenting because your mind is "terrible at statistics" (or, rather, you are unaware of the denominator)

In Megan McArdle's Seven Reasons We Hate Free-Range Parenting - Bloomberg View:, she states that because of the news cycle, and because our minds are terrible at statistics, we think the world is a much less safe place than 30 years ago. It's true. We have more opportunity (because of the internet and cable news) to hear about tragedies and crime from far-away places. It's worse if such tragedy strikes closer to home. Thus, we tend to think the world is very unsafe. (And therefore are encouraged to become helicopter parents.) We are acutely aware of the numerator

What we do not hear, because it does not sell on cable news, is is the denominator. For all (very few) children abducted by strangers, for instance, we do not hear of the ones who successfully played at the park, or walked to the library, or went down to the field to play stickball (or I guess nerf softball because we shouldn't be throwing hard objects anymore) without getting abducted. This is because those stories do not sell.

I guess the second best is reporting on trends in parenting, and how they are driven by how bad we are at statistics (even statisticians).

Thursday, March 19, 2015

Lying with statistics, CAM version

Full disclosure here: at one time I wanted to be a complementary and alternative (CAM) researcher. Or integrative, or whatever the cool kids call it these days. I thought that CAM research would yield positive fruit if they could just tighten up their methodology and leave nothing to question. While this is not intended to be a discussion of my career path, I’m glad I did not go down that road.

This article is a discussion of why. The basic premise of the article is that positive clinical trials do not really provide strong evidence of an implausible therapy, for much the same reason that doctors will give a stronger test to an individual who tests positive for HIV. A positive test will provide some, but not conclusive, evidence for HIV simply because HIV is so rare in the population. The predictive value of even a pretty good test is poor. And the predictive value of a pretty good clinical trial is pretty poor if the treatment has not been established.

Put it this way, if we have a treatment that has zero probability of working (the “null hypothesis” in statistical parlance), there will be a 5% probability that it will show a significant result in a conventional clinical trial. But let’s turn that on its head using Bayes Rule:

Prob (treat is useless| positive trial) * Prob(positive trial) = Prob (positive trial | treatment is useless) * Prob (treat is useless) (ok, this is just the definition of Prob (treat is useless AND positive trial)

Expanding, and using the law of total probability:

Prob (treat is useless| positive trial) = Prob (positive trial | treatment is useless)* Prob (treat is useless) / ((Prob positive trial|treat is useless)*Prob(treat is useless) + Prob(positive trial|treat is not useless)*Prob(treat is not useless))

Now we can substitute, assuming that our treatment is in fact truly useless:

Prob (treat is useless| positive trial) = p-value * 1 / (p-value * 1 + who cares * 0) = 1

That is to say, if we know the treatment is useless, the clinical trial is going to offer no new knowledge of the result, even if it was well conducted.

Drugs that enter in human trials are required to have some evidence for efficacy and safety, such as that gained from in vitro and animal testing. The drug development paradigm isn’t perfect in this regard, but the principle of the requirement of scientific and empirical evidence for safety and efficacy is sound. When we get better models for predicting safety and efficacy we will all be a lot happier. The point is to reduce the probability of futility to something low and maximize the probability of a positive trial given the treatment is not useless, which would result in something like:

Prob (treat is useless | positive trial) = p-value * <something tiny> / (p-value * something tiny + something large * something close to 1) = something tiny

Of course, there are healthy debates regarding the utility of the p-value. I question it as well, given that it requires a reference to trials that can never be run. These debates need to be had among regulators, academia, and industry to determine the best indicators of evidence of efficacy and safety.

But CAM studies have a long way to go before they can even think about such issues.

Monday, March 16, 2015

Lying with statistics, anti-vax edition 2015

Sometimes Facebook’s suggestions of things to read lead to some seriously funny material. After clicking on a link about vaccines, Facebook recommended I read an article about health outcomes in unvaccinated children. Reading this rubbish made me as annoyed as a certain box of blinking lights, but it again affords me the opportunity to describe how people can confuse, bamboozle, and twist logic using bad statistics.

First of all, Health Impact News has all the markings of a crank site. For instance, its banner claims it is a site for “News that impacts your health that other media sources may censor.” This in itself ought to be a red flag, just like Kevin Trudeau’s Natural Cures They Don’t Want You to Know About.

But enough about that. Let’s see how this article and the referred study abuses statistics.

First of all, this is a bit of a greased pig. Their link leads to a malformed PDF file on a site called The site’s apparent reason for existence is to host a questionnaire for parents who did not vaccinate their children. So I’ll have to go on what the article says. There appeNo study of health outcomes of vaccinated people versus unvaccinated has ever been conducted in the U.S. by CDC or any other agency in the 50 years or more of an accelerating schedule of vaccinations (now over 50 doses of 14 vaccines given before kindergarten, 26 doses in the first year).ars to be another discussion on the site, which I’ll get to in a moment.

The authors claim

No study of health outcomes of vaccinated people versus unvaccinated has ever been conducted in the U.S. by CDC or any other agency in the 50 years or more of an accelerating schedule of vaccinations (now over 50 doses of 14 vaccines given before kindergarten, 26 doses in the first year).

Here’s one. A simple Pubmed search will bring up others fairly quickly. These don’t take long to find. What happens after this statement is a long chain of unsupported assertions about what data the CDC has and has not collected, that I really don’t have an interest in debunking right now (and so leave as an exercise).

So on to the good stuff. They have a pretty blue and red bar graph that’s just itching to be shredded, so let’s do it. This blue and red bar graph is designed to demonstrate that vaccinated children are more likely to develop certain medical conditions, such as asthma and seizures, than unvaccinated children. Pretty scary stuff, if their evidence were actually true.

One of the most important principles in statistics is defining your population. If you fail at that, you might as well quit, get your money back from SAS, and call it a day, because nothing that comes after that is meaningful. You might as well make up a bunch of random numbers if that’s the case, because that will be just as meaningful.

This study fails miserably at defining its population. The best I can tell, the comparison is between a population in an observation study called KIGGS and respondents to an open invitation survey conducted at

What could go wrong? Rhetorical question.

We don’t know who responded to the questionnaire, but it is aimed at parents who did not vaccinate their children. This pretty much tanks the rest of their argument. From what I can tell, these respondents seem to be motivated to give answers favorable to the antivaccine movement. That the data they present are supplemented with testimonials gives this away. They are comparing apples to rotten oranges.

The right way to answer a question like this is a matched case-control study of vaccinated and unvaccinated children. An immunologist is probably the best one to determine which factors need to be included in the matching. That way, an analysis conditioned on the matching can clearly point to the effect of the vaccinations rather than leave open the questions of whether the differences in cases were due to differences in inherent risk factors.

I’m wondering if there isn’t some ascertainment bias going on as well. Though I really couldn’t tell what the KIGGS population was, it was represented as the vaccinated population. So in addition to imbalances in risk factors, I’m wondering if the “diagnosis” in the unvaccinated population was derived from the parents were asked which medical conditions their children have. In that case, we have no clue what the real rate is like, because we are comparing parents’ judgments (and parents probably more likely to ignore mainstream medicine at that) with, presumably, a GP’s more rigorous diagnosis. That’s not to say that no children in the survey were diagnosed by an MD, but without that documentation (which this web-based survey isn’t going to be able to provide), the red bars in the pretty graph are essentially meaningless. (Which they were even before this discussion.)

But let’s move on.

The cites some other studies that seem to agree with their little survey. For instance, McKeever, et al. published a study in the American Journal of Public Health in 2004 from which the site claims an association between vaccines and the development of allergies. However, that apparent association, as stated in the study, is possibly the result of ascertainment bias (the association was only strong in a stratum with the least frequent GP visits). Even objections to the discussion of ascertainment bias leave the evidence of association of vaccines and allergic diseases unclear.

The site also cites the Guinea-Bisseau study reported by Kristensen et BMJ in 2000. They claim, falsely, that the study showed a higher mortality in vaccinated children.

They also cite a New Zealand study.

What they don’t do is describe how they chose the studies to be displayed on the web site. What were the search terms? Were these studies cherry-picked to demonstrate their point? (Probably, but they didn’t do a good job.)

What follows the discussion of other studies is an utter waste of internet space. They report the results of their “survey,” I think. Or somebody else’s survey. I really couldn’t figure out what was meant by “Questionnaire for my unvaccinated child ("Salzburger Elternstudie")”. The age breakdown for the “children” is interesting, for 2 out of the 1004 “children” were over 60! At any rate, if you are going to be talking about diseases in children, you need to present it by age, because, well, age is a risk factor in disease development. But they did not do this.

What is interesting about the survey, though, is the reasons the parents did not vaccinate their children, if only to give a preliminary notion of the range of responses.

In short,, and the reporting site Health Impact News, present statistics that are designed to scare rather than inform. Proper epidemiological studies, contrary to the sites’ claims, have been conducted and provide no clear evidence to the notion that vaccinations cause allergies except in rare cases. In trying to compile evidence for their claims, they failed to provide evidence that they did a proper systematic review, and even misquoted the conclusions of the studies they presented.

All in all, a day in the life of a crank website.