Realizations in Biostatistics

Sunday, August 19, 2007

What does the First Ever Pharma Blogosphere Survey tell us

First, let me make a few comments. I find John Mack's Pharma Marketing blog useful. Marketing tends to be a black box for me. From my perspective, for the inputs you have guys who want to sell things, and for the outputs you have commercials and other promotional materials. I (partly by choice and partly by the way my brain works) understand very little about what happens between input and output. All I know about it is play my strengths and downplay my weaknesses. This is part of the reason I'm limiting myself to discussing statistical issues, at least on this blog.

However, when he came out with his First Ever Pharma Blogosphere Survey®©™, I was skeptical. In fact, I didn't pay much attention to it. But then he started making claims based on the survey, especially surrounding Peter Rost's new gig at BrandweekNRX. His predicted Brandweek would "flush its creditibility down the toilet" by hiring Rost, and cited his survey data to back up his case (he had other arguments as well, but, as noted above, I'm just covering what I know). And since I'm skeptical of his data, I'm skeptical of his analysis, and, therefore, his arguments, conclusions, and predictions based on the data. To his credit, however, he posts the raw data so at least we know he didn't use a graphics program to draw his bar graphs.

Rost's counterarguments are worthy of analysis as well. He notes that most people read the Pharma Marketing blog (the survey was conducted from its sister site Pharma Blogosphere), raising the question about which population Mack was really sampling. The correct answer, of course, is people who happened to read that blog entry around the day it was posted who cared enough to bother to take a web survey. I would agree that Mack's following probably make up a bulk of the survey.

But more important is the comparison of the survey to more objective data, such as site counters. (Note that site counter data isn't perfect, either, but it is more objective than web polls since the data collection does not require user interaction.) And it looks like that objective data doesn't match Mack's data.

Then you throw in the data from eDrugSearch, which has its own algorithm for ranking healthcare websites, but they seem a very out of line with the ranking algorithm from that of Technorati, which uses some modifications to the number of incoming links (I think to adjust for the fact that some blogs just all link to one another).

So, at any rate, you can be sure that Peter Rost will keep you abreast of his rankings, and for now they certainly do not seem to match Mack's predictions. And, while the eDrugSearch and Technorati rankings seem far from perfect, they do tend to agree on the upward trend in readership of BrandweekNRx and Rost's personal blog, at least for now. Mack's survey, and the predictions based on them, are the only data I've seen so far that have not agreed.

In the meantime, I say the proof is in the pudding. Read these sites, or, better yet, put them in an RSS reader so you can skim for the material you like. Discard the material you don't like. As for me, well, I like to keep abreast of the news in my industry because, well, it could affect my ability to feed my children. So far, Rost's blog breaks news that doesn't get picked up anywhere else, (as does Pharmalot and PharmGossip). Mack's blogs did, too, at least until he started getting obsessed with his subjective evaluation of Rost's content.

Web polls in blog entries - I don't trust them

I distrust web polls. While there are more trustworthy sources of polling such as Surveymonkey, these web surveying sites have to be backed up with essential the same type of operational techniques found in standard paper surveys. The web polls I distrust are ones that that bloggers put in their entries in their blog entries to poll their readers on their thoughts of certain issues. Sometimes they will even follow up with an entry saying "this isn't a scientific poll, but here are the results."

A small step up from this are the web surveys, such as John Mack's First Ever Pharma Blogsphere Survey®™©. They have a lot of the same problems as the simple web poll, and few of the controls necessary to ensure valid results. So I'll discuss simple one-off web polls and web surveys together.

Most of the problems and biases with these web polls aren't statistical; rather, they are operational. The data from these is so bad that no amount of statistics can rescue them. It's better not to even bring statistics into the equation here. Following are the operational biases I consider unavoidable and insurmountable:

Most web polls do not control whether one person can vote multiple times. Most services will now use cookies or IP addresses to block multiple votes from one person, but these services are imperfect at best. Changing an IP address is easy (just go to a different Starbucks, and cookies can be deleted). Cookies are easily deleted.
Wording questions in surveys is a tricky proposition, and millions of billable hours are spent agonizing over the wording. (Perhaps 75% of that is going a bit too far, but you get the point.) Very little time is generally spent wording the question of a web poll. The end result is that readers may not be answering the same question a blogger asks.
Forget random sampling, matching cases, identifying demographic information, or any of the classical statistical controls that are intended to isolate noise and false signal from true signal. Web poll samples are "People who happen to find the blog entry and care enough to click on a web poll." At best, the readers who feel strongly about an issue are the ones likely to click, while people who are feel less strongly (but might lean a certain way) will probably just glaze over.
Answers to web polls will typically be immediate reactions to the blog post, rather than thoughtful, considered answers. Internet life is fast-paced, and readers (in general) simply don't have the time to thoughtfully answer a web poll.

Web polls and surveys might be useful for guaging whether readers are interested in a particular topic posted by the blogger, and so they do have a use in guiding the future material in a blog. But beyond that, I can't trust them.

Next step: an analysis of the John Mack/Peter Rost kerfluffle.

Shout outs

Realizations is a tiny blog, getting just a tiny bit of traffic. After all, I cover a rather narrow topic. Every once in a while, someone finds an article on here worth reading, and, even less often, they link to it.

So, shout outs (some long overdue) to the people who have linked:

Kevin, MD (On flaws in the Avandia meta-analysis and potential regulatory fallout)
Peter Rost (On the statistical reporting system and how it could impact Novartis)
Kitchen Table Math (A general link to the site)
Captador (DCA)
Highlight Health (DCA)

Friday, August 17, 2007

Pharmacogenomics: "Fascinating science" and more

The FDA has issued two press releases in two days where pharmacogenomics has played the starring role:

Pharmacogenomics is the study of the interaction of genes and drugs. Most of the study, and certainly the most mature part of the field, has been on the study of drug metabolism, especially the cytochrome P450 enzymes, which are found in most kinds of life. The FDA's press releases are based on this science.

Pharmalot has reported on the mixed reactions to the warfarin news. One reaction was that "It's fascinating science, but not ready for prime time." Maybe not in general for all drugs, but the pharmacogenomics of warfarin has been studied for some time, and a greater understanding of the metabolism of this particular drug is critical to its correct application. Warfarin is a very effective drug, but it has two major problems:
1. The difference between the effective dose and a toxic dose is small enough to require close monitoring (i.e. it has a narrow therapeutic window)
2. It is implicated in a large number of adverse effects during ER visits (probably mostly for stroke or blood clots)

The codeine use update is even more urgent. Codeine is activated by the CYP2D6 enzyme, which has a wide variation in our population (gory detail at the Wikipedia link). In other words, the effects of codeine on people vary widely. The morphine that results from codeine metabolism is excreted in breast milk. If a nursing mother is one of the uncommon people who have an overabundance of CYP2D6, a lot of morphine can get excreted into breast milk and find its way into the baby. The results can be devastating. Fortunately, CYP2D6 tests have been approved by the FDA, and the price will probably start falling. Whether this science is ready for prime time or not (and CYP2D6 is probably the most studied of all the metabolism enzymes, so it probably is), it's fairly urgent to start applying this right away.

I applaud the FDA for taking these two steps toward applying pharmacogenomics to important problems. There may be issues down the road, but it's high time we started applying this important science.

Thursday, August 16, 2007

Good Clinical Trial Simulation Practices

I didn't realize they had gotten this far, and they did so 8 years ago! A group has put together a collection of good clinical trial simulation practices. While I only partly agree with the parsimony principle, I think the guiding principles are in general sound. I'd like to see this effort get wider recognition in the biostatistical community so that clinical trial simulations will get wider use. That can only help bring down drug development costs and promote deeper understanding of the compounds we are testing.

IRB abuses

Institutional Review Boards are a very important part of our human research. They are the major line of defense against research that degrades our humanity, and protects subjects in clinical research. Thank goodness they're there avoid a repeat of a nasty part of our history.

Unfortunately, as institutions do, IRBs have suffered from mission creep and a growing conservatism. It's a growing opinion that IRBs are overstepping their bounds and bogging down research and journalism that has no chance of harming human subjects. Via Statistical Modeling etc. I found IRBWatch, which details some examples of IRB abuses.

Tuesday, August 14, 2007

Review of Jim Albert's Bayesian Computation with R

When I first read Andrew Gelman's quick off-the-cuff review of the book Bayesian Computation with R, I thought it was a bit harsh. So did Gelman.

I thumbed through the book at the joint statistical meetings, and decided to buy it along with Bayesian Core. And I'm glad I did. Albert clearly positioned the book to be a companion to an introductory and perhaps even intermediate course in Bayesian statistics. I've found the book to be very useful to learning about Bayesian computation and deepening my understanding of Bayesian statistics.

The Bad

I include the bad first because there are few bad things.

I thought the functions laplace (which computes the normal approximation to a posterior using the Laplacian method) and the linear regression functions were a bit black-boxish. The text described these functions generally, but not nearly in the detail that it described other important functions such as rwmetrop and indepmetrop (which run random walk and independence Metropolis chains). Since I think that laplace is a very useful function, I think it would have been better to go into a little more detail. However, Albert did show the function in action in many different situations, including the computation of Bayes factors.
The choice of starting points for laplace seemed black-boxish as well. They were clearly chosen to be close to the mode (one of the functions of the function is to compute a mode of the log posterior distribution), but Albert doesn't really go into how to choose "intelligent" starting points. I recommend using a grid search using the R function expand.grid (and patience).
I wish the Chapter on MCMC included a problem on Gibbs sampling, though there is Chapter on Gibbs sampling in the end.
I wish it included a little more detail about accounting for the Jacobian when parameters are transformed. (Most parameters are transformed to the real line.)
I wish the book included more about adaptive rejection sampling.

The Good

In no particular order:

Albert includes detailed examples from a wide variety of fields. The examples vary in difficulty from run-of-the-mill (such as estimating a single proportion) to the sophisticated (such as Weibull survival regression with censored data). Regression and generalized linear models are covered.
The exercises really deepen the understanding of the material. You really need a computer with the R statistical package to read this book and get the most out of it. Take the time to work through the examples. Because I did this, I much better understand the Metropolis algorithms and the importance of choosing the right algorithm (and right parameters) to run an MCMC. Do it incorrectly and the results are compromised due to high (sometimes very high) autocorrelation and poor mixing.
The book is accompanied by a package LearnBayes that contain a lot of good datasets and some very useful functions for learning and general use. The laplace, metropolis, and gibbs (which actually implements Metropolis within Gibbs sampling) functions all can be used outside of the context of the book.
The book covers several different sampling algorithms, including importance, rejection sampling (not adaptive), and sample importance resampling. Along with this material are examples and exercises that show the importance of good proposal densities and what can happen with bad proposal densities.
A lot of the exercises extend exercises in previous chapters, so that the active reader gets to compare different approaches to the same problem.
The book heavily refers to other books on Bayesian statistics, such as Berry and Stangl's Bayesian Biostatistics, Carlin and Louis's Bayes and Emprical Bayes for Data Analysis, and Gelman, et al's Bayesian Data Analysis. In doing so, this book increases the instructive value of the other Bayesian books on the market.

Overall, this book is a great companion to any effort to learn about Bayesian statistics (estimation and inference) and Bayesian computation. Like any book, it's rewards are commensurate with the effort. I highly recommend working the exercises and going beyond the scope of the exercises (such as investigating diagnostics when not explicitly directed to do so). Read/work this book in conjunction with other heavy-hitter books such as Bayes and Empirical Bayes or Bayesian Data Analysis.

Wednesday, August 1, 2007

A good joint statistical meetings week

While I was not particularly enthralled with the location, I found this years Joint Statistical Meetings to be very good. By about 5 pm yesterday, I thought it was going to be so-so. There were good presentations on adaptive trials and Bayesian clinical trials, and even a few possible answers to some serious concerns I have about noninferiority trials. Last night I went to the biopharmaceutical section business meeting, and struck up conversations with a few people from the industry and the FDA (including the speaker who had some ideas on how to improve noninferiority trials). And shy, bashful me who had to drink 3 glasses of wine a couple of years ago to get up the courage to approach a (granted rather famous) colleague was one of the last ones to leave the mixer.

This morning, I was still feeling a little burned out, but decided to drag myself to a section on Bayesian trials in medical devices. I found the speakers (which came from both industry and FDA) top notch, and at the end the session turned into a very nice dialog on the CDRH draft guidance.

I then went to a session on interacting with the FDA in a medical device setting, and again speakers from both the FDA and industry were top notch. Again, the talks turned into very good discussions about how to most effectively communicate with the FDA, especially from a statistician/statistical consultant's point of view. I asked the question of how to handle the situation where, though it's not in the best interest, a sponsor wants to kick the statistical consults out of the FDA interactions. The answer: speak the sponsor's language, which is in dollars. Quite frankly, statistics is a major part of any clinical development plan, and unless the focus is specifically on chemistry, manufacturing, and controls (CMC), a statistician needs to be present for any contact with the FDA. (In a few years, it might be true for CMC as well.) If this is not the case, especially if it's consistently not the case throughout the development cycle of the product, the review can be delayed, and time is money. Other great questions were asked on use of software and submission of data. We all got an idea of what is required statistically in a medical device submission.

After lunch was a session given by the section on graphics and International Biometric Society (West N America Region). Why it wasn't cosponsored by biopharmaceutical, I'll never know. The talks were all about using graphs to understand effects of drugs, and how to use graphs to effectively support a marketing application or medical publication. The underlying message was get out of the 60's line printer era with the illegible statistical tables, and take advantage of new tools available. Legibility is key in producing a graph, followed by the ability to present a large amount of data in a small area. In some cases, many dimensions can be included on a graph, so that the human eye can spot potential complex relationships among variables. Some companies, notably big pharma, are far ahead in this arena. (I guess they have well-paid talent to work on this kind of stuff.)

These were three excellent sessions, and worth demanding more of my aching feet. Now I'm physically tired and ready to chill with my family for the rest of the week/weekend before doing "normal" work on Monday. But professionally, I'm refreshed.

Tuesday, July 31, 2007

Biostatistics makes the news, and hope for advances

Black hole or black art, biostatistics (and its mother field statistics) is a topic people tend to avoid. I find this unfortunate, because it makes discussions of drug development and surveillance news rather difficult.

Yet these discussions affect millions of lives, from Avandia to echinacaea to zinc. So I get tickled pink when a good discussion of statistics comes out in the popular press. They even quoted two biostatisticians who know what they are talking about, Susan Ellenberg and Frank Harrell, Jr. Thanks, BusinessWeek! Talking about the pros and cons of meta-analysis is a difficult task, and that's if you're aiming for an audience of statisticians. To tackle the topic in a popular news magazine is courageous, and I hope establishes a trend.

On the other hand, I have a few friends who cannot pick up a copy of USA Today without casting a critical eye. Turns out, they had a professor who constantly was mining the paper for examples of bad statistical graphics. (I have nothing against USA Today. In fact, I've appreciated their treatment of transfats.)

In other news, two new books on missing data have been released this year. Little and Rubin have released the second edition of their useful 1987 book. Molenberghs and Kenward have come out with their book that's designed specifically for missing data in clinical studies. I ended up picking up the latter for its focus, and I attended a workshop earlier this year by Geert Molenberghs that was pretty good. I'm very glad these books have been release because they're sorely needed. And at the Joint Statistical Meetings this year, there was a very good session on missing data (including a very good presentation by a colleague). I hope this means in the future we can think more intelligently about how to handle missing data because, well, in clinical trials you can count on patients dropping out.

Friday, July 20, 2007

Whistleblower on "statistical reporting system"

Whether you love or hate Peter Rost (and there seems to be very little in between), you can't work in the drug or CRO industry and ignore him. Yesterday, he and Ed Silverman (Pharmalot) broke a story on a director of statistics who blew the whistle on Novartis. Of course, this caught my eye.

While I can't really determine whether Novartis is "at fault" from these two stories (and related echos throughout the pharma blogs), I can tell you about statistical reporting systems, and why I think that these allegations can impact Novartis's bottom line in a major way.

Gone are the days of doing statistics with pencil, paper, and a desk calculator. These days, and especially in commercial work, statistics are all done with a computer. Furthermore, no statistical calculation is done in a vacuum. Especially in a clinical trial, there are thousands of these calculations which must be integrated and presented so that they can be interpreted by a team of scientists and doctors who then decide whether a drug is safe and effective (or, more accurately, whether a drug's benefits outweigh its risks).

A statistical reporting system, briefly, is a collection of standards, procedures, practices, and computer programs (usually SAS macros, but may involve programs in any language) that standardize the computation and reporting of statistics. Assuming they are well-written, these processes and programs are general enough to process the data any kind of study and produce reports that are consistent across all studies, and, hopefully, across all product lines in a company. For example, there may be one program to turn raw data into summary statistics (n, mean, median, standard deviation) and present them in a standardized way in a text table. Since this is a procedure we do many times, we'd like to just be able to "do it" without having to fuss over the details. We feed the variable name in (and perhaps some other details like number of decimal places) and voila the table. Not all statistics is that routine (and good for me because that means job security), but perhaps 70-80% is and can be made more efficient. Other programs and standards will take care of titles, footnotes, column headers, formatting, tracking, and validation in a standardized and efficient way. This saves a lot of time in both programming and in review and validation of tables.

So far, so good. But what happens when these systems break? As you might expect, you have to pay careful attention to these statistical reporting systems, even go so far as applying some software development life cycle methodology. If they break, you influence not just one calculation but perhaps thousands. And there is no way of knowing - obscure bugs in the code might influence just 10 out of a whole series of studies, where a more serious bug might affect everything. If this system is applied to every product in house (and it should probably be general enough to apply to at least one category of products, such as all cancer products), the integrity of the data analysis for a whole series of products is compromised.

Allegations were also made that a contract programmer was told to change dates on adverse events, which could either be a benign but bizarre request if the reasons for the change are well-documented (it's better to change dates in the database than at the program level, because it's easier to audit changes to a database and specific changes to specific dates keep a program from being generalizable to other similar circumstances) or an ethical nightmare if the changes were done to make the safety profile of the drug look better. From Pharmalot's report, the latter was alleged.

You might guess the consequences of systematic errors in data submitted to the FDA. The FDA does have the authority to kick out an application if it has good reason to believe that its data is incorrect. This application has to go through the resubmission process, after it is completely redone. (The FDA will only do this if there are systematic problems.) This erodes the confidence the reviewers have in the application, and probably even all applications submitted by a sponsor who made the errors. This kind of distrust is very costly, resulting in longer review periods, more work to assure the validity of the data, analysis, and interpretation, and, ultimately, lower profits. Much lower.

It doesn't look like the FDA has invoked its Application Integrity Policy on Novartis's Tasigna or any other product. But it has invoked its right to three more months of review time, saying it needs to "review additional data."

So, yes, this is big trouble as of now. Depending on the investigation, it could get bigger. A lot bigger.

Update: Pharmalot has posted a response from Novartis. In it, Novartis reiterates their confidence in the integrity of their data and claims to have proactively shared all data with the FDA (as they should). They also claim that the extension to the review time for the NDA was for the FDA to consider amendments to the submission.

This is a story to watch (and without judgment, for now, since this is currently a matter of "he said, she said"). And, BTW, I think Novartis responded very quickly. (Ed seems to think that 24 hours was too long.)