Friday, December 31, 2010

Dirty data processing in SAS and R

For many data management needs, I have typically relied on SAS. The data vector is perfectly suited to dealing with datasets on a record-by-record basis (which was why it was invented!), and sorting, subsetting, and merging datasets is a relatively easy task. (One of the questions I will ask SAS programmer candidates involves showing them a bit of code using a merge without a by statement. The trick is to see if they catch on.)
In a clean data environment, such as premarketing clinical trials, these operations are usually adequate, though even then it's sometimes hard to identify key variables in a dataset when the the database is not properly set up. However, I'm now moving into an environment where data from several sources has to be merged. SAS has some tools to handle this, such as fuzzy matching, hash tables, and regular expressions (supporting Perl regular expressions as of version 9). I find them rather cumbersome to use, though. Consider the following code (from a SUGI paper by Ron Cody):
 TITLE "Perl Regular Expression Tutorial – Program 1";
 *Exact match for the letters 'cat' anywhere in the string;
There is a cat in this line.
Does not match CAT
cat in the beginning
At the end, a cat

This is just to match using a Perl regular expression. The regular expression must be compiled (using the cumbersome IF _N_=1 device) and then the compiled regular expression referred in the body of the data step.
I think that SAS also support the SQL LIKE function, which determines if two words sound reasonably alike. However, I don't think it supports Levenshtein distance unless you buy the text analytics package. Because I haven't experimented with the text analytics package, I cannot say whether it is worth the money.
The brute force way to use these tools is to create a large dataset that is keyed by the direct product of the two datasets to merge. The regular expressions or whatever criteria you like can be used to filter out records. Other methods are possible, but I'm only getting started down this road, so I'll have to share more clever methods later.
In fuzzy matching cases, R is not so bad to use despite the lack of implicit looping that SAS has. Mostly, it's because matching and otherwise using regular expressions is much simpler than the implementation in SAS. (SAS BASE DEVELOPERS! DO YOU HEAR ME!) However, that's not really an improvement above the SAS Base implementation. There is the book Data Mashups in R, which shows some very interesting ideas for data cleaning and merging using web services (specifically, geocoding services), but their one manual merge had a very clean key.
Over the last few days, I've found a couple of really cool tools that seem to expand the possibilities of data cleaning and merging from different sources (or, so-called "mashups"). One is the Google Refine tool and the other the RecordLinkage package in R. Both hold a lot of promise, and I'll be trying both out in the near future.

Friday, December 24, 2010


A few items:

  • I'm still settling into the new job. I'll be back to writing substantive posts probably early in the new year. The new company doesn't have a strong social media presence, so I will probably advocate for a stronger strategy.
  • Google has recognized the growing number of mobile devices accessing Blogger blogs, so has developed a mobile view. All we bloggers have to do is flip a switch. I flipped the switch, so if you are on your Android, iPhone, or other phone, enjoy. (I think Wordpress blogs have been doing this for a while so it's about time. I had looked into services such as Mobify but it was getting more complicated than I wanted.)
  • I'm off to play Santa Claus, so enjoy whatever holiday (or just time off) that you celebrate.

Monday, December 6, 2010

Open learning page on statistics

I found the open learning page on statistics via Gelman's blog, and it looks to be interesting.

Wednesday, November 24, 2010

New pastures

As of Monday, November 29, I will be moving on to new opportunities. These last few years working in pre-marketing drug development have been exciting and interesting, and now I move on to late phase work. This will be a bit of a challenge, because I have for the last 8 years been focused on how to achieve the level of evidence required by the Food, Drug, and Cosmetic Act (and similar rules in other countries), and late-phase work has been in my blind spot.
Cato Research has been kind enough to encourage the professional aspect of blogging, and has even let me post on the company blog. I intend to continue here, but until I get a feel for how my new company feels about social media I probably won't tie professional and work-related so closely for a while.

Sunday, November 7, 2010

Having data is not the same as using data--what a statistician reviews on case report forms

I have seen three different philosophies of statistical review of case report forms:
  • None at all, either because the statistician never sees it or doesn't know what to look for
  • Some review, where the statistician sees the case report form and the developer ignores the comments
  • Full review
Of course, I believe the latter case is the best, but I've seen the first two much more. In those two scenarios, the lack of attention to these details endangers the trial's ability to fulfill protocol objectives. For example, I have seen cases where the sponsor had to forgo some very important analyses because the data were not collected in such a way that it could be analyzed appropriately.
When I review a case report form, I have in mind how the data is going to be displayed in the final analysis, perhaps even the SAS or R code to perform the analysis (assuming the trial is in the therapeutic areas I am most used to). I can do this because I have been involved in large number of trials now, and I have seen the common threads over a wide range of trials and also within a few therapeutic areas. I know all trials will need an enrollment and disposition table, a background and demography table, efficacy tables, adverse event tables, and laboratory tables, for instance. I also know if the trial requires electrocardiogram measurements (because I've fully reviewed the protocol!). These are tables I've produced over and over, and about 90% of the elements are the same from table to table. Therefore, I can tell if a case report form and the validation specifications of the case report form are capable of producing the required analysis.
The following are some examples of what I review in case report forms:
  1. The ability to calculate complex composite endpoints
    All components need to be present and accessible for any composite endpoints we need to calculate and analyze. For example, disease progression in oncology trials is often complex and difficult to calculate using a SAS program, whether it is based on RECIST or some other working group criteria. To complicate matters, the time to disease progression may be not observed or not observable. For example, if a subject completes the full course on study without progressing, the disease progression is not observed. If the subject discontinues from the study and begins a new treatment, the disease progression may be considered unobservable in some studies, but in other studies the disease progression may be followed up.
  2. Whether endpoints for analysis are captured in a structured way
    With the exception of adverse events and concomitant medications, any data that is going to be summarized or analyzed should be collected in a structured field. To see why, let's look at the exceptions. The medical coding dictionary Medical Dictionary for Regulatory Activities (MedDRA) is used to have the best of both worlds for adverse events: investigators can write the adverse event any which way, and the adverse events can be summarized in a meaningful way. However, it usually requires two subscriptions to the MSSO (one for the clinical research organization performing the coding and one for the sponsor) each at over $10 thousand per year in addition to the labor cost of an MD. Thus, we spend a lot of money and effort being able to analyze free text. (There are other advantages to MedDRA, as well.) For specialized endpoints, it is better to use planning and creativity in collecting the data in a way to make it usable than cut corners on the data collecting.
  3. Whether collection of laboratory and other external data is reconcilable and analyzable
    Sometimes lab data is recorded on the case report form, in which case everything is ok as long as the data is structured. Sometimes, however, data is sent directly from the laboratory to the data management or statistics group, in which case it is preferable to reconcile the collection dates and times on the case report form with the dates and times in the database. The best way to do this is record requisition or accession numbers on the case report form.
This is not an exhaustive list, but it should give a flavor of the kinds of issues occurring in the case report form that can ruin an analysis.


Friday, October 29, 2010

Quiz time: what does the following R code do?

Prize is brownie points.

y <- 7
z <- 1
y.last <- 0
while (abs(y.last-y)>1e-5) {
  y.last <- y
  y.temp <- 0.5*(y+1/z)
  z <- 0.5*(z+1/y)
  y <- y.temp

Thursday, October 21, 2010

Belated world statistics day ruminations

Yesterday, my wedding anniversary took precedence over World Statistics Day. So here are my belated thoughts on the subject.

First, I'm very happy that statistical reasoning is getting more airtime in the news. It's about time. While not everyone needs to be a statistician, I think it is within everyone's capability of learning enough about statistics to understand the increasing number of statistical arguments (and lack thereof) in the world around us. For example, the chart in this image was made by my 4 year old son. Certainly, his father is a statistician, but there is no reason why first and second graders can't make similar charts, and start to draw conclusions from them. Later grades can build on this exercise so that a basic understanding of statistics is achievable by the end of high school. The alternative is that too many people (even in scientific disciplines) fall vulnerable to anecdotal or even superstitious arguments. (Case in point: Jenny McCarthy's anti-vaccine campaign.)

I am pleased that SAS is pushing their educational agenda for mathematics and technology at the secondary school level, and Revolution Analytics has made their premium Revolution R product free for all academics. I, as these companies, have been displeased with the state of statistical and technological education in the grade schools and even undergraduate schools. Let's all work together to make this important tool accessible to everybody, as statistical reasoning is set to become an essential part of civil participation.

Friday, October 8, 2010

Trials with bolted on adaptive components

All too often, I get a request to make a trial adaptive. In a lot of cases, adaptations were considered but rejected, but the sample size was too large given considerations such as dropout. Of course, this is a delicate time in sponsor-CRO relations, because emotions are already running high due to the frustration in spending the time considering a lot of alternatives that are already rejected. There is further danger in that the sponsor is, in fact, asking for a fundamental change to a trial that has already been designed.

Adaptive trials are best designed with the adaptation already in mind. This is because the adaptive component affects many aspects of the trial. In addition, the additional planning required for an additive trial can be more easily done if it is worked in from the beginning.

In the case where adaptation is used to rescue a trial, it's probably best to take the time to effectively start from the beginning, at least in making sure the time and events table makes sense. Barring that, I will often recommend one futility analysis be performed. The reason I do this are as follows:

* no adjustment to stated Type 1 error rate is required

* it's relatively easy to "bolt on" to an existing trial

* under the most common circumstances under which this late consideration is done (late Phase 1 or Phase 2 trial) this strategy will prevent wasting too much money on a worthless compound

Of course, not all trials benefit from a futility analysis, but I recommend this strategy almost categorically in cases where a sponsor wants to add one interim analysis to an otherwise designed trial.

Posted via Blogaway

Tuesday, September 28, 2010

Future of Clinical Trials conference

Up over at  Ask-Cato.

Posted via Blogaway

Tuesday, September 21, 2010

Future of clinical trials recap

Clinical trials are complex, so any meeting about the future of trials is going to be complex. Indeed, the  Future of Clinical Trials meeting had something from many perspectives from recruitment to ethics to statistics. Of course, I viewed most of the presentations with an eye for how to apply them to adaptive trials. So, here's the themes of what I heard in the presentations:

  • Relationships are going to be the most important key to the success of any clinical trial. Pharma companies are starting to outsource in such a way that they expect a strategic partner-level participation by the vendor (such as a clinical research organization-CRO), and the CRO had best bring its A-game regarding project management, design and execution of trials.
  • I had not thought about this particular area, but business development is going to play a key role as well. We discussed several aspects, but one that sticks in my mind is structuring contracts in such a way to minimize change orders. I think this will be helpful because change orders take precious time away from the team and make the relationship more difficult to maintain.
  • Regulatory uncertainty drives us to be more efficient, but we are also uncertain about the changes that are required to make us more efficient. We can expect the difficult regulatory environment to get worse before it gets better because of the recent politicization of drug safety.
  • I think a new wave of technologies is going to make designing and running trials more efficient. Improvements are being made to study startup, clinical trial management, patient recruitment, site selection, and ethics approval of protocols. It may take a while, but any company wanting to stay competitive will need to either employ some of these technologies or use something else to make up the lag in efficiency.
This is only a small overview. I think we will be hearing a lot more about these issues in the years to come.

Wednesday, September 15, 2010

Adaptive trials can be hard on the team

Clinical trials are hard enough to do as it is, because many people coming from many different backgrounds and having many different focuses have to coordinate their efforts to make a good quality finished product--a clinical trial with good data that answers the research questions in a persuasive and scientifically valid way. Add to that mix several interim analyses with tight turnaround times (required to make the interim analysis results useful enough to adapt the trial) and you really are putting your sites, clinical, data management, and statistical teams in the pressure cooker. Making stupid mistakes that your teams would not ordinarily make is a real danger (believe me, it is and I have made a few of those myself), and one that can endanger the results of the interim analysis. Here are some ideas to cut down on those stupid mistakes:

  • Overplan during study startup.
  • Get the whole trial execution team, including data management and stats, together around the table in the beginning.
  • Do a dry run of the interim analysis, with everybody around the table. Personally, I think it's worth it to fly people in if they are scattered around the world, but at the very least use the web conferencing technologies.
  • Draw a diagram of data flow for the interim analysis. Use Visio, a white board, note cards and string, or whatever is useful. The process of making this diagram is more important than the diagram itself, but the diagram is important as well. Of course, this process will more than likely change during the course of the study but these diagrams can be updated as well.
  • Fuss over details. Little details can trip up the team when the chips are down. Make the process as idiot-proof as possible. I once had a situation where I screwed up an interim analysis because I forgot to change the randomization directory from a dummy randomization (so blinded programmers to write programs) to the real randomization (so I could produce the reports). After that, I talked with the lead programmer and refined the report production process even further.
  • Plan for turnover. You like members of your team, and some of them will go away during the execution of the trial. New members will come on board. Business continuity planning is very important and is increasingly being scrutinized. Scrutinize it on your trials. Because you've overplanned, done some dry runs, drawn diagrams, and fussed over details, you've written all these down, so the content's readily available to put together in a binder (or pdf). You might even repeat the dry run process with new staff.
  • And, for the statisticians, run clinical trial simulations. A well-done simulation will not only show how the trial performs, but also illuminate the assumptions behind the trial. Then simulations can be performed to show how robust the trial is regarding those assumptions.
Running adaptive trials is hard, but a thoughtful process and a prepared staff will help you realize the potential gains that adaptive trials can bring.

Saturday, September 11, 2010

Bayesian dose-ranging trials, ASTIN, and execution of adaptive clinical trials

Bayesian adaptive trials have a lot of potential to cut down sample sizes in the dose-ranging trials and enable better selection of the best dose to take into pivotal trials. The canonical example is the ASTIN trial, published in Clinical Trials in 2005.

The power of the Bayesian adaptive trial as it is used in the ASTIN trial is that data from all subjects is used to find the dose of choice (in the case of ASTIN, the ED95, or the dose that gives 95% of the efficacy beyond the control). This is in contrast to most parallel-group multi-dose trials, where only trials from a particular treatment group are used to estimate the treatment effect at that dose, and also different from most dose-effect models such as Emax where the dose-response curve is assumed to have a certain shape. For example, the ASTIN trial was able to detect non-monotone dose-response curve (and good thing, too!).

What is notable about the ASTIN trial is that the literature is very transparent on the methodology and the operational aspects of the trial. Thus, the whole clinical trial project team can learn important lessons in the running of any adaptive trial, including modern flexible adaptive trials such as ASTIN.

Though a little heavy on the math, I recommend any clinical trial professional check out the literature on the ASTIN trial (ignoring the math if necessary and concentrating on the overall idea), starting with the article linked above.

Thursday, September 9, 2010

Meta-analysis is under the microscope again, this time for drug safety

FDA Asks For “Restraint” On Drug Safety Worries - Matthew Herper - The Medicine Show - Forbes

Meta-analysis is a class of techniques used to combine data from multiple, often disparate, studies on a given topic. Essentially the methodology involves reverse-engineering published literature or data from a website and then statistically combining the results. Of course, as with all statistical analyses, there are several ways of doing a meta-analysis, and within each way there are lots of smaller assumptions that affect the way a meta-analysis should be interpreted. Bias, especially publication bias, are primary worries.

In the article linked above, FDA reviewers are calling for restraint in the use of this tool, and for good reason. In the drive toward transparency and open data (or at least open results in our industry), coupled with the wide availability of statistical software, anybody can easily create a meta-analysis. The Vioxx and Avandia examples show that a meta-analysis can kick off a process of scrutiny that will eventually cause a drug to be pulled from the market or relegated to a "last resort" status. The ugly downside of this, of course, is that some drugs may be inappropriately targeted and its use inappropriately reduced due to market withdrawal, patient fears, or refusal of reimbursement. The reviewers note that Triotropium should not follow the path of Vioxx and Avandia despite a negative meta-analysis.

My comment is that they are absolutely right in that the meta-analysis is only one aspect of the whole picture. In the cases of Vioxx and Avandia, further investigations were made into the data, and these further investigations supported the original meta-analysis. It is not automatic, however, that a drug should be targeted for removal or usage reduction in light of a negative meta-analysis, but rather a more detailed analysis that includes the original approval data and any subsequent post-marketing data.

What does likability have to do with statistics?

Likeability is a very important skill for statisticians. While the best among us are recognized for our skill, being likeable will entice our clients to listen to us more closely, and with active listening skills we will be able to better understand our clients' problems. This is Tukey's saying "Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong quesiton, which can always be made precise" in action.

With so many people now needing statistical services, we statisticians need to be good listeners, good communicators, and likeable. So I heartily recommend Bruna Martinuzzi's Likeability: It's an Inside Job.

Statistics and Statisticians in Clinical Trials – Beginning with the End in Mind

Up over at Ask Cato.

Sunday, September 5, 2010

Great things coming up

In just a couple of weeks, I'll be giving my talk at the  Future of Clinical Trials  conference. For the next few weeks, I'll be posting material here and at Ask Cato about the best ways to negotiate with the FDA, design, and execute adaptive clinical trials so they can reach their potential.

Wednesday, August 18, 2010

A "What Now?!" moment in Alzheimer's research

Derek Lowe gives the rundown on Lilly's failure in the clinic with a gamma secretase inhibitor. This is big news because it casts serious doubt on the beta amyloid plaque hypothesis of Alzheimer's disease. Given that beta amyloid plaque has been a major target for drug development in Alzheimer's for several decades, and there are not many other promising leads, we are pretty much wondering what's next. The most promising news to come out of Alzheimer's research is a major collaboration effort among big pharma which we really do not see in industry too often. You can get a further overview of the graveyard of drugs in this area in Lowe's archives (he used to do research in Alzheimer's disease).

My main comment here is that this is a very expensive example of how assuming or hoping that correlation=causation can be, and not just in terms of money (although no one can ignore that billions have been spent chasing this hypothesis). The gamma secretase inhibitor did what it was supposed to do, which was block one of the enzymes that produce beta amyloid plaque, and the Phase 2 results seemed pretty good on that measure (which was pretty good, as most previous efforts ran afoul of other biological pathways or had other problems). However, the clinical outcome (cognitive decline) was worse in the treatment group than in placebo, which suggests either that beta amyloid is a byproduct of the Alzheimer's process, or even that beta amyloid is the body's response to another process. Who knows?
What now?! Back to the drawing board.

Friday, August 13, 2010

National Academies of Science book on prevention and analysis of missing data in clinical trials

The National Academies of Science is letting people display the prepublication book on the prevention and analysis of missing data on our sites. So here you go. Enjoy!

Wednesday, August 11, 2010

The curse of missing data, and potential regulatory action

Missing data is taken for granted now in clinical trials. This issue colors protocol design, case report form (CRF) development, monitoring, and statistical analysis. Statistical analysis plans (SAPs) must have a section covering missing data, or they are incomplete.

Of course, the handling of missing data is a hard question. Though Little and Rubin came out with the first through treatise in 1976 (link is to second edition), the methods to deal with missing data are hard enough that only statisticians understand them (and not very well at that). Of particular interest is the case when missing data depends on the what the value would have been had it been observed, even after conditioning on values you have observed (this is called "Missing not at random" [MNAR] or "nonignorably missing data"). Methods for dealing with MNAR data are notoriously difficult and depend on unverifiable assumptions, so historically we biostatisticians have relied on simple, but misleading, methods such as complete case analysis, last observation carried forward, or conditional mean imputation (i.e. replace with some adjusted mean or regression prediction).

The FDA has typically balked at these poor methods, but in the last few years has started to focus on the issue. They empaneled a group of statisticians a few years ago to research the issue and make recommendations, and the panel has now issued its report (link when I can find it). This report will likely find its way into a guidance, which will help sponsors deal more intelligently with this issue.

For now, the report carries few specific recommendations for methods and strategies for use, but the following principles apply:

  • everything should be prespecified and then executed according to plan
  • distinction should be made between dropouts and randomly missed visits
  • single imputations such as LOCF should be avoided in favor of methods that adjust the standard error correctly for the missing data
  • any missing data analysis should include a sensitivity analysis, where alternate methods are used in the analysis to make sure that the missing data are not driving the result (this still leaves open a huge can of worms, and it is hoped that further research will help here).

It's time to start thinking harder about this issue, and stop using last observation carried forward blindly. Pretty soon, those days will be over for good.

From my JSM 2010 notes on the topic

Posted via email from Randomjohn's posterous

Friday, August 6, 2010

Joint statistical meetings 2010 - reflections for biostatistics

Ok, so I am going to leave R, SAS, big data, and so forth aside for a bit (mostly) and focus on trends in biostatistics.

Adaptive trials (group sequential trials, sample size re-estimation, O'Brien-Fleming designs, triangular boundary trials) is a fairly mature literature at least as far as the classical group sequential boundaries goes. However, they leave a lot to be desired as they do not take advantage of full information at interim analyses, especially partial information on the disease trajectory from enrolled subjects who have not completed follow up. On the upside, they are easy to communicate to regulators, and software is available to design them, whether you use RSAS, or EaST. The main challenge is finding project teams who are experienced in implementing adaptive trials, as not all data managers understand what is required for interim analyses, not all clinical teams are aware of their important roles, not all sites understand, and not all drug supply teams are aware of what they need to do.

Bayesian methods have a lot of promise both in the incorporation of partial information in making adaptation decisions and in making drug supply decisions. I think it will take a few years for developers to find this out, but I'm happy to evangelize. With the new FDA draft guidance on adaptive trials, I think more people are going to be bold and use adaptive trials. The danger, of course, is that they have to be done well to be successful, and I'm afraid that more people are going to use them because they are the in thing and the promise to save money, without a good strategy in place to actually realize those savings.

Patient segmentation (essentially, the analysis of subgroups from a population point of view) seems to be an emerging topic. This is because personalized medicine, which is a logical conclusion of segmentation, is a long way off (despite the hype). We have the methods to do segmentation today (perhaps with a little more development of methodology), and many of the promises of personalized medicine can be realized with an effective segmentation strategy. For example, if we can identify characteristics of subgroups who can benefit more from one class of drugs, that will be valuable information for physicians when they decide first line treatment after diagnosis.

Missing data has always been a thorn in the side, and the methodology has finally developed enough to where the FDA believes they can start drafting a guidance. A few years ago they empaneled a committee to study the problem of missing data and provide input into a draft guidance on the matter. The committee has put together a report (will link when I find the report), which is hot off the press and thought-provoking. Like the adaptive design and noninferiority guidances, the guidance will probably leave it up to the sponsor to justify the missing data method but there are a few strong suggestions:

  • don't use single imputation methods, as they underestimate the standard error of the treatment effect
  • specify one method as primary, but do other methods as sensitivity analyses. Show that the result of the trial is robust to different methods.
  • Distinguish between dropouts and "randomly" missing data such as missed visits.
  • Try not to have missing data, and institute followup procedures that decrease missing data rates. For dropouts, try to follow up anyway.

The use of graphics in clinical trial reporting has increased, and that's a good thing. A new site is being created to show the different kinds of graphs that can be used in clinical reports, and is designed to increase the usage of graphs. One FDA reviewer noted that graphics that are well done can decrease review times, and, in response to a burning question, noticed that the FDA will accept graphs that are created in R.

Finally, I will mention one field of study that we do not apply in our field. Yes, it's a fairly complex method but I believe the concept can be explained, and even if it is used in an exploratory manner it can yield a lot of great insights. It's functional data analysis, which is the study of curves as data rather than just points. My thought is that we can study whole disease trajectories (i.e.the changes in disease over time) rather than just endpoints. Using the functional data analysis methods, we can start to characterize patients as, for example, "quick responders," "slow responders," "high morbidity," and so forth depending on what their disease trajectory looks like. Then we can make comparisons in these disease trajectories between treatment groups. Come to think of it, it might be useful in patient segment analysis.

At any rate I think biostatisticians are going to feel the pinch in the coming years as drug developers rely more heavily on us to reduce costs and development time, yet keep the scientific integrity of a drug development program intact. I am confident we will step up to the challenge, but I think we need to be more courageous in applying our full knowledge to the problems and be assertive in those cases where inappropriate methods will lead to delays, higher costs, and/or ethical issues.

Posted via email from Randomjohn's posterous

Joint statistical meetings 2010 - second half

The second half of JSM was just as eventful as the first half. Jim Goodnight addressed the new practical problems requiring analytics. Perhaps telling, though is his almost begrudging admission that R is big. The reality is that SAS seems to think they are going to have to work with R in the future. There is already some integration in SAS/IML studio, and I think that is going to get tighter.

The evening brought a couple of reunions and business meetings, including the UNC reunion (where it sounds like my alma mater had a pretty good year in terms of faculty and student awards and contributions) and the statistical computing and graphics sections, where I met some of my fellow tweeters.

On Tuesday, I went a little out of my normal route and attended a session on functional data analysis. This is one area I think we biostatisticans could use more ideas. Ramsay (who helped create and advance the field) discussed software needs for the field (with a few interesting critques of R), and two others talked about two interesting applications to biostatistics, including studying cell apoptosis and brain imaging study of lead exposure. On Wednesday afternoon, we discussed patient population segmentation and tailored therapeutics, which is I guess an intermediate step between marketing a drug to everybody and personalized medicine. I think everybody agreed that personalized medicine is the direction we are going, but we are going to take a long time to get there. Patient segmentation is happening today. Tuesday night brought Revolution Analytics's big announcement about their commercial big data package for R, where you can analyze 100 million row datasets in less than a minute on a relatively cheap laptop. I saw a demo of the system, and they even tried to respect many of the conventions in R, including the use of generic functions. Thanks to them for the beeR, as well. Later on in the evening brought more business meetings. I ended up volunteering for some work for next year, and I begin next week.

On Wednesday, I attended talks on missing data, vaccine trials and practical issues in implementing adpative trials. By then, I was conferenced out, having attended probably 10 sessions over 4 days, for a total of 20 hours absorbing ideas. And that didn't include the business part.

I will present some reflections on the conference, including issues that will either emerge or continue to be important in statistical analysis of clinical trials.

Posted via email from Randomjohn's posterous

Tuesday, August 3, 2010

Joint statistical meetings 2010 - first half

The first part of the Joint statistical meetings for 2010 has come and gone, and so here are a few random thoughts on the conference. Keep in mind that I have a bias toward biostatistics, so your mileage may vary.

Vancouver is a beautiful city with great weather. I enjoy watching the sea planes take off and land, and the mountainous backdrop of the city is just gorgeous. Technology has been heavily embraced at least where the conference is located, and the diversity of the people served by the city is astounding. The convention center is certainly the swankiest I've ever been in.

The quality of the conference is certainly a lot higher than previous conferences I've been to, or perhaps I'm just better about selecting what to attend.

  • The ASA's session on strategic partnerships in academia, industry, and government (SPAIG) was not well-enough attended. I think these partnerships are going to be essential to the best way to conduct scientific research and the data analysis coming out of and going into that research. Presentations included reflections on the ASA's strategic plan from a past president, and efforts for the future coming from the incoming president-elect Bob Rodriguez. I wish everybody could have heard it.
  • The 4pm session on adaptive designs was very good. This important area (for which I enthusiastically evangelize to my company and clients) advances, and it is good to see some of the latest updates.
  • Another session I attended had a Matrix theme, in which we were encouraged to break out of a mind prison by reaching out to those in other disciplines and making our work more accessible. The session was packed, and it did not disappoint. It may seem like an obvious point, but it does not seem to be emphasized in education or even on the job.
  • Another session focused on segmenting patient populations for tailoring therapeutics. A lot of good work is going on in this area. We are not able to do personalized medicine yet despite the hype, but tailored therapeutics (i.e. tailoring for a subgroup) is an intermediate step that is happening today.
  • At the business meeting on statistical computing and graphics I meet some of my fellow tweeters. I am very pleased to make their acquaintance.

There are other themes too. R is still huge, and SAS is getting huger. This all came together in Jim Goodnight's talk on the importance of analytics and how the education system needs to support it. His tone seemed to exhibit a begrudging acceptance of R. (I'll get into philosophizing about SAS and R another time.) Revolution Analytics is addressing some of the serious issues with R, including high performance computing and big data, and this will be certainly something to follow.

Hopefully the second half will be as good as the first half.

Posted via email from Randomjohn's posterous

Wednesday, July 28, 2010

Using ODBC and R to analyze Lotus Notes databases (including email)

For several reasons, I want to analyze data that comes out of Lotus Notes. One kind of such data, of course, is email. So here's how I did it. This requires MS Windows, but it may be possible to do this on Linux and Mac, because IBM supports those platforms as well. Also, I'm sure that other solutions exist for other email platforms, but I won't go into that here.
  1. Download NotesSQL, which is an ODBC (open database connectivity) driver for Lotus Notes. In a nutshell, ODBC allows most kind of databases, such as Oracle, MySQL, or even Microsoft Access and Excel to be connected with software, such as R or SAS, that can analyze data from those databases.
  2. The setup for ODBC on Windows is a little tricky, but worth it. Install NotesSQL, then add the following to your PATH (instructions here):
    1. c:\Program Files\lotus\notes
    2. c:\NotesSQL
  3. Follow the instructions here to set up the ODBC connection. There is also a set of instructions here. Essentially, you will run an application installed by the NotesSQL to set up the permissions to access the Lotus databases, and then use Microsoft's ODBC tool to set up a Data Source Name (DSN) to your Lotus mail file. Usually, your mail file will be named something like userid.nsf. In what follows, I have assumed that the DSN is "lotus" but you can use any name in the control panel.
  4. Start up R, and install/load the RODBC package. Set up a connection to the Lotus database.
  5. library(RODBC) ch <- odbcConnect("lotus")
  6. You may have to use sqlTables to find the right name of the table, but I found the database view _Mail_Threads_, so I used that. Consult the RODBC documentation for how to use the commands.
  7. foo <- sqlFetch(ch,"_Mail_Threads_")
  8. Here's where the real fun begins. foo is now a data frame with the sender, the date/time, and the subject line of your emails (INBOX and filed). So have some fun.
  9. # find out how many times somebody has ever sent you email, and plot it bar <- table(foo[,1]) # sort in reverse descending order bar <- bar[rev(order(bar))] barplot(bar,names.arg="") 
Say, is that a power law distribution?
Oh, don't forget to cleanup after yourself.

More on the petabyte milestone

One area I think can break the petabyte milestone soon if not today, is genomics research. Again, you have a relatively automated system as far as data collection and storage is concerned.

Posted via email from Randomjohn's posterous

Tuesday, July 27, 2010

In which I speculate about breaking through the petabyte milestone in clinical research

Spotfire (who owns a web data analysis package and now also S-Plus), recently posted on petabyte databases, and I started wondering if petabyte databases would come to clinical research. The examples they provided–Google, Large Hadron Collider, and World of Warcraft/Avatar–are nearly self-contained data production and analysis systems in the sense that nearly the whole part of the data collection and storage process is automated. This property allows the production of a large amount of high-quality data, and our technology has gotten to a point where petabyte databases are possible.

By contrast, clinical research has a lot of inherently manual processes. Even with electronic data capture, which has usually improved the collection of clinical data, the process still has enough manual parts to make the database fairly small. Right now, individual studies have clinical databases on the order of tens of megabytes, with the occasional gigabyte database if a lot of laboratory data is collected (which does have a bit more automation to it at least on the data accumulation and storage end). Large companies having a lot of products might have tens of terabytes of storage, but data analysis only occurs on a few gigabytes at a time at the most. At the FDA, this kind of scaling is more extreme as they have to analyze data from a lot of different companies on a lot of different products. I don't know how much storage they have, but I can imagine that they would have to have petabytes of storage, but on the single product scale the individual analyses focus on a few gigabytes at a time.

I don't think we will hit petabyte databases in clinical research until electronic medical records are the primary data collection source. And before that happens, I think the systems that are in place will have to be standardized, interoperable, and simply of higher quality than they are now. By then, we will look at the trails that Google and LHC have blazed.


Posted via email from Randomjohn's posterous

Friday, July 23, 2010

Information allergy

Frank Harrell gave a talk at this year's useR! conference on "information allergy." (I did not attend the conference, but it looks like I should have.) Information allergy, according to the abstract, is defined as a two-part problem, which exhibits a willful refusal to do what it takes to make good decisions:
  1. refusing to obtain key information to make a sound decision
  2. ignoring important available information
One of the major areas point out is the refusal to acknowledge "gray areas," which forces one into a false binary choice. I have observed this on many occasions, which is why I usually recommend the analysis of a continuous endpoint in conjunction with a binary endpoint.
At any rate, I look forward to reading the rest of the talk.
Update: the slides from a previous incarnation of the talk can be found here.

Sunday, June 27, 2010

Reproducible research - further options

Mostly just a list of possible reproducible research options as a follow up to a previous entry. I still don't like these quite as much as R/Sweave, but they might do in a variety of situations.

  • Inference for R - connects R with Microsoft Office 2003 or later. I evaluated this a couple of years ago and I think there's a lot to like about it. It is very Weave-like, with a slight disadvantage that it really prefers the data to be coupled tightly with the report. However, I think it is just as easy to decouple these without using Inference's data features, which is advantageous when you want to regenerate the report when data is updated. Another disadvantage is that I didn't see a way to easily redo a report quickly, as you can with Sweave/LaTeX by creating a batch or shell script file (perhaps this is possible with Inference). Advantages - you can also connect to Excel and Powerpoint. If you absolutely require Office 2003 or later, Inference for R is worth a look. It is, however, not free.
  • R2wd (link is to a very nice introduction) which is a nice package a bit like R2HTML, except it writes to a Word file. (Sciviews has something similar, I think.) This is unlike many of the other options I've written about, because everything must be generated from R code. It is also a bit rough around the edges (for example, you cannot just write wdBody(summary(lm(y~x,data=foo))). I think some of the dependent packages, such as Statcomm, also allow connections to Excel and other applications, if that is needed.
  • There are similar solutions that allow connection to Openoffice or Google Documents, some of which can be found in the comments section of the previous link.

The solutions that connect R with Word are very useful for businesses that rely on the Office platform. The solutions that connect to Openoffice are useful for those who rely on the Openoffice platform, or need to exchange documents with those who rely on Microsoft Office but do not want to purchase it. However, for reproducible research in the way I'm describing these solutions are not ideal, because it allows the display version to be edited easily, which would make it difficult to update the report if there is new data. Perhaps if there were a solution to make the document "comment-only" (i.e. no one could edit the document but could only add comments) this would be a workable solution. So far, it's possible to manually set a protection flag to allow redlining but not source editing of a Word file, but my Windows skills are not quite sufficient to have that happen from, for example, a batch file or other script.

Exchanging with Google Docs is a different beast. Google Docs allows easy collaboration without having to send emails with attachments. I think that this idea will catch on, and once IT personnel are satisfied with security this idea (whether it's Google's system, Microsoft's attempt at catching up, or someone else's) will become the primary way of editing small documents that require heavy collaboration. Again, I'm not clear if it's possible to share a Google document with putting it into a comment-only mode, which I think would be required for a reproducible research context to work, but I think this technology will be very useful.

Posted via email from Randomjohn's posterous

The effect of protocol amendments on statistical inference of clinical trials

Lu, Chow, and Zhang recently released1 an article detailing some statistical adjustments they claim need to be made when a clinical trial protocol is amended. While I have not investigated their method (they seem to revert to my first choice when there is no obvious or straightforward algorithm – the maximum likelihood method), I do appreciate the fact that they have even considered this issue at all. I have been thinking for a while that the way we tinker with clinical trials during their execution (all for good reasons, mind you) ought to be reflected in the analysis. For example, if a sponsor is unhappy with enrollment they will often alter the inclusion/exclusion criteria to speed enrollment. This, as Lu, et al. point out, tends to increase the variance of the treatment effect (and possibly affect the means as well). But rather than assess that impact directly, we end up analyzing a mixture of populations.

This and related papers seem to be rather heavy on the math, but I will be reviewing these ideas more closely over the coming weeks.

Posted via email from Randomjohn's posterous

Sunday, June 13, 2010

Drug development from a kindergartener's point of view: sharing is good

Derek Lowe notes this effort by several drug makers to share data from failed clinical trials in Alzheimer's disease. The reason we do not have very good treatments for Alzheimer's is that it's a very tough nut to crack, and we're not even sure the conventional wisdom about the mechanisms (the amyloid plaque theory, for instance) are correct. The hope is that in sharing data from a whole string of failed clinical trials, someone will be able to find something that can move a cure–or at least an effective treatment–forward.

It should be appreciated that participating in this initiative is not easy. Desire to protect the privacy of research participants is embedded deeply within the clinical trial process, and if any of the sensitive personal-level data is to be made public, it has to be anonymized (and documented).

The data is also very expensive to collect, and the desire to protect it vigorously as a trade secret is very strong.

I think this effort is notable in light of the drive toward open data discussed by Tim Berners-Lee in his recent TED talk. This effort seems to be the first of several in difficult diseases such as Parkinson's. Stay tuned, because this will be something to watch closely.

Posted via email from Randomjohn's posterous

Sunday, June 6, 2010

How to waste millions of dollars with clinical trials: MS drug trial 'a fiasco' – and NHS paid for it - Health News, Health & Families - The Independent

The most expensive publicly funded drug trial in history is condemned today as a "fiasco" which has wasted hundreds of millions of NHS cash and raised fresh concerns about the influence of the pharmaceutical industry.
The scheme involved four drugs for multiple sclerosis launched in the 1990s which were hailed as the first treatment to delay progression of the disabling neurological condition that affects 80,000 people in the UK.
It was set up in 2002 after the National Institute for Clinical Excellence (Nice) unexpectedly ruled that the drugs were not cost effective and should not be used on the NHS. To head off opposition from patient groups and the pharmaceutical industry, the Department of Health established the largest NHS "patient access scheme", to provide patients with the drugs, costing an average £8,000 a year, on the understanding that if they turned out to be less effective than expected, the drug companies would reduce the price.
The first report on the outcome was due after two years but was not published until last December, seven years later. It showed that the drugs failed to delay the onset of disability in patients – defined as walking with a stick or using a wheelchair – and may even have hastened it. On that basis, the drug companies would have had to pay the NHS to use them to make them cost effective.
Despite this finding, the price was not reduced and the scientific advisory group monitoring the scheme advised that "further follow up and analyses" were required. It said that disability may yet improve, the disease may have become more aggressive and the measure of disability used may have underestimated benefit. There were 5,583 patients in the scheme at a cost to the NHS of around £50m a year, amounting to £350m over seven years to 2009. The Multiple Sclerosis Society said twice as many patients were using the drugs outside the trial. That implies a total NHS cost of £700m for a treatment that does not work.
In a series of articles in today's British Medical Journal, experts criticise the scheme. James Raftery, professor of health technology assessment at the University of Southampton and an adviser to Nice, said the scientific advisory group included representatives from the four drug companies, two MS groups, and the neurologists treating patients, all of whom had lobbied for the continued use of the drugs on the NHS.
"The independence of this group is questionable," he said. "Monitoring and evaluation of outcomes must be independent. Transparency is essential, involving annual reports, access to data, and rights to publish. Any of these might have helped avoid the current fiasco."
Professor Christopher McCabe, head of health economics at the University of Leeds, writing with colleagues in the BMJ, said: "None of the reasons for delaying the price review withstand critical assessment." Professor McCabe told The Independent: "We should be asking questions about paying for these drugs. In terms of disability avoidance, the evidence is not there."
Alastair Compston, professor of neurology at the University of Cambridge, defended the scheme. He said that despite a disappointing outcome, the scheme had "advanced the situation for people with multiple sclerosis" by improving understanding and care of the disease. Neil Scolding, professor of neurosciences at the University of Bristol, said the proportion of British patients treated with drugs (10-15 per cent) was tiny compared to France and Germany (40-50 per cent). He said the scheme had also led to the appointment of 250 multiple sclerosis nurses.
"[Though] expensive and flawed, if it turns out to have been no better than a clever wooden horse, then the army of MS healthcare specialists it delivered may make it more than worthwhile," he wrote. The MS Society claimed success for the scheme up to 2007 but after publication of the results last December, withdrew its support.
MS: why the drugs don't work
Multiple sclerosis is a chronic disease. It may take 40 years to run its course. In developing drugs to slow its progression, doctors have used brain scans to show lesions which the drugs appeared to prevent, and gave quicker results. Some experts thought the lesions were the disease but little effort was made to check. But preventing lesion formation does not prevent disability caused by the condition. The drugs deal with the lesions, not the disease.
Jeremy Laurance

Shouldn't those scientific questions about disease progression have been dealt with before the trial began?

Friday, June 4, 2010

Healthcare and the drive toward open data

If you haven't seen the TED talk by Tim Berners-Lee, you should. Follow the link right now if you haven't. It will only take 6 minutes, and I'll be waiting right here when you're done.

So now you've seen the talk, right? Good. We can take the following as axioms in today's technology and culture:
  • There is a stronger push toward greater transparency, and resisting that push is futile.
  • With technology, people will create their own open data. When they create their own open data, they will create their own analyses. And when they create their own analyses, they will create their own conclusions.
To see the first point, I direct you to the excellent Eye on FDA, a blog run by Mark Senak who is a communications professional who is very familiar with the workings of the FDA and pharmaceutical industry. Further evidence is the opening of data by the government, as evidenced by the creation of the site in the US and corresponding sites in other countries. Even municipalities such as San Francisco are joining the movement. And, of course, the efforts of Berners-Lee is further evidence.

The second point is perhaps a little harder to see. However, realize that there are discussion forums where doctors discuss adverse events of drugs, without any intervention or oversight from the companies that create these products. This also gets discussed, in an informal way, on the website, the wild, wild west of pharma rep discussion boards. With text analysis and web search tools, it is possible to analyze this data, too, much like the Google Flu trend tool.

The traditionally tight-lipped pharmaceutical and biotech industries, who rely on data for their livelihoods, have to adjust, far beyond what is presented in the prescribing information labels. So far, this movement hasn't been kind to pharma companies, as evidenced by the Avandia meta-analysis that essentially brought the drug down and created a multibillion dollar nightmare for GlaxoSmithKline which, by the way, was based on data that GSK had published through its efforts on being transparent with clinical trial data. At first, it sounded like Steve Nissen (the primary author of the meta-analysis) was simply trying to bring the company down. However, as more details emerged, it turned out that GSK was aware of the results of the meta-analysis before publication. GSK was caught with its pants down, in essence, not sure as a company how to function in this new environment of data.

Attacking an analysis on the basis of methodological questions doesn't seem to work, at least so far. For example, the analysis method that Nissen used in the Avandia meta-analysis wasn't quite correct, as it ignored studies where no cardiovascular events occurred. Nor did the fact that the oft-cited 43% increase in risk was a bit misleading as the cardiovascular event risk of Avandia was very small, and small risks tend to result in large relative risks.

Here, GSK was a leader in opening its data, and it got burned. However, this kind of openness will have to continue, as there is too much public good in having open data. One way or another, we will have to adjust to living in an open-data world.

Update: Bonus: Ask-Cato notes this issue from a different perspective

Monday, May 24, 2010

Reproducible research in the drug development industry

1 What is reproducible research?

Reproducible research, in a nutshell, is the process of publishing
research in such a way that a person can pick up the materials and
reproduce the research exactly. This is an ideal in
science. Essentially, all data, programming code, and interpretation
is presented in such a way that it is easy to see what was done, how
it was done, and why.

A report written in reproducible research style is written in such a
way that any result that comes from analyzing data is written in some
programming language inside the report. The written report is then
processed by software that will interpret the programming code and
replace it with both the code and the output from the code. The reader
of the report then sees exactly what code is executed to produce the
results, and the results that are shown in the report are guaranteed
to be from the code that is shown. This is different from, for
example, writing the code in the document and running it separately to
generate results which are copied and pasted back into the report. In
essence, the report and the analysis are done together, at the same
time, as a unit. An demo of how this works using the LaTeX, Sweave,
and R packages can be found here, and another example using R and
LaTeX, but not Sweave, can be found at Frank Harrell's rreport page.

Further information can be found at some of the links below (and the
links from those pages).

When should you write your statistical analysis plans?

Up over at Ask Cato.

Monday, May 10, 2010

Friday, May 7, 2010

Perverse incentives in clinical trials

Decisions whether to progress to Phase 3 are not always based on the effiacy and safety of a drug or the feedback on the drug from Phase 2. Derek Lowe notes that because Phase 3 is a milestone for smaller companies and their valuations is dependent on how advanced their products are.

A common mistake I see is that a Phase 2 study will fail soundly, but the sponsor still runs a Phase 3 trial without further tests. To "justify" these studies, a case will often be made that a secondary endpoint is "trending" and, if enough subjects are enrolled, it will be significant. Of course, most such Phase 3 trials will fail. Unfortunately, this same strategy carries forth into the investment arena with press releases chock full of meaningless p-values. P-values that savvy investors are double-checking.

We have statistical techniques available or under development that will aid in Phase 3 go/no-go decision-making without requiring statistical significance or baseless guesswork. We can compute a probability of Phase 3 success, with an estimate of uncertainty of that probability, using techniques we have now. I guess too many small companies want to go to Phase 3 at any cost rather than make a cool-headed decision. Maybe investors should start requesting a failure plan for drugs under development in which they invest to reduce this perverse incentive.

Tuesday, April 20, 2010

How to waste millions of dollars with clinical trials, Part II: Rexahn

Lots of people have been calling BS on Rexahn's press release about its Phase 2a data on Serdaxin and subsequent "additional statements." Read the articles below; they offer good background and analysis.

Here, I think Rexahn really did it to themselves, but I also think the environment around drug development is partly to blame. Everybody loves the p-value less than 0.05 [to Adam Fueurstein's credit in the first reference below, he blows off the "not statistically significant" issue], and companies are ready to comply by cherry-picking the best-looking p-value to present. (I know: I've been asked to do this cherry picking.)

Why shouldn't we care about statistically significant? Simply because it's a Phase 2a study. This is the place where we don't know anything about a drug's efficacy, and we're starting to learn. We don't know how to power a study to reach statistical significance, simply because we don't know the best endpoint, the best dosing schedule, the best dose, or really anything except what is observed in preclinical. And we know that the drug development graveyard is littered with drugs that did well in preclinical and bombed in the clinic. So how can we expect to know how many subjects to enroll to show efficacy? We could also use some nouveau Bayesian adaptive design (and I could probably design one for most circumstances), but tweaking more than two or three things in the context of one study is a recipe for disaster.

Here's what I would prescribe (while ignoring the needs of press release consumers):
1. Forget about statistically significant. Whatever calculations are made for power analysis or number of subjects are usually a joke, anyway. The real reason for a sample size usually has to do with budget, and the desire to collect at least the minimum amount of information needed to design a decent Phase 2 trial. If there is a stated power calculation, it has usually been reverse engineered. Instead, it is sometimes possible to do sample size calculations based on size of confidence intervals (to reflect the certainty of an estimate) of different endpoints (see #2).
2. Forget about a primary endpoint. There is no need. Instead, use several clinically relevant instruments, and pick one after the study that is the best from a combination of resolution (i.e. ability to showcase treatment effect) and clinical relevance.
3. Set some criteria for "success," i.e. decision criteria for further development of the drug, that does not include statistical significance. This might include an apparent dose effect (say, 80% confidence interval around a parameter in a dose-response design that shows positive dose-related effect), tolerability at a reasonably high dose, or, if you implement a Bayesian design (adaptive or otherwise), a given probability (say 80%) of a successful adequate and well-controlled trial of reasonable size with clinically relevant criteria. What these criteria are of course needs to be carefully considered -- they have to be reasonable enough to be believable to investors and scientifically sound enough to warrant further treatment of patients with your unproven drug, and yet not so ridiculously high that your press release is almost definitely going to kill your ability to woo investors.
4. Be transparent and forthcoming in your press release, because enough data is out there and there are enough savvy investors and bloggers who can call BS on shenanigans.

Also see:

Friday, April 16, 2010

R - not the epic fail we thought

I usually like AnnMaria's witty insight. I can relate to a lot of what she is saying. After all SAS and family life are large parts of my life, too. But you can imagine the reaction she provoked in saying the following:

I know that R is free and I am actually a Unix fan and think Open Source software is a great idea. However, for me personally and for most users, both individual and organizational, the much greater cost of software is the time it takes to install it, maintain it, learn it and document it. On that, R is an epic fail. 
 With the exception of the last sentence, I am in full agreement. Especially in the industry I work in, qualification and documentation is hugely important, and a strength of SAS is a gigantic support department who has worked through these issues. Having some maverick install and use R, as I do, simply does not work for the more formal work that we do. (I use R for consulting and other things that do not necessarily fulfill a predicate rule.)

However, another company, REvolution Computing, has recognized this need as well. With R catching on at the FDA, larger companies in our industry have taken a huge interest in R, partly because of the flexibility in statistical calculations (and, frankly, the S language beats SAS/IML hands down for the fancier methods), and mostly to save millions of dollars in license fees. Compare REvolution's few hundred dollars a year for install and operation qualification on an infinite-seat license to SAS's license model, and it's not hard to see how.

And SAS, to their credit, has made it easier to interact with R.

Epic win for everybody.

Friday, April 2, 2010

FDA guidance on adaptive trials

The FDA guidance on adaptive trials (group sequential designs, sample size re-estimation, and so forth) can be found here. I've only skimmed through it but it looks fairly informative. So far so good. Hopefully this will be a strong document that will lead to an increase in these kinds of studies.

I'll discuss more after a more thorough look.

Adventures in graduate school

I was recently reflecting at, basic classes aside, I use information from mainly three graduate classes. Two of them were special topics classes, and one was a class that had finally evolved from a special topics class.

In one of the special topics class, we were given a choice of two topics: one field survey of Gaussian processes, which would have been useful but that was not so interesting to the professor, and local time (i.e. the amount of time a continuous process spends in the neighborhood of a point), which was much more specialized (and for which I did not meet the prerequisites) and much more interesting to the professor. I chose the local time because I figured if the professor was excited about it, I would be excited enough to learn what I needed to to understand the class. As a result, I have a much deeper understanding of time series and stochastic processes in general.

The second special topics class seemed to have a very specialized focus, pattern recognition. It covered the abstract Vapnik-Chervonenkis theory in detail, and we discussed rates of convergence, exponential inequalities on probabilities, and other hard-core theory. I could have easily forgotten that class, but the professor was excited about it, and because of it I am having a much easier time understanding data mining methods than I would have otherwise.

The third class, though it was not labeled a special topics class, was a statistical computing class where the professor shared his new research in addition to the basics. There I learned a lot about scatterplot smoothing, Fourier analysis, local polynomial and other nonparametric regression methods that I still use very often.

In each of these cases, I decided to forgo a basic or survey class for a special topics class. Because of the professor's enthusiasm toward the subject in each case, I was willing to go the extra mile and learn whatever prerequisite information I needed to understand the class. In each case as well, that willingness to go the extra mile and fill in the gaps has carried over to over a decade later where I have kept up my interest and am always looking to apply these methods to new cases, when appropriate.

I am currently taking the bootstrapping course at and am happy to say that I am experiencing the same thing. (I was introduced to the bootstrap in fact in my computing class mentioned above but we never got beyond the basics due to time.) We are getting the basics and current research, and I'm already able to apply it to problems I have now.

Monday, March 15, 2010

Odds are, you need to read this

With my recent attacks on p-values and many common statistical practices, it's good to know at least someone agrees with me.

Odds are, it's wrong, ScienceNews

(via American Statistical Association Facebook page)

Saturday, March 13, 2010

Observation about recent regulatory backlash against group sequential trials

Maybe it's just me, but I'm noticing an increased backlash against group sequential trials from regulatory authorities in the last couple of years. The argument against these two trials seems to be twofold:

  1. Group sequential trials that stop early for efficacy tend to overstate the evidence for efficacy. While true, this can be corrected easily, and should be. Standard texts on group sequential trials, and software make the application of this correction easy.
  2. Trials that stop early tend to have too little evidence for safety.
The effect of this seems to be that the groups that need to use group sequential designs the most—the small companies who have to get funding for every dollar they spend on clinical trials—are being scared away from them especially in Phase 3.

The second point about safety is a major one, and one where the industry would do better to keep up with the methodology. Safety analysis is usually descriptive because hypothesis testing doesn't really work so well, because Type I errors (claiming a safety problem where there is none) is not as serious a problem as a Type II error (claiming no safety problem where there is one). Because safety issues can take many different forms (does the drug hurt the liver? heart? kidneys?) there is a massive multiple testing problem, and efforts to control the Type I error that we are used to are no longer conservative. There is the general notion that more evidence is better (and, to an extent, I would agree), but I think it is better to solve the hard problem and attempt to characterize how much evidence we have of the safety of a drug. We have started to do this with adverse events; for example, Berry and Berry have implemented a Bayesian analysis that I allude to in a previous blog post. Other efforts include using False Discovery Rates and other Bayesian models.

We are left with another difficult problem: how much of a safety issue are we willing to tolerate for the efficacy of a drug? Of course, it would be lovely if we could make a pill that cured our diseases and left everything else alone, but it's not going to happen. The fact of the matter is that during the review cycle regulatory agencies have to make the determination of whether safety risk is worth the efficacy, and I think it would be better to have that discussion up front. This kind of hard discussion before the submission of the application will help inform the design of clinical trials in Phase 3 and reduce the uncertainty in Phase 3 and the application and review process. Then we can talk with a better understanding about the role of sequential designs in Phase 3.

Saturday, March 6, 2010

When a t-test hides what a sign test exposes

John Cook recently posted a case where a statistical test "confirmed" a hypothesis but failed to confirm a more general hypothesis. Along those same lines, I had a recent case where I was comparing a cheap and a more expensive way of determining the potency of an assay. If they were equivalent, the sponsor could get by with the cheaper method. A t-test was not powerful enough to show a difference, but I noticed one method showed consistency lower potency than the other. I did a sign test (compared the number with lower results against the expected number based on a binomial) and got a significant result. I could not recommend the cheaper method.

Lessons learned:

- t-test is not necessarily more powerful than a sign test
- a t-test can "throw away" information
- dichotomizing data is often good and exchanges one type of information (qualitative) for loss of quantitative information

Friday, March 5, 2010

Another strike against p-values

Though I'm asked to produce the darn things every day, I have grown to detest p-values, mostly for the way that people wanted to engineer and overinterpret them. The fact that they do not follow the likelihood principle serve to provide an additional impetus to want to shove them overboard while no one is looking. Now, John Cook has brought up another reason: p-values are inconsistent (in the sense that they do not provide evidence for a set of hypotheses in the way that you would expect--I suspect if they were statistically inconsistent in the sense that no unbiased test could exist they would have been abandoned a while back).

Sunday, February 28, 2010

How to waste millions of dollars on clinical trials

When I started working for my current company (a clinical research organization), I was presented with a project that had discontinued due to lack of continued funding. The sponsor simply wanted to stop enrolling and get any conclusions from the study they could find. We presented some basic summary statistics, and then they presented me with an ethical dilemma: they found a couple of numbers that "looked good" and wanted me to generate a p-value that they could put into a press release so that they could get more funding. In statistics this is known as data dredging or p-value hunting. P-values generated this way are known to be extremely biased in favor of the drug being studied simply because the brain is very good at picking out patterns. Recently, however, I read that this company had a failed Phase 3 trial, which cost millions of dollars.

In a previous life, I worked with someone whose large Phase 2 trial failed on its primary endpoint. However, a secondary endpoint looked very good. They commissioned a Phase 3 study with thousands of patients to study and hopefully confirm the new endpoint. However, that study ended up failing as well, and I believe development of the drug was discontinued.

In my opinion, those failed studies could have been avoided. A Phase 2 study need not reach statistical significance (and certainly should not be designed so that it has to), but results in Phase 2 should be robust enough and strong enough to inspire confidence going into Phase 3. For example, estimated treatment effect should be clinical relevant, even if confidence intervals are wide enough to extend to 0. Related secondary endpoints should show similar trends. Relevant subgroups should show similar effects, and different clinical sites should show a solid trend.

I personally would prefer Bayesian methods which can quantify these concepts I just listed, and can even give a probability of success in a Phase 3 trial (with given enrollment) based on the treatment effect and variation present in the Phase 2 trial. However, these methods aren't necessary to apply the concepts above.

In both of the cases I listed above, the causes were extremely worthy, and products that are able to accomplish what the sponsors wanted would have been useful additions to medical practice. However, these products are probably now on the shelves, millions of dollars too late. The end of Phase 2 can be a very difficult soul-searching time, especially when a Phase 2 trial gives equivocal or negative results. It's better to shelve the compound or even run further small proof-of-concept studies than waste such large sums of money on failed large trials.

Sunday, February 7, 2010

Barnard’s exact test -- a test that ought to be used more

Barnard’s exact test – a powerful alternative for Fisher’s exact test (implemented in R) | R-statistics blog describes the use of Barnard's test, which I think is a more preferable test to Fisher's exact.

Barnard's exact test has one further advantage over Fisher's exact: Fisher's exact requires two fixed margins (e.g. the number of subjects in a treatment group and the number of subjects with a given adverse event), whereas in most places it is used only one of the margins is fixed (i.e. the number in a treatment group but not the number with a given adverse event).

The downside is that not too many software packages implement it. Specifically, SAS doesn't seem to implement it, so it doesn't get much use in the pharmaceutical industry. Having an implementation in R is a good start, so maybe more people will explore it and it will see more use.

Tuesday, February 2, 2010

Bradford Cross's "100 proof" project

While it sounds like the way to make the perfect whiskey, the 100 proof project is actually a way to remember the fundamentals of mathematics and how to write a proof. Bradford Cross is taking up the project just for this reason, and we can all benefit.

Every once in a while I pull out the functional analysis book just to brush up on things like the spectral theorem (a ghost of grad school past), but that doesn't have the impact that this project will have.

I found the project via John Cook's AnalysisFact Twitter feed.