Ok, so I am going to leave R, SAS, big data, and so forth aside for a bit (mostly) and focus on trends in biostatistics.
Adaptive trials (group sequential trials, sample size re-estimation, O'Brien-Fleming designs, triangular boundary trials) is a fairly mature literature at least as far as the classical group sequential boundaries goes. However, they leave a lot to be desired as they do not take advantage of full information at interim analyses, especially partial information on the disease trajectory from enrolled subjects who have not completed follow up. On the upside, they are easy to communicate to regulators, and software is available to design them, whether you use R, SAS, or EaST. The main challenge is finding project teams who are experienced in implementing adaptive trials, as not all data managers understand what is required for interim analyses, not all clinical teams are aware of their important roles, not all sites understand, and not all drug supply teams are aware of what they need to do.
Bayesian methods have a lot of promise both in the incorporation of partial information in making adaptation decisions and in making drug supply decisions. I think it will take a few years for developers to find this out, but I'm happy to evangelize. With the new FDA draft guidance on adaptive trials, I think more people are going to be bold and use adaptive trials. The danger, of course, is that they have to be done well to be successful, and I'm afraid that more people are going to use them because they are the in thing and the promise to save money, without a good strategy in place to actually realize those savings.
Patient segmentation (essentially, the analysis of subgroups from a population point of view) seems to be an emerging topic. This is because personalized medicine, which is a logical conclusion of segmentation, is a long way off (despite the hype). We have the methods to do segmentation today (perhaps with a little more development of methodology), and many of the promises of personalized medicine can be realized with an effective segmentation strategy. For example, if we can identify characteristics of subgroups who can benefit more from one class of drugs, that will be valuable information for physicians when they decide first line treatment after diagnosis.
Missing data has always been a thorn in the side, and the methodology has finally developed enough to where the FDA believes they can start drafting a guidance. A few years ago they empaneled a committee to study the problem of missing data and provide input into a draft guidance on the matter. The committee has put together a report (will link when I find the report), which is hot off the press and thought-provoking. Like the adaptive design and noninferiority guidances, the guidance will probably leave it up to the sponsor to justify the missing data method but there are a few strong suggestions:
- don't use single imputation methods, as they underestimate the standard error of the treatment effect
- specify one method as primary, but do other methods as sensitivity analyses. Show that the result of the trial is robust to different methods.
- Distinguish between dropouts and "randomly" missing data such as missed visits.
- Try not to have missing data, and institute followup procedures that decrease missing data rates. For dropouts, try to follow up anyway.
The use of graphics in clinical trial reporting has increased, and that's a good thing. A new site is being created to show the different kinds of graphs that can be used in clinical reports, and is designed to increase the usage of graphs. One FDA reviewer noted that graphics that are well done can decrease review times, and, in response to a burning question, noticed that the FDA will accept graphs that are created in R.
Finally, I will mention one field of study that we do not apply in our field. Yes, it's a fairly complex method but I believe the concept can be explained, and even if it is used in an exploratory manner it can yield a lot of great insights. It's functional data analysis, which is the study of curves as data rather than just points. My thought is that we can study whole disease trajectories (i.e.the changes in disease over time) rather than just endpoints. Using the functional data analysis methods, we can start to characterize patients as, for example, "quick responders," "slow responders," "high morbidity," and so forth depending on what their disease trajectory looks like. Then we can make comparisons in these disease trajectories between treatment groups. Come to think of it, it might be useful in patient segment analysis.
At any rate I think biostatisticians are going to feel the pinch in the coming years as drug developers rely more heavily on us to reduce costs and development time, yet keep the scientific integrity of a drug development program intact. I am confident we will step up to the challenge, but I think we need to be more courageous in applying our full knowledge to the problems and be assertive in those cases where inappropriate methods will lead to delays, higher costs, and/or ethical issues.