Wednesday, September 19, 2012

Data cleaning is harder than statistical analysis

Statistical analysis is relatively hard, but it is a piece of cake compared to data collection, cleaning, and manipulation. In fact, in clinical trials research, we spend millions of dollars to develop and advance the capability to effectively manage data. Just about any clinical research organization worth the price has a strong data management department that they’ve spent a lot of time cultivating.

It’s time to take this a step further. In my workplace, we have a very close integration of the statistics group (consisting of statisticians and statistical programmers) and the data management group. In the latest issue of their newsletter, the Society for Clinical Data Management has included an article for the optimal collaboration between statisticians and data managers.  (I take this a step further and include the medical writer.) This collaboration takes a lot of time – time I could be spending doing statistical analysis. However, if the statistical analysis involves working around fewer data issues, it’s all worth it.