Realizations in Biostatistics: Review of Caltech's Learning from Data e-course

Wednesday, March 20, 2013

Review of Caltech's Learning from Data e-course

Caltech has an online course Learning from Data, taught by Professor Yaser Abu-Mostafa, that seeks to make the course material accessible to everybody. Unlike most of the online courses I've taken, this one is independently offered through a platform created just for the class. I took the course for its second offering in Jan-March 2013.

The platform on which the course is offered isn't as slick as Coursera. The lectures are offered through a Youtube playlist, and the homeworks are graded through multiple choice. That's perhaps a weakness of the class, but somehow the course faculty made it work.

The class's content was its strong point. Abu-Mostafa weaved theory and pragmatic concerns throughout the class, and invited students to write code in just about any platform (I, of course, chose R) to explore the theoretical ideas in a practical setting. Between this class and Andrew Ng's Machine Learning class on the Coursera platform, a student will have a very strong foundation to apply these techniques to a real-world setting.

I have only one objection to the content, which came in the last lecture. In his description of Bayesian techniques, he claimed that in most circumstances you could only model a parameter with a delta function. This, of course, falls in line with the frequentist notion that you have a constant, but unknowable "state of nature." I felt this way for a long time, but don't really believe it any more in a variety of contexts. I think he played up the Bayesian v. frequentist squabble a bit much, which may have been appropriate 20 years ago but is not so much an issue now.

Otherwise, I found the perspective from the course extremely valuable, especially in the context of supervised learning.

If you plan on taking the course, I recommend leaving a lot of time for it or having a very strong statistical background.