Saturday, March 30, 2013

Presenting without slides

Tired of slides, I’ve been experimenting with different ways of presenting. At the recent Conference on Statistical Practice, I decided only to use slides for an outline and references. As it turns out, the most critical feedback I got had to do with the fact that the audience couldn’t follow the organization because I had no slides.

I tried presenting without slides because, well, I started to use them as a crutch. I also saw a lot of people presenting essentially by putting together slides and reading from them. So I figured I would expand my horizons.

Next time I present, I’ll do slides, I guess, but I may try something a bit different.

Wednesday, March 27, 2013

Last session of Caltech's Learning from Data course starts April 2

I just received this email:

Caltech's Machine Learning MOOC is coming to an end this spring, with the final session starting on April 2. There will be no future sessions. The course has attracted more than 200,000 participants since its launch last year, and has gained wide acclaim. This is the last chance for anyone who wishes to take the course (
The Caltech Team
I strongly recommend this course if you can take it, even if you have taken other machine learning classes. It lays a great theoretical foundation for machine learning, sets it off nicely from classical statistics, and gives you some experience working with data as well.

If you were for some reason waiting for the right time, it looks to be now or never.

Wednesday, March 20, 2013

Review of Caltech's Learning from Data e-course

Caltech has an online course Learning from Data, taught by Professor Yaser Abu-Mostafa, that seeks to make the course material accessible to everybody. Unlike most of the online courses I've taken, this one is independently offered through a platform created just for the class. I took the course for its second offering in Jan-March 2013.

The platform on which the course is offered isn't as slick as Coursera. The lectures are offered through a Youtube playlist, and the homeworks are graded through multiple choice. That's perhaps a weakness of the class, but somehow the course faculty made it work.

The class's content was its strong point. Abu-Mostafa weaved theory and pragmatic concerns throughout the class, and invited students to write code in just about any platform (I, of course, chose R) to explore the theoretical ideas in a practical setting. Between this class and Andrew Ng's Machine Learning class on the Coursera platform, a student will have a very strong foundation to apply these techniques to a real-world setting.

I have only one objection to the content, which came in the last lecture. In his description of Bayesian techniques, he claimed that in most circumstances you could only model a parameter with a delta function. This, of course, falls in line with the frequentist notion that you have a constant, but unknowable "state of nature." I felt this way for a long time, but don't really believe it any more in a variety of contexts. I think he played up the Bayesian v. frequentist squabble a bit much, which may have been appropriate 20 years ago but is not so much an issue now.

Otherwise, I found the perspective from the course extremely valuable, especially in the context of supervised learning.

If you plan on taking the course, I recommend leaving a lot of time for it or having a very strong statistical background.

Tuesday, March 12, 2013

Distrust of R

I guess I've been living in a bubble for a bit, but apparently there are a lot of people who still mistrust R. I got asked this week why I used R (and, specifically, the package rpart) to generate classification and regression trees instead of SAS Enterprise Miner. Never mind the fact that rpart code has been around a very long time, and probably has been subject to more scrutiny than any other decision tree code. (And never mind the fact that I really don't like classification and regression trees in general because of their limitations.)

At any rate, if someone wants to pay the big bucks for me to use SAS Enterprise Miner just on their project, they can go right ahead. Otherwise, I have got a bit of convincing to do.