A small step up from this are the web surveys, such as John Mack's First Ever Pharma Blogsphere Survey®™©. They have a lot of the same problems as the simple web poll, and few of the controls necessary to ensure valid results. So I'll discuss simple one-off web polls and web surveys together.
Most of the problems and biases with these web polls aren't statistical; rather, they are operational. The data from these is so bad that no amount of statistics can rescue them. It's better not to even bring statistics into the equation here. Following are the operational biases I consider unavoidable and insurmountable:
- Most web polls do not control whether one person can vote multiple times. Most services will now use cookies or IP addresses to block multiple votes from one person, but these services are imperfect at best. Changing an IP address is easy (just go to a different Starbucks, and cookies can be deleted). Cookies are easily deleted.
- Wording questions in surveys is a tricky proposition, and millions of billable hours are spent agonizing over the wording. (Perhaps 75% of that is going a bit too far, but you get the point.) Very little time is generally spent wording the question of a web poll. The end result is that readers may not be answering the same question a blogger asks.
- Forget random sampling, matching cases, identifying demographic information, or any of the classical statistical controls that are intended to isolate noise and false signal from true signal. Web poll samples are "People who happen to find the blog entry and care enough to click on a web poll." At best, the readers who feel strongly about an issue are the ones likely to click, while people who are feel less strongly (but might lean a certain way) will probably just glaze over.
- Answers to web polls will typically be immediate reactions to the blog post, rather than thoughtful, considered answers. Internet life is fast-paced, and readers (in general) simply don't have the time to thoughtfully answer a web poll.