Tuesday, October 13, 2009

About Those Polls...

Alright, so Raphael and I have restarted our poll tracker. So I'll discuss the boring technicalities of why I compile the polls the way I do. Some of what I do may seem simplistic, but I have put some thought in behind why I am doing it that way.

The Politics of Binning

The first thing that I do is 'bin' the polls based on their polling end date. Binning the polls is a way of accumulating more data to obtain a smaller numerical error but a larger time 'error'. During the previous election, because of the large number of polls being released virtually everyday, I had the luxury of playing with the methodology a bit. Because of the relative scarcity of polls now, I am binning them into weeks.

Generally polling companies rarely release polls that have been conducted over the weekend, although this is not a hard and fast rule, the one polling company that is continuing to release polls week after week (Ekos) is always polling during the weekdays and not on the weekends. So the bin boundary was set to be Friday evening.

A Weighty Issue

That means that every Friday I accumulate all the new polls together from the previous week and perform a straight weighted average. Now, you might be thinking I've taken the lazy man's way out and am ignoring the intricacies of polls. Decided versus Leaners Included. Reliability of polling companies. And so on and so forth.

The truth is I could create a complicated and massive system designed to account for all these issues. I could use the previous two election results to measure the difference between using 'leaners included' and 'decided only' methods. I could estimate the reliability of polling companies (as has been done elsewhere). But fundamentally, such methods are ignoring a few key details.

First of all, they are ignoring the dreaded 'margin of error'. Statistically speaking, you can't get your results more accurate than the margin of error, which says that 95% of the time the value will be within the value plus or minus the margin of error. So a poll that pegs the Tories at 36% with a margin of error of 3% the day before the 2008 election wouldn't have been 'wrong' or a 'bad poll'. It is still within the statistical margin of error.

Secondly, they assume that polling companies are static and that they don't change their polling or weighting methods. If a polling company knows that their polling is systematically wrong one way or another then they will change their polling weighting methods to try to obtain a more accurate one. A biased polling company doesn't get any work.

Finally, I invoke the power of Occam's Razor (otherwise known as the KISS - Keep It Simple Stupid! principle). To prove that a more complicated method of analysis is needed, the standard is placed higher than simply "because we can". Other groups have tried to find systematic biases and errors in polling results only to find their final results off when push comes to shove. The reason, I believe, is because they are using a complicated method, where a simple method would derive similar results in terms of accuracy.

I'll discuss this further later.

No comments:

Post a Comment