Shaving My Shoulders: Statistical Weighting

Mrs. X called her brother Mr. A last night and they got to talking about the election (he lives in PA). Their conversation got us to talking and about polling specifically. Most people take polls pretty much at face value, but every pollster manipulates the numbers they get to conform to certain preconceived notions that they have about a particular race or geographic region. This is done with something known as statistical weighting. I’ll give a numerical example:

Let’s say that I’m a pollster and I call 1000 people in a geographic region. 453 people that I call say that they are members of party A, 438 people say they are members of party B and 109 say they are independent. Each group lists their voting preference for candidates X, Y, or a third party as follows:

X Y 3p Total
A 387 54 12 453
B 44 390 4 438
I 52 51 6 109

Taken as shown, candidate Y would be leading 49.5-48.3 (which most pollsters would just report as a 50-48 race). This might seem a reasonable breakdown with the two major parties being roughly even and a reasonable amount of third party affiliation.

However, let’s say that as the pollster, I don’t believe that this is the proper party breakdown. Instead, I believe that in the region I am sampling, its 45% membership for party A, 39% for party B, and 16% for Independents. Thus, I would then multiply my numbers by a weighing factor, which would be determined by taking my perceived party affiliation percentage and dividing it by the party percentage of my sample. In our example, I would have weighing factors of 0.99 for party A, 0.89 for party B, and 1.47 for independents. This would change my numbers as follows:

X Y 3p Total
A 383 54 12 449
B 39 347 4 390
I 76 75 9 160

Summing these numbers up, we now have candidate X leading 49.8-47.6 (or 50-48), a swing of nearly 3.5 points. The fundamental flaw that is introduced is that the pollster is assuming right off the bat that his sample is fixed and only mass defection of a party away from its candidate or a large swing in unaffiliated voters will determine the outcome of the election. In fact, a pollster has no way of knowing what absolute party breakdown will be come Election Day.

This is the key element of danger for the Democratic Party. During the primary, many people came and registered to vote as Democrats, there have been massive drives to sign first-time voters (many of whom would probably favor Democratic candidates), and Mr. Obama is said to have a large get-out-the-vote apparatus in place. These three things cause many pollsters to weigh their samples a bit heavier on the Democratic end of the scale.

Now one or all of these things could be true and Democratic turnout could be much heavier that Republican, but if polls are shown to favor Mr. Obama by large margins, it could lead to a sense of complacency among Democratic voters and the hubris that it might foster among Democrats could galvanize Republicans into their own large scale get-out-the-vote effort.

One other problem is that Democrats and media outlets that are friendly to Democrats have been trumpeting these polls, telling everyone that they should vote for Mr. Obama to be part of the “cool kids table” since he’s going to win anyway. That attitude might work if we all sat around and raised our hands in public to vote (see Oct. 16 post) but we don’t. We get to vote in a quiet little booth and only you and God know who you voted for. You could talk all you want about voting for one candidate, but then quietly vote for another and not a soul would know.

Things are certainly stacked in favor for Mr. Obama to win, but there is enough uncertainty and danger out there that I would suggest that everyone sit back and just wait to see what happens. Of course, that’s what makes this process so much fun.

Wednesday, October 29, 2008

Statistical Weighting

No comments:

About Me

Blog Archive

Music