After 136 trials the DEFAULT assumption no longer applies in the face of ample hard data. - Edit 1
Before modification by Joel at 06/10/2012 04:42:05 PM
...typically coins are effectively random. It is rare to find a non-effectively random coin.
I really don't want to get into explaining sample size and how to calculate it. I'm tired. Long day yesterday, didn't sleep well last night, had a test first thing this morning.
But if you want to see for yourself if your sample size is large enough. Find a quarter and conduct five to ten tests. Flipping the coin 100 times each test. Count the number of heads for each test and calculate the percentage of heads for each test. You should see how much the percentages vary. It isn't until you get around 1000 flips per test that you start to see percentages that are close together.
I really don't want to get into explaining sample size and how to calculate it. I'm tired. Long day yesterday, didn't sleep well last night, had a test first thing this morning.
But if you want to see for yourself if your sample size is large enough. Find a quarter and conduct five to ten tests. Flipping the coin 100 times each test. Count the number of heads for each test and calculate the percentage of heads for each test. You should see how much the percentages vary. It isn't until you get around 1000 flips per test that you start to see percentages that are close together.
I preface this by saying you owe me an hour of life spent verifying things I already knew. I only did the minimum 500 flips because there were no surprises. Anyway:
Trial 1: H 52% T 48% (H+4)
Trial 2: H 44% T 56% (T+12)
Trial 3: H 47% T 53% (T+6)
Trial 4: H 40% T 60% (T+20)
Trial 5: H 48% T 52% (T+4)
The difference was <10% in the majority of cases, <20% in all but one, and never anywhere NEAR 50%. In fact, rather than matching the 3:1 margin of national polls Obama led outright (i.e. giving Romney the 8% of ties,) the most lopsided result in 5 trials of 100 tosses was 1.5:1.
That one was instructive though, because it was the starkest example of another well known phenomenon: The process' randomness makes a lead of significant size hard to surmount. In trial 4, the ratio of tails:heads peaked at 34:19, nearly 2:1. The ratio through the remainder of the trial was (obviously) a more even 26:21, but only increased the final margin (yet lowered the final ratio.) Which, of course, is why journalists usually project election winners well before 100% of votes are counted: Once early returns give a candidate a lead so large opponents need a prohibitively large number of remaining votes, the candidate may be safely declared the victor.
That is not to say there are never exceptions; in trial 5 heads opened a lead of 11 after just over 40 tosses, but tails came back in the final 30, took the lead on toss 93, and won 5 of the last 7 to finish ahead by 4. Which, of course, means that the nearly 2:1 edge heads had after a SMALL number of tosses evened out to 50±2% after a LARGE number. It also means that since the FL debacle in 2000 US journalists have been VERY hesitant to call elections unless absolutely sure of the outcome. To clarify "absolutely sure," once tails trailed 11% with 41% counted, it took tails on a whopping 39 of the remaining 59 tosses (nearly 2:1) to give it that meager 4% final margin. Moral (which I can attest from decades of election watching:) Candidates who lead 10+% with >40% of votes counted almost ALWAYS win.
If we take all 500 throws together, of course, we end up with a final tally of H 231 T 269 (T+38,) or H 46% T 54% (T+8%.) Not 50/50, but exactly 250 H/T is extremely unlikely. The majority of trials varied even less than that 8%, but the two outliers (especially trial 4) pushed the total variance higher. It still never reached anything like a 3:1 ratio, and only a single trial reached even half that ratio (had I done another 36 tosses to reach a full 136, I am nearly certain that ratio would fall, since ~18+60 will more often than not be less than 21.6+60.) If I ever do get 75 heads/tails in 100 tosses, I will therefore assume the coin is weighted to the other side.
Perhaps the biggest thing to take away from this is that Obama does not simply lead a large AMOUNT of this years presidential pollls, but a large RATIO. The plurality of polls Obama leads is a relative thing, proportionate to the total number of polls taken: The ratio of polls he wins to polls Romney wins inherently includes that factor, and as the total number of polls grows, a lopsided ratio is strongly indicative of a genuine advantage for Obama. To win 3/4 polls, Obamas support must be strong enough it usually sustains him even against errors in Romneys favor rather than his.
Binomial expansion is neither new nor difficult; what took me a few months was coming up with a general POLYnomial expansion theorem.
