Chapter 03

The Forecast

Election Modeling

You are a data journalist. It's the night before the election, and your candidate leads 52 to 48 in the polls. Your editor wants a headline. “Call it,” she says. “Are they going to win?”

The distribution curve on your right shows what you actually know. The bell curve represents the range of plausible outcomes given the polling data. The green shaded area past 50% is the probability your candidate wins. Right now, it looks pretty good.

The Margin That Matters

Drag the Margin of Error slider up to 5%. Watch the curve flatten and spread. Your candidate still leads 52–48 in the polls, but with a 5-point margin of error, outcomes anywhere from 47% to 57% are plausible. The green shaded region shrinks. Suddenly that 4-point lead doesn't feel so safe.

Now drag it down to 1%. The curve snaps tight. A 4-point lead with a 1-point margin of error is nearly certain. The polling data is precise enough to call it.

The Power of Aggregation

Here's where it gets interesting. Drag the Number of Polls slider from 1 to 50. Watch the curve narrow in real time. Each individual poll has a wide margin of error. But when you average 50 independent polls, the effective margin shrinks by a factor of seven.

The semi-transparent curves behind the main curve show what the distribution looked like with fewer polls. One poll: a vague hump. Twenty-five polls: getting sharper. Fifty polls: a spike. This is the central limit theorem in action — aggregation turns noise into signal.

The Fatal Flaw

But aggregation has a fatal flaw. Drag the Systematic Bias slider to +2%. The entire curve shifts. Every poll overestimates your candidate by 2 points.

No amount of averaging can fix this. If every poll is biased in the same direction, fifty biased polls are just as wrong as one. The curve gets narrower — you become more precisely wrong.

This is what happened in 2016 and 2020. The polls weren't random noise. They had a systematic bias that aggregation couldn't cure.

Three Disguises

You've now seen three versions of the same problem. A doctor deciding whether a test result is real. A radio astronomer deciding whether a signal is real. A journalist deciding whether a poll lead is real. In each case, the answer depends on how much noise surrounds the signal — and whether you know which direction the noise leans.

In the next chapter, you'll learn what happens when the evidence arrives one piece at a time, and you have to update your beliefs as you go.