Soccer, Python, the Election, and Understanding a Stochastic World

I have a former colleague whose daughter is an outstanding soccer player on one of the best high school teams in the area. The paper today reported that her team had lost in the quarterfinals of the playoffs by a 1-0 score. That made me start thinking of how unfair it is that athletic contests have a binary outcome, and then thinking about how we overvalue the results of arbitrary binary events (we love “winners,” we don’t love “losers”), even those most of life isn’t binary at all. If I apply for a job and don’t get it, my career doesn’t end, it just continues on a different path.

Then at the gym the TV had on two of the most influential election-predictors, Nate Silver of the site FiveThirtyEight and Nate Cohn of the New York Time’s site The Upshot, talking about their election predictions. That got me thinking about my colleague’s daughter’s soccer game again, and about how, in my view, most people don’t have a good way to think about randomness, probability, and stochastic events. We don’t have a good way to mentally separate the specific outcomes of stochastic systems from the as-yet-undetermined pre-outcome state of those systems.

Here’s an example. I drive to work down a main road called Old Redwood Highway. The first stoplight I come to is at a main intersection, and if I miss that light, I end up sitting a good while waiting for the next green. As a result, I usually “jump through” a yellow light rather than stopping for it (I note that the intersection has no pedestrians, and cross traffic is backed up and unlikely to move into the intersection, making such a move relatively safe). After that light there are several others further down the road.

One morning I jumped through the yellow and noticed that the car in the lane next to me made the opposite decision. Four stoplights down, as I was stopped for a red light, the very same car pulled up right beside me.

I automatically said to myself, “What a useless thing it was to jump the yellow … I took a risk and it didn’t get me anywhere.”

But later I realized that I was wrong. The fact that in one specific instance jumping the yellow didn’t get me ahead of the car that made the opposite decision says nothing about whether it’s a good decision in general. I may not always be ahead by the fourth stoplight if I jump the yellow, but on average I will, and only in a very, very rare “black swan” scenario would I ever be worse off (e.g., going through the yellow results in my being in an accident that wouldn’t have happened had I stopped). Jumping the yellow is a good decision if I want to get to work faster (even though sometimes I won’t), and my colleague’s daughter’s soccer team is still better than its playoff opponent even though it lost its playoff game.

Getting a feel for the nature of randomness is difficult. One of the best (and easiest) ways I’ve found is to simply watch the discreet results from a random process as they happen. One of the hobbies I’ve taken up in semi-retirement is programming using the Python computer language. The on-line class I took from MIT when I was first learning (outstanding, by the way, and starting again in January on EdX) spent a good deal of time on how to program to evaluate stochastic systems (surprisingly easy to do). In doing that I found myself often just watching the output of a random event, which really did seem to give me a better “gut” feel for how “randomness” and “probability” manifest themselves in the world.

I’ve created a short, no-brainer Python program to help you get that “gut” feel as well. If you go to this link and hit the “run” button, the program will spit out on the right (at 3 second intervals) random numbers (rounded to two decimal places) from a “normal” distribution (the good-old bell-shaped curve) having a mean of 5 and a standard deviation of 2 (meaning that about 95% of the spit-out numbers will be between 1 and 9, and about 68% of the spit-out numbers will be between 3 and 7).

What will intrigue you, I suspect, is how much more frequently than expected you will see numbers not close to 5, numbers that seem to be on “runs” (i.e., multiple “4”s in a row), and numbers distant from the mean of 5.

Returning to my colleague’s daughter’s soccer team, suppose that her team is much, much better than the other playoff team — so much better that, on average, the daughter’s team will 70% of the games played between the two. A spit-out of actual Wins and Losses for this scenario is here. The first time I ran it, the first spat-out result for my friend’s daughter’s team was a Loss, just as happened in the real world, even though you’ll see if you run the program that her team mostly kicks butt.

As for the election, well … who knows what will happen, but we can hope that the actual, particular of the election will be more in line with what the polls show than the actual, particular outcomes of my commute and my friend’s daughter’s soccer game.

Categories: Politics, Python

Thoughts? Leave a comment.