Expected Learning Week 3 - The Prerequisites of Expected Goals - Part 1

After taking last week explaining how hockey analytics evolved from being centered around the quantity-based Corsi to the quantity and quality-based expected goals, there’s a little bit more work to do. Not too much work to do, but we have to go through some lessons in probability and statistics first so that it’s easier to get a good grasp of what an expected goal value means.

To start, A shot’s expected goal value is the probability that the shot results in a goal. Okay now let’s pause for math:

Like how a lot of data analysis has its roots in writing events down, describing them, and counting them, probability’s origin in mathematics is counting. Usually, that is counting events or outcomes. An example you may have stored away in your long-term memory from a math class in your past is one with, say, 5 shirts, 4 pants, and 2 shoes and were asked how many combinations of outfits there are. And since I’m not actually grading you and am not, despite what Chad and Anthony have convinced themselves, a professor, I’m not going to actually write out the tree diagram of [shirt 1 pant 1 shoe 1, shirt 1 pant 1 shoe 2, shirt 1 pant 2 shoe 1 … shoe 5 pant 4 shoe 2], but after all of that, you learned that you can multiply the three numbers together and learn that there are 40 outfit combinations.

Other examples you may remember are how many different ways can a race with five people finish, or maybe rolling a die and flipping a coin at the same time, or picking a ball out of an urn (or a jar – Shout out to my MATH 301 and 302 textbooks at Fisher for using an urn because it’s still hilarious to me). These are all instances of simply counting outcomes.

Then you would have been asked what the probability was of the picked shirt being shirt 5 and then you would go through and count out the scenarios where the fifth shirt was used and find out that in 8 of the 40 outfits, the fifth shirt was picked – so 20% of the time the person in this scenario picked an outfit that included the fifth shirt.

So you’ve got the act of picking an outfit. In math/science terms, we’re going to call that the experiment, or set. Every outfit that you can wear based on your closet is the sample space. Then think about wearing a single full outfit as a single event. Every single unique outfit in this example is a single, wait for it, a fraction of the sample space. So yes, finding probabilities is all about fractions. Back to finding the probability of the fifth shirt, where there are 8 events in the sample space of 40 outfits where the fifth shirt is worn:

Event: Wearing the fifth shirt	Numerator (Top of the fraction)	8
Sample Space: Every possible outfit combination	Denominator (Bottom of the fraction)	40

8/40 = 0.2 = 20% for the probability of wearing the fifth shirt

Now that we know that probabilities are fractional, we know two of the important foundational pieces of the subject: To change it up, let’s use rolling a standard die (values 1-6) as the example:

There are no negative numbers in probability. The smallest possible number for a probability is 0	If asked for the probability of rolling a 7, that isn’t on the die. There are zero sides of the cubic die with a seven on it. It occurs on 0/6 sides. It’s a 0% probability (wouldn’t be a negative value)
The largest possible number for a probability is 1	If asked for the probability that the value of a roll is less than 10 (Of course it’s not a realistic answer – I’m going for easy at first), all six values on the die are less than 10, so the event would occur in 6 scenarios of the 6 possible values. It’s a 100% probability (6/6 = 1 = 100%)

These are the first two axioms of probability. From the good folks at Merriam-Webster, “In mathematics or logic, an axiom is an unprovable rule or first principle accepted as true because it is self-evident or particularly useful.” TL;DR? An axiom is a foundational assumption for studying a topic. The fractional aspects of probability are axioms because it’s simply how fractions work.

There is the third axiom of probability that involves mutual exclusion. When two (or more) events cannot occur at the same time, and the event in question involves the probability that one or the other (or the other, etc.) event occurs is the probability of the one event occurring plus the probability of the other event (plus the other, etc.) occurring.

P(A or B) = P(A) + P(B)

The probability of rolling a 2 or a 3 equals the probability of rolling a 2 plus the probability of rolling a 3 (⅙ + ⅙ = 2/6 = ⅓)

While sometimes tedious, determining probabilities isn’t the most intensive exercise when the number of events in the sample space is known. Of course, most applications in sports are not going to have a known sample space because the number of possible events ends up substantially large or even approaching infinite. Luckily, the field of statistics emerged to estimate these probabilities and parameters so that analysis could still be done on these applications where there isn’t as much clarity of possibilities. We’ll work through statistics in the next pre-requisite post.

To wrap up this first part, allow me to hop on the soapbox for a second. Anti-math people love to discredit probability calculations when there’s, say, a 99+% chance of something happening but that thing doesn’t happen. For example, when a team has a two-goal lead in the final minute of a game and loses in regulation. The win probability for the team that blew the lead was probably close approaching 100% and because the team lost, you’ll of course see someone out there discrediting the odds because the other team won.

Well now that you’ve made it this far into the post, you know that if the leading team’s win probability being 99% means that in 1 of every 100 scenarios, the leading team could still lose, which in this case would have happened. In 99 of every 100 scenarios featured the leading team getting to the garage by keeping the puck or scoring an empty-net goal or at least only giving up a single goal and then winning the faceoff and keeping the puck to force overtime. It would’ve taken something crazy (which ended up happening in this fancy scenario) the trailing team scoring, getting the puck back, keeping it to make sure the other team didn’t keep it to force overtime, putting shots on net, and one of those shots going in for the comeback to be completed; a scenario that seems so far-fetched that it was incredibly unlikely to happen.

But that didn’t mean it couldn’t have happened. As long as it was an event in the sample space, it was possible. Heck, every single game’s play-by-play is a unique part of that sample space of what could happen in a game and that’s what’s being estimated by win probability.

That concludes part 1 of the prereqs. I’m hoping that these prerequisite posts help transition through July with the holiday weekend and the Cup leading into the draft and the offseason carousel. We will get to the full primer on expected goals, but not before understanding what that probability of scoring a goal really means first so there are no secrets. Thanks again for reading

Bonus: At First Line, we made a video that follows this primer into probability centered around the grumpy people who said games aren’t played on spreadsheets after the Falcons famously blew their 28-3 Super Bowl Lead. You can watch that and see the transcription here. We intended on making more than two videos in this series but then got busy with clients. But here’s this one for your viewing pleasure.

Photo credit: Timothy T. Ludwig-USA TODAY Sports