You are here

Expected Learning History Class – More Sam Ventura Content

The alternate headline for this post would be “I searched Sam Ventura and WAR On Ice on google and searched as far as I could so you don’t have to”. Unfortunately, there isn’t too much more, but we will continue looking at two really important articles that are very important to the current state of hockey analytics. Let’s continue where we left off last time.

More Early Expected Goals Development

This presentation from OTTHAC in 2015 is similar to the presentation he would have given a few months earlier in Pittsburgh that we looked at previously, explaining how expected goals blend shot quantity and quality together. It’s also a really good bridge piece for Expected Learning with the transition from corsi to Expected Goals. One new slide on this presentation that wasn’t on the previous presentation is one that shows how for predicting future goal share (goals-for percentage), corsi is the most predictive for teams, but when splitting the roster into forwards and defensemen, scoring chances are more predictive than corsi for forwards, and fenwick (unblocked shots) is more predictive for defensemen. This shows the first step to quality being as important to predicting future goals as quantity. Fenwick filters out the blocked shots that a defenseman is more likely to have a shot blocked from traffic on a long distance shot. Then for forwards, scoring chances being more valuable than a simple shot attempt from anywhere is straightforward – a player is more likely to score when it’s a high quality shot.

A scoring chance, however, is defined with an arbitrary threshold that traditionally has been mostly location-based (think the home plate diagram), and only looking at scoring chances to predict future goals is removing any other shot as being considered so non-useful that if it’s not a scoring chance, it wouldn’t be expected to become a goal. And when it gets back to corsi measures, it would presume that every shot attempt starts with equal likelihood of being a goal before the defense and goaltending are considered. Expected goals, of course we know now but the concept was in its infant/toddler stages in early-2015, combine measuring quantity and quality. What does this have to do with Sam though? Well it shows his role in the evolution of hockey analytics in critically thinking of what’s the next frontier. Which is a great segue.

WAR being, well, on ice

Okay so this isn’t a “This is how WAR works” article that I linked to; I guess I technically cheated here. They actually have an 11 part blog on the site to explain how it worked back in the mid-2010s. But since in the curriculum we are still a ways away from WAR and GAR, I’m highlighting this blog from Sam and Andrew [Thomas]’s old site because it is a very good critical thinking piece. This is how it starts:

One of the biggest questions when it comes to WAR is exactly what question we’re trying to answer by constructing these measures. There are a few that we need to address, because they’re often all asked at once.

  1. What actually happened?
  2. What would have happened, given what we know now, and events repeated over again?
  3. What would have happened (if we had made a change)?
  4. What’s going to happen next?
  5. What will happen next, if we can make a change?

For each of these points, the answer seems clear. The biggest issue is that they often overlap, and figuring out exactly what questions we’re answering is a bit more difficult. When it comes to what we’re trying to learn about players

Absorb those steps, students. This is the process for developing the majority of statistical methods from scratch, and it fits in with the second of the two expected goal pre-req posts. This translates well to most statistical processes. 

  1. What actually happened? Acquire event data
  2. What would have happened, given what we know now, and events repeated over again? Summarize the event data and estimate parameters
  3. What would have happened (if we had made a change)? Compare two (or more groups) of summarized event data
  4. What’s going to happen next? Use parameters in predictive models and simulations
  5. What will happen next, if we can make a change? Compare two (or more) groups of predictive models and simulations

While some concepts won’t reach the fourth or fifth steps due to things such as lack of data, limited benefits from future predictions, etc., the foundation at least of the first three steps has a wide range of applications that have brought us to where all statistical analysis currently stands across all industries. 

Reward for reaching the end of the research – Make your own rink!

Even if you are unfamiliar with R coding, if you can copy and paste, you are one step away from making your own NHL-regulation rink, so that’s fun! But it’s also the end of the road on WAR On Ice for this series of posts because everything else is no longer on the site. But as the homepage of the site says, their GitHub will stay up indefinitely, and look, it’s more with ice rinks. There’s a lot to parse through on their GitHub, but we don’t need to go through it here, especially because while the contents don’t exist on their website anymore, those who are working on the public-side of the hockey analytics community have their own versions of things such as NHL stat scrapers, contract calculators, WAR calculations, etc. 

So we can call it a class here. It’s a bummer that there isn’t more of Sam’s past work available online still, but the limited pieces that are still around definitely help get a great idea of what kind of experience the Sabres have received this summer and going forward. 

Top