Expected Learning Week 2: The Why and How of Corsi (And Fenwick) Expected Learning by Eddy Tabone - June 24, 2021June 24, 20210 Welcome back to summer school. Thank you for not dropping the course after I went right into Chapter 1 without spending the first class on the non-existent syllabus. This’ll be another fairly short class this week, but we’re still building a foundation to be able to understand how some of the more widely used concepts became part of the regular lexicon in hockey analytics. As I discussed in the previous week, inference is the initial pillar of the analysis process, and in sports, everything is traced back to a single question – How do I win. For hockey, score more goals than the other team by the time the game ends, as cut and dry as the rulebook would state it. When a team is on offense, the question is how do we score. When a team is on defense, the question is how do we keep them from scoring. Since having a positive goal differential is the objective for any game, that’s a good first place to start for any analysis. The teams with the best goal differential in a season are the ones that are more often and not the better teams. Of course the emphasis here is that this is most helpful over a full season sample because it removes the noise of single larger blowouts in one direction or another, as good teams have their bad nights and bad teams have their good nights (Remember when the 2018-19 Sabres won 2 of 16 games in March but finished the season with 5-2 and 7-1 wins over Ottawa and Detroit?). There have been metrics created in other sports such as pythagorean wins that project estimated wins based on a team’s goal differential, but especially hockey where different results lead to a different amount of rewarded standings points, we don’t have to go into that here. From the standpoint of strategy, monitoring goal differential is a good tool, but for answering the big questions in how to score goals and prevent them against, you would lose a lot of context by just looking at the differentials. While the outcome of a game is certainly the result of a game, analysis takes place at the next level out from the core, with goals as the result of the event that is the shot. This way it can separate analyzing offense from defense. This next “How do we score? Take shots.” layer would’ve been the first layer of strategy placed into hockey the second someone muttered for the first time, “The more shots you take, the more chances you have to score.” Then prior to the 1959-60 season, some other person at the NHL offices muttered, “Hey, we should write these shots down,” and with that, shots on goal were first recorded by the league. And like I talked about last week, data collection is the first step of analytics, and for 1960, I’d call that analytics (okay fine it’s a stretch but come one there weren’t that many computers at that point to automate visualizations and store large datasets). So that’s what we had for decades from the NHL, and we can certainly analyze shots on goal. It’s certainly good for goalies as we can understand save percentages and how they vary from season to season and it gave us data on saves that we didn’t have in the previous decades of NHL hockey. It’s not a stretch to say that GAA and SV% were an advancement in analysis at one point. And conveniently enough, goalies are the reason for the next evolution in shot analysis. Goalies, and, according to legend, the Buffalo Sabres. *Pause for dramatic effect* The story goes, according to Bob McKenzie, that Jim Corsi would record shots that missed the net or blocked shots in addition to shots on goal for his goalies because those shots still contributed to the goalie’s workload because they would still be going down into their butterfly and back up as the shot was being taken. Vic Ferrari, mid-2000s hockey analyst, picked the name Corsi mostly because of his mustache. (Read Bob’s article for more on Jim Corsi’s mustache’s role in the story). A little while later on the timeline, Matt Fenwick, who was blogging about the Calgary Flames, would focus on the unblocked shot attempts only, and thus Fenwick was added to the lexicon. So now in tabular form, because the names in the lexicon aren’t supposed to be a secret: CorsiShot AttemptsGoals + Saved Shots + Shots off post + Shots that miss net + Shots that are blockedFenwickUnblocked Shot AttemptsGoals + Saved Shots + Shots off post + Shots that miss netThe NHL actually uses Shot Attempts and Unblocked Shot Attempts on their stats pages, which may be the first time they have chosen to avoid confusion and a lack of clarity just for the heck of it. Why else is it important to collect and analyze all shot attempts? Well for starters, about one in four unblocked shot attempts miss the net and that doesn’t even encompass the number of shots that are blocked (Moneypuck’s shot data encompasses shots that aren’t blocked). With puck possession being estimated with shot attempts (A whole other can of worms that we will get to eventually), limiting shot attempt scope limits the picture of clarity. Image borrowed from earlier in the season So throughout the 2010s, through the LA Kings run to multiple Stanley Cups where they were the poster children of puck possession and winning the shot attempt battle, analysis was centered around the quantities of shot attempts being a driver of which teams score more, allow less goals, and win more frequently. And once we got a hold of shot quantities, people started asking the next logical question: What about shot quality? What shots are the most likely to be goals? Hopefully by now you’ve caught onto what is coming next – The star of the course – Expected goals. Photo Credit: Bill Wippert/NHLI via Getty Images