You are here

Expected Learning History Class – Sam Ventura and WAR On Ice

It was announced on July 5 that Sam Ventura will be joining the Sabres as VP of Hockey Strategy and Research. Sam has been with the Penguins since the start of the 2015-16 NHL season in various analytics roles, starting as a consultant and working up to Director of Hockey Ops and Hockey Research this past season in Pittsburgh. 

Part of the rapidly growing Carnegie Mellon Statistics and Data Science pipeline in North American sports; he got his PhD in statistics from the school and then was added to the Penguins staff as a consultant by Jason Karmanos. While he was with the Penguins as a consultant, he was teaching at the university. Then the Penguins hired him full time in a director of analytics role for the 2017-18 season. He had yet to turn 30.

Readers, I’m telling ya, Sam’s legit. The Penguins put out this bio for Sam in November after he had his title changed to Director of Hockey Ops. In that bio, you’ll see mention of War On Ice, one of the original hockey analytics sites of the more modern analytics era (as in they were doing way more than talking about corsi). 

War on Ice was founded by a group of three hockey hockey fans and math enthusiasts in 2014 and had stats and contract databases, data visualizations and tables for game and season data, and, as the name suggests, had what probably was the first public WAR model in hockey. Sam was the first of the three to leave the site to work for a team in July of 2015. In early 2016, the other two co-founders were hired by the Minnesota Wild – Andrew Thomas, a Statistics PhD from Harvard who is now the Director of Data Science for SportsMedia Technology Corp, and Alex Mandrycky, who is the Director of Hockey Strategy and Research for the Seattle Kraken. Not a bad legacy for the trio.

So bad news for us, unsurprisingly, their site went offline after they all had been hired away. But like dinosaur fossils, their site’s influence hasn’t disappeared, which is good because 6 years is an eternity in hockey analytics. And the website’s blog is still available to view, which is good because we can see the foundation of 2021 hockey analytics beginning to reveal itself.

Early adjusted save percentage

Their adjusted save percentage calculation circa 2015 weighted shot quality based on high, medium, and low danger attempts faced so that it accounts for the shot profiles that different goalies face. Natural Stat Trick still categorizes shots with these three labels, so this calculation is still easy to calculate. One thing to note is that high/medium/low danger shot calculations based on binned locations have mostly been replaced by expected goal calculations, which has been the big evolution since this site was up. From the linked article from Andrew Thomas:

Different goaltenders face different distributions of shots from across the ice due to the offenses they face and the defenses in front of them. We adjust save percentage by re-weighing the components according to the league-wide distribution of shots, so that the value better translates between different goaltenders. This is similar to stratified sampling in survey methodology, and also goes by the name benchmarking.

If the shots faced by the goalie have the same ratio as the league average, then their unadjusted and adjusted save percentages will be equal.

2014 Pittsburgh Hockey Analytics Workshop

Sam and Andrew hosted an analytics workshop in 2014 where Sam gave two presentations, one of which went along with the theme of transitioning from corsi-based analysis to expected goals-based. In a 2014 context, corsi’s appeal as an explanatory variable was that it was the most predictive counting statistic (A reminder that corsi is a counting stat) for predicting goals over a period of time (whether it’s a game, season, etc.). And it wasn’t a secret that using shot attempts to predict goals (typically CF% to predict GF%) was limited – there just wasn’t the public tracking as widely available. The slides explain that statistical modeling through means such as regressions can include accounts and controls of other contextual variables that go into the volumes of shots in a period of time, referencing four models that had existed at the time that created these kinds of models and hint at what has evolved into expected goal modeling.

The second presentation also gave hints to the present state of hockey analytics with a presentation about zone transition times. My favorite thing about this one is how it goes slide by slide to show another concept’s evolution – this time with it being how to evaluate the defensive abilities of a player, explaining how NHL box score data limits defensive counting stats to things such as time-on-ice, hits, blocked shots, etc., none of those things being predictive of future shot attempt or goal volumes. In the spirit of play-by-play data as another data source (called RTSS in the NHL – Real-Time Scoring System), Sam explained how this PxP data could not only get on-ice volume and rate statistics for shots and goals, but also information into the zone where each event would’ve taken place. With this, the estimated time it took for the puck to move from one zone (offensive, neutral, defensive) to another was analyzed to determine how efficient teams and players were at (1.)  getting the puck out of the defensive zone and into the offensive zone and (2.) keeping the puck in the offensive zone before it was moved back into the defensive zone. As fun as it would be for me to explain the Markov process of this study further, let’s talk instead about how this has also evolved since 2014. As tracking data becomes public, and through the last few years of Corey Sznajder’s manual tracking, zone entry and exit data has joined time-related measures to give a much clearer understanding at which teams and players are good at the transition aspects of hockey.

WAR On Ice Class Dismissed for the Day

Outside of their very detailed annotated glossary, that’s about it…for now. Don’t want to put too much in one post that doesn’t have any images or charts to break up the words. Sam’s scope of work doesn’t start and end on the remnants of the WAR on Ice website, so there’ll be more to look through in the coming weeks to pair with Expected Learning’s curriculum continuing. As for the website, we’re not done here either. There’s one more piece that we will get to later, but we have a lot more math to get to first. But for now, hopefully this gives a better glance into what may be coming to the Sabres front office while also providing a nice trip down memory lane of how far the field has evolved in less than a decade. 

Top