You are here

Expected Learning – Week 1

Looks like it’s gloomy in Sabreland, and there are only so many ways to write about mock offseasons without being dejected and scrapping the work before it’s done. So that’s why I figured we could have a much more light-hearted topic to carry us through the summer. 

So let’s all hop on the bus! Are we going to the beach? The park? A baseball game? Nope, even better. We’re going to summer school!

You’re not being punished, don’t worry. Nobody failed at anything either. We just have a lot of time on our hands, and through the first two rounds of the playoffs, the talking heads in the hockey world and the internet are getting silly again with the analytics discourse. And of course, it’s nothing new – the same arguments are always very repetitive, and the majority of them have drifted away from what the whole point is of “applying analytics in sports”.

And while I’m by no means the end-all-be-all source for “teaching” this “sports analytics course”, but between statistics education, experience in sports analytics, and especially some of the connections that I’ve made in the industry over the last few years, I think I can muster up a good series of explainers for how this all works and maybe debunk some misconceptions about the field too. The single-word analytics really doesn’t encompass the range that the field encompasses. And with the hype that Moneyball got on the big screen in the last decade and with the publicity that organizations like the Astros and the Eagles have generated following their championships in being heavy proponents of analytics, it’s not surprising that the fanfare of the “analytics evolution” has grown at a faster pace than how the public’s perception of the field has grown.

So that is something else I’m trying to accomplish is taking a step back because most of what is under the scope of “analytics” are not meant in any way to be too overwhelming to try to grasp. 

When people casually say analytics, most of the time they are referring to all applications of working with data, and data at its core is itself nothing more than gathered information. Thinking of data as any information and not some secret code that can only be deciphered by the smartest scientists on the planet itself removes most of the knowledge gap that is perceived. 

And let’s get the most important thing out of the way: THERE ISN’T SUPPOSED TO BE A WAR BETWEEN ANALYTICS AND THE EYE TEST. In no way is that supposed to be the point. While informal, watching games is its own form of acquiring information because your brain is processing what is happening, and the observations are stored in memory. 

So even while observations are limited to what the person’s eyes saw and drilled down even further to what they remember over time, it’s still a series of observations being “measured.” And even more importantly, on the other side of the spectrum, except for the most lucrative leagues over the last couple of years, data collection is a universally manual process with tracking done, wait for it, live or by watching countless amounts of film. People that work in sports analytics watch the games. I promise you. The anonymous person arguing with you on Twitter might not, but it’s part of the job when working in sports. 

Observations are the initial pillar of all science and mathematics. To get philosophical with it, the roots of it all come from humanity’s curiosity. People are curious, which makes them ask questions, which leads them to develop their own hypotheses, which they then test and evaluate on the way to attempting to draw conclusions. In an analytical context, the experiment is the root of the data collection. If you’re writing down what you see, asking good questions, and summarizing what you collected, you’re still “doing analytics”. No prerequisites are required to get started.

With way too simple broad examples, the three-point revolution in basketball, valuing on-base percentage over batting average in baseball, the increase in non-punts on fourth down in football, and developing decrease of point shots in hockey all would’ve had their foundation from people asking questions about what will make them more likely to win. 

Because in the end, the desire to win is the reason for the data revolution in sports. 

This concludes the introduction week of summer school. This isn’t so bad, right? It’s nice to learn! If there’s anything you all would like to learn about this summer, please let me know. I certainly have plenty of topics on the backburner, but we’re all in this offseason together. 

Next week we’re going to focus on why the importance of having more data created the Corsi revolution in hockey.

Photo Credit: Ryan Yorgen/NHLI via Getty Images

3 thoughts on “Expected Learning – Week 1

  1. great idea. I look forward to this. One of my initial questions is why are is this called analytics for hockey and other sports, but sabermetrics for baseball? haha, just kidding. Actually, specifically for Buffalo I guess it should be called “Sabremetrics”, no? Seriously, what do you feel are some limitations to analytics in hockey right now? I know the data contained within them is static, but perhaps how one analyzes it might have a bias. A specific situation that occurred this past season had me thinking about this. Florida traded for Sam Bennett. Communicating with some analytic pundits that are Panthers fans, their initial reaction to the trade was “meh”. Sure, he had value, but wasn’t really expected to move the needle for them. But then a funny thing happened. He absolutely flourished with them. Now, I may be remembering this incorrectly, but Bennett’s “charts” actually improved once he joined his new team. (I could be wrong, so this may throw this whole argument out the window). So there is potentially a bias a team my have against acquiring a certain player if they have less than favourable analytics, when all said player may need is a different environment. Would you agree with this statement? Are there certain analytics that can be used to mitigate these situations…for better or for worse? Thanks…

  2. So for the main question, I wouldn’t call it a limitation, but tracking data is the next frontier. The amount of information that will be able to be recorded is going to skyrocket, and we won’t know for a long time how much we will truly be able to get out of it (More on that later…)

    What’s funny about this example is at First Line we’re working with a client on a lineup-related project in basketball that is running into the same problem. Lineup interactions are a frontier that will take awhile before we *really* get a good idea how archetypes interact with each other and how players perform in and out of their archetype. Not only will player tracking need to be able to encompass the impact of all 12 players on the ice (in the hockey case) at once (right now we really mostly have off puck and then positioning frozen when an event happens with what little tracking data has been publicized), but they will also need to acquire a fair amount of games to get distinguishable clarity. And even with that, with everything in statistics, you’ll never know the true formula, just more accurate estimates when there are this many interactions.

    So with current evaluation, you are limited to how players produce individually and as part of a line. So in the Bennett example, you can assign identities to player types and look into how Bennett plays with different types of lines to get a general idea of what may happen, and then stick to what would happen over the course of a season instead of against different defenses.

    It’s both a pro and a con of hockey that positions aren’t really that different compared to other sports where this kind of analysis would be occurring. A pro because you can look at individual talent with results and not archetypes and not be too far off, but the con being that it’s going to probably take the longest to determine what those true archetypes are and are going to keep the range of expectations higher longer.

    Hope that all helps.

  3. Thanks for the response, I appreciate it. I look forward to what the more advanced tracking data will provide. If it’ll be more games like Colorado-Vegas, then we will all benefit.

Comments are closed.

Top