Expected Learning History Class: Matthew Barlowe

Last summer, I wrote two posts about Sam Ventura following his being hired by the Sabres. Throughout the season, the Sabres hired two more people for their analytics department, Domenic Galamini as a data scientist and Matthew Barlowe as a data engineer. Like Sam’s work last summer, I did a google deep dive on both new hires to see what public work is still available online so we can better acquaint ourselves with the Sabres’ analytics department.

First, we’ll look at Matt’s background, thanks to LinkedIn. His work in the hockey analytics space began back in 2016 when he began writing for Cardiac Cane, the Carolina Hurricanes Fansided Blog. Most of these posts appear to be accessible on the site. Among the posts are Stat Roundups, which feature the occasional graph and chart (mostly Tableau) with Corsi and expected goal data among other stats. Unfortunately, a lot of them contain tweets that have since been deleted due to his work in the NHL, but here’s one from a game against the Sabres, and then another one a week later when the Sabres and Canes played again. This powerplay analysis post does feature a Tableau chart that is still available, as well as references to some film tweets for the eye test people in the crowd. Want some mid-2000s Canes’ downfall content? Here you go.

His first job in sports came in December 2018 when the NHL hired him as a SQL Developer and Stats Analyst. After about a year, he began a role as an App Developer that appeared to have had a lot of focus on the NHL Tracking Data integration across the league. Then after some work as a software engineer in the corporate world, the Houston Rockets hired him in February 2021 as a Basketball Operations Analyst, which is a role he kept until being hired by the Sabres in January 2022 as a Data Engineer.

Data Engineering typically refers to the process of taking collected data and preparing it to be analyzed down the road without too much further manipulation. If you hear of the ETL process, this is typically the responsibility of the data engineer role. Luckily, each letter is fairly straight-forward: Extract, Transform, and Load

Extract: Getting the data we want (in this case for doing hockey analysis) often involves acquiring said data from different sources. If there are one-off instances of wanting to find data, exporting a single file to excel is the easiest way to go, but part of the process of building a data pipeline is automating as much as possible for consistency and speeding up processes.
Transform: These are the steps of making the data clean, consistent, and joined together to ensure that analysis can be done to the extent of the desired scope
Load: The Loading process is simply taking this newly transformed data and saving it somewhere, whether that’s within the same database or to a different part of the overall data warehouse

See? Data isn’t always scary.

Website

Barlowe’s website features links to the work he’s made public, including on RPubs and GitHub. The GitHub page mostly has references to some of his basketball work, but there are a few hockey repositories, including an Elite Prospects scraper, a scraper file for NHL and NWHL data, an NHL game prediction model, and another general repo for scraping data across different sports.

His site also has a tutorial pages for getting started in Python, R, Java, SQL, Git, Unix, and a Statistics tutorial on the gamma distribution. This post here called Learning To Code is one that I would highly recommend, as with his last two posts before starting with the Sabres which entail Picking A Language to start learning how to code and Asking Questions when it comes to narrowing down what kind of research to do.

There is also his public Tableau profile for a lot of NHL-related work that he has put together.