By Mark Dodd
As frivolous as sports may seem, there is no denying the passion and love that sport elicits in population. It is a passion that unites people in celebration and agony; bringing people together in joy and sorrow. This passion has created an entire sub-industry: sports analytics.
Sports are a results based entertainment industry where a winner is ultimately, and quantifiably, established, and this generates a wealth of data to be analyzed. This analysis is not fully embraced by everyone who discuss, manage, or play the games - but that doesn’t mean it hasn’t had an impact. Decade old “laws” regarding sports betting have been changed; how teams pick players have been changed; how we watch and enjoy the game has changed. Sports data analysis was immortalized in the movie Moneyball, starring Brad Pitt, based on a true story of how the Oakland Athletics revolutionized baseball.
My focus for this project will be applying data analysis to the professional hockey domain. Specifically, the goal of this project is to create a heatmap visualization that can be used to gain insights on how teams and players play.
Our data is a public dataset from Kaggle called the NHL Game Data (see references). The dataset was created using the NHL api which has been documented at https://gitlab.com/dword4/nhlapi. In addition to the Kaggle dataset we have polled the API directly to gather specific information we required for our analysis.
The Kaggle NHL dataset can be visualized with the following entity relationship diagram.
import pandas as pd import numpy as np import ipywidgets as widgets from ipywidgets import interact import plotly.graph_objects as go import plotly.offline as py py.init_notebook_mode(connected=False)