Python project trying to facilitate and being a starting point for soccer analytics projects.
soccer_analytics
is a Python project trying to facilitate and being a starting point for analytics projects in soccer.
This projects includes a number of notebooks that serve as tutorial on how to use the helper functions and might be a good starting point into soccer analytics in general.
The notebooks can be found here and I recommend to go through them in the following order:
Exploratory analysis event data: This notebook gives you an overview over the pre-processed wyscout data
and runs rudimentary exploratory analysis using pandas-profiling
Goal kick analysis: In this notebook we identify the best teams w.r.t goal kicks in the Bundesliga. On the way we learn how to
Passing analysis: We continue our journey by looking at passes between players and analyze one match in more detail. Technically, we learn how to
use the helper function to:
Expected goal model with logistic regression: While in the previous notebooks it was mostly about visualization, in this notebook we start
looking into machine learning. We jointly build an expected goal model using logistic regression and learn about fundamentals of machine learning, e.g.:
Challenges using gradient boosters: In this rather technical notebook we are going to look into some of the challenges that often
arise in real-life situations when using gradient boosters such as lightGBM or XGBoost, such as:
Introduction to tracking data: In this notebook we are going to start looking into tracking data provided by Metrica sports. We learn
about the fundamentals of working with tracking data such as:
Passing probability model: In this notebook we look at a pass probability model as proposed by Peralta et al. and Spearman et al. (see papers below) and use it for a first use case.
More precisely this notebook includes:
If you are new to Python and soccer analytics, I would recommend you to download the Anaconda distribution and follow
the instructions under Conda.
conda create -n soccer_analytics python=3.7
conda activate soccer_analytics
pip install -r requirements.txt
Event data: Wyscout
Tracking data: Metrica
Great repositority on pitch control and pitch impact by Laurie Shaw here
Physics-Based Modeling of Pass Probabilities in Soccer by Spearman et al. here
Seeing in to the future: using self-propelled particle models to aid player decision-making in soccer by
Peralta et al. here