[PR #480] [MERGED] Add support for scraping Understat #596

Closed
opened 2026-03-02 15:58:41 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/probberechts/soccerdata/pull/480
Author: @JanVanHaaren
Created: 2/11/2024
Status: Merged
Merged: 2/12/2024
Merged by: @probberechts

Base: masterHead: feature/add-understat-support


📝 Commits (7)

  • 6e75a3d Add support to extract Understat's JS variables to BaseRequestsReader
  • c75262b Add support to scrape advanced statistics from Understat
  • 1478ee8 Add tests for new functionality
  • ef33eec Add documentation for Understat scraper
  • 6dc3aa8 Update dependency pip to v24
  • 58e8574 Update dependency sphinx-autobuild to v2024
  • 1d90bf0 fix: Add 'player' to read_player_season_stats index

📊 Changes

17 files changed (+2795 additions, -29 deletions)

View changed files

📝 .github/workflows/constraints.txt (+1 -1)
📝 README.rst (+4 -3)
📝 docs/conf.py (+1 -0)
docs/datasources/Understat.ipynb (+1926 -0)
📝 docs/datasources/index.rst (+14 -0)
📝 docs/index.rst (+2 -2)
📝 docs/reference/index.rst (+1 -0)
docs/reference/understat.rst (+10 -0)
📝 poetry.lock (+9 -8)
📝 pyproject.toml (+1 -1)
📝 soccerdata/__init__.py (+6 -4)
📝 soccerdata/_common.py (+26 -10)
📝 soccerdata/_config.py (+5 -0)
soccerdata/understat.py (+707 -0)
📝 tests/conftest.py (+12 -0)
tests/test_Understat.py (+60 -0)
📝 tests/test_common.py (+10 -0)

📄 Description

This pull request adds support to scrape advanced statistics such as xG, xGBuildup and xGChain, and shot events with their associated xG values from the Understat website.

Concretely, this pull request includes the following changes.

  • Extends the BaseRequestsReader class with functionality to extract JavaScript variables from the Understat website.
  • Adds the Understat class with functionality to scrape leagues, seasons, schedules, team-match statistics, player-match statistics, player-season statistics and shot events.
  • Adds documentation for the Understat class.

This pull request closes #151.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/probberechts/soccerdata/pull/480 **Author:** [@JanVanHaaren](https://github.com/JanVanHaaren) **Created:** 2/11/2024 **Status:** ✅ Merged **Merged:** 2/12/2024 **Merged by:** [@probberechts](https://github.com/probberechts) **Base:** `master` ← **Head:** `feature/add-understat-support` --- ### 📝 Commits (7) - [`6e75a3d`](https://github.com/probberechts/soccerdata/commit/6e75a3d1edb5872660edb04bfa1428500a46ce8b) Add support to extract Understat's JS variables to BaseRequestsReader - [`c75262b`](https://github.com/probberechts/soccerdata/commit/c75262bfdcb9c9cc9c15103e181b3b97577d5b72) Add support to scrape advanced statistics from Understat - [`1478ee8`](https://github.com/probberechts/soccerdata/commit/1478ee89390b7615afc388161b2fd7a449d867d1) Add tests for new functionality - [`ef33eec`](https://github.com/probberechts/soccerdata/commit/ef33eec3f320993efbe9d36cd83d9a573e5b65c9) Add documentation for Understat scraper - [`6dc3aa8`](https://github.com/probberechts/soccerdata/commit/6dc3aa838de54ef4081d54358077bcb9d5f6e7bd) Update dependency pip to v24 - [`58e8574`](https://github.com/probberechts/soccerdata/commit/58e857473c1820ba3f2c357dca75e94e1728dbea) Update dependency sphinx-autobuild to v2024 - [`1d90bf0`](https://github.com/probberechts/soccerdata/commit/1d90bf0f5c30037f0360ff33c93123dabf90ffeb) fix: Add 'player' to read_player_season_stats index ### 📊 Changes **17 files changed** (+2795 additions, -29 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/constraints.txt` (+1 -1) 📝 `README.rst` (+4 -3) 📝 `docs/conf.py` (+1 -0) ➕ `docs/datasources/Understat.ipynb` (+1926 -0) 📝 `docs/datasources/index.rst` (+14 -0) 📝 `docs/index.rst` (+2 -2) 📝 `docs/reference/index.rst` (+1 -0) ➕ `docs/reference/understat.rst` (+10 -0) 📝 `poetry.lock` (+9 -8) 📝 `pyproject.toml` (+1 -1) 📝 `soccerdata/__init__.py` (+6 -4) 📝 `soccerdata/_common.py` (+26 -10) 📝 `soccerdata/_config.py` (+5 -0) ➕ `soccerdata/understat.py` (+707 -0) 📝 `tests/conftest.py` (+12 -0) ➕ `tests/test_Understat.py` (+60 -0) 📝 `tests/test_common.py` (+10 -0) </details> ### 📄 Description This pull request adds support to scrape advanced statistics such as xG, xGBuildup and xGChain, and shot events with their associated xG values from the Understat website. Concretely, this pull request includes the following changes. * Extends the `BaseRequestsReader` class with functionality to extract JavaScript variables from the Understat website. * Adds the `Understat` class with functionality to scrape leagues, seasons, schedules, team-match statistics, player-match statistics, player-season statistics and shot events. * Adds documentation for the `Understat` class. This pull request closes #151. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-02 15:58:41 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#596
No description provided.