[GH-ISSUE #904] [Understat] Team's stat unable to be read #196

Closed
opened 2026-03-02 15:56:35 +03:00 by kerem · 4 comments
Owner

Originally created by @maris-volk on GitHub (Dec 9, 2025).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/904

Describe the bug
Unable to scrape the schedule from Understat. Either the data doesn't arrive at all, or there's a KeyError - 'statData'.

Affected scrapers
This affects the following scrapers:

  • Understat

Code example

import soccerdata as sd
understat = sd.Understat(leagues='GER-Bundesliga', seasons='2526', no_cache=True)
match_data = understat.read_team_match_stats()

Error message

process data for Bundesliga..
[12/09/25 07:52:00] INFO     Saving cached data to               _common.py:263
                             C:\Users\MarsVilo\soccerdata\data\U               
                             nderstat                                          
error while process Bundesliga: 'statData'

or

Traceback (most recent call last):
  ———
  File "...\site-packages\soccerdata\understat.py", line 144, in read_seasons
    league_data = data["statData"]
                  ~~~~^^^^^^^^^^^^
KeyError: 'statData'

Additional context
Literally less than a day ago everything was working.

Contributor Action Plan

  • I’m not able to fix this issue.
Originally created by @maris-volk on GitHub (Dec 9, 2025). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/904 **Describe the bug** Unable to scrape the schedule from Understat. Either the data doesn't arrive at all, or there's a KeyError - 'statData'. **Affected scrapers** This affects the following scrapers: - [x] Understat **Code example** ```python import soccerdata as sd understat = sd.Understat(leagues='GER-Bundesliga', seasons='2526', no_cache=True) match_data = understat.read_team_match_stats() ``` **Error message** ``` process data for Bundesliga.. [12/09/25 07:52:00] INFO Saving cached data to _common.py:263 C:\Users\MarsVilo\soccerdata\data\U nderstat error while process Bundesliga: 'statData' ``` or ``` Traceback (most recent call last): ——— File "...\site-packages\soccerdata\understat.py", line 144, in read_seasons league_data = data["statData"] ~~~~^^^^^^^^^^^^ KeyError: 'statData' ``` **Additional context** Literally less than a day ago everything was working. **Contributor Action Plan** - [x] I’m not able to fix this issue.
kerem 2026-03-02 15:56:35 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@pres-2 commented on GitHub (Dec 9, 2025):

Same issue on my side. The cached data saved as json are empty.

<!-- gh-comment-id:3632143820 --> @pres-2 commented on GitHub (Dec 9, 2025): Same issue on my side. The cached data saved as json are empty.
Author
Owner

@rontrim commented on GitHub (Dec 10, 2025):

I'm running into the same issue. Did some digging and it looks like Understat isn't putting statData or datesData in the HTML anymore - it's all being loaded via JavaScript after the page loads.

import cloudscraper

scraper = cloudscraper.create_scraper()

# Homepage
response = scraper.get("https://understat.com/")
print('statData' in response.text)  # False

# League page
response = scraper.get("https://understat.com/league/EPL/2024")
print('datesData' in response.text)  # False

Both return 200 OK with valid HTML, but the JS variables the library is looking for just aren't there anymore. Only basic stuff like THEME and BASE_URL show up.

Seems like Understat changed their site to load data via AJAX instead of embedding it in the HTML, which breaks how the scraper currently works.

Versions:

  • soccerdata version: 1.8.7
  • Python: 3.11
  • OS: Windows 11
<!-- gh-comment-id:3634901100 --> @rontrim commented on GitHub (Dec 10, 2025): I'm running into the same issue. Did some digging and it looks like Understat isn't putting statData or datesData in the HTML anymore - it's all being loaded via JavaScript after the page loads. ``` import cloudscraper scraper = cloudscraper.create_scraper() # Homepage response = scraper.get("https://understat.com/") print('statData' in response.text) # False # League page response = scraper.get("https://understat.com/league/EPL/2024") print('datesData' in response.text) # False ``` Both return 200 OK with valid HTML, but the JS variables the library is looking for just aren't there anymore. Only basic stuff like THEME and BASE_URL show up. Seems like Understat changed their site to load data via AJAX instead of embedding it in the HTML, which breaks how the scraper currently works. Versions: - soccerdata version: 1.8.7 - Python: 3.11 - OS: Windows 11
Author
Owner

@maris-volk commented on GitHub (Dec 10, 2025):

Okay, so what can you recommend to get information on matches, XG, XGa, corners and yellow cards?

<!-- gh-comment-id:3636709512 --> @maris-volk commented on GitHub (Dec 10, 2025): Okay, so what can you recommend to get information on matches, XG, XGa, corners and yellow cards?
Author
Owner

@rontrim commented on GitHub (Dec 10, 2025):

Okay, so what can you recommend to get information on matches, XG, XGa, corners and yellow cards?

I use the fbref scraper functions provided to get that data. I also have other data in my pipeline coming from understat so this is still an issue for me that I don't know how to resolve.

<!-- gh-comment-id:3636918350 --> @rontrim commented on GitHub (Dec 10, 2025): > Okay, so what can you recommend to get information on matches, XG, XGa, corners and yellow cards? I use the fbref scraper functions provided to get that data. I also have other data in my pipeline coming from understat so this is still an issue for me that I don't know how to resolve.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#196
No description provided.