[GH-ISSUE #76] [FBref] Can't fetch schedule data #14

Closed
opened 2026-03-02 15:55:01 +03:00 by kerem · 3 comments
Owner

Originally created by @BelkacemB on GitHub (Aug 17, 2022).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/76

if you run:

import soccerdata as sd
fbref = sd.FBref(leagues="ENG-Premier League", seasons=2021)
print(fbref.__doc__)

epl_schedule = fbref.read_schedule()

You will get an error

frame.py 3832 _set_item
value = self._sanitize_column(value)

frame.py 4535 _sanitize_column
com.require_length_match(value, self.index)

common.py 557 require_length_match
raise ValueError(

ValueError:
Length of values (0) does not match length of index (31)

Originally created by @BelkacemB on GitHub (Aug 17, 2022). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/76 if you run: ``` import soccerdata as sd fbref = sd.FBref(leagues="ENG-Premier League", seasons=2021) print(fbref.__doc__) epl_schedule = fbref.read_schedule() ``` You will get an error > frame.py 3832 _set_item > value = self._sanitize_column(value) > > frame.py 4535 _sanitize_column > com.require_length_match(value, self.index) > > common.py 557 require_length_match > raise ValueError( > > ValueError: > Length of values (0) does not match length of index (31)
kerem closed this issue 2026-03-02 15:55:02 +03:00
Author
Owner

@MatsThijssen commented on GitHub (Aug 26, 2022):

This appears to still be an issue, with pretty much any fbref function. Haven't investigated much, but my best guess at the moment is further attempts by fbref to discourage webscraping. If I have time to dive deeper I will update here.

<!-- gh-comment-id:1229027687 --> @MatsThijssen commented on GitHub (Aug 26, 2022): This appears to still be an issue, with pretty much any fbref function. Haven't investigated much, but my best guess at the moment is further attempts by fbref to discourage webscraping. If I have time to dive deeper I will update here.
Author
Owner

@probberechts commented on GitHub (Sep 2, 2022):

It seems indeed related to FBRef blocking bot traffic. However, soccerdata respects their policy. Since it already fails on the first request, I guess they simply block all headless traffic when the load is very high.

Most of the time, everything works just fine though. I would recommend to simply wait a bit if it does not.

<!-- gh-comment-id:1235266998 --> @probberechts commented on GitHub (Sep 2, 2022): It seems indeed related to FBRef blocking bot traffic. However, soccerdata respects their policy. Since it already fails on the first request, I guess they simply block all headless traffic when the load is very high. Most of the time, everything works just fine though. I would recommend to simply wait a bit if it does not.
Author
Owner

@probberechts commented on GitHub (Sep 27, 2022):

@BelkacemB I just found out that your error might be caused by using cached data. Try disable caching with

import soccerdata as sd
fbref = sd.FBref(leagues="ENG-Premier League", seasons=2021, no_cache=True)
epl_schedule = fbref.read_schedule()

If that works just delete your cache (default is at ~/soccerdata/data/FBref) and scrape all data again.

FBRef recently renamed some HTML attributes. This was fixed by 1f4128bc6e. Obviously, this now creates problems if you would run the latest version on cached data which still has the old HTML attributes.

<!-- gh-comment-id:1259876796 --> @probberechts commented on GitHub (Sep 27, 2022): @BelkacemB I just found out that your error might be caused by using cached data. Try disable caching with ``` import soccerdata as sd fbref = sd.FBref(leagues="ENG-Premier League", seasons=2021, no_cache=True) epl_schedule = fbref.read_schedule() ``` If that works just delete your cache (default is at `~/soccerdata/data/FBref`) and scrape all data again. FBRef recently renamed some HTML attributes. This was fixed by 1f4128bc6ef9a00fab921f2f70cfad64ecab54fb. Obviously, this now creates problems if you would run the latest version on cached data which still has the old HTML attributes.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#14
No description provided.