mirror of
https://github.com/probberechts/soccerdata.git
synced 2026-04-25 18:15:58 +03:00
[GH-ISSUE #869] [FBref] 403 Forbidden error when running from Google Colab #188
Labels
No labels
ESPN
FBref
FotMob
MatchHistory
SoFIFA
Sofascore
WhoScored
WhoScored
bug
build
common
dependencies
discussion
documentation
duplicate
enhancement
good first issue
invalid
performance
pull-request
question
question
removal
understat
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/soccerdata#188
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @gustavoalikan1910 on GitHub (Aug 20, 2025).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/869
Describe the bug
I've been using soccerdata to scrape data from fbref.com for about a year, and it's always worked fine. However, I'm now consistently getting a 403 Forbidden error, and I can no longer scrape any data.
I've already tested this locally, on Google Colab, and in other environments using different IP addresses. Despite these attempts, the issue seems to be more profound than a simple IP block.
Affected scrapers
This affects the following scrapers:
Code example
A minimal code example that fails. Use
no_cache=Trueto make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.Error message
Additional context
It was working a long time but stopped yesterday (2025-08-19). Is the FBREF blocking soccerdata package?
Contributor Action Plan
@gustavoalikan1910 commented on GitHub (Aug 20, 2025):
Here's a curious fact:
When I run the script locally using my home network in Brazil, it works.

However, when I execute the exact same script on Google Colab, I get a 403 error. And similarly, when I run it on my Oracle server, I also encounter a 403 error.
@probberechts commented on GitHub (Aug 21, 2025):
FBref automatically blocks your IP address if you send more than 10 requests per minute. If multiple people scrape FBref from a Google Collab notebook. Google's IP adresses will inevitably get blocked. It might also be that they have permanently blocked Google's IP addresses. I cannot fix this. You either have to use a proxy or not use Google Collab.
@mattpfreer commented on GitHub (Aug 21, 2025):
was anyone able to find a way around the 403 error? i have troubleshooted similar issues with the sportsreference team before (for different sports) and they were able to resolve, but am not getting helpful responses for this latest FBRef issue.
@gustavoalikan1910 commented on GitHub (Aug 22, 2025):
"Unfortunately, even when using a proxy, the 403 error persists."