[GH-ISSUE #869] [FBref] 403 Forbidden error when running from Google Colab #188

Closed
opened 2026-03-02 15:56:31 +03:00 by kerem · 4 comments
Owner

Originally created by @gustavoalikan1910 on GitHub (Aug 20, 2025).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/869

Describe the bug
I've been using soccerdata to scrape data from fbref.com for about a year, and it's always worked fine. However, I'm now consistently getting a 403 Forbidden error, and I can no longer scrape any data.

I've already tested this locally, on Google Colab, and in other environments using different IP addresses. Despite these attempts, the issue seems to be more profound than a simple IP block.

Affected scrapers
This affects the following scrapers:

  • ClubElo
  • ESPN
  • [ x ] FBref
  • FiveThirtyEight
  • FotMob
  • Match History
  • SoFIFA
  • Understat
  • WhoScored

Code example
A minimal code example that fails. Use no_cache=True to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.

import soccerdata as sd
fbref = sd.FBref(leagues="ENG-Premier League", seasons="24/25", no_cache=True)
fbref.read_schedule()

Error message

< Error while scraping https://fbref.com/en/comps/. Retrying... (attempt  [_common.py](file:///usr/local/lib/python3.12/dist-packages/soccerdata/_common.py):[545](file:///usr/local/lib/python3.12/dist-packages/soccerdata/_common.py#545)
                             1 of 5).                                                                              
                             Traceback (most recent call last):                                                    
                               File "/usr/local/lib/python3.12/dist-packages/soccerdata/_common.py",               
                             line 525, in _download_and_save                                                       
                                 response.raise_for_status()                                                       
                               File "/usr/local/lib/python3.12/dist-packages/requests/models.py",                  
                             line 1026, in raise_for_status                                                        
                                 raise HTTPError(http_error_msg, response=self)                                    
                             requests.exceptions.HTTPError: 403 Client Error: Forbidden for url:                   
                             https://fbref.com/en/comps/>

Additional context
It was working a long time but stopped yesterday (2025-08-19). Is the FBREF blocking soccerdata package?

Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • [ x ] I’m not able to fix this issue.
Originally created by @gustavoalikan1910 on GitHub (Aug 20, 2025). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/869 **Describe the bug** I've been using soccerdata to scrape data from fbref.com for about a year, and it's always worked fine. However, I'm now consistently getting a 403 Forbidden error, and I can no longer scrape any data. I've already tested this locally, on Google Colab, and in other environments using different IP addresses. Despite these attempts, the issue seems to be more profound than a simple IP block. **Affected scrapers** This affects the following scrapers: - [ ] ClubElo - [ ] ESPN - [ x ] FBref - [ ] FiveThirtyEight - [ ] FotMob - [ ] Match History - [ ] SoFIFA - [ ] Understat - [ ] WhoScored **Code example** A minimal code example that fails. Use `no_cache=True` to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed. ```python import soccerdata as sd fbref = sd.FBref(leagues="ENG-Premier League", seasons="24/25", no_cache=True) fbref.read_schedule() ``` **Error message** ``` < Error while scraping https://fbref.com/en/comps/. Retrying... (attempt [_common.py](file:///usr/local/lib/python3.12/dist-packages/soccerdata/_common.py):[545](file:///usr/local/lib/python3.12/dist-packages/soccerdata/_common.py#545) 1 of 5). Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/soccerdata/_common.py", line 525, in _download_and_save response.raise_for_status() File "/usr/local/lib/python3.12/dist-packages/requests/models.py", line 1026, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://fbref.com/en/comps/> ``` **Additional context** It was working a long time but stopped yesterday (2025-08-19). Is the FBREF blocking soccerdata package? **Contributor Action Plan** - [ ] I can fix this issue and will submit a pull request. - [ ] I’m unsure how to fix this, but I'm willing to work on it with guidance. - [ x ] I’m not able to fix this issue.
kerem 2026-03-02 15:56:31 +03:00
Author
Owner

@gustavoalikan1910 commented on GitHub (Aug 20, 2025):

Here's a curious fact:

When I run the script locally using my home network in Brazil, it works.
Image

However, when I execute the exact same script on Google Colab, I get a 403 error. And similarly, when I run it on my Oracle server, I also encounter a 403 error.

Image
<!-- gh-comment-id:3208454252 --> @gustavoalikan1910 commented on GitHub (Aug 20, 2025): Here's a curious fact: When I run the script locally using my home network in Brazil, it works. <img width="1315" height="568" alt="Image" src="https://github.com/user-attachments/assets/031fdd85-9c72-4d08-bf7b-2583346333c5" /> However, when I execute the exact same script on Google Colab, I get a 403 error. And similarly, when I run it on my Oracle server, I also encounter a 403 error. <img width="1041" height="461" alt="Image" src="https://github.com/user-attachments/assets/949e9a1b-cb5d-48c0-815f-d6b5119f457f" />
Author
Owner

@probberechts commented on GitHub (Aug 21, 2025):

FBref automatically blocks your IP address if you send more than 10 requests per minute. If multiple people scrape FBref from a Google Collab notebook. Google's IP adresses will inevitably get blocked. It might also be that they have permanently blocked Google's IP addresses. I cannot fix this. You either have to use a proxy or not use Google Collab.

<!-- gh-comment-id:3209395030 --> @probberechts commented on GitHub (Aug 21, 2025): FBref [automatically blocks your IP address if you send more than 10 requests per minute](https://www.sports-reference.com/bot-traffic.html). If multiple people scrape FBref from a Google Collab notebook. Google's IP adresses will inevitably get blocked. It might also be that they have permanently blocked Google's IP addresses. I cannot fix this. You either have to use a proxy or not use Google Collab.
Author
Owner

@mattpfreer commented on GitHub (Aug 21, 2025):

was anyone able to find a way around the 403 error? i have troubleshooted similar issues with the sportsreference team before (for different sports) and they were able to resolve, but am not getting helpful responses for this latest FBRef issue.

<!-- gh-comment-id:3212163733 --> @mattpfreer commented on GitHub (Aug 21, 2025): was anyone able to find a way around the 403 error? i have troubleshooted similar issues with the sportsreference team before (for different sports) and they were able to resolve, but am not getting helpful responses for this latest FBRef issue.
Author
Owner

@gustavoalikan1910 commented on GitHub (Aug 22, 2025):

"Unfortunately, even when using a proxy, the 403 error persists."

<!-- gh-comment-id:3212819797 --> @gustavoalikan1910 commented on GitHub (Aug 22, 2025): "Unfortunately, even when using a proxy, the 403 error persists."
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#188
No description provided.