[GH-ISSUE #739] [FBref] team_match_stats for teams with slash "/" in the name results in FileNotFoundError #159

Open
opened 2026-03-02 15:56:16 +03:00 by kerem · 2 comments
Owner

Originally created by @ilyacherevkov on GitHub (Oct 30, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/739

Describe the bug
Unable to use team_match_stats for teams with slash in the name, like Bodø/Glimt.

It tries to create file matchlogs_Bodø/Glimt_2022_schedule.html, which resolves incorrectly due to slash in the name.

Affected scrapers
This affects the following scrapers:

  • ClubElo
  • ESPN
  • FBref
  • FiveThirtyEight
  • FotMob
  • Match History
  • SoFIFA
  • Understat
  • WhoScored

Code example
A minimal code example that fails. Use no_cache=True to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.

import soccerdata as sd
fbref = sd.FBref(leagues="SWE-Allsvenskan", seasons=[2022,2023], no_cache=True)
fbref.read_team_match_stats(stat_type="schedule", opponent_stats=False, team="Bodø/Glimt", force_cache=True)

Error message

Error while scraping https://fbref.com/en/squads/d86248bd/2022/matchlogs/all_comps/schedule. Retrying... (attempt 2 of 5).                  _common.py:568│
│                             Traceback (most recent call last):                                                                                                                        │
│                               File "/Users/user/.venv/lib/python3.12/site-packages/soccerdata/_common.py", line 564, in _download_and_save                                   │
│                                 with filepath.open(mode="wb") as fh:                                                                                                                  │
│                                      ^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                         │
│                               File "/usr/local/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 1013, in open                     │
│                                 return io.open(self, mode, buffering, encoding, errors, newline)                                                                                      │
│                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                      │
│                             FileNotFoundError: [Errno 2] No such file or directory:                                                                                                   │
│                             '/Users/user/soccerdata/data/FBref/historic/matchlogs_Bodø/Glimt_2022_schedule.html'

Additional context
Note, line number in _common.py with the error might differ, as I did minor changes in the code.

Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • I’m not able to fix this issue.
Originally created by @ilyacherevkov on GitHub (Oct 30, 2024). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/739 **Describe the bug** Unable to use team_match_stats for teams with slash in the name, like Bodø/Glimt. It tries to create file matchlogs_Bodø/Glimt_2022_schedule.html, which resolves incorrectly due to slash in the name. **Affected scrapers** This affects the following scrapers: - [ ] ClubElo - [ ] ESPN - [x] FBref - [ ] FiveThirtyEight - [ ] FotMob - [ ] Match History - [ ] SoFIFA - [ ] Understat - [ ] WhoScored **Code example** A minimal code example that fails. Use `no_cache=True` to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed. ```python import soccerdata as sd fbref = sd.FBref(leagues="SWE-Allsvenskan", seasons=[2022,2023], no_cache=True) fbref.read_team_match_stats(stat_type="schedule", opponent_stats=False, team="Bodø/Glimt", force_cache=True) ``` **Error message** ``` Error while scraping https://fbref.com/en/squads/d86248bd/2022/matchlogs/all_comps/schedule. Retrying... (attempt 2 of 5). _common.py:568│ │ Traceback (most recent call last): │ │ File "/Users/user/.venv/lib/python3.12/site-packages/soccerdata/_common.py", line 564, in _download_and_save │ │ with filepath.open(mode="wb") as fh: │ │ ^^^^^^^^^^^^^^^^^^^^^^^^ │ │ File "/usr/local/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 1013, in open │ │ return io.open(self, mode, buffering, encoding, errors, newline) │ │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │ │ FileNotFoundError: [Errno 2] No such file or directory: │ │ '/Users/user/soccerdata/data/FBref/historic/matchlogs_Bodø/Glimt_2022_schedule.html' ``` **Additional context** Note, line number in _common.py with the error might differ, as I did minor changes in the code. **Contributor Action Plan** - [ ] I can fix this issue and will submit a pull request. - [x] I’m unsure how to fix this, but I'm willing to work on it with guidance. - [ ] I’m not able to fix this issue.
Author
Owner

@ilyacherevkov commented on GitHub (Oct 30, 2024):

Fixed it by changing in fbref.py
filepath = self.data_dir / filemask.format(team, skey, stat_type)
to
filepath = self.data_dir / filemask.format(team.replace('/',''), skey, stat_type)

Not sure if it breaks anything, though.

<!-- gh-comment-id:2447909217 --> @ilyacherevkov commented on GitHub (Oct 30, 2024): Fixed it by changing in fbref.py `filepath = self.data_dir / filemask.format(team, skey, stat_type)` to `filepath = self.data_dir / filemask.format(team.replace('/',''), skey, stat_type)` Not sure if it breaks anything, though.
Author
Owner

@probberechts commented on GitHub (Oct 30, 2024):

No, it won't break anything. A more generic solution would be to use something like Django's slugify() function.

<!-- gh-comment-id:2447954831 --> @probberechts commented on GitHub (Oct 30, 2024): No, it won't break anything. A more generic solution would be to use something like Django's [`slugify()`](https://docs.djangoproject.com/en/4.0/_modules/django/utils/text/#slugify) function.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#159
No description provided.