[GH-ISSUE #820] [Sofascore] Allow retrieving leagues from other regions #176

Open
opened 2026-03-02 15:56:25 +03:00 by kerem · 3 comments
Owner

Originally created by @jbrepogmailcom on GitHub (Mar 18, 2025).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/820

Describe the bug

Hello. In the documentation, it is advised to add more leagues for sofascore by going to https://api.sofascore.com/api/v1/config/unique-tournaments/EN/football
However, I was looking for Czech 1st league and I did not find it there. But I found it on this page:
https://api.sofascore.com/api/v1/config/unique-tournaments/CZ/football

So I modified my local sofascore.py to read from that page and I created custom entry for league
{
"CZE-Czech First League": {
"Sofascore": "Czech First League"
}
}

Unfortunately, that did not work and I am still getting index errors (even when I leave entries for other scrapers there)

Can you please advise how to use other leagues from that second URL?

Affected scrapers
This affects the following scrapers:
sofascore.py

Code example
A minimal code example that fails. Use no_cache=True to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.

from soccerdata import Sofascore

al = Sofascore.available_leagues()
print("\nAvailable leagues:")
print(al)

# Initialize Sofascore instance with the "24-25" season
mh = Sofascore(leagues=["CZE-Czech First League"], seasons=["24-25"])

# Retrieve available leagues
available_leagues = mh.read_leagues()

# Display the list of leagues
print("\nAvailable leagues:")
print(available_leagues)

# Adjust the loop to iterate over the correct column or structure
for league in available_leagues.itertuples():
    print(league)

# Retrieve schedule for the league "ENG-Premier League"
eng_premier_league_schedule = mh.read_schedule()

# Display the matches
print("\nMatches:")
for _, match in eng_premier_league_schedule.iterrows():
    print(match)

# Display the number of matches
#print(f"\nNumber of matches: {len(eng_premier_league_schedule)}")




Error message

Traceback (most recent call last):
  File "/home/janbenes/soccerdata-test2.py", line 11, in <module>
    available_leagues = mh.read_leagues()
                        ^^^^^^^^^^^^^^^^^
  File "/home/janbenes/.local/lib/python3.12/site-packages/soccerdata/sofascore.py", line 102, in read_leagues
    pd.DataFrame(leagues)
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1420, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1360, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1558, in _get_listlike_indexer
    keyarr, indexer = ax._get_indexer_strict(key, axis_name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6200, in _get_indexer_strict
    self._raise_if_missing(keyarr, indexer, axis_name)
  File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6249, in _raise_if_missing
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['CZE-Czech First League'], dtype='object', name='league')] are in the [index]"

Additional context
Add any other context about the problem here.

Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • I’m not able to fix this issue.
Originally created by @jbrepogmailcom on GitHub (Mar 18, 2025). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/820 **Describe the bug** Hello. In the documentation, it is advised to add more leagues for sofascore by going to https://api.sofascore.com/api/v1/config/unique-tournaments/EN/football However, I was looking for Czech 1st league and I did not find it there. But I found it on this page: https://api.sofascore.com/api/v1/config/unique-tournaments/CZ/football So I modified my local sofascore.py to read from that page and I created custom entry for league { "CZE-Czech First League": { "Sofascore": "Czech First League" } } Unfortunately, that did not work and I am still getting index errors (even when I leave entries for other scrapers there) Can you please advise how to use other leagues from that second URL? **Affected scrapers** This affects the following scrapers: sofascore.py **Code example** A minimal code example that fails. Use `no_cache=True` to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed. ```python from soccerdata import Sofascore al = Sofascore.available_leagues() print("\nAvailable leagues:") print(al) # Initialize Sofascore instance with the "24-25" season mh = Sofascore(leagues=["CZE-Czech First League"], seasons=["24-25"]) # Retrieve available leagues available_leagues = mh.read_leagues() # Display the list of leagues print("\nAvailable leagues:") print(available_leagues) # Adjust the loop to iterate over the correct column or structure for league in available_leagues.itertuples(): print(league) # Retrieve schedule for the league "ENG-Premier League" eng_premier_league_schedule = mh.read_schedule() # Display the matches print("\nMatches:") for _, match in eng_premier_league_schedule.iterrows(): print(match) # Display the number of matches #print(f"\nNumber of matches: {len(eng_premier_league_schedule)}") ``` **Error message** ``` Traceback (most recent call last): File "/home/janbenes/soccerdata-test2.py", line 11, in <module> available_leagues = mh.read_leagues() ^^^^^^^^^^^^^^^^^ File "/home/janbenes/.local/lib/python3.12/site-packages/soccerdata/sofascore.py", line 102, in read_leagues pd.DataFrame(leagues) File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1191, in __getitem__ return self._getitem_axis(maybe_callable, axis=axis) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1420, in _getitem_axis return self._getitem_iterable(key, axis=axis) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1360, in _getitem_iterable keyarr, indexer = self._get_listlike_indexer(key, axis) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexing.py", line 1558, in _get_listlike_indexer keyarr, indexer = ax._get_indexer_strict(key, axis_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6200, in _get_indexer_strict self._raise_if_missing(keyarr, indexer, axis_name) File "/home/janbenes/.local/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6249, in _raise_if_missing raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [Index(['CZE-Czech First League'], dtype='object', name='league')] are in the [index]" ``` **Additional context** Add any other context about the problem here. **Contributor Action Plan** - [ ] I can fix this issue and will submit a pull request. - [ ] I’m unsure how to fix this, but I'm willing to work on it with guidance. - [ ] I’m not able to fix this issue.
Author
Owner

@jbrepogmailcom commented on GitHub (Mar 18, 2025):

OK it started to work after I deleted cached files :-)

However, I would recommend to adjust the configuration in the way that if someone wants to explore league that is not in ../EN/.. folder, but in some other, it could customize the configuration. Like in imported leagues file

<!-- gh-comment-id:2734359980 --> @jbrepogmailcom commented on GitHub (Mar 18, 2025): OK it started to work after I deleted cached files :-) However, I would recommend to adjust the configuration in the way that if someone wants to explore league that is not in ../EN/.. folder, but in some other, it could customize the configuration. Like in imported leagues file
Author
Owner

@Brahim2796 commented on GitHub (May 26, 2025):

How to delete cached files??
I add norway league as explained in the documentation , it did not work

<!-- gh-comment-id:2910758903 --> @Brahim2796 commented on GitHub (May 26, 2025): How to delete cached files?? I add norway league as explained in the documentation , it did not work
Author
Owner

@probberechts commented on GitHub (May 27, 2025):

How to delete cached files??

See https://soccerdata.readthedocs.io/en/latest/intro.html#data-caching

<!-- gh-comment-id:2913291664 --> @probberechts commented on GitHub (May 27, 2025): > How to delete cached files?? See https://soccerdata.readthedocs.io/en/latest/intro.html#data-caching
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#176
No description provided.