[GH-ISSUE #660] WhoScored issue #131

Closed
opened 2026-03-02 15:56:03 +03:00 by kerem · 8 comments
Owner

Originally created by @Messe57 on GitHub (Aug 3, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/660

Since I found out about this project few months ago, scraping data has always been very easy, so I am really thankful to who is currently working on it. However, now I'm stuck with a problem that I am not able to solve, but I hope that someone can help me deal with it.
I admit that I am a beginner in coding, so it might be very easy to solve, but not with my knowledge.
This is my code:

import soccerdata as sd
seasons = [ '2122', '2223', '2324'] # 
leagues = ['ENG-Premier League', 'ITA-Serie A'] #
for season in seasons:
    for league in leagues:
        ws = sd.WhoScored(leagues=league, seasons=season, headless=False, no_cache=True) #
        ws._driver.get("https://www.whoscored.com/")
        ws._driver.execute_script("location = 'https://whoscored.com/'")
        leagues = ws.available_leagues()
        print(leagues)
        schedule = ws.read_schedule(force_cache=True)
        epl_matches = ws.read_events(output_fmt='events')

This is the key error that I am receiving:

KeyError                                  Traceback (most recent call last)
Cell In[10], [line 11](vscode-notebook-cell:?execution_count=10&line=11)
      [9](vscode-notebook-cell:?execution_count=10&line=9) leagues = ws.available_leagues()
     [10](vscode-notebook-cell:?execution_count=10&line=10) print(leagues)
---> [11](vscode-notebook-cell:?execution_count=10&line=11) schedule = ws.read_schedule()
     [12](vscode-notebook-cell:?execution_count=10&line=12) epl_matches = ws.read_events(output_fmt='events') #

File c:\Users\filip\AppData\Local\Programs\Python\Python311\Lib\site-packages\soccerdata\whoscored.py:402, in WhoScored.read_schedule(self, force_cache)
    [389](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:389) def read_schedule(self, force_cache: bool = False) -> pd.DataFrame:  # noqa: C901
    [390](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:390)     """Retrieve the game schedule for the selected leagues and seasons.
    [391](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:391) 
    [392](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:392)     Parameters
   (...)
    [400](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:400)     pd.DataFrame
    [401](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:401)     """
--> [402](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:402)     df_season_stages = self.read_season_stages(force_cache=force_cache)
    [403](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:403)     filemask_schedule = "matches/{}_{}_{}_{}.json"
    [405](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:405)     all_schedules = []

File c:\Users\filip\AppData\Local\Programs\Python\Python311\Lib\site-packages\soccerdata\whoscored.py:331, in WhoScored.read_season_stages(self, force_cache)
    [318](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:318) def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame:
    [319](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:319)     """Retrieve the season stages for the selected leagues.
    [320](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:320) 
    [321](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:321)     Parameters
...
-> [6249](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6249)         raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   [6251](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6251)     not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
   [6252](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6252)     raise KeyError(f"{not_found} not in index")

KeyError: "None of [Index(['ENG-Premier League'], dtype='object', name='league')] are in the [index]"

Thank you in advance.

Originally created by @Messe57 on GitHub (Aug 3, 2024). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/660 Since I found out about this project few months ago, scraping data has always been very easy, so I am really thankful to who is currently working on it. However, now I'm stuck with a problem that I am not able to solve, but I hope that someone can help me deal with it. I admit that I am a beginner in coding, so it might be very easy to solve, but not with my knowledge. This is my code: ```python import soccerdata as sd seasons = [ '2122', '2223', '2324'] # leagues = ['ENG-Premier League', 'ITA-Serie A'] # for season in seasons: for league in leagues: ws = sd.WhoScored(leagues=league, seasons=season, headless=False, no_cache=True) # ws._driver.get("https://www.whoscored.com/") ws._driver.execute_script("location = 'https://whoscored.com/'") leagues = ws.available_leagues() print(leagues) schedule = ws.read_schedule(force_cache=True) epl_matches = ws.read_events(output_fmt='events') ``` This is the key error that I am receiving: ``` KeyError Traceback (most recent call last) Cell In[10], [line 11](vscode-notebook-cell:?execution_count=10&line=11) [9](vscode-notebook-cell:?execution_count=10&line=9) leagues = ws.available_leagues() [10](vscode-notebook-cell:?execution_count=10&line=10) print(leagues) ---> [11](vscode-notebook-cell:?execution_count=10&line=11) schedule = ws.read_schedule() [12](vscode-notebook-cell:?execution_count=10&line=12) epl_matches = ws.read_events(output_fmt='events') # File c:\Users\filip\AppData\Local\Programs\Python\Python311\Lib\site-packages\soccerdata\whoscored.py:402, in WhoScored.read_schedule(self, force_cache) [389](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:389) def read_schedule(self, force_cache: bool = False) -> pd.DataFrame: # noqa: C901 [390](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:390) """Retrieve the game schedule for the selected leagues and seasons. [391](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:391) [392](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:392) Parameters (...) [400](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:400) pd.DataFrame [401](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:401) """ --> [402](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:402) df_season_stages = self.read_season_stages(force_cache=force_cache) [403](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:403) filemask_schedule = "matches/{}_{}_{}_{}.json" [405](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:405) all_schedules = [] File c:\Users\filip\AppData\Local\Programs\Python\Python311\Lib\site-packages\soccerdata\whoscored.py:331, in WhoScored.read_season_stages(self, force_cache) [318](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:318) def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame: [319](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:319) """Retrieve the season stages for the selected leagues. [320](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:320) [321](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/soccerdata/whoscored.py:321) Parameters ... -> [6249](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6249) raise KeyError(f"None of [{key}] are in the [{axis_name}]") [6251](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6251) not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique()) [6252](file:///C:/Users/filip/AppData/Local/Programs/Python/Python311/Lib/site-packages/pandas/core/indexes/base.py:6252) raise KeyError(f"{not_found} not in index") KeyError: "None of [Index(['ENG-Premier League'], dtype='object', name='league')] are in the [index]" ``` Thank you in advance.
kerem 2026-03-02 15:56:03 +03:00
  • closed this issue
  • added the
    WhoScored
    label
Author
Owner

@probberechts commented on GitHub (Aug 4, 2024):

Could you try with caching disabled everywhere?

schedule = ws.read_schedule(force_cache=False)

I am also intrigued why you added

ws._driver.get("https://www.whoscored.com/")
ws._driver.execute_script("location = 'https://whoscored.com/'")

Do you experience any problems without doing this?

<!-- gh-comment-id:2267489585 --> @probberechts commented on GitHub (Aug 4, 2024): Could you try with caching disabled everywhere? ``` schedule = ws.read_schedule(force_cache=False) ``` I am also intrigued why you added ``` ws._driver.get("https://www.whoscored.com/") ws._driver.execute_script("location = 'https://whoscored.com/'") ``` Do you experience any problems without doing this?
Author
Owner

@Messe57 commented on GitHub (Aug 4, 2024):

I tried what you suggested, but still not working unfortunately.
I added
ws._driver.get("https://www.whoscored.com/")
ws._driver.execute_script("location = 'https://whoscored.com/'")
because the driver was opening with my native language and so it was an issue. I found out this solution in the issues observed before and it works perfectly until now.

<!-- gh-comment-id:2267529282 --> @Messe57 commented on GitHub (Aug 4, 2024): I tried what you suggested, but still not working unfortunately. I added `ws._driver.get("https://www.whoscored.com/")` `ws._driver.execute_script("location = 'https://whoscored.com/'")` because the driver was opening with my native language and so it was an issue. I found out this solution in the issues observed before and it works perfectly until now.
Author
Owner

@LoGreHub commented on GitHub (Aug 4, 2024):

Hi all,

same kind of issue here.

Code:

import soccerdata as sd
ws = sd.WhoScored(leagues = ['ITA-Serie A'], seasons = ['2122'])
ws.read_schedule()

and traceback:

Traceback (most recent call last)
Cell In[8], line 1
----> 1 ws.read_schedule()

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:344](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=343), in WhoScored.read_schedule(self, force_cache)
    331 def read_schedule(self, force_cache: bool = False) -> pd.DataFrame:
    332     """Retrieve the game schedule for the selected leagues and seasons.
    333 
    334     Parameters
   (...)
    342     pd.DataFrame
    343     """
--> 344     df_season_stages = self.read_season_stages(force_cache=force_cache)
    345     filemask_schedule = "matches/{}_{}_{}_{}.json"
    347     all_schedules = []

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:274](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=273), in WhoScored.read_season_stages(self, force_cache)
    261 def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame:
    262     """Retrieve the season stages for the selected leagues.
    263 
    264     Parameters
   (...)
    272     pd.DataFrame
    273     """
--> 274     df_seasons = self.read_seasons()
    275     filemask = "seasons/{}_{}.html"
    277     season_stages = []

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:225](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=224), in WhoScored.read_seasons(self)
    218 def read_seasons(self) -> pd.DataFrame:
    219     """Retrieve the selected seasons for the selected leagues.
    220 
    221     Returns
    222     -------
    223     pd.DataFrame
    224     """
--> 225     df_leagues = self.read_leagues()
    227     seasons = []
    228     for lkey, league in df_leagues.iterrows():

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:210](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=209), in WhoScored.read_leagues(self)
    199     for league in region["tournaments"]:
    200         leagues.append(
    201             {
    202                 "region_id": region["id"],
   (...)
    206             }
    207         )
    209 return (
--> 210     pd.DataFrame(leagues)
    211     .assign(league=lambda x: x.region + " - " + x.league)
    212     .pipe(self._translate_league)
    213     .set_index("league")
    214     .loc[self._selected_leagues.keys()]
    215     .sort_index()
    216 )

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1191](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1190), in _LocationIndexer.__getitem__(self, key)
   1189 maybe_callable = com.apply_if_callable(key, self.obj)
   1190 maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable)
-> 1191 return self._getitem_axis(maybe_callable, axis=axis)

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1420](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1419), in _LocIndexer._getitem_axis(self, key, axis)
   1417     if hasattr(key, "ndim") and key.ndim > 1:
   1418         raise ValueError("Cannot index with multidimensional key")
-> 1420     return self._getitem_iterable(key, axis=axis)
   1422 # nested tuple slicing
   1423 if is_nested_tuple(key, labels):

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1360](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1359), in _LocIndexer._getitem_iterable(self, key, axis)
   1357 self._validate_key(key, axis)
   1359 # A collection of keys
-> 1360 keyarr, indexer = self._get_listlike_indexer(key, axis)
   1361 return self.obj._reindex_with_indexers(
   1362     {axis: [keyarr, indexer]}, copy=True, allow_dups=True
   1363 )

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1558](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1557), in _LocIndexer._get_listlike_indexer(self, key, axis)
   1555 ax = self.obj._get_axis(axis)
   1556 axis_name = self.obj._get_axis_name(axis)
-> 1558 keyarr, indexer = ax._get_indexer_strict(key, axis_name)
   1560 return keyarr, indexer

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6200](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6199), in Index._get_indexer_strict(self, key, axis_name)
   6197 else:
   6198     keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr)
-> 6200 self._raise_if_missing(keyarr, indexer, axis_name)
   6202 keyarr = self.take(indexer)
   6203 if isinstance(key, Index):
   6204     # GH 42790 - Preserve name from an Index

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6249](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6248), in Index._raise_if_missing(self, key, indexer, axis_name)
   6247 if nmissing:
   6248     if nmissing == len(indexer):
-> 6249         raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   6251     not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
   6252     raise KeyError(f"{not_found} not in index")

KeyError: "None of [Index(['ITA-Serie A'], dtype='object', name='league')] are in the [index]"
Traceback (most recent call last)
Cell In[8], line 1
----> 1 ws.read_schedule()

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:344](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=343), in WhoScored.read_schedule(self, force_cache)
    331 def read_schedule(self, force_cache: bool = False) -> pd.DataFrame:
    332     """Retrieve the game schedule for the selected leagues and seasons.
    333 
    334     Parameters
   (...)
    342     pd.DataFrame
    343     """
--> 344     df_season_stages = self.read_season_stages(force_cache=force_cache)
    345     filemask_schedule = "matches/{}_{}_{}_{}.json"
    347     all_schedules = []

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:274](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=273), in WhoScored.read_season_stages(self, force_cache)
    261 def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame:
    262     """Retrieve the season stages for the selected leagues.
    263 
    264     Parameters
   (...)
    272     pd.DataFrame
    273     """
--> 274     df_seasons = self.read_seasons()
    275     filemask = "seasons/{}_{}.html"
    277     season_stages = []

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:225](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=224), in WhoScored.read_seasons(self)
    218 def read_seasons(self) -> pd.DataFrame:
    219     """Retrieve the selected seasons for the selected leagues.
    220 
    221     Returns
    222     -------
    223     pd.DataFrame
    224     """
--> 225     df_leagues = self.read_leagues()
    227     seasons = []
    228     for lkey, league in df_leagues.iterrows():

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:210](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=209), in WhoScored.read_leagues(self)
    199     for league in region["tournaments"]:
    200         leagues.append(
    201             {
    202                 "region_id": region["id"],
   (...)
    206             }
    207         )
    209 return (
--> 210     pd.DataFrame(leagues)
    211     .assign(league=lambda x: x.region + " - " + x.league)
    212     .pipe(self._translate_league)
    213     .set_index("league")
    214     .loc[self._selected_leagues.keys()]
    215     .sort_index()
    216 )

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1191](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1190), in _LocationIndexer.__getitem__(self, key)
   1189 maybe_callable = com.apply_if_callable(key, self.obj)
   1190 maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable)
-> 1191 return self._getitem_axis(maybe_callable, axis=axis)

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1420](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1419), in _LocIndexer._getitem_axis(self, key, axis)
   1417     if hasattr(key, "ndim") and key.ndim > 1:
   1418         raise ValueError("Cannot index with multidimensional key")
-> 1420     return self._getitem_iterable(key, axis=axis)
   1422 # nested tuple slicing
   1423 if is_nested_tuple(key, labels):

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1360](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1359), in _LocIndexer._getitem_iterable(self, key, axis)
   1357 self._validate_key(key, axis)
   1359 # A collection of keys
-> 1360 keyarr, indexer = self._get_listlike_indexer(key, axis)
   1361 return self.obj._reindex_with_indexers(
   1362     {axis: [keyarr, indexer]}, copy=True, allow_dups=True
   1363 )

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1558](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1557), in _LocIndexer._get_listlike_indexer(self, key, axis)
   1555 ax = self.obj._get_axis(axis)
   1556 axis_name = self.obj._get_axis_name(axis)
-> 1558 keyarr, indexer = ax._get_indexer_strict(key, axis_name)
   1560 return keyarr, indexer

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6200](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6199), in Index._get_indexer_strict(self, key, axis_name)
   6197 else:
   6198     keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr)
-> 6200 self._raise_if_missing(keyarr, indexer, axis_name)
   6202 keyarr = self.take(indexer)
   6203 if isinstance(key, Index):
   6204     # GH 42790 - Preserve name from an Index

File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6249](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6248), in Index._raise_if_missing(self, key, indexer, axis_name)
   6247 if nmissing:
   6248     if nmissing == len(indexer):
-> 6249         raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   6251     not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
   6252     raise KeyError(f"{not_found} not in index")

KeyError: "None of [Index(['ITA-Serie A'], dtype='object', name='league')] are in the [index]"
<!-- gh-comment-id:2267657724 --> @LoGreHub commented on GitHub (Aug 4, 2024): Hi all, same kind of issue here. Code: ``` import soccerdata as sd ws = sd.WhoScored(leagues = ['ITA-Serie A'], seasons = ['2122']) ws.read_schedule() ``` and traceback: ``` Traceback (most recent call last) Cell In[8], line 1 ----> 1 ws.read_schedule() File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:344](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=343), in WhoScored.read_schedule(self, force_cache) 331 def read_schedule(self, force_cache: bool = False) -> pd.DataFrame: 332 """Retrieve the game schedule for the selected leagues and seasons. 333 334 Parameters (...) 342 pd.DataFrame 343 """ --> 344 df_season_stages = self.read_season_stages(force_cache=force_cache) 345 filemask_schedule = "matches/{}_{}_{}_{}.json" 347 all_schedules = [] File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:274](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=273), in WhoScored.read_season_stages(self, force_cache) 261 def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame: 262 """Retrieve the season stages for the selected leagues. 263 264 Parameters (...) 272 pd.DataFrame 273 """ --> 274 df_seasons = self.read_seasons() 275 filemask = "seasons/{}_{}.html" 277 season_stages = [] File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:225](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=224), in WhoScored.read_seasons(self) 218 def read_seasons(self) -> pd.DataFrame: 219 """Retrieve the selected seasons for the selected leagues. 220 221 Returns 222 ------- 223 pd.DataFrame 224 """ --> 225 df_leagues = self.read_leagues() 227 seasons = [] 228 for lkey, league in df_leagues.iterrows(): File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:210](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=209), in WhoScored.read_leagues(self) 199 for league in region["tournaments"]: 200 leagues.append( 201 { 202 "region_id": region["id"], (...) 206 } 207 ) 209 return ( --> 210 pd.DataFrame(leagues) 211 .assign(league=lambda x: x.region + " - " + x.league) 212 .pipe(self._translate_league) 213 .set_index("league") 214 .loc[self._selected_leagues.keys()] 215 .sort_index() 216 ) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1191](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1190), in _LocationIndexer.__getitem__(self, key) 1189 maybe_callable = com.apply_if_callable(key, self.obj) 1190 maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable) -> 1191 return self._getitem_axis(maybe_callable, axis=axis) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1420](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1419), in _LocIndexer._getitem_axis(self, key, axis) 1417 if hasattr(key, "ndim") and key.ndim > 1: 1418 raise ValueError("Cannot index with multidimensional key") -> 1420 return self._getitem_iterable(key, axis=axis) 1422 # nested tuple slicing 1423 if is_nested_tuple(key, labels): File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1360](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1359), in _LocIndexer._getitem_iterable(self, key, axis) 1357 self._validate_key(key, axis) 1359 # A collection of keys -> 1360 keyarr, indexer = self._get_listlike_indexer(key, axis) 1361 return self.obj._reindex_with_indexers( 1362 {axis: [keyarr, indexer]}, copy=True, allow_dups=True 1363 ) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1558](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1557), in _LocIndexer._get_listlike_indexer(self, key, axis) 1555 ax = self.obj._get_axis(axis) 1556 axis_name = self.obj._get_axis_name(axis) -> 1558 keyarr, indexer = ax._get_indexer_strict(key, axis_name) 1560 return keyarr, indexer File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6200](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6199), in Index._get_indexer_strict(self, key, axis_name) 6197 else: 6198 keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr) -> 6200 self._raise_if_missing(keyarr, indexer, axis_name) 6202 keyarr = self.take(indexer) 6203 if isinstance(key, Index): 6204 # GH 42790 - Preserve name from an Index File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6249](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6248), in Index._raise_if_missing(self, key, indexer, axis_name) 6247 if nmissing: 6248 if nmissing == len(indexer): -> 6249 raise KeyError(f"None of [{key}] are in the [{axis_name}]") 6251 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique()) 6252 raise KeyError(f"{not_found} not in index") KeyError: "None of [Index(['ITA-Serie A'], dtype='object', name='league')] are in the [index]" Traceback (most recent call last) Cell In[8], line 1 ----> 1 ws.read_schedule() File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:344](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=343), in WhoScored.read_schedule(self, force_cache) 331 def read_schedule(self, force_cache: bool = False) -> pd.DataFrame: 332 """Retrieve the game schedule for the selected leagues and seasons. 333 334 Parameters (...) 342 pd.DataFrame 343 """ --> 344 df_season_stages = self.read_season_stages(force_cache=force_cache) 345 filemask_schedule = "matches/{}_{}_{}_{}.json" 347 all_schedules = [] File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:274](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=273), in WhoScored.read_season_stages(self, force_cache) 261 def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame: 262 """Retrieve the season stages for the selected leagues. 263 264 Parameters (...) 272 pd.DataFrame 273 """ --> 274 df_seasons = self.read_seasons() 275 filemask = "seasons/{}_{}.html" 277 season_stages = [] File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:225](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=224), in WhoScored.read_seasons(self) 218 def read_seasons(self) -> pd.DataFrame: 219 """Retrieve the selected seasons for the selected leagues. 220 221 Returns 222 ------- 223 pd.DataFrame 224 """ --> 225 df_leagues = self.read_leagues() 227 seasons = [] 228 for lkey, league in df_leagues.iterrows(): File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\soccerdata\whoscored.py:210](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/soccerdata/whoscored.py#line=209), in WhoScored.read_leagues(self) 199 for league in region["tournaments"]: 200 leagues.append( 201 { 202 "region_id": region["id"], (...) 206 } 207 ) 209 return ( --> 210 pd.DataFrame(leagues) 211 .assign(league=lambda x: x.region + " - " + x.league) 212 .pipe(self._translate_league) 213 .set_index("league") 214 .loc[self._selected_leagues.keys()] 215 .sort_index() 216 ) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1191](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1190), in _LocationIndexer.__getitem__(self, key) 1189 maybe_callable = com.apply_if_callable(key, self.obj) 1190 maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable) -> 1191 return self._getitem_axis(maybe_callable, axis=axis) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1420](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1419), in _LocIndexer._getitem_axis(self, key, axis) 1417 if hasattr(key, "ndim") and key.ndim > 1: 1418 raise ValueError("Cannot index with multidimensional key") -> 1420 return self._getitem_iterable(key, axis=axis) 1422 # nested tuple slicing 1423 if is_nested_tuple(key, labels): File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1360](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1359), in _LocIndexer._getitem_iterable(self, key, axis) 1357 self._validate_key(key, axis) 1359 # A collection of keys -> 1360 keyarr, indexer = self._get_listlike_indexer(key, axis) 1361 return self.obj._reindex_with_indexers( 1362 {axis: [keyarr, indexer]}, copy=True, allow_dups=True 1363 ) File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexing.py:1558](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexing.py#line=1557), in _LocIndexer._get_listlike_indexer(self, key, axis) 1555 ax = self.obj._get_axis(axis) 1556 axis_name = self.obj._get_axis_name(axis) -> 1558 keyarr, indexer = ax._get_indexer_strict(key, axis_name) 1560 return keyarr, indexer File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6200](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6199), in Index._get_indexer_strict(self, key, axis_name) 6197 else: 6198 keyarr, indexer, new_indexer = self._reindex_non_unique(keyarr) -> 6200 self._raise_if_missing(keyarr, indexer, axis_name) 6202 keyarr = self.take(indexer) 6203 if isinstance(key, Index): 6204 # GH 42790 - Preserve name from an Index File [~\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\core\indexes\base.py:6249](http://localhost:8888/~/AppData/Local/Programs/Python/Python312/Lib/site-packages/pandas/core/indexes/base.py#line=6248), in Index._raise_if_missing(self, key, indexer, axis_name) 6247 if nmissing: 6248 if nmissing == len(indexer): -> 6249 raise KeyError(f"None of [{key}] are in the [{axis_name}]") 6251 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique()) 6252 raise KeyError(f"{not_found} not in index") KeyError: "None of [Index(['ITA-Serie A'], dtype='object', name='league')] are in the [index]" ```
Author
Owner

@LoGreHub commented on GitHub (Aug 4, 2024):

Read other issues at last (apologize for not doing that before), my above issue is likely related to being forced to load the italian version of the website.

<!-- gh-comment-id:2267665103 --> @LoGreHub commented on GitHub (Aug 4, 2024): Read other issues at last (apologize for not doing that before), my above issue is likely related to being forced to load the italian version of the website.
Author
Owner

@Messe57 commented on GitHub (Aug 13, 2024):

Updating my issue... I think it might be a problem with read_schedule function because when I ask for the available leagues the code run perfectly. Furthermore, the chromedriver is able to open the website without any issues. Do you suggest any additional changes to do?

<!-- gh-comment-id:2287017654 --> @Messe57 commented on GitHub (Aug 13, 2024): Updating my issue... I think it might be a problem with read_schedule function because when I ask for the available leagues the code run perfectly. Furthermore, **the chromedriver is able to open the website without any issues.** Do you suggest any additional changes to do?
Author
Owner

@AbinThomas10 commented on GitHub (Aug 21, 2024):

same issue with me too but only for scraping Italian SerieA league

<!-- gh-comment-id:2302495677 --> @AbinThomas10 commented on GitHub (Aug 21, 2024): same issue with me too but only for scraping Italian SerieA league
Author
Owner

@probberechts commented on GitHub (Aug 23, 2024):

The following works fine for me:

 import soccerdata as sd
 ws = sd.WhoScored(leagues = ['ITA-Serie A'], seasons = ['2122'], no_cache = True)
 ws.read_schedule()

I am closing this since I don't have sufficient information to debug your issue. Feel free to reopen if you can pinpoint the cause.

<!-- gh-comment-id:2307390300 --> @probberechts commented on GitHub (Aug 23, 2024): The following works fine for me: ```python import soccerdata as sd ws = sd.WhoScored(leagues = ['ITA-Serie A'], seasons = ['2122'], no_cache = True) ws.read_schedule() ``` I am closing this since I don't have sufficient information to debug your issue. Feel free to reopen if you can pinpoint the cause.
Author
Owner

@Messe57 commented on GitHub (Sep 25, 2024):

Here to announce that I was able to find the issue... in my tiers leagues name were written in my native language, that's why the scraper was not able to find them. Now, I need to use a VPN to set the IP in another country. I don't understand how this could have happen because previuosly the scraper worked fine, maybe it happen updating chrome or soccerdata. Anyway, very happy to have found how to solve this issue and I hope this help someone else.

<!-- gh-comment-id:2374357049 --> @Messe57 commented on GitHub (Sep 25, 2024): Here to announce that I was able to find the issue... in my tiers leagues name were written in my native language, that's why the scraper was not able to find them. Now, I need to use a VPN to set the IP in another country. I don't understand how this could have happen because previuosly the scraper worked fine, maybe it happen updating chrome or soccerdata. Anyway, very happy to have found how to solve this issue and I hope this help someone else.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#131
No description provided.