[GH-ISSUE #160] [FBref] Columns differ between games for 2022/2023 season #38

Closed
opened 2026-03-02 15:55:14 +03:00 by kerem · 4 comments
Owner

Originally created by @waqz1993 on GitHub (Feb 10, 2023).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/160

Getting "Length mismatch: Expected axis has 34 elements, new values have 36 elements" error when scraping 2022/2023 player match stats. 2021/2022 is working fine.

Originally created by @waqz1993 on GitHub (Feb 10, 2023). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/160 Getting "Length mismatch: Expected axis has 34 elements, new values have 36 elements" error when scraping 2022/2023 player match stats. 2021/2022 is working fine.
kerem closed this issue 2026-03-02 15:55:15 +03:00
Author
Owner

@probberechts commented on GitHub (Feb 10, 2023):

Could you include a minimal example and the full stack trace of the error?

<!-- gh-comment-id:1426018429 --> @probberechts commented on GitHub (Feb 10, 2023): Could you include a minimal example and the full stack trace of the error?
Author
Owner

@waqz1993 commented on GitHub (Feb 10, 2023):

sure.

import soccerdata as sd 
fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022')
player_match_stats=fbref.read_player_match_stats(stat_type='summary')

ValueError                                Traceback (most recent call last)
Input In [1], in <cell line: 3>()
      1 import soccerdata as sd 
      2 fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022')
----> 3 player_match_stats=fbref.read_player_match_stats(stat_type='summary')

File ~\anaconda3\lib\site-packages\soccerdata\fbref.py:707, in FBref.read_player_match_stats(self, stat_type, match_id, force_cache)
    704     df_table["game_id"] = game["game_id"]
    705     stats.append(df_table)
--> 707 df = _concat(stats)
    708 df = df[~df.Player.str.contains(r"^\d+\sPlayers$")]
    709 df = (
    710     df.rename(columns={"Player": "player"})
    711     .replace({"team": TEAMNAME_REPLACEMENTS})
    712     .set_index(["league", "season", "game", "team", "player"])
    713     .sort_index()
    714 )

File ~\anaconda3\lib\site-packages\soccerdata\fbref.py:827, in _concat(dfs)
    824 columns.loc[mask, 1] = ""
    826 for df in dfs:
--> 827     df.columns = pd.MultiIndex.from_tuples(columns.to_records(index=False).tolist())
    829 return pd.concat(dfs)

File ~\anaconda3\lib\site-packages\pandas\core\generic.py:5588, in NDFrame.__setattr__(self, name, value)
   5586 try:
   5587     object.__getattribute__(self, name)
-> 5588     return object.__setattr__(self, name, value)
   5589 except AttributeError:
   5590     pass

File ~\anaconda3\lib\site-packages\pandas\_libs\properties.pyx:70, in pandas._libs.properties.AxisProperty.__set__()

File ~\anaconda3\lib\site-packages\pandas\core\generic.py:769, in NDFrame._set_axis(self, axis, labels)
    767 def _set_axis(self, axis: int, labels: Index) -> None:
    768     labels = ensure_index(labels)
--> 769     self._mgr.set_axis(axis, labels)
    770     self._clear_item_cache()

File ~\anaconda3\lib\site-packages\pandas\core\internals\managers.py:214, in BaseBlockManager.set_axis(self, axis, new_labels)
    212 def set_axis(self, axis: int, new_labels: Index) -> None:
    213     # Caller is responsible for ensuring we have an Index object.
--> 214     self._validate_set_axis(axis, new_labels)
    215     self.axes[axis] = new_labels

File ~\anaconda3\lib\site-packages\pandas\core\internals\base.py:69, in DataManager._validate_set_axis(self, axis, new_labels)
     66     pass
     68 elif new_len != old_len:
---> 69     raise ValueError(
     70         f"Length mismatch: Expected axis has {old_len} elements, new "
     71         f"values have {new_len} elements"
     72     )

ValueError: Length mismatch: Expected axis has 34 elements, new values have 36 elements
<!-- gh-comment-id:1426036335 --> @waqz1993 commented on GitHub (Feb 10, 2023): sure. ```py import soccerdata as sd fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022') player_match_stats=fbref.read_player_match_stats(stat_type='summary') ``` --------------------------------------------------------------------------- ``` ValueError Traceback (most recent call last) Input In [1], in <cell line: 3>() 1 import soccerdata as sd 2 fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022') ----> 3 player_match_stats=fbref.read_player_match_stats(stat_type='summary') File ~\anaconda3\lib\site-packages\soccerdata\fbref.py:707, in FBref.read_player_match_stats(self, stat_type, match_id, force_cache) 704 df_table["game_id"] = game["game_id"] 705 stats.append(df_table) --> 707 df = _concat(stats) 708 df = df[~df.Player.str.contains(r"^\d+\sPlayers$")] 709 df = ( 710 df.rename(columns={"Player": "player"}) 711 .replace({"team": TEAMNAME_REPLACEMENTS}) 712 .set_index(["league", "season", "game", "team", "player"]) 713 .sort_index() 714 ) File ~\anaconda3\lib\site-packages\soccerdata\fbref.py:827, in _concat(dfs) 824 columns.loc[mask, 1] = "" 826 for df in dfs: --> 827 df.columns = pd.MultiIndex.from_tuples(columns.to_records(index=False).tolist()) 829 return pd.concat(dfs) File ~\anaconda3\lib\site-packages\pandas\core\generic.py:5588, in NDFrame.__setattr__(self, name, value) 5586 try: 5587 object.__getattribute__(self, name) -> 5588 return object.__setattr__(self, name, value) 5589 except AttributeError: 5590 pass File ~\anaconda3\lib\site-packages\pandas\_libs\properties.pyx:70, in pandas._libs.properties.AxisProperty.__set__() File ~\anaconda3\lib\site-packages\pandas\core\generic.py:769, in NDFrame._set_axis(self, axis, labels) 767 def _set_axis(self, axis: int, labels: Index) -> None: 768 labels = ensure_index(labels) --> 769 self._mgr.set_axis(axis, labels) 770 self._clear_item_cache() File ~\anaconda3\lib\site-packages\pandas\core\internals\managers.py:214, in BaseBlockManager.set_axis(self, axis, new_labels) 212 def set_axis(self, axis: int, new_labels: Index) -> None: 213 # Caller is responsible for ensuring we have an Index object. --> 214 self._validate_set_axis(axis, new_labels) 215 self.axes[axis] = new_labels File ~\anaconda3\lib\site-packages\pandas\core\internals\base.py:69, in DataManager._validate_set_axis(self, axis, new_labels) 66 pass 68 elif new_len != old_len: ---> 69 raise ValueError( 70 f"Length mismatch: Expected axis has {old_len} elements, new " 71 f"values have {new_len} elements" 72 ) ValueError: Length mismatch: Expected axis has 34 elements, new values have 36 elements ```
Author
Owner

@probberechts commented on GitHub (Feb 10, 2023):

I guess FBref has either added or removed some stats and you are trying to merge old cached data with new data. Could you try again with caching disabled? This will override the old data.

fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022', no_cache=True)
<!-- gh-comment-id:1426078502 --> @probberechts commented on GitHub (Feb 10, 2023): I guess FBref has either added or removed some stats and you are trying to merge old cached data with new data. Could you try again with caching disabled? This will override the old data. ```py fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022', no_cache=True) ```
Author
Owner

@waqz1993 commented on GitHub (Feb 10, 2023):

Yeah you're correct. Works now, thank you! Great package btw :)

<!-- gh-comment-id:1426105183 --> @waqz1993 commented on GitHub (Feb 10, 2023): Yeah you're correct. Works now, thank you! Great package btw :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#38
No description provided.