[GH-ISSUE #372] Add support for pandas 2.1.0 #72

Closed
opened 2026-03-02 15:55:31 +03:00 by kerem · 3 comments
Owner

Originally created by @aegonwolf on GitHub (Sep 16, 2023).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/372

Hi there,
is it possible that there is a bug with newer pandas versions?
I didn't have it before but I also haven't use this awesome package for a few months:

Calling any fbref object with read_team_season_stats or player_season_stats yields:

AxisError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 season_stats = fbref.read_team_season_stats(stat_type='shooting')

File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:288, in FBref.read_team_season_stats(self, stat_type, opponent_stats)
    285     stat_type += "_for"
    287 # get league IDs
--> 288 seasons = self.read_seasons()
    290 # collect teams
    291 teams = []

File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:180, in FBref.read_seasons(self, split_up_big5)
    167 """Retrieve the selected seasons for the selected leagues.
    168 
    169 Parameters
   (...)
    177 pd.DataFrame
    178 """
    179 filemask = "seasons_{}.html"
--> 180 df_leagues = self.read_leagues(split_up_big5)
    182 seasons = []
    183 for lkey, league in df_leagues.iterrows():

File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:147, in FBref.read_leagues(self, split_up_big5)
    144     df_table["url"] = html_table.xpath(".//th[@data-stat='league_name']/a/@href")
    145     dfs.append(df_table)
    146 df = (
--> 147     pd.concat(dfs)
    148     .pipe(standardize_colnames)
    149     .rename(columns={"competition_name": "league"})
    150     .pipe(self._translate_league)
    151     .drop_duplicates(subset="league")
    152     .set_index("league")
    153     .sort_index()
    154 )
    155 df["first_season"] = df["first_season"].apply(season_code)
    156 df["last_season"] = df["last_season"].apply(season_code)

File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\reshape\concat.py:393, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    378     copy = False
    380 op = _Concatenator(
    381     objs,
    382     axis=axis,
   (...)
    390     sort=sort,
    391 )
--> 393 return op.get_result()

File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\reshape\concat.py:680, in _Concatenator.get_result(self)
    676             indexers[ax] = obj_labels.get_indexer(new_labels)
    678     mgrs_indexers.append((obj._mgr, indexers))
--> 680 new_data = concatenate_managers(
    681     mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy
    682 )
    683 if not self.copy and not using_copy_on_write():
    684     new_data._consolidate_inplace()

File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\internals\concat.py:180, in concatenate_managers(mgrs_indexers, axes, concat_axis, copy)
    177     values = np.concatenate(vals, axis=1)  # type: ignore[arg-type]
    178 elif is_1d_only_ea_dtype(blk.dtype):
    179     # TODO(EA2D): special-casing not needed with 2D EAs
--> 180     values = concat_compat(vals, axis=1, ea_compat_axis=True)
    181     values = ensure_block_shape(values, ndim=2)
    182 else:

File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\dtypes\concat.py:135, in concat_compat(to_concat, axis, ea_compat_axis)
    133 else:
    134     to_concat_arrs = cast("Sequence[np.ndarray]", to_concat)
--> 135     result = np.concatenate(to_concat_arrs, axis=axis)
    137     if not any_ea and "b" in kinds and result.dtype.kind in "iuf":
    138         # GH#39817 cast to object instead of casting bools to numeric
    139         result = result.astype(object, copy=False)

AxisError: axis 1 is out of bounds for array of dimension 1

Should I revert pandas versions? If so, to which one?

Originally created by @aegonwolf on GitHub (Sep 16, 2023). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/372 Hi there, is it possible that there is a bug with newer pandas versions? I didn't have it before but I also haven't use this awesome package for a few months: Calling any `fbref `object with `read_team_season_stats `or `player_season_stats ` yields: ``` AxisError Traceback (most recent call last) Cell In[9], line 1 ----> 1 season_stats = fbref.read_team_season_stats(stat_type='shooting') File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:288, in FBref.read_team_season_stats(self, stat_type, opponent_stats) 285 stat_type += "_for" 287 # get league IDs --> 288 seasons = self.read_seasons() 290 # collect teams 291 teams = [] File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:180, in FBref.read_seasons(self, split_up_big5) 167 """Retrieve the selected seasons for the selected leagues. 168 169 Parameters (...) 177 pd.DataFrame 178 """ 179 filemask = "seasons_{}.html" --> 180 df_leagues = self.read_leagues(split_up_big5) 182 seasons = [] 183 for lkey, league in df_leagues.iterrows(): File ~\anaconda3\envs\scraperfc3\lib\site-packages\soccerdata\fbref.py:147, in FBref.read_leagues(self, split_up_big5) 144 df_table["url"] = html_table.xpath(".//th[@data-stat='league_name']/a/@href") 145 dfs.append(df_table) 146 df = ( --> 147 pd.concat(dfs) 148 .pipe(standardize_colnames) 149 .rename(columns={"competition_name": "league"}) 150 .pipe(self._translate_league) 151 .drop_duplicates(subset="league") 152 .set_index("league") 153 .sort_index() 154 ) 155 df["first_season"] = df["first_season"].apply(season_code) 156 df["last_season"] = df["last_season"].apply(season_code) File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\reshape\concat.py:393, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy) 378 copy = False 380 op = _Concatenator( 381 objs, 382 axis=axis, (...) 390 sort=sort, 391 ) --> 393 return op.get_result() File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\reshape\concat.py:680, in _Concatenator.get_result(self) 676 indexers[ax] = obj_labels.get_indexer(new_labels) 678 mgrs_indexers.append((obj._mgr, indexers)) --> 680 new_data = concatenate_managers( 681 mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy 682 ) 683 if not self.copy and not using_copy_on_write(): 684 new_data._consolidate_inplace() File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\internals\concat.py:180, in concatenate_managers(mgrs_indexers, axes, concat_axis, copy) 177 values = np.concatenate(vals, axis=1) # type: ignore[arg-type] 178 elif is_1d_only_ea_dtype(blk.dtype): 179 # TODO(EA2D): special-casing not needed with 2D EAs --> 180 values = concat_compat(vals, axis=1, ea_compat_axis=True) 181 values = ensure_block_shape(values, ndim=2) 182 else: File ~\anaconda3\envs\scraperfc3\lib\site-packages\pandas\core\dtypes\concat.py:135, in concat_compat(to_concat, axis, ea_compat_axis) 133 else: 134 to_concat_arrs = cast("Sequence[np.ndarray]", to_concat) --> 135 result = np.concatenate(to_concat_arrs, axis=axis) 137 if not any_ea and "b" in kinds and result.dtype.kind in "iuf": 138 # GH#39817 cast to object instead of casting bools to numeric 139 result = result.astype(object, copy=False) AxisError: axis 1 is out of bounds for array of dimension 1 ``` Should I revert pandas versions? If so, to which one?
kerem 2026-03-02 15:55:31 +03:00
Author
Owner

@aegonwolf commented on GitHub (Sep 16, 2023):

returned to pandas 2.0 and it worked, 2.1 throws the error again.

<!-- gh-comment-id:1722295600 --> @aegonwolf commented on GitHub (Sep 16, 2023): returned to pandas 2.0 and it worked, 2.1 throws the error again.
Author
Owner

@vishalmish commented on GitHub (Sep 17, 2023):

Seeing the same issue, tried Pandas both 2.0 & 2.1 and it didn't help.

<!-- gh-comment-id:1722561085 --> @vishalmish commented on GitHub (Sep 17, 2023): Seeing the same issue, tried Pandas both 2.0 & 2.1 and it didn't help.
Author
Owner

@probberechts commented on GitHub (Sep 17, 2023):

I know about this. Apparently, something changed between pandas v2.0.3 and v2.1.0 in the the concat function, but I did not figure out what exactly. You can downgrade to v2.0.3 as a temporary fix.

<!-- gh-comment-id:1722567027 --> @probberechts commented on GitHub (Sep 17, 2023): I know about this. Apparently, something changed between pandas v2.0.3 and v2.1.0 in the the concat function, but I did not figure out what exactly. You can downgrade to v2.0.3 as a temporary fix.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#72
No description provided.