[GH-ISSUE #690] [WhoScored] La Liga 2010/11 causes whoscored.read_league_stages() to fail #145

Closed
opened 2026-03-02 15:56:11 +03:00 by kerem · 2 comments
Owner

Originally created by @lehoff on GitHub (Aug 22, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/690

  >>> whoscored.read_season_stages()
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/Users/lehoff/Library/Caches/pypoetry/virtualenvs/footballdata-bjXxG6Tz-py3.10/lib/python3.10/site-packages/soccerdata/whoscored.py", line 293, in read_season_stages
      fixtures_url = tree.xpath("//a[text()='Fixtures']/@href")[0]
  IndexError: list index out of range

I have run read_schedule() (which calls read_season_stages()) for all seasons since 2000/01 for all of the BIG-5 leagues and this is the only season that has this issue.

I am in the process of trying to make the calls in read_season_stages() step by step, but it is not all that easy, so I will try to do a basic Seleninum scaping of the page.

This is some of the context needed to reproduce, but I am not quite there:

SPAIN_1011_URL = 'https://www.whoscored.com/Regions/206/Tournaments/4/Seasons/2596/Spain-LaLiga'
SPAIN_1011_FILEMASK = 'seasons/ESP-La Liga_1011.html'
"""
from lxml import html
reader = whoscored.get(fd.SPAIN_1011_URL, whoscored.data_dir / fd.SPAIN_1011_FILEMASK, var=None)
tree = html.parse(reader)
fixtures_url = tree.xpath("//a[text()='Fixtures']/@href")[0]
"""

I will update this issue when I have had time to investigate it further.

Originally created by @lehoff on GitHub (Aug 22, 2024). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/690 ``` >>> whoscored.read_season_stages() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/lehoff/Library/Caches/pypoetry/virtualenvs/footballdata-bjXxG6Tz-py3.10/lib/python3.10/site-packages/soccerdata/whoscored.py", line 293, in read_season_stages fixtures_url = tree.xpath("//a[text()='Fixtures']/@href")[0] IndexError: list index out of range ``` I have run `read_schedule()` (which calls `read_season_stages()`) for all seasons since 2000/01 for all of the BIG-5 leagues and this is the only season that has this issue. I am in the process of trying to make the calls in `read_season_stages()` step by step, but it is not all that easy, so I will try to do a basic Seleninum scaping of the page. This is some of the context needed to reproduce, but I am not quite there: ``` SPAIN_1011_URL = 'https://www.whoscored.com/Regions/206/Tournaments/4/Seasons/2596/Spain-LaLiga' SPAIN_1011_FILEMASK = 'seasons/ESP-La Liga_1011.html' """ from lxml import html reader = whoscored.get(fd.SPAIN_1011_URL, whoscored.data_dir / fd.SPAIN_1011_FILEMASK, var=None) tree = html.parse(reader) fixtures_url = tree.xpath("//a[text()='Fixtures']/@href")[0] """ ``` I will update this issue when I have had time to investigate it further.
kerem 2026-03-02 15:56:11 +03:00
  • closed this issue
  • added the
    WhoScored
    label
Author
Owner

@probberechts commented on GitHub (Aug 23, 2024):

The following actually worked fine for me:

import soccerdata as sd
ws = sd.WhoScored(leagues="ESP-La Liga", seasons="2010/11", no_cache=True)
ws.read_season_stages()
<!-- gh-comment-id:2307317262 --> @probberechts commented on GitHub (Aug 23, 2024): The following actually worked fine for me: ```python import soccerdata as sd ws = sd.WhoScored(leagues="ESP-La Liga", seasons="2010/11", no_cache=True) ws.read_season_stages() ```
Author
Owner

@lehoff commented on GitHub (Aug 23, 2024):

Colour me confused… it now works at my end (albeit with "2010-11") so I am not sure why I had it failing so many times.

Anyway, thanks for trying it out - I consider this issue closed.

<!-- gh-comment-id:2307708368 --> @lehoff commented on GitHub (Aug 23, 2024): Colour me confused… it now works at my end (albeit with "2010-11") so I am not sure why I had it failing so many times. Anyway, thanks for trying it out - I consider this issue closed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#145
No description provided.