[GH-ISSUE #596] [Whoscored] Broken read_schedule method #107

Closed
opened 2026-03-02 15:55:51 +03:00 by kerem · 2 comments
Owner

Originally created by @joaomcalves on GitHub (May 27, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/596

First of all congrats for this awesome repo!

I have been using whoscored scrapper without problems for the last few months. But in the last few days I have been having issues when scraping this year data.

For example if I run
ws = sd.WhoScored(leagues=""ENG-Premier League"", seasons=2223) epl_schedule = ws.read_schedule()
It works well. But if I run:
ws = sd.WhoScored(leagues=""ENG-Premier League"", seasons=2324) epl_schedule = ws.read_schedule()

I get this errror:
`TimeoutException Traceback (most recent call last)
Cell In[16], line 1
----> 1 epl_schedule = ws.read_schedule()
2 epl_schedule

File ~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:390, in WhoScored.read_schedule(self, force_cache)
387 self._driver.get(url)
389 # Check if season consists of multiple stages
--> 390 stages = self._parse_season_stages()
392 # Handle a multi-stage season
393 if len(stages) > 0:

File ~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:282, in WhoScored._parse_season_stages(self)
278 def _parse_season_stages(self) -> List[Dict]:
279 match_selector = (
280 "//div[contains(@id,'tournament-fixture')]//div[contains(@class,'divtable-row')]"
281 )
--> 282 WebDriverWait(self._driver, 30, poll_frequency=1).until(
283 ec.presence_of_element_located((By.XPATH, match_selector))
284 )
285 node_stages_selector = "//select[contains(@id,'stages')]/option"
286 node_stages = self._driver.find_elements(By.XPATH, node_stages_selector)

File ~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:105, in WebDriverWait.until(self, method, message)
103 if time.monotonic() > end_time:
104 break
--> 105 raise TimeoutException(message, screen, stacktrace)

TimeoutException: Message:
Stacktrace:
0 undetected_chromedriver 0x00000001008d66c8 undetected_chromedriver + 6149832
1 undetected_chromedriver 0x00000001008cdcea undetected_chromedriver + 6114538
2 undetected_chromedriver 0x000000010035ad5c undetected_chromedriver + 400732
3 undetected_chromedriver 0x00000001003a7aa5 undetected_chromedriver + 715429
4 undetected_chromedriver 0x00000001003a7bf1 undetected_chromedriver + 715761
5 undetected_chromedriver 0x00000001003ecdd4 undetected_chromedriver + 998868
6 undetected_chromedriver 0x00000001003cacdd undetected_chromedriver + 859357
7 undetected_chromedriver 0x00000001003ea0db undetected_chromedriver + 987355
8 undetected_chromedriver 0x00000001003caa53 undetected_chromedriver + 858707
9 undetected_chromedriver 0x000000010039a6d5 undetected_chromedriver + 661205
10 undetected_chromedriver 0x000000010039af6e undetected_chromedriver + 663406
11 undetected_chromedriver 0x0000000100897d00 undetected_chromedriver + 5893376
12 undetected_chromedriver 0x000000010089d4cc undetected_chromedriver + 5915852
13 undetected_chromedriver 0x00000001008798c4 undetected_chromedriver + 5769412
14 undetected_chromedriver 0x000000010089df99 undetected_chromedriver + 5918617
15 undetected_chromedriver 0x000000010086aed4 undetected_chromedriver + 5709524
16 undetected_chromedriver 0x00000001008be018 undetected_chromedriver + 6049816
17 undetected_chromedriver 0x00000001008be1d7 undetected_chromedriver + 6050263
18 undetected_chromedriver 0x00000001008cd89e undetected_chromedriver + 6113438
19 libsystem_pthread.dylib 0x00007ff80bb171d3 _pthread_start + 125
20 libsystem_pthread.dylib 0x00007ff80bb12bd3 thread_start + 15`

Any idea of how I can solve this issue?
Thanks!

Originally created by @joaomcalves on GitHub (May 27, 2024). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/596 First of all congrats for this awesome repo! I have been using whoscored scrapper without problems for the last few months. But in the last few days I have been having issues when scraping this year data. For example if I run `ws = sd.WhoScored(leagues=""ENG-Premier League"", seasons=2223) epl_schedule = ws.read_schedule() ` It works well. But if I run: `ws = sd.WhoScored(leagues=""ENG-Premier League"", seasons=2324) epl_schedule = ws.read_schedule() ` I get this errror: `TimeoutException Traceback (most recent call last) Cell In[16], [line 1](vscode-notebook-cell:?execution_count=16&line=1) ----> [1](vscode-notebook-cell:?execution_count=16&line=1) epl_schedule = ws.read_schedule() [2](vscode-notebook-cell:?execution_count=16&line=2) epl_schedule File [~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:390](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:390), in WhoScored.read_schedule(self, force_cache) [387](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:387) self._driver.get(url) [389](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:389) # Check if season consists of multiple stages --> [390](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:390) stages = self._parse_season_stages() [392](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:392) # Handle a multi-stage season [393](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:393) if len(stages) > 0: File [~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:282](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:282), in WhoScored._parse_season_stages(self) [278](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:278) def _parse_season_stages(self) -> List[Dict]: [279](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:279) match_selector = ( [280](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:280) "[//div](https://div/)[contains(@id,'tournament-fixture')]//div[contains(@class,'divtable-row')]" [281](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:281) ) --> [282](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:282) WebDriverWait(self._driver, 30, poll_frequency=1).until( [283](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:283) ec.presence_of_element_located((By.XPATH, match_selector)) [284](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:284) ) [285](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:285) node_stages_selector = "[//select](https://select/)[contains(@id,'stages')]/option" [286](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/soccerdata/whoscored.py:286) node_stages = self._driver.find_elements(By.XPATH, node_stages_selector) File [~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:105](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:105), in WebDriverWait.until(self, method, message) [103](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:103) if time.monotonic() > end_time: [104](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:104) break --> [105](https://file+.vscode-resource.vscode-cdn.net/Users/jalves/Desktop/football/football_analytics/visualization_notebooks/represent_events/~/Desktop/football/football_analytics/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py:105) raise TimeoutException(message, screen, stacktrace) TimeoutException: Message: Stacktrace: 0 undetected_chromedriver 0x00000001008d66c8 undetected_chromedriver + 6149832 1 undetected_chromedriver 0x00000001008cdcea undetected_chromedriver + 6114538 2 undetected_chromedriver 0x000000010035ad5c undetected_chromedriver + 400732 3 undetected_chromedriver 0x00000001003a7aa5 undetected_chromedriver + 715429 4 undetected_chromedriver 0x00000001003a7bf1 undetected_chromedriver + 715761 5 undetected_chromedriver 0x00000001003ecdd4 undetected_chromedriver + 998868 6 undetected_chromedriver 0x00000001003cacdd undetected_chromedriver + 859357 7 undetected_chromedriver 0x00000001003ea0db undetected_chromedriver + 987355 8 undetected_chromedriver 0x00000001003caa53 undetected_chromedriver + 858707 9 undetected_chromedriver 0x000000010039a6d5 undetected_chromedriver + 661205 10 undetected_chromedriver 0x000000010039af6e undetected_chromedriver + 663406 11 undetected_chromedriver 0x0000000100897d00 undetected_chromedriver + 5893376 12 undetected_chromedriver 0x000000010089d4cc undetected_chromedriver + 5915852 13 undetected_chromedriver 0x00000001008798c4 undetected_chromedriver + 5769412 14 undetected_chromedriver 0x000000010089df99 undetected_chromedriver + 5918617 15 undetected_chromedriver 0x000000010086aed4 undetected_chromedriver + 5709524 16 undetected_chromedriver 0x00000001008be018 undetected_chromedriver + 6049816 17 undetected_chromedriver 0x00000001008be1d7 undetected_chromedriver + 6050263 18 undetected_chromedriver 0x00000001008cd89e undetected_chromedriver + 6113438 19 libsystem_pthread.dylib 0x00007ff80bb171d3 _pthread_start + 125 20 libsystem_pthread.dylib 0x00007ff80bb12bd3 thread_start + 15` Any idea of how I can solve this issue? Thanks!
kerem 2026-03-02 15:55:51 +03:00
Author
Owner

@probberechts commented on GitHub (May 27, 2024):

Most likely this is related to #581. For previous seasons, the schedule is probably retrieved from the cache.

<!-- gh-comment-id:2134064838 --> @probberechts commented on GitHub (May 27, 2024): Most likely this is related to #581. For previous seasons, the schedule is probably retrieved from the cache.
Author
Owner

@joaomcalves commented on GitHub (May 27, 2024):

Oh Thanks @probberechts ! This was a fast response ahah I will test the new version.

<!-- gh-comment-id:2134087280 --> @joaomcalves commented on GitHub (May 27, 2024): Oh Thanks @probberechts ! This was a fast response ahah I will test the new version.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#107
No description provided.