[GH-ISSUE #65] [WhoScored] Unable to dismiss cookies banner #10

Closed
opened 2026-03-02 15:55:00 +03:00 by kerem · 1 comment
Owner

Originally created by @giochi99 on GitHub (Jul 19, 2022).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/65

Which Python version are you using?

Python 3.10.5

Which version of soccerdata are you using?

1.0.2

What did you do?

ws = sd.WhoScored(leagues="ENG-Premier League", seasons='19-20', proxy='tor')

pl_1920_events = ws.read_events()
pl_1920_events.head()

What did you expect to see?

Downloaded event data

What did you see instead?

TimeoutException                          Traceback (most recent call last)
Input In [13], in <cell line: 1>()
----> 1 pl_1920_events = ws.read_events()
        2 pl_1920_events.head()

File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:552, in WhoScored.read_events(self, match_id, force_cache, live, output_fmt)
     549 urlmask = WHOSCORED_URL + "/Matches/{}/Live"
     550 filemask = "events/{}_{}/{}.json"
--> 552 df_schedule = self.read_schedule(force_cache).reset_index()
     553 if match_id is not None:
     554     iterator = df_schedule[
     555         df_schedule.game_id.isin([match_id] if isinstance(match_id, int) else match_id)
     556     ]

File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:287, in WhoScored.read_schedule(self, force_cache)
    285 time.sleep(random.random() * 5)
    286 self._driver.get(url)
--> 287 stages = self._parse_season_stages()
    288 if len(stages) > 0:
    289     for stage in stages:

File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:182, in WhoScored._parse_season_stages(self)
    178 match_selector = (
    179     "//div[contains(@id,'tournament-fixture')]//div[contains(@class,'divtable-row')]"
    180 )
    181 time.sleep(5 + random.random() * 5)
--> 182 WebDriverWait(self._driver, 30, poll_frequency=1).until(
    183     ec.presence_of_element_located((By.XPATH, match_selector))
    184 )
    185 stages = []
    186 node_stages_selector = "//select[contains(@id,'stages')]/option"

File ~/.local/lib/python3.10/site-packages/selenium/webdriver/support/wait.py:89, in WebDriverWait.until(self, method, message)
     87     if time.monotonic() > end_time:
     88         break
---> 89 raise TimeoutException(message, screen, stacktrace)

TimeoutException: Message: 
Stacktrace:
#0 0x5616b4e39b13 <unknown>
#1 0x5616b4c40688 <unknown>
#2 0x5616b4c77cc7 <unknown>
#3 0x5616b4c77e91 <unknown>
#4 0x5616b4caae34 <unknown>
#5 0x5616b4c958dd <unknown>
#6 0x5616b4ca8b94 <unknown>
#7 0x5616b4c957a3 <unknown>
#8 0x5616b4c6b0ea <unknown>
#9 0x5616b4c6c225 <unknown>
#10 0x5616b4e812dd <unknown>
#11 0x5616b4e852c7 <unknown>
#12 0x5616b4e6b22e <unknown>
#13 0x5616b4e860a8 <unknown>
#14 0x5616b4e5fbc0 <unknown>
#15 0x5616b4ea26c8 <unknown>
#16 0x5616b4ea2848 <unknown>
#17 0x5616b4ebcc0d <unknown>
#18 0x7f669b48c54d <unknown>
Originally created by @giochi99 on GitHub (Jul 19, 2022). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/65 Which Python version are you using? Python 3.10.5 Which version of soccerdata are you using? 1.0.2 What did you do? ws = sd.WhoScored(leagues="ENG-Premier League", seasons='19-20', proxy='tor') pl_1920_events = ws.read_events() pl_1920_events.head() What did you expect to see? Downloaded event data What did you see instead? TimeoutException Traceback (most recent call last) Input In [13], in <cell line: 1>() ----> 1 pl_1920_events = ws.read_events() 2 pl_1920_events.head() File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:552, in WhoScored.read_events(self, match_id, force_cache, live, output_fmt) 549 urlmask = WHOSCORED_URL + "/Matches/{}/Live" 550 filemask = "events/{}_{}/{}.json" --> 552 df_schedule = self.read_schedule(force_cache).reset_index() 553 if match_id is not None: 554 iterator = df_schedule[ 555 df_schedule.game_id.isin([match_id] if isinstance(match_id, int) else match_id) 556 ] File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:287, in WhoScored.read_schedule(self, force_cache) 285 time.sleep(random.random() * 5) 286 self._driver.get(url) --> 287 stages = self._parse_season_stages() 288 if len(stages) > 0: 289 for stage in stages: File ~/.local/lib/python3.10/site-packages/soccerdata/whoscored.py:182, in WhoScored._parse_season_stages(self) 178 match_selector = ( 179 "//div[contains(@id,'tournament-fixture')]//div[contains(@class,'divtable-row')]" 180 ) 181 time.sleep(5 + random.random() * 5) --> 182 WebDriverWait(self._driver, 30, poll_frequency=1).until( 183 ec.presence_of_element_located((By.XPATH, match_selector)) 184 ) 185 stages = [] 186 node_stages_selector = "//select[contains(@id,'stages')]/option" File ~/.local/lib/python3.10/site-packages/selenium/webdriver/support/wait.py:89, in WebDriverWait.until(self, method, message) 87 if time.monotonic() > end_time: 88 break ---> 89 raise TimeoutException(message, screen, stacktrace) TimeoutException: Message: Stacktrace: #0 0x5616b4e39b13 <unknown> #1 0x5616b4c40688 <unknown> #2 0x5616b4c77cc7 <unknown> #3 0x5616b4c77e91 <unknown> #4 0x5616b4caae34 <unknown> #5 0x5616b4c958dd <unknown> #6 0x5616b4ca8b94 <unknown> #7 0x5616b4c957a3 <unknown> #8 0x5616b4c6b0ea <unknown> #9 0x5616b4c6c225 <unknown> #10 0x5616b4e812dd <unknown> #11 0x5616b4e852c7 <unknown> #12 0x5616b4e6b22e <unknown> #13 0x5616b4e860a8 <unknown> #14 0x5616b4e5fbc0 <unknown> #15 0x5616b4ea26c8 <unknown> #16 0x5616b4ea2848 <unknown> #17 0x5616b4ebcc0d <unknown> #18 0x7f669b48c54d <unknown>
kerem 2026-03-02 15:55:00 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@probberechts commented on GitHub (Jul 20, 2022):

It should be fixed in v1.0.3.
Note that you might also have to run the scraper in non-headless mode to avoid bot detection with

ws = sd.WhoScored(leagues="ENG-Premier League", seasons='21-22', proxy='tor', headless=False)
<!-- gh-comment-id:1190267754 --> @probberechts commented on GitHub (Jul 20, 2022): It should be fixed in v1.0.3. Note that you might also have to run the scraper in non-headless mode to avoid bot detection with ```py ws = sd.WhoScored(leagues="ENG-Premier League", seasons='21-22', proxy='tor', headless=False) ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#10
No description provided.