mirror of
https://github.com/probberechts/soccerdata.git
synced 2026-04-26 10:35:53 +03:00
[GH-ISSUE #23] [General] Selenium fails with SOCKS proxy (for tor) with WebDriverException: Message: unknown error: net::ERR_PROXY_CONNECTION_FAILED #1
Labels
No labels
ESPN
FBref
FotMob
MatchHistory
SoFIFA
Sofascore
WhoScored
WhoScored
bug
build
common
dependencies
discussion
documentation
duplicate
enhancement
good first issue
invalid
performance
pull-request
question
question
removal
understat
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/soccerdata#1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tonyelhabr on GitHub (Mar 19, 2022).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/23
I tried to set
use_tor=Truefor downloading events for a match with tor running in the background, butread_eventsended with an error indicating that the proxy connection failed.Here's what my terminal looks like with tor running (prior to calling
read_events()I've opened my browser to the port to verify that something is running, although this is using an HTTP proxy, so the warning here is expected.
@probberechts commented on GitHub (Mar 19, 2022):
I can't reproduce this and I've got no clue what could be the problem here.
Could you check the following:
Launch Chrome with
and check whether Tor works by browsing to https://check.torproject.org/
@tonyelhabr commented on GitHub (Mar 20, 2022):
google-chromecommandI've tried setting
path_to_browserto my chromedriver and the normal chrome executable. I've also tried not setting it. All result in the same error 🤷I'm not super familiar with python debugging. Is there a good way for me to stop the execution somewhere in the selenium call for
self.execute(Command.GET, {'url': url})? This seems to be where the error handling dispatches.@probberechts commented on GitHub (Mar 20, 2022):
Ok. Tor clearly functions properly. That means it has to be an issue with selenium / undetected_chromedriver.
I use undetected_chromedriver, which is a patched version of the original chromedriver to avoid detection by bot mitigation systems. I would first make sure it is not this patched version that causes your problem by running the code below. You'll first have to download the appropriate chromedriver version for your system from https://chromedriver.chromium.org/downloads.
If this does not work, you could try with some additional arguments (Google for "windows selenium tor proxy") or create an issue in the selenium repo.
If it works and the code below does not (it shouldn't as this snippet is copied from soccerdata's source code), it is an issue with undetected-chromedriver and you should create an issue here.
@tonyelhabr commented on GitHub (Mar 20, 2022):
The major thing I had to change with your snippets is replace
myproxywith the actual value of the proxy127.0.0.1. Is that supposed to be an environment variable?The first worked for me on my second try. My first try was blocked, so I see why you might prefer
undetected_chromedriver.The second snippet with
undetected_chromedriverworked, after the replacement ofmyproxy.@tonyelhabr commented on GitHub (Mar 20, 2022):
This gist seems to indicate that we need the value of the proxy specified in
resolver_rules@probberechts commented on GitHub (Mar 20, 2022):
Oh yes, that makes sense! I copy-pasted the resolver rules and forgot to change
myproxyto127.0.0.1. Actually, it is odd that it works on my system.Thanks for debugging this! I'll push a fix in a couple of minutes.
@tonyelhabr commented on GitHub (Mar 20, 2022):
happy to help!