[GH-ISSUE #383] [FEATURE] Multiple Proxies #258

Open
opened 2026-02-25 20:35:16 +03:00 by kerem · 9 comments
Owner

Originally created by @kitchenutensils778 on GitHub (Aug 5, 2021).
Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/383

Is is possible to add multiple proxies and select the least latency proxy in real time?

Originally created by @kitchenutensils778 on GitHub (Aug 5, 2021). Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/383 Is is possible to add multiple proxies and select the least latency proxy in real time?
Author
Owner

@vacom13 commented on GitHub (Dec 18, 2021):

@benbusby I would like to give this a try. However, I would need guidance on it 😅. I don't know, maybe some resources to refer too?

<!-- gh-comment-id:997182318 --> @vacom13 commented on GitHub (Dec 18, 2021): @benbusby I would like to give this a try. However, I would need guidance on it 😅. I don't know, maybe some resources to refer too?
Author
Owner

@vacom13 commented on GitHub (Feb 4, 2022):

@benbusby I will take it up

<!-- gh-comment-id:1029664151 --> @vacom13 commented on GitHub (Feb 4, 2022): @benbusby I will take it up
Author
Owner

@benbusby commented on GitHub (Feb 4, 2022):

Thanks @vacom13, and sorry I missed your message from back in December! I think for an initial implementation you could just add support for multiple proxies, and retry requests using a different proxy (if multiple are configured) if one times out. Selecting the least latency proxy can be added later as an improvement.

<!-- gh-comment-id:1030424398 --> @benbusby commented on GitHub (Feb 4, 2022): Thanks @vacom13, and sorry I missed your message from back in December! I think for an initial implementation you could just add support for multiple proxies, and retry requests using a different proxy (if multiple are configured) if one times out. Selecting the least latency proxy can be added later as an improvement.
Author
Owner

@vacom13 commented on GitHub (Feb 8, 2022):

@benbusby no problem. And yes I will look into it

<!-- gh-comment-id:1032237271 --> @vacom13 commented on GitHub (Feb 8, 2022): @benbusby no problem. And yes I will look into it
Author
Owner

@vacom13 commented on GitHub (Mar 17, 2022):

@benbusby Hey. I have actually been really busy with family and college work. I did think up a way to implement this I suppose. Correct me if I am wrong. For now I should probably just get a list of working proxies from a site and give the user a checkbox in the configs to select if he wants to use the proxies right? And them I need to make sure that the search results come within the given timeframe. After that I can give the user the ability to select the latency right?

<!-- gh-comment-id:1070768842 --> @vacom13 commented on GitHub (Mar 17, 2022): @benbusby Hey. I have actually been really busy with family and college work. I did think up a way to implement this I suppose. Correct me if I am wrong. For now I should probably just get a list of working proxies from a site and give the user a checkbox in the configs to select if he wants to use the proxies right? And them I need to make sure that the search results come within the given timeframe. After that I can give the user the ability to select the latency right?
Author
Owner

@vacom13 commented on GitHub (Mar 22, 2022):

@benbusby there is this python library free-proxies. It basically scrapes for proxies on https://www.sslproxies.org/ and gives a string for working proxies. I could use that and incase it times out, I could try to get another proxy. As it gets the proxy in real time, I suppose there shouldnt be a problem as according to the documentation, it checks whether the proxy is working.

<!-- gh-comment-id:1075101880 --> @vacom13 commented on GitHub (Mar 22, 2022): @benbusby there is this python library `free-proxies`. It basically scrapes for proxies on `https://www.sslproxies.org/` and gives a string for working proxies. I could use that and incase it times out, I could try to get another proxy. As it gets the proxy in real time, I suppose there shouldnt be a problem as according to the documentation, it checks whether the proxy is working.
Author
Owner

@vacom13 commented on GitHub (Mar 22, 2022):

Or I could just scrape the list myself for the first 5 proxies and then check those out?

<!-- gh-comment-id:1075105141 --> @vacom13 commented on GitHub (Mar 22, 2022): Or I could just scrape the list myself for the first 5 proxies and then check those out?
Author
Owner

@benbusby commented on GitHub (Mar 22, 2022):

Hey @vacom13, I think that's actually a good idea, but potentially a bit out of scope for this issue. I think all this issue should really support is multiple user specified proxies. So if a user has access to multiple, they can specify them as a comma separated string (or something along those lines).

So currently WHOOGLE_PROXY_LOC only accepts one IP:PORT string, but could be updated to have multiple and cycle through them if one of them returns an error response code. If a user just specified the proxy locations as a comma separated string such as IP:4000,IP:4001, then in the request module we could do something like:

proxy_paths = os.environ.get('WHOOGLE_PROXY_LOC', '').split(',')
if proxy_paths:
    # ...
    for path in proxy_paths:
        # Validate a 200 response and no captcha from search URL

I think your idea could be an entirely separate issue, but I'd like to personally look into the free-proxies library a bit first.

<!-- gh-comment-id:1075345164 --> @benbusby commented on GitHub (Mar 22, 2022): Hey @vacom13, I think that's actually a good idea, but potentially a bit out of scope for this issue. I think all this issue should really support is multiple user specified proxies. So if a user has access to multiple, they can specify them as a comma separated string (or something along those lines). So currently `WHOOGLE_PROXY_LOC` only accepts one `IP:PORT` string, but could be updated to have multiple and cycle through them if one of them returns an error response code. If a user just specified the proxy locations as a comma separated string such as `IP:4000,IP:4001`, then in the request module we could do something like: ```py proxy_paths = os.environ.get('WHOOGLE_PROXY_LOC', '').split(',') if proxy_paths: # ... for path in proxy_paths: # Validate a 200 response and no captcha from search URL ``` I think your idea could be an entirely separate issue, but I'd like to personally look into the `free-proxies` library a bit first.
Author
Owner

@vacom13 commented on GitHub (May 1, 2022):

@benbusby i have tried to work on this but it's just that not having multiple working proxies available just makes it confusing. I used a couple of free proxies but i ran into internal errors maybe because the connection keeps timing out. The free proxies did work in a test script i created but whenever i add the proxy to the whoogle env and then run, it always leads to some problem. I did also run into a rate limiting issue with a proxy. Anyway, I will keep at it.

<!-- gh-comment-id:1114364393 --> @vacom13 commented on GitHub (May 1, 2022): @benbusby i have tried to work on this but it's just that not having multiple working proxies available just makes it confusing. I used a couple of free proxies but i ran into internal errors maybe because the connection keeps timing out. The free proxies did work in a test script i created but whenever i add the proxy to the whoogle env and then run, it always leads to some problem. I did also run into a rate limiting issue with a proxy. Anyway, I will keep at it.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/whoogle-search#258
No description provided.