[GH-ISSUE #789] [BUG] Images in search results are fetched using HTTP, even with HTTPS_ONLY=1 #499

Closed
opened 2026-02-25 20:35:54 +03:00 by kerem · 16 comments
Owner

Originally created by @DUOLabs333 on GitHub (Jun 15, 2022).
Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/789

Describe the bug
A clear and concise description of what the bug is.
Title
To Reproduce
Steps to reproduce the behavior:

  1. Search anything
  2. Look in the network tab
  3. See that images are pulled in using HTTP

Deployment Method

  • Heroku (one-click deploy)
  • Docker
  • run executable
  • pip/pipx
  • Other: [describe setup]

Version of Whoogle Search

  • Latest build from [source] (i.e. GitHub, Docker Hub, pip, etc)
  • Version [version number]
  • Not sure

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Originally created by @DUOLabs333 on GitHub (Jun 15, 2022). Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/789 **Describe the bug** A clear and concise description of what the bug is. Title **To Reproduce** Steps to reproduce the behavior: 1. Search anything 2. Look in the network tab 3. See that images are pulled in using HTTP **Deployment Method** - [ ] Heroku (one-click deploy) - [ ] Docker - [x] `run` executable - [ ] pip/pipx - [ ] Other: [describe setup] **Version of Whoogle Search** - [x] Latest build from [source] (i.e. GitHub, Docker Hub, pip, etc) - [ ] Version [version number] - [ ] Not sure **Desktop (please complete the following information):** - OS: [e.g. iOS] - Browser [e.g. chrome, safari] - Version [e.g. 22] **Smartphone (please complete the following information):** - Device: [e.g. iPhone6] - OS: [e.g. iOS8.1] - Browser [e.g. stock browser, safari] - Version [e.g. 22] **Additional context** Add any other context about the problem here.
kerem 2026-02-25 20:35:54 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@jacr13 commented on GitHub (Jun 15, 2022):

Could you give a bit more context ?

<!-- gh-comment-id:1156635362 --> @jacr13 commented on GitHub (Jun 15, 2022): Could you give a bit more context ?
Author
Owner

@DUOLabs333 commented on GitHub (Jun 15, 2022):

What do you mean, that is the context.

<!-- gh-comment-id:1156637646 --> @DUOLabs333 commented on GitHub (Jun 15, 2022): What do you mean, that is the context.
Author
Owner

@jacr13 commented on GitHub (Jun 15, 2022):

Your root url is in the form of http://whoogle.domain.tld instead of https://whoogle.domain.tld?

<!-- gh-comment-id:1156670485 --> @jacr13 commented on GitHub (Jun 15, 2022): Your root url is in the form of http://whoogle.domain.tld instead of https://whoogle.domain.tld?
Author
Owner

@DUOLabs333 commented on GitHub (Jun 15, 2022):

I didn't enable root_url (did you mean WHOOGLE_CONFIG_URL).

<!-- gh-comment-id:1156671504 --> @DUOLabs333 commented on GitHub (Jun 15, 2022): I didn't enable root_url (did you mean `WHOOGLE_CONFIG_URL`).
Author
Owner

@jacr13 commented on GitHub (Jun 15, 2022):

What is the form of the url reported in your configuration under the search bar (Root URL) or in WHOOGLE_CONFIG_URL ?

<!-- gh-comment-id:1156677590 --> @jacr13 commented on GitHub (Jun 15, 2022): What is the form of the url reported in your configuration under the search bar (Root URL) or in WHOOGLE_CONFIG_URL ?
Author
Owner

@DUOLabs333 commented on GitHub (Jun 15, 2022):

https://domain.tld

<!-- gh-comment-id:1156678457 --> @DUOLabs333 commented on GitHub (Jun 15, 2022): `https://domain.tld`
Author
Owner

@DUOLabs333 commented on GitHub (Jun 15, 2022):

It seems that they send the same request twice: one HTTP, and one HTTPS.

<!-- gh-comment-id:1156688946 --> @DUOLabs333 commented on GitHub (Jun 15, 2022): It seems that they send the same request twice: one HTTP, and one HTTPS.
Author
Owner

@jacr13 commented on GitHub (Jun 15, 2022):

The requested you see with HTTP are to your server or somewhere on the internet ?

<!-- gh-comment-id:1156708875 --> @jacr13 commented on GitHub (Jun 15, 2022): The requested you see with HTTP are to your server or somewhere on the internet ?
Author
Owner

@DUOLabs333 commented on GitHub (Jun 15, 2022):

My server.

<!-- gh-comment-id:1156756787 --> @DUOLabs333 commented on GitHub (Jun 15, 2022): My server.
Author
Owner

@jacr13 commented on GitHub (Jun 16, 2022):

It's weird, when I reopen firefox this morning I had a similar behavior. Removing the cookies for whoogle domain solved the problem.

<!-- gh-comment-id:1157488588 --> @jacr13 commented on GitHub (Jun 16, 2022): It's weird, when I reopen firefox this morning I had a similar behavior. Removing the cookies for whoogle domain solved the problem.
Author
Owner

@DUOLabs333 commented on GitHub (Jun 23, 2022):

I found that this happens in line 128 in filter.py. Commenting it out makes it work. Why are image links proxied through whoogle anyway?

<!-- gh-comment-id:1163842640 --> @DUOLabs333 commented on GitHub (Jun 23, 2022): I found that this happens in line 128 in filter.py. Commenting it out makes it work. Why are image links proxied through whoogle anyway?
Author
Owner

@DUOLabs333 commented on GitHub (Jun 23, 2022):

Figured it out. For some reason, the request.url_root starts with http. How should we fix this: should we set the self.request.url_root in Search in search.py to start with https if HTTPS_ONLY is enabled? @benbusby

<!-- gh-comment-id:1163863044 --> @DUOLabs333 commented on GitHub (Jun 23, 2022): Figured it out. For some reason, the request.url_root starts with http. How should we fix this: should we set the `self.request.url_root` in `Search` in `search.py` to start with https if `HTTPS_ONLY` is enabled? @benbusby
Author
Owner

@benbusby commented on GitHub (Jun 24, 2022):

I found that this happens in line 128 in filter.py. Commenting it out makes it work. Why are image links proxied through whoogle anyway?

Because otherwise the user's IP is sent by the browser when fetching the image. Most images returned by the Google search results page are proxied through Google servers, so if we don't proxy image requests, it would defeat the purpose a bit.

Figured it out. For some reason, the request.url_root starts with http. How should we fix this: should we set the self.request.url_root in Search in search.py to start with https if HTTPS_ONLY is enabled?

That should work, but it's kinda strange that Flask is changing the protocol used for the original request when fetching the url root. Feel free to open a PR if that's working for you (and if you want to), otherwise I'll probably mess around with it later today or tomorrow and push an update.

<!-- gh-comment-id:1165749935 --> @benbusby commented on GitHub (Jun 24, 2022): > I found that this happens in line 128 in filter.py. Commenting it out makes it work. Why are image links proxied through whoogle anyway? Because otherwise the user's IP is sent by the browser when fetching the image. Most images returned by the Google search results page are proxied through Google servers, so if we don't proxy image requests, it would defeat the purpose a bit. > Figured it out. For some reason, the request.url_root starts with http. How should we fix this: should we set the self.request.url_root in Search in search.py to start with https if HTTPS_ONLY is enabled? That should work, but it's kinda strange that Flask is changing the protocol used for the original request when fetching the url root. Feel free to open a PR if that's working for you (and if you want to), otherwise I'll probably mess around with it later today or tomorrow and push an update.
Author
Owner

@benbusby commented on GitHub (Jun 24, 2022):

Could also use url_for instead of request.url_root to get a root url with a defined scheme:

url_for('.index',
        _external=True,
        _scheme='https')
<!-- gh-comment-id:1165752385 --> @benbusby commented on GitHub (Jun 24, 2022): Could also use `url_for` instead of `request.url_root` to get a root url with a defined scheme: ```py url_for('.index', _external=True, _scheme='https') ```
Author
Owner

@benbusby commented on GitHub (Jun 24, 2022):

I think you could just replace the usage of request.url_root in search.py with the call to url_for. By requesting an external link to the index, it's achieving the same thing as request.url_root but with a more reliable way of enforcing the scheme used. But setting the scheme should depend on the HTTPS_ONLY var obviously. I haven't tested it out yet though.

<!-- gh-comment-id:1165761711 --> @benbusby commented on GitHub (Jun 24, 2022): I think you could just replace the usage of `request.url_root` in `search.py` with the call to `url_for`. By requesting an external link to the index, it's achieving the same thing as `request.url_root` but with a more reliable way of enforcing the scheme used. But setting the scheme should depend on the `HTTPS_ONLY` var obviously. I haven't tested it out yet though.
Author
Owner

@DUOLabs333 commented on GitHub (Jun 27, 2022):

Fixed it with adding proxy_set_header X-Forwarded-Proto $scheme; to nginx.

<!-- gh-comment-id:1167368058 --> @DUOLabs333 commented on GitHub (Jun 27, 2022): Fixed it with adding `proxy_set_header X-Forwarded-Proto $scheme;` to nginx.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/whoogle-search#499
No description provided.