[GH-ISSUE #2018] Crawler ignores NO_PROXY for internal browser connection when HTTP_PROXY is set #1257

Closed
opened 2026-03-02 11:56:06 +03:00 by kerem · 1 comment
Owner

Originally created by @BND-1 on GitHub (Oct 7, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2018

Describe the Bug

Hello Karakeep Team,
I've encountered a critical networking issue when running Karakeep in a Docker environment that requires an external HTTP proxy for outbound internet access.
The core problem is that the crawler process does not respect the NO_PROXY environment variable for internal Docker network connections, specifically when trying to connect to the headless Chrome container.

Environment
Karakeep Version: 0.27.1
Deployment Method: Docker Compose
Network Environment: A setup where the host machine must use an HTTP/HTTPS proxy to access the internet. Internal Docker services (web, chrome, meilisearch) are on the same Docker bridge network.

Steps to Reproduce

  1. Set up Karakeep using the standard docker-compose.yml (with the crawler running inside the web service).
  2. In the docker-compose.yml file for the web service, configure the environment section with a proxy:
    environment:

... other variables

HTTP_PROXY: "http://my-external-proxy.example.com:8080"
HTTPS_PROXY: "http://my-external-proxy.example.com:8080"
NO_PROXY: "localhost,127.0.0.1,chrome,meilisearch"
3. Start the services using docker-compose up -d.
4. Verify that the environment variables are correctly set inside the web container by running docker-compose exec web env.
5. In the Karakeep UI, add a bookmark for a URL that requires full browser rendering, such as a link to X.com (Twitter) or a heavily JavaScript-driven site.
6. Observe the logs of the web container using docker-compose logs -f web.

Expected Behaviour

The crawler process should identify that the target chrome is listed in the NO_PROXY variable. It should therefore establish a direct connection to http://chrome:9222 within the internal Docker network, bypassing the external proxy.

Screenshots or Additional Context

The crawler process ignores the NO_PROXY variable. It attempts to connect to http://chrome:9222 through the configured HTTP_PROXY. This results in the request being incorrectly routed, leading to a connection refused error as the proxy tries to connect to itself or localhost.
The following error is consistently produced in the web-1 logs:
web-1 | 2025-10-07T13:09:35.050Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222
web-1 | 2025-10-07T13:09:35.051Z info: [Crawler] Successfully resolved IP address, new address: http://172.25.0.3:9222/
web-1 | 2025-10-07T13:09:35.055Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs: browserType.connectOverCDP: connect ECONNREFUSED 127.0.0.1:7890
web-1 | Call log:
web-1 | - retrieving websocket url from http://172.25.0.3:9222/
web-1 |
web-1 | at Proxy.connectOverCDP (/app/apps/workers/node_modules/.pnpm/playwright-extra@4.3.6_playwright-core@1.53.1_playwright@1.53.1/node_modules/playwright-extra/dist/index.cjs.js:772:63)
web-1 | at async startBrowserInstance (/app/apps/workers/dist/index.js:43152:10)

Device Details

No response

Exact Karakeep Version

0.27.1

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @BND-1 on GitHub (Oct 7, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2018 ### Describe the Bug Hello Karakeep Team, I've encountered a critical networking issue when running Karakeep in a Docker environment that requires an external HTTP proxy for outbound internet access. The core problem is that the crawler process does not respect the NO_PROXY environment variable for internal Docker network connections, specifically when trying to connect to the headless Chrome container. Environment Karakeep Version: 0.27.1 Deployment Method: Docker Compose Network Environment: A setup where the host machine must use an HTTP/HTTPS proxy to access the internet. Internal Docker services (web, chrome, meilisearch) are on the same Docker bridge network. ### Steps to Reproduce 1. Set up Karakeep using the standard docker-compose.yml (with the crawler running inside the web service). 2. In the docker-compose.yml file for the web service, configure the environment section with a proxy: environment: # ... other variables HTTP_PROXY: "http://my-external-proxy.example.com:8080" HTTPS_PROXY: "http://my-external-proxy.example.com:8080" NO_PROXY: "localhost,127.0.0.1,chrome,meilisearch" 3. Start the services using docker-compose up -d. 4. Verify that the environment variables are correctly set inside the web container by running docker-compose exec web env. 5. In the Karakeep UI, add a bookmark for a URL that requires full browser rendering, such as a link to X.com (Twitter) or a heavily JavaScript-driven site. 6. Observe the logs of the web container using docker-compose logs -f web. ### Expected Behaviour The crawler process should identify that the target chrome is listed in the NO_PROXY variable. It should therefore establish a direct connection to http://chrome:9222 within the internal Docker network, bypassing the external proxy. ### Screenshots or Additional Context The crawler process ignores the NO_PROXY variable. It attempts to connect to http://chrome:9222 through the configured HTTP_PROXY. This results in the request being incorrectly routed, leading to a connection refused error as the proxy tries to connect to itself or localhost. The following error is consistently produced in the web-1 logs: web-1 | 2025-10-07T13:09:35.050Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222 web-1 | 2025-10-07T13:09:35.051Z info: [Crawler] Successfully resolved IP address, new address: http://172.25.0.3:9222/ web-1 | 2025-10-07T13:09:35.055Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs: browserType.connectOverCDP: connect ECONNREFUSED 127.0.0.1:7890 web-1 | Call log: web-1 | - <ws preparing> retrieving websocket url from http://172.25.0.3:9222/ web-1 | web-1 | at Proxy.connectOverCDP (/app/apps/workers/node_modules/.pnpm/playwright-extra@4.3.6_playwright-core@1.53.1_playwright@1.53.1/node_modules/playwright-extra/dist/index.cjs.js:772:63) web-1 | at async startBrowserInstance (/app/apps/workers/dist/index.js:43152:10) ### Device Details _No response_ ### Exact Karakeep Version 0.27.1 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
kerem 2026-03-02 11:56:06 +03:00
Author
Owner

@MohamedBassem commented on GitHub (Oct 12, 2025):

If you want to configure a proxy, you should use karakeep's own proxy env variables: https://docs.karakeep.app/configuration#proxy-configuration setting a global one will cause all sort of issues

<!-- gh-comment-id:3394452629 --> @MohamedBassem commented on GitHub (Oct 12, 2025): If you want to configure a proxy, you should use karakeep's own proxy env variables: https://docs.karakeep.app/configuration#proxy-configuration setting a global one will cause all sort of issues
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1257
No description provided.