[GH-ISSUE #193] [Crawler] Failed to connect to the browser instance, will retry in 5 secs #147

Closed
opened 2026-03-02 11:47:05 +03:00 by kerem · 3 comments
Owner

Originally created by @XiaoSiHwang on GitHub (Jun 4, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/193

Hoarder crawler doesnt wrok
hoarder-workers:
2024-06-04T16:07:30.401Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/
2024-06-04T16:07:31.931Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
2024-06-04T16:07:36.932Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333
2024-06-04T16:07:36.933Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/
2024-06-04T16:07:38.500Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
2024-06-04T16:07:43.501Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333
2024-06-04T16:07:43.502Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/
2024-06-04T16:07:45.073Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
2024-06-04T16:07:50.075Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333
2024-06-04T16:07:50.075Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/
2024-06-04T16:07:51.625Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
2024-06-04T16:07:56.627Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333
2024-06-04T16:07:56.628Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/
2024-06-04T16:07:58.200Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs

Chrome:
[0604/160227.059822:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[0604/160227.060817:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[0604/160227.060883:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[0604/160227.088938:WARNING:sandbox_linux.cc(420)] InitializeSandbox() called with multiple threads in process gpu-process.
[0604/160227.098655:INFO:config_dir_policy_loader.cc(118)] Skipping mandatory platform policies because no policy file was found at: /etc/chromium/policies/managed
[0604/160227.098690:INFO:config_dir_policy_loader.cc(118)] Skipping recommended platform policies because no policy file was found at: /etc/chromium/policies/recommended
[0604/160227.102479:WARNING:bluez_dbus_manager.cc(248)] Floss manager not present, cannot set Floss enable/disable.
DevTools listening on ws://0.0.0.0:9333/devtools/browser/6fe21986-5c5d-4c77-8acc-5deeed0aa8b9
Docker-compose

version: "3.8"
services:
  web:
    image: ghcr.io/hoarder-app/hoarder-web:${HOARDER_VERSION:-release}
    container_name: hoarder-web
    restart: unless-stopped
    volumes:
      - ./web_data:/data
    ports:
      - 9025:3000
    env_file:
      - .env
    environment:
      REDIS_HOST: redis
      MEILI_ADDR: http://meilisearch:7700
      DATA_DIR: /data
  redis:
    container_name: hoarder-redis
    image: redis:7.2-alpine
    restart: unless-stopped
    volumes:
      - ./redis:/data
  chrome:
    container_name: hoarder-chrome
    image: gcr.io/zenika-hub/alpine-chrome:124
    restart: unless-stopped
    command:
      - --no-sandbox
      - --disable-gpu
      - --disable-dev-shm-usage
      - --remote-debugging-address=0.0.0.0
      - --remote-debugging-port=9333
      - --hide-scrollbars
  meilisearch:
    container_name: hoarder-meilisearch
    image: getmeili/meilisearch:v1.6
    restart: unless-stopped
    env_file:
      - .env
    environment:
      MEILI_NO_ANALYTICS: "true"
    volumes:
      - ./meilisearch:/meili_data
  workers:
    container_name: hoarder-workers
    image: ghcr.io/hoarder-app/hoarder-workers:${HOARDER_VERSION:-release}
    restart: unless-stopped
    volumes:
      - ./web_data:/data
    env_file:
      - .env
    environment:
      REDIS_HOST: redis
      MEILI_ADDR: http://meilisearch:7700
      BROWSER_WEB_URL: http://chrome:9333
      DATA_DIR: /data
    depends_on:
      web:
        condition: service_started

Originally created by @XiaoSiHwang on GitHub (Jun 4, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/193 Hoarder crawler doesnt wrok [hoarder-workers](http://192.168.0.11:9000/#!/2/docker/containers/2767f8c96c171e6ccfbf63f3be44ba65360f9881c4e7af46119ea58771860abd): 2024-06-04T16:07:30.401Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/ 2024-06-04T16:07:31.931Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-06-04T16:07:36.932Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333 2024-06-04T16:07:36.933Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/ 2024-06-04T16:07:38.500Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-06-04T16:07:43.501Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333 2024-06-04T16:07:43.502Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/ 2024-06-04T16:07:45.073Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-06-04T16:07:50.075Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333 2024-06-04T16:07:50.075Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/ 2024-06-04T16:07:51.625Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-06-04T16:07:56.627Z info: [Crawler] Connecting to existing browser instance: http://chrome:9333 2024-06-04T16:07:56.628Z info: [Crawler] Successfully resolved IP address, new address: http://172.31.0.4:9333/ 2024-06-04T16:07:58.200Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs Chrome: [0604/160227.059822:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory [0604/160227.060817:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory [0604/160227.060883:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory [0604/160227.088938:WARNING:sandbox_linux.cc(420)] InitializeSandbox() called with multiple threads in process gpu-process. [0604/160227.098655:INFO:config_dir_policy_loader.cc(118)] Skipping mandatory platform policies because no policy file was found at: /etc/chromium/policies/managed [0604/160227.098690:INFO:config_dir_policy_loader.cc(118)] Skipping recommended platform policies because no policy file was found at: /etc/chromium/policies/recommended [0604/160227.102479:WARNING:bluez_dbus_manager.cc(248)] Floss manager not present, cannot set Floss enable/disable. DevTools listening on ws://0.0.0.0:9333/devtools/browser/6fe21986-5c5d-4c77-8acc-5deeed0aa8b9 Docker-compose ```yaml version: "3.8" services: web: image: ghcr.io/hoarder-app/hoarder-web:${HOARDER_VERSION:-release} container_name: hoarder-web restart: unless-stopped volumes: - ./web_data:/data ports: - 9025:3000 env_file: - .env environment: REDIS_HOST: redis MEILI_ADDR: http://meilisearch:7700 DATA_DIR: /data redis: container_name: hoarder-redis image: redis:7.2-alpine restart: unless-stopped volumes: - ./redis:/data chrome: container_name: hoarder-chrome image: gcr.io/zenika-hub/alpine-chrome:124 restart: unless-stopped command: - --no-sandbox - --disable-gpu - --disable-dev-shm-usage - --remote-debugging-address=0.0.0.0 - --remote-debugging-port=9333 - --hide-scrollbars meilisearch: container_name: hoarder-meilisearch image: getmeili/meilisearch:v1.6 restart: unless-stopped env_file: - .env environment: MEILI_NO_ANALYTICS: "true" volumes: - ./meilisearch:/meili_data workers: container_name: hoarder-workers image: ghcr.io/hoarder-app/hoarder-workers:${HOARDER_VERSION:-release} restart: unless-stopped volumes: - ./web_data:/data env_file: - .env environment: REDIS_HOST: redis MEILI_ADDR: http://meilisearch:7700 BROWSER_WEB_URL: http://chrome:9333 DATA_DIR: /data depends_on: web: condition: service_started ```
kerem 2026-03-02 11:47:05 +03:00
  • closed this issue
  • added the
    question
    label
Author
Owner

@kamtschatka commented on GitHub (Jun 4, 2024):

my guess is because you have added a container_name: hoarder-chrome, which is not available in the original docker_compose.yaml.

From what I can see the container_name will also be used as a hostname in docker, but your worker container is still referencing "chrome".
So I think if you simply change BROWSER_WEB_URL: http://chrome:9333 to BROWSER_WEB_URL: http://hoarder-chrome:9333, it will work?

<!-- gh-comment-id:2148346124 --> @kamtschatka commented on GitHub (Jun 4, 2024): my guess is because you have added a `container_name: hoarder-chrome`, which is not available in the original docker_compose.yaml. From what I can see the `container_name` will also be used as a hostname in docker, but your worker container is still referencing "chrome". So I think if you simply change `BROWSER_WEB_URL: http://chrome:9333` to `BROWSER_WEB_URL: http://hoarder-chrome:9333`, it will work?
Author
Owner

@XiaoSiHwang commented on GitHub (Jun 8, 2024):

Thank you very much, I have solved this problem!

my guess is because you have added a container_name: hoarder-chrome, which is not available in the original docker_compose.yaml.我的猜测是因为你添加了一个 ,这在原始 docker_compose.yaml 中不可用。

From what I can see the container_name will also be used as a hostname in docker, but your worker container is still referencing "chrome".据我所知,它也将用作 docker 中的主机名,但您的工作容器仍然引用“chrome”。 So I think if you simply change BROWSER_WEB_URL: http://chrome:9333 to BROWSER_WEB_URL: http://hoarder-chrome:9333, it will work?所以我认为如果你简单地改成 ,它会起作用吗?

<!-- gh-comment-id:2155795548 --> @XiaoSiHwang commented on GitHub (Jun 8, 2024): Thank you very much, I have solved this problem! > my guess is because you have added a `container_name: hoarder-chrome`, which is not available in the original docker_compose.yaml.我的猜测是因为你添加了一个 ,这在原始 docker_compose.yaml 中不可用。 > > From what I can see the `container_name` will also be used as a hostname in docker, but your worker container is still referencing "chrome".据我所知,它也将用作 docker 中的主机名,但您的工作容器仍然引用“chrome”。 So I think if you simply change `BROWSER_WEB_URL: http://chrome:9333` to `BROWSER_WEB_URL: http://hoarder-chrome:9333`, it will work?所以我认为如果你简单地改成 ,它会起作用吗?
Author
Owner

@hongruilin commented on GitHub (Jul 27, 2024):

Thank you very much, I have solved this problem!

my guess is because you have added a container_name: hoarder-chrome, which is not available in the original docker_compose.yaml.我的猜测是因为你添加了一个 ,这在原始 docker_compose.yaml 中不可用。
From what I can see the container_name will also be used as a hostname in docker, but your worker container is still referencing "chrome".据我所知,它也将用作 docker 中的主机名,但您的工作容器仍然引用“chrome”。 So I think if you simply change BROWSER_WEB_URL: http://chrome:9333 to BROWSER_WEB_URL: http://hoarder-chrome:9333, it will work?所以我认为如果你简单地改成 ,它会起作用吗?

Hello, how did you solve it later

<!-- gh-comment-id:2254246768 --> @hongruilin commented on GitHub (Jul 27, 2024): > Thank you very much, I have solved this problem! > > > my guess is because you have added a `container_name: hoarder-chrome`, which is not available in the original docker_compose.yaml.我的猜测是因为你添加了一个 ,这在原始 docker_compose.yaml 中不可用。 > > From what I can see the `container_name` will also be used as a hostname in docker, but your worker container is still referencing "chrome".据我所知,它也将用作 docker 中的主机名,但您的工作容器仍然引用“chrome”。 So I think if you simply change `BROWSER_WEB_URL: http://chrome:9333` to `BROWSER_WEB_URL: http://hoarder-chrome:9333`, it will work?所以我认为如果你简单地改成 ,它会起作用吗? Hello, how did you solve it later
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#147
No description provided.