[GH-ISSUE #1683] Bug: HTTPSConnectionPool(host='...', port=443): Max retries exceeded with url #4019

Open
opened 2026-03-15 01:18:34 +03:00 by kerem · 0 comments
Owner

Originally created by @prototyperspective on GitHub (May 22, 2025).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1683

Originally assigned to: @pirate on GitHub.

Provide a screenshot and describe the bug

How to find out what causes this? Is it maybe some DNS problem or do I need to allow something in the firewall?

First of all I would need some error message assuming you don't know the problem. I can't see an error message or how to get one (in the logfile at data/logs/ there is just the command I used but no error message log). If that's in the Web UI, I currently can't launch it – after docker compose up runs through, when trying to open http://127.0.0.1:8000/ it doesn't finish loading.

Steps to reproduce

1. running `docker compose run archivebox add --depth=1 url` shows:

100.0% (60/60sec)[!] Failed to download url

     HTTPSConnectionPool(host='url', port=443): Max retries exceeded with url: site (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at  >: Failed to establish a new connection: [Errno 101] Network is unreachable'))
[!] Failed to get contents of URL {new_link.url} HTTPSConnectionPool(host='url', port=443): Max retries exceeded with url: /site (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at >: Failed to establish a new connection: [Errno 101] Network is unreachable'))
    > Found 1 new URLs not already in index

2. After it tries downloading the favicon it also shows `Extractor timed out after 60s. Run to see full output:` like with #1045 but when I actually run these it shows no output for the favicon and headers commands and things like `Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket` for the DOM command. But then it does show the DOM so it seems to work. But the crawl does not.

Logs or errors


ArchiveBox Version

ArchiveBox v0.7.3 COMMIT_HASH=069aabc BUILD_TIME=2024-12-15 09:54:03 1734256443
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.1.0-34-amd64-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

How did you install the version of ArchiveBox you are using?

Docker (or Podman/LXC/K8s/TrueNAS/Proxmox/etc)

What operating system are you running on?

Linux (Ubuntu/Debian/Arch/Alpine/etc.)

What type of drive are you using to store your ArchiveBox data?

  • some of data/ is on a local SSD or NVMe drive
  • some of data/ is on a spinning hard drive or external USB drive
  • some of data/ is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.)
  • some of data/ is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.)

Docker Compose Configuration


ArchiveBox Configuration


Originally created by @prototyperspective on GitHub (May 22, 2025). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1683 Originally assigned to: @pirate on GitHub. ### Provide a screenshot and describe the bug How to find out what causes this? Is it maybe some DNS problem or do I need to allow something in the firewall? First of all I would need some error message assuming you don't know the problem. I can't see an error message or how to get one (in the logfile at data/logs/ there is just the command I used but no error message log). If that's in the Web UI, I currently can't launch it – after docker compose up runs through, when trying to open http://127.0.0.1:8000/ it doesn't finish loading. ### Steps to reproduce ```markdown 1. running `docker compose run archivebox add --depth=1 url` shows: 100.0% (60/60sec)[!] Failed to download url HTTPSConnectionPool(host='url', port=443): Max retries exceeded with url: site (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at >: Failed to establish a new connection: [Errno 101] Network is unreachable')) [!] Failed to get contents of URL {new_link.url} HTTPSConnectionPool(host='url', port=443): Max retries exceeded with url: /site (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at >: Failed to establish a new connection: [Errno 101] Network is unreachable')) > Found 1 new URLs not already in index 2. After it tries downloading the favicon it also shows `Extractor timed out after 60s. Run to see full output:` like with #1045 but when I actually run these it shows no output for the favicon and headers commands and things like `Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket` for the DOM command. But then it does show the DOM so it seems to work. But the crawl does not. ``` ### Logs or errors ```shell ``` ### ArchiveBox Version ```shell ArchiveBox v0.7.3 COMMIT_HASH=069aabc BUILD_TIME=2024-12-15 09:54:03 1734256443 IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.1.0-34-amd64-x86_64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False ``` ### How did you install the version of ArchiveBox you are using? Docker (or Podman/LXC/K8s/TrueNAS/Proxmox/etc) ### What operating system are you running on? Linux (Ubuntu/Debian/Arch/Alpine/etc.) ### What type of drive are you using to store your ArchiveBox data? - [ ] some of `data/` is on a local SSD or NVMe drive - [ ] some of `data/` is on a spinning hard drive or external USB drive - [ ] some of `data/` is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.) - [ ] some of `data/` is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.) ### Docker Compose Configuration ```shell ``` ### ArchiveBox Configuration ```shell ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#4019
No description provided.