[GH-ISSUE #1575] v0.7.2 logs many ConnectionResetError: [Errno 104] Connection reset by peer errors when served behind haproxy #2452

Open
opened 2026-03-01 17:59:08 +03:00 by kerem · 3 comments
Owner

Originally created by @Just4Link on GitHub (Oct 28, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1575

Hello,

im trying to set up archivebox with docker. Everything is fine but the log get spammed with this exceptions when i navigate
through archivebox.

Also i have in front of archivebox haproxy as reverse proxy. Is there a way to get the actual user ip adress? i dont find in the
doc anything like thrusted proxies, only the withecard for proxy auth thats not the solution for me.

Can u provide any hints?

2024-10-28T08:51:20.947805346Z Exception occurred during processing of request from ('192.168.10.2', 14679)
2024-10-28T08:51:20.947814526Z ----------------------------------------
2024-10-28T08:51:20.947822288Z Exception occurred during processing of request from ('192.168.10.2', 59650)
2024-10-28T08:51:20.948035396Z Traceback (most recent call last):
2024-10-28T08:51:20.948365101Z File "/usr/local/lib/python3.11/socketserver.py", line 691, in process_request_thread
2024-10-28T08:51:20.948388455Z self.finish_request(request, client_address)
2024-10-28T08:51:20.948399314Z File "/usr/local/lib/python3.11/socketserver.py", line 361, in finish_request
2024-10-28T08:51:20.948437395Z self.RequestHandlerClass(request, client_address, self)
2024-10-28T08:51:20.948444769Z File "/usr/local/lib/python3.11/socketserver.py", line 755, in init
2024-10-28T08:51:20.948451843Z self.handle()
2024-10-28T08:51:20.948458395Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 174, in handle
2024-10-28T08:51:20.948465668Z self.handle_one_request()
2024-10-28T08:51:20.948477661Z Traceback (most recent call last):
2024-10-28T08:51:20.948630444Z File "/usr/local/lib/python3.11/socketserver.py", line 691, in process_request_thread
2024-10-28T08:51:20.948642938Z self.finish_request(request, client_address)
2024-10-28T08:51:20.948650542Z File "/usr/local/lib/python3.11/socketserver.py", line 361, in finish_request
2024-10-28T08:51:20.948657846Z self.RequestHandlerClass(request, client_address, self)
2024-10-28T08:51:20.948665159Z File "/usr/local/lib/python3.11/socketserver.py", line 755, in init
2024-10-28T08:51:20.948672703Z self.handle()
2024-10-28T08:51:20.948679586Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 174, in handle
2024-10-28T08:51:20.948686700Z self.handle_one_request()
2024-10-28T08:51:20.948693603Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 182, in handle_one_request
2024-10-28T08:51:20.948701187Z self.raw_requestline = self.rfile.readline(65537)
2024-10-28T08:51:20.948707859Z ^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-28T08:51:20.948714682Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 182, in handle_one_request
2024-10-28T08:51:20.948721886Z self.raw_requestline = self.rfile.readline(65537)
2024-10-28T08:51:20.948728839Z ^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-10-28T08:51:20.948735421Z File "/usr/local/lib/python3.11/socket.py", line 706, in readinto
2024-10-28T08:51:20.948742404Z return self._sock.recv_into(b)
2024-10-28T08:51:20.948749127Z ^^^^^^^^^^^^^^^^^^^^^^^
2024-10-28T08:51:20.948755718Z ConnectionResetError: [Errno 104] Connection reset by peer
2024-10-28T08:51:20.948762501Z ----------------------------------------
2024-10-28T08:51:20.948774473Z File "/usr/local/lib/python3.11/socket.py", line 706, in readinto
2024-10-28T08:51:20.948801173Z return self._sock.recv_into(b)
2024-10-28T08:51:20.948807986Z ^^^^^^^^^^^^^^^^^^^^^^^
2024-10-28T08:51:20.948814438Z ConnectionResetError: [Errno 104] Connection reset by peer
2024-10-28T08:51:20.948969511Z ----------------------------------------

Originally created by @Just4Link on GitHub (Oct 28, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1575 Hello, im trying to set up archivebox with docker. Everything is fine but the log get spammed with this exceptions when i navigate through archivebox. Also i have in front of archivebox haproxy as reverse proxy. Is there a way to get the actual user ip adress? i dont find in the doc anything like thrusted proxies, only the withecard for proxy auth thats not the solution for me. Can u provide any hints? 2024-10-28T08:51:20.947805346Z Exception occurred during processing of request from ('192.168.10.2', 14679) 2024-10-28T08:51:20.947814526Z ---------------------------------------- 2024-10-28T08:51:20.947822288Z Exception occurred during processing of request from ('192.168.10.2', 59650) 2024-10-28T08:51:20.948035396Z Traceback (most recent call last): 2024-10-28T08:51:20.948365101Z File "/usr/local/lib/python3.11/socketserver.py", line 691, in process_request_thread 2024-10-28T08:51:20.948388455Z self.finish_request(request, client_address) 2024-10-28T08:51:20.948399314Z File "/usr/local/lib/python3.11/socketserver.py", line 361, in finish_request 2024-10-28T08:51:20.948437395Z self.RequestHandlerClass(request, client_address, self) 2024-10-28T08:51:20.948444769Z File "/usr/local/lib/python3.11/socketserver.py", line 755, in __init__ 2024-10-28T08:51:20.948451843Z self.handle() 2024-10-28T08:51:20.948458395Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 174, in handle 2024-10-28T08:51:20.948465668Z self.handle_one_request() 2024-10-28T08:51:20.948477661Z Traceback (most recent call last): 2024-10-28T08:51:20.948630444Z File "/usr/local/lib/python3.11/socketserver.py", line 691, in process_request_thread 2024-10-28T08:51:20.948642938Z self.finish_request(request, client_address) 2024-10-28T08:51:20.948650542Z File "/usr/local/lib/python3.11/socketserver.py", line 361, in finish_request 2024-10-28T08:51:20.948657846Z self.RequestHandlerClass(request, client_address, self) 2024-10-28T08:51:20.948665159Z File "/usr/local/lib/python3.11/socketserver.py", line 755, in __init__ 2024-10-28T08:51:20.948672703Z self.handle() 2024-10-28T08:51:20.948679586Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 174, in handle 2024-10-28T08:51:20.948686700Z self.handle_one_request() 2024-10-28T08:51:20.948693603Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 182, in handle_one_request 2024-10-28T08:51:20.948701187Z self.raw_requestline = self.rfile.readline(65537) 2024-10-28T08:51:20.948707859Z ^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-28T08:51:20.948714682Z File "/usr/local/lib/python3.11/site-packages/django/core/servers/basehttp.py", line 182, in handle_one_request 2024-10-28T08:51:20.948721886Z self.raw_requestline = self.rfile.readline(65537) 2024-10-28T08:51:20.948728839Z ^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-28T08:51:20.948735421Z File "/usr/local/lib/python3.11/socket.py", line 706, in readinto 2024-10-28T08:51:20.948742404Z return self._sock.recv_into(b) 2024-10-28T08:51:20.948749127Z ^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-28T08:51:20.948755718Z ConnectionResetError: [Errno 104] Connection reset by peer 2024-10-28T08:51:20.948762501Z ---------------------------------------- 2024-10-28T08:51:20.948774473Z File "/usr/local/lib/python3.11/socket.py", line 706, in readinto 2024-10-28T08:51:20.948801173Z return self._sock.recv_into(b) 2024-10-28T08:51:20.948807986Z ^^^^^^^^^^^^^^^^^^^^^^^ 2024-10-28T08:51:20.948814438Z ConnectionResetError: [Errno 104] Connection reset by peer 2024-10-28T08:51:20.948969511Z ----------------------------------------
Author
Owner

@pirate commented on GitHub (Oct 28, 2024):

Please provide the output of docker compose run archivebox version.

<!-- gh-comment-id:2441354402 --> @pirate commented on GitHub (Oct 28, 2024): Please provide the output of `docker compose run archivebox version`.
Author
Owner

@Just4Link commented on GitHub (Oct 30, 2024):

Sorry for the delay. Here is the version output

$ archivebox --version
0.7.2
ArchiveBox v0.7.2 COMMIT_HASH=315c9f3 BUILD_TIME=2024-04-24 22:47:02 1713998822
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.8.0-47-generic-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

[i] Dependency versions:
√ PYTHON_BINARY v3.11.9 valid /usr/local/bin/python3.11
√ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py
√ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/init.py
√ ARCHIVEBOX_BINARY v0.7.2 valid /usr/local/bin/archivebox

√ CURL_BINARY v8.5.0 valid /usr/bin/curl
√ WGET_BINARY v1.21.3 valid /usr/bin/wget
√ NODE_BINARY v20.12.2 valid /usr/bin/node
√ SINGLEFILE_BINARY v1.1.46 valid /app/node_modules/single-file-cli/single-file
√ READABILITY_BINARY v0.0.11 valid /app/node_modules/readability-extractor/readability-extractor
√ MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js
√ GIT_BINARY v2.39.2 valid /usr/bin/git
√ YOUTUBEDL_BINARY v2023.12.30 valid /usr/local/bin/yt-dlp
√ CHROME_BINARY v124.0.6367.29 valid /browsers/chromium-1112/chrome-linux/chrome
√ RIPGREP_BINARY v13.0.0 valid /usr/bin/rg

[i] Source-code locations:
√ PACKAGE_DIR 23 files valid /app/archivebox
√ TEMPLATES_DIR 3 files valid /app/archivebox/templates

  • CUSTOM_TEMPLATES_DIR - disabled None

[i] Secrets locations:
√ CHROME_USER_DATA_DIR 35 files valid /browser_profiles/personas/Default/chrome_profile
√ COOKIES_FILE 3.2 KB valid /browser_cookies/chrome_profile/Default/cookies.txt

[i] Data locations:
√ OUTPUT_DIR 5 files @ valid /data
√ SOURCES_DIR 30 files valid ./sources
√ LOGS_DIR 1 files valid ./logs
√ ARCHIVE_DIR 0 files valid ./archive
√ CONFIG_FILE 600.0 Bytes valid ./ArchiveBox.conf
√ SQL_INDEX 228.0 KB valid ./index.sqlite3

<!-- gh-comment-id:2446411558 --> @Just4Link commented on GitHub (Oct 30, 2024): Sorry for the delay. Here is the version output $ archivebox --version 0.7.2 ArchiveBox v0.7.2 COMMIT_HASH=315c9f3 BUILD_TIME=2024-04-24 22:47:02 1713998822 IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.8.0-47-generic-x86_64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False [i] Dependency versions: √ PYTHON_BINARY v3.11.9 valid /usr/local/bin/python3.11 √ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py √ ARCHIVEBOX_BINARY v0.7.2 valid /usr/local/bin/archivebox √ CURL_BINARY v8.5.0 valid /usr/bin/curl √ WGET_BINARY v1.21.3 valid /usr/bin/wget √ NODE_BINARY v20.12.2 valid /usr/bin/node √ SINGLEFILE_BINARY v1.1.46 valid /app/node_modules/single-file-cli/single-file √ READABILITY_BINARY v0.0.11 valid /app/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js √ GIT_BINARY v2.39.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2023.12.30 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v124.0.6367.29 valid /browsers/chromium-1112/chrome-linux/chrome √ RIPGREP_BINARY v13.0.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 23 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled None [i] Secrets locations: √ CHROME_USER_DATA_DIR 35 files valid /browser_profiles/personas/Default/chrome_profile √ COOKIES_FILE 3.2 KB valid /browser_cookies/chrome_profile/Default/cookies.txt [i] Data locations: √ OUTPUT_DIR 5 files @ valid /data √ SOURCES_DIR 30 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 0 files valid ./archive √ CONFIG_FILE 600.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 228.0 KB valid ./index.sqlite3
Author
Owner

@pirate commented on GitHub (Dec 5, 2024):

I think these errors are mostly harmless even though they're annoying, they should go away in >=v0.8.5 because we switched to daphne, and daphne complains less when an upstream connection is interruputed.

<!-- gh-comment-id:2518903941 --> @pirate commented on GitHub (Dec 5, 2024): I think these errors are mostly harmless even though they're annoying, they should go away in >=v0.8.5 because we switched to daphne, and daphne complains less when an upstream connection is interruputed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2452
No description provided.