[GH-ISSUE #1181] Bug: Screenshots and DOM always fail after a while in v0.6.3 in Docker #2244

Closed
opened 2026-03-01 17:57:37 +03:00 by kerem · 10 comments
Owner

Originally created by @melyux on GitHub (Jul 14, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1181

Describe the bug

After running for a while and/or snapshotting a certain amount of URLs, the Chromium call that does screenshots and DOM starts failing. When I exec into the container and run the chromium command again manually, it always works. But when doing "Pull" from the web UI, it always fails. It only starts working again if I stop the container, docker rm it, and then start it again.

It's impossible from the log to see the exact Chromium error when this happens, because every single Chromium call is always prefixed by these error lines (and the ArchiveBox log only shows the first 5 lines):

find: ‘/root/.config/chromium/Crash Reports/pending/’: No such file or directory
[574:599:0714/201021.258704:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:603:0714/201021.766940:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:603:0714/201021.767034:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:599:0714/201021.770660:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771017:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771115:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.771276:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:574:0714/201021.783527:ERROR:chrome_browser_cloud_management_controller.cc(162)] Cloud management controller initialization aborted as CBCM is not enabled.
[574:599:0714/201021.800785:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[574:599:0714/201021.800842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[609:609:0714/201021.834450:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[609:609:0714/201021.834540:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[609:609:0714/201021.834551:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[609:609:0714/201021.834564:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[609:609:0714/201021.834578:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[609:609:0714/201021.834653:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[609:609:0714/201021.834676:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[609:609:0714/201021.834682:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[609:609:0714/201021.834691:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[609:609:0714/201021.834698:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[609:609:0714/201021.836399:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization
[574:574:0714/201021.876529:ERROR:object_proxy.cc(590)] Failed to call method: org.freedesktop.portal.Settings.Read: object_path= /org/freedesktop/portal/desktop: unknown error type: 
[574:651:0714/201021.894184:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894302:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894554:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894618:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[574:651:0714/201021.894702:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[650:650:0714/201021.930958:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[650:650:0714/201021.930990:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[650:650:0714/201021.930994:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[650:650:0714/201021.931003:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[650:650:0714/201021.931012:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[650:650:0714/201021.931059:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display.
[650:650:0714/201021.931071:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display.
[650:650:0714/201021.931076:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[650:650:0714/201021.931079:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed.
[650:650:0714/201021.931084:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed.
[650:650:0714/201021.932311:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization

Despite these lines, the Screenshot and DOM still work manually. But they're preventing me from seeing what's going on when Chromium does fail to produce the Screenshot and DOM during the original run.

Steps to reproduce

  1. Add a bunch of URLs
  2. After like ~100 or more, all Screenshots and DOM saving starts to fail.

Screenshots or log output

When re-running "Pull" on the failed snapshots, it always fails again and produces this output:

archivebox  | [▶] [2023-07-14 19:21:50] Starting archiving of 1 snapshots in index...
archivebox  | 
archivebox  | [√] [2023-07-14 19:21:50] "Website Name and Title"
archivebox  |     https://domain.name.here/
archivebox  |     √ ./archive/1689326989.501584
archivebox  |       > screenshot
archivebox  |         Extractor failed:
archivebox  |              Failed to save screenshot
archivebox  |             [307222:307247:0714/192151.111727:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307222:307250:0714/192151.131872:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307222:307250:0714/192151.131911:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307222:307247:0714/192151.138760:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox  |             [307222:307247:0714/192151.138842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox  |         Run to see full output:
archivebox  |             cd /data/archive/1689326989.501584;
archivebox  |             /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --screenshot https://domain.name.here/
archivebox  | 
archivebox  |       > dom
archivebox  |         Extractor failed:
archivebox  |              Failed to save DOM
archivebox  |             [307272:307298:0714/192151.313489:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307272:307301:0714/192151.318865:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307272:307301:0714/192151.318894:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
archivebox  |             [307272:307298:0714/192151.320545:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox  |             [307272:307298:0714/192151.320601:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
archivebox  |         Run to see full output:
archivebox  |             cd /data/archive/1689326989.501584;
archivebox  |             /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --dump-dom https://domain.name.here/
archivebox  | 
archivebox  |         192 files (5.4 MB) in 0:00:00s 
archivebox  | 
archivebox  | [√] [2023-07-14 19:21:51] Update of 1 pages complete (1.23 sec)

ArchiveBox version

find: '/.config/chromium/Crash Reports/pending/': No such file or directory
0.6.3
ArchiveBox v0.6.3 40ddd33 Cpython Linux Linux-6.1.0-10-amd64-x86_64-with-glibc2.31 x86_64
DEBUG=False IN_DOCKER=True IS_TTY=True TZ=UTC FS_ATOMIC=True FS_REMOTE=True FS_PERMS=644 0:0 SEARCH_BACKEND=ripgrep

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.4         valid     /usr/local/bin/python3.11                                                   
 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py                                 
 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.11/site-packages/django/__init__.py                  
 √  ARCHIVEBOX_BINARY     v0.6.3          valid     /usr/local/bin/archivebox                                                   

 √  CURL_BINARY           v7.74.0         valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21           valid     /usr/bin/wget                                                               
 √  NODE_BINARY           v18.16.1        valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file                              
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor              
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js                         
 -  GIT_BINARY            -               disabled  /usr/bin/git                                                                
 √  YOUTUBEDL_BINARY      v2023.07.06     valid     /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v114.0.5735.198  valid     /usr/bin/chromium                                                           
 √  RIPGREP_BINARY        v12.1.1         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           23 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates                                                   
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                              

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled                                                                              
 -  COOKIES_FILE          -               disabled                                                                              

[i] Data locations:
 √  OUTPUT_DIR            9 files @       valid     /data                                                                       
 √  SOURCES_DIR           325 files       valid     ./sources                                                                   
 √  LOGS_DIR              2 files         valid     ./logs                                                                      
 √  ARCHIVE_DIR           1150 files      valid     ./archive                                                                   
 √  CONFIG_FILE           133.0 Bytes     valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             11.2 MB         valid     ./index.sqlite3   
Originally created by @melyux on GitHub (Jul 14, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1181 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> After running for a while and/or snapshotting a certain amount of URLs, the Chromium call that does screenshots and DOM starts failing. When I exec into the container and run the `chromium` command again manually, it always works. But when doing "Pull" from the web UI, it always fails. It only starts working again if I stop the container, `docker rm` it, and then start it again. It's impossible from the log to see the exact Chromium error when this happens, because every single Chromium call is always prefixed by these error lines (and the ArchiveBox log only shows the first 5 lines): ``` find: ‘/root/.config/chromium/Crash Reports/pending/’: No such file or directory [574:599:0714/201021.258704:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:603:0714/201021.766940:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:603:0714/201021.767034:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:599:0714/201021.770660:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [574:599:0714/201021.771017:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [574:599:0714/201021.771115:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [574:599:0714/201021.771276:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [574:574:0714/201021.783527:ERROR:chrome_browser_cloud_management_controller.cc(162)] Cloud management controller initialization aborted as CBCM is not enabled. [574:599:0714/201021.800785:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [574:599:0714/201021.800842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") [609:609:0714/201021.834450:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. [609:609:0714/201021.834540:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display. [609:609:0714/201021.834551:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED [609:609:0714/201021.834564:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed. [609:609:0714/201021.834578:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed. [609:609:0714/201021.834653:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. [609:609:0714/201021.834676:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display. [609:609:0714/201021.834682:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED [609:609:0714/201021.834691:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed. [609:609:0714/201021.834698:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed. [609:609:0714/201021.836399:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization [574:574:0714/201021.876529:ERROR:object_proxy.cc(590)] Failed to call method: org.freedesktop.portal.Settings.Read: object_path= /org/freedesktop/portal/desktop: unknown error type: [574:651:0714/201021.894184:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:651:0714/201021.894302:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:651:0714/201021.894554:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:651:0714/201021.894618:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [574:651:0714/201021.894702:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory [650:650:0714/201021.930958:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. [650:650:0714/201021.930990:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display. [650:650:0714/201021.930994:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED [650:650:0714/201021.931003:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed. [650:650:0714/201021.931012:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed. [650:650:0714/201021.931059:ERROR:angle_platform_impl.cc(43)] Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. ERR: Display.cpp:1023 (initialize): ANGLE Display::initialize error 12289: Could not open the default X display. [650:650:0714/201021.931071:ERROR:gl_display.cc(520)] EGL Driver message (Critical) eglInitialize: Could not open the default X display. [650:650:0714/201021.931076:ERROR:gl_display.cc(790)] eglInitialize Default failed with error EGL_NOT_INITIALIZED [650:650:0714/201021.931079:ERROR:gl_display.cc(824)] Initialization of all EGL display types failed. [650:650:0714/201021.931084:ERROR:gl_ozone_egl.cc(26)] GLDisplayEGL::Initialize failed. [650:650:0714/201021.932311:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization ``` Despite these lines, the Screenshot and DOM still work manually. But they're preventing me from seeing what's going on when Chromium does fail to produce the Screenshot and DOM during the original run. #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> 1. Add a bunch of URLs 2. After like ~100 or more, all Screenshots and DOM saving starts to fail. #### Screenshots or log output <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> When re-running "Pull" on the failed snapshots, it always fails again and produces this output: ``` archivebox | [▶] [2023-07-14 19:21:50] Starting archiving of 1 snapshots in index... archivebox | archivebox | [√] [2023-07-14 19:21:50] "Website Name and Title" archivebox | https://domain.name.here/ archivebox | √ ./archive/1689326989.501584 archivebox | > screenshot archivebox | Extractor failed: archivebox | Failed to save screenshot archivebox | [307222:307247:0714/192151.111727:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307222:307250:0714/192151.131872:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307222:307250:0714/192151.131911:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307222:307247:0714/192151.138760:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") archivebox | [307222:307247:0714/192151.138842:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") archivebox | Run to see full output: archivebox | cd /data/archive/1689326989.501584; archivebox | /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --screenshot https://domain.name.here/ archivebox | archivebox | > dom archivebox | Extractor failed: archivebox | Failed to save DOM archivebox | [307272:307298:0714/192151.313489:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307272:307301:0714/192151.318865:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307272:307301:0714/192151.318894:ERROR:bus.cc(399)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory archivebox | [307272:307298:0714/192151.320545:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") archivebox | [307272:307298:0714/192151.320601:ERROR:bus.cc(399)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix") archivebox | Run to see full output: archivebox | cd /data/archive/1689326989.501584; archivebox | /usr/bin/chromium --headless=new --no-sandbox --no-zygote --disable-dev-shm-usage --disable-software-rasterizer --run-all-compositor-stages-before-draw --hide-scrollbars --window-size=1440,2000 --autoplay-policy=no-user-gesture-required --no-first-run --use-fake-ui-for-media-stream --use-fake-device-for-media-stream --disable-sync "--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/605.1.15 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)" --window-size=1440,2000 --dump-dom https://domain.name.here/ archivebox | archivebox | 192 files (5.4 MB) in 0:00:00s archivebox | archivebox | [√] [2023-07-14 19:21:51] Update of 1 pages complete (1.23 sec) ``` #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs find: '/.config/chromium/Crash Reports/pending/': No such file or directory 0.6.3 ArchiveBox v0.6.3 40ddd33 Cpython Linux Linux-6.1.0-10-amd64-x86_64-with-glibc2.31 x86_64 DEBUG=False IN_DOCKER=True IS_TTY=True TZ=UTC FS_ATOMIC=True FS_REMOTE=True FS_PERMS=644 0:0 SEARCH_BACKEND=ripgrep [i] Dependency versions: √ PYTHON_BINARY v3.11.4 valid /usr/local/bin/python3.11 √ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py √ ARCHIVEBOX_BINARY v0.6.3 valid /usr/local/bin/archivebox √ CURL_BINARY v7.74.0 valid /usr/bin/curl √ WGET_BINARY v1.21 valid /usr/bin/wget √ NODE_BINARY v18.16.1 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js - GIT_BINARY - disabled /usr/bin/git √ YOUTUBEDL_BINARY v2023.07.06 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v114.0.5735.198 valid /usr/bin/chromium √ RIPGREP_BINARY v12.1.1 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 23 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 9 files @ valid /data √ SOURCES_DIR 325 files valid ./sources √ LOGS_DIR 2 files valid ./logs √ ARCHIVE_DIR 1150 files valid ./archive √ CONFIG_FILE 133.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 11.2 MB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
Author
Owner

@melyux commented on GitHub (Jul 15, 2023):

Interesting point is that simply restarting the container doesn't work. If you restart the container, you're hit with the exact same problem. I have to stop the container, do docker compose rm -f archivebox, and then bring it back up. Not sure what this indicates...

<!-- gh-comment-id:1636690150 --> @melyux commented on GitHub (Jul 15, 2023): Interesting point is that simply restarting the container doesn't work. If you restart the container, you're hit with the exact same problem. I have to stop the container, do `docker compose rm -f archivebox`, and then bring it back up. Not sure what this indicates...
Author
Owner

@msalmasi commented on GitHub (Jul 19, 2023):

Hi @melyux . I ran into the same issue. Are you running on the dev branch of ArchiveBox?

I was able to "fix" this by:

  1. Restarting dbus manually:

sudo docker exec -u root -it archivebox service dbus start

  1. Removing a "SingletonLock" lock file that had been generated in my Chrome profile folder. The fact this lock file exists is why even restarting the container didn't work for me.

I'm not sure if the 2 issues are related or if I was just experiencing two simultaneous issues. It does appear that either way there is some instability in the dbus service on the ArchiveBox dev branch if we are both experiencing the same issue. This might be caused by the profile lock or causing the profile lock file to not be deleted as it should be. Not sure what exactly is causing this.

<!-- gh-comment-id:1642350361 --> @msalmasi commented on GitHub (Jul 19, 2023): Hi @melyux . I ran into the same issue. Are you running on the dev branch of ArchiveBox? I was able to "fix" this by: 1) Restarting dbus manually: sudo docker exec -u root -it archivebox service dbus start 2) Removing a "SingletonLock" lock file that had been generated in my Chrome profile folder. The fact this lock file exists is why even restarting the container didn't work for me. I'm not sure if the 2 issues are related or if I was just experiencing two simultaneous issues. It does appear that either way there is some instability in the dbus service on the ArchiveBox dev branch if we are both experiencing the same issue. This might be caused by the profile lock or causing the profile lock file to not be deleted as it should be. Not sure what exactly is causing this.
Author
Owner

@melyux commented on GitHub (Jul 19, 2023):

@msalmasi Yes, also on the dev branch. Great find. Seems like the singleton file thing for sure, but you say restarting dbus also temporarily fixes it even if that singleton file is present?

<!-- gh-comment-id:1642407708 --> @melyux commented on GitHub (Jul 19, 2023): @msalmasi Yes, also on the dev branch. Great find. Seems like the singleton file thing for sure, but you say restarting dbus also temporarily fixes it even if that singleton file is present?
Author
Owner

@melyux commented on GitHub (Jul 20, 2023):

@msalmasi I have been experimenting with the latest version of singlefile (manually calling "npm install -g single-file-cli" inside the container and setting SINGLEFILE_BINARY=/usr/bin/single-file for the Docker variable). I haven't heavily tested it like I was doing with the old version, but haven't gotten any failures since then. Can you give it a try and see if it works?

<!-- gh-comment-id:1642927262 --> @melyux commented on GitHub (Jul 20, 2023): @msalmasi I have been experimenting with the latest version of `singlefile` (manually calling "npm install -g single-file-cli" inside the container and setting `SINGLEFILE_BINARY=/usr/bin/single-file` for the Docker variable). I haven't heavily tested it like I was doing with the old version, but haven't gotten any failures since then. Can you give it a try and see if it works?
Author
Owner

@msalmasi commented on GitHub (Jul 22, 2023):

@melyux This seems to have fixed or at least improved the problem. I have not experienced the issue since updating to the latest version of singlefile, but I have not extensively tested yet.

Update: I must have jinxed it. Just failed on the last page I tried to grab: (https://www.nytimes.com/2022/07/19/dining/oklahoma-onion-burger-recipe.html)

<!-- gh-comment-id:1646383866 --> @msalmasi commented on GitHub (Jul 22, 2023): @melyux This seems to have fixed or at least improved the problem. I have not experienced the issue since updating to the latest version of singlefile, but I have not extensively tested yet. Update: I must have jinxed it. Just failed on the last page I tried to grab: (https://www.nytimes.com/2022/07/19/dining/oklahoma-onion-burger-recipe.html)
Author
Owner

@melyux commented on GitHub (Jul 22, 2023):

Mine also failed now after hanging on a screenshot. @msalmasi Do you know the exact path to the SingletonLock file?

<!-- gh-comment-id:1646688017 --> @melyux commented on GitHub (Jul 22, 2023): Mine also failed now after hanging on a screenshot. @msalmasi Do you know the exact path to the SingletonLock file?
Author
Owner

@msalmasi commented on GitHub (Jul 23, 2023):

@melyux The path for me is /config/.config/chromium/SingletonLock

<!-- gh-comment-id:1646699854 --> @msalmasi commented on GitHub (Jul 23, 2023): @melyux The path for me is /config/.config/chromium/SingletonLock
Author
Owner

@melyux commented on GitHub (Jul 23, 2023):

I couldn't find a config folder in the Docker container, /config didn't exist. No SingletonLock to be found either, but was still failing until container was removed and re-upped. I wonder what's going on

<!-- gh-comment-id:1646712569 --> @melyux commented on GitHub (Jul 23, 2023): I couldn't find a config folder in the Docker container, /config didn't exist. No SingletonLock to be found either, but was still failing until container was removed and re-upped. I wonder what's going on
Author
Owner

@msalmasi commented on GitHub (Jul 23, 2023):

It should just be in your Chrome user data folder. If you haven't specifically mounted this into your docker container then it might be in your /data folder.

<!-- gh-comment-id:1646742089 --> @msalmasi commented on GitHub (Jul 23, 2023): It should just be in your Chrome user data folder. If you haven't specifically mounted this into your docker container then it might be in your /data folder.
Author
Owner

@pirate commented on GitHub (Dec 17, 2023):

I'm going to close this as stale for now, as there are many changes and improvements made to Chrome in Docker ArchiveBox since the release OP is referring to (e.g. we now use a different Chrome install method managed by Playwright instead of by Apt).

If anyone is still experiencing issues running Chrome on >=0.7.1 please open a new issue with a screenshot of the error and the full output of docker compose run archivebox version. Thanks!

<!-- gh-comment-id:1859311552 --> @pirate commented on GitHub (Dec 17, 2023): I'm going to close this as stale for now, as there are many changes and improvements made to Chrome in Docker ArchiveBox since the release OP is referring to (e.g. we now use a different Chrome install method managed by Playwright instead of by Apt). If anyone is still experiencing issues running Chrome on >=0.7.1 please open a new issue with a screenshot of the error and the full output of `docker compose run archivebox version`. Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2244
No description provided.