mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-26 01:26:00 +03:00
[GH-ISSUE #1181] Bug: Screenshots and DOM always fail after a while in v0.6.3 in Docker #2244
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#2244
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @melyux on GitHub (Jul 14, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1181
Describe the bug
After running for a while and/or snapshotting a certain amount of URLs, the Chromium call that does screenshots and DOM starts failing. When I exec into the container and run the
chromiumcommand again manually, it always works. But when doing "Pull" from the web UI, it always fails. It only starts working again if I stop the container,docker rmit, and then start it again.It's impossible from the log to see the exact Chromium error when this happens, because every single Chromium call is always prefixed by these error lines (and the ArchiveBox log only shows the first 5 lines):
Despite these lines, the Screenshot and DOM still work manually. But they're preventing me from seeing what's going on when Chromium does fail to produce the Screenshot and DOM during the original run.
Steps to reproduce
Screenshots or log output
When re-running "Pull" on the failed snapshots, it always fails again and produces this output:
ArchiveBox version
@melyux commented on GitHub (Jul 15, 2023):
Interesting point is that simply restarting the container doesn't work. If you restart the container, you're hit with the exact same problem. I have to stop the container, do
docker compose rm -f archivebox, and then bring it back up. Not sure what this indicates...@msalmasi commented on GitHub (Jul 19, 2023):
Hi @melyux . I ran into the same issue. Are you running on the dev branch of ArchiveBox?
I was able to "fix" this by:
sudo docker exec -u root -it archivebox service dbus start
I'm not sure if the 2 issues are related or if I was just experiencing two simultaneous issues. It does appear that either way there is some instability in the dbus service on the ArchiveBox dev branch if we are both experiencing the same issue. This might be caused by the profile lock or causing the profile lock file to not be deleted as it should be. Not sure what exactly is causing this.
@melyux commented on GitHub (Jul 19, 2023):
@msalmasi Yes, also on the dev branch. Great find. Seems like the singleton file thing for sure, but you say restarting dbus also temporarily fixes it even if that singleton file is present?
@melyux commented on GitHub (Jul 20, 2023):
@msalmasi I have been experimenting with the latest version of
singlefile(manually calling "npm install -g single-file-cli" inside the container and settingSINGLEFILE_BINARY=/usr/bin/single-filefor the Docker variable). I haven't heavily tested it like I was doing with the old version, but haven't gotten any failures since then. Can you give it a try and see if it works?@msalmasi commented on GitHub (Jul 22, 2023):
@melyux This seems to have fixed or at least improved the problem. I have not experienced the issue since updating to the latest version of singlefile, but I have not extensively tested yet.
Update: I must have jinxed it. Just failed on the last page I tried to grab: (https://www.nytimes.com/2022/07/19/dining/oklahoma-onion-burger-recipe.html)
@melyux commented on GitHub (Jul 22, 2023):
Mine also failed now after hanging on a screenshot. @msalmasi Do you know the exact path to the SingletonLock file?
@msalmasi commented on GitHub (Jul 23, 2023):
@melyux The path for me is /config/.config/chromium/SingletonLock
@melyux commented on GitHub (Jul 23, 2023):
I couldn't find a config folder in the Docker container, /config didn't exist. No SingletonLock to be found either, but was still failing until container was removed and re-upped. I wonder what's going on
@msalmasi commented on GitHub (Jul 23, 2023):
It should just be in your Chrome user data folder. If you haven't specifically mounted this into your docker container then it might be in your /data folder.
@pirate commented on GitHub (Dec 17, 2023):
I'm going to close this as stale for now, as there are many changes and improvements made to Chrome in Docker ArchiveBox since the release OP is referring to (e.g. we now use a different Chrome install method managed by Playwright instead of by Apt).
If anyone is still experiencing issues running Chrome on >=0.7.1 please open a new issue with a screenshot of the error and the full output of
docker compose run archivebox version. Thanks!