mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-26 01:26:00 +03:00
[GH-ISSUE #1637] Bug: archivebox doesn't use cookie file and vnc doesn't show anything #3994
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3994
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @orthodoxe on GitHub (Jan 19, 2025).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1637
Originally assigned to: @pirate on GitHub.
Provide a screenshot and describe the bug
Description
I gave ArchiveBox a cookie file (netscape format) but when I try to archive a page like reddit or youtube that have cookie popups (or other type of popups) they still appear in the archive.
If I try to use the NoVNC browser to accept cookies I just see a debian wallpaper with an (almost) empty taskbar (see screenshot)
Screenshots
Vnc empty desktop
Expected result (reddit)
Result (reddit)
onefile

dom

Expected Result (youtube)
Result (youtube)
onefile

dom

wget

P.S.
I don't need to be logged into the accounts, I just don't want the cookie popup.
I also can't add a chrome profile as it gives errors (the container can't start if I uncomment the lines regarding the chrome profile in the docker compose):
Steps to reproduce
Logs or errors
ArchiveBox Version
How did you install the version of ArchiveBox you are using?
Docker (or Podman/LXC/K8s/TrueNAS/Proxmox/etc)
What operating system are you running on?
Linux (Ubuntu/Debian/Arch/Alpine/etc.)
What type of drive are you using to store your ArchiveBox data?
data/is on a local SSD or NVMe drivedata/is on a spinning hard drive or external USB drivedata/is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.)data/is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.)Docker Compose Configuration
ArchiveBox Configuration
@TooManyStacks commented on GitHub (Jan 24, 2025):
I can confirm that with the latest tag, neither the cookies.txt nor the chromium profile are working.
@pirate commented on GitHub (Jan 26, 2025):
OP's post indicates no
CHROME_USER_DATA_DIRis set up.The cookies.txt file only applies to a few of the methods (wget, curl, yt-dlp), for the rest a
CHROME_USER_DATA_DIRmust be set up.@orthodoxe commented on GitHub (Jan 28, 2025):
Thank you for your response.
I have tried to uncomment the lines regarding the chrome profile but when I restart the docker compose stack I get this error from the archivebox main container:
Inside the chrome profile folder I have the profile's data not another folder named "Default" or anything else.
@pirate commented on GitHub (Jan 28, 2025):
If there's no "default" folder inside then you're using the wrong folder, pass it the parent folder to the one you're using now.
@orthodoxe commented on GitHub (Jan 28, 2025):
I put the chrome profile's data inside a folder named "Default" so now I have:
I no longer get the error but websites archived with onefile (and other archiving methods that use chrome profile) still have the cookie popup.
@pirate commented on GitHub (Jan 28, 2025):
It should look like this:
ArchiveBox also has the concept of a "Persona" which is folder that contains all the state needed to impersonate a human (which can include a chrome profile dir). Don't be confused by
Defaultappearing twice in the path if you're putting the chrome profile in the/data/personas/Defaultdir, thepersonas/Defaultis created by archivebox, but really it can be any path you don't have to put it inthe personas dir, thechrome_profile/Defaultsubdir is created by chrome and cannot be relocated.CHROME_USER_DATA_DIR=/any/path/to/chrome_profile<- this is ok (this dir should already contain aDefaultdir inside that's created by chrome)CHROME_USER_DATA_DIR=/data/personas/Default/chrome_profile<- this is ok (this dir should already contain aDefaultdir inside that's created by chrome)CHROME_USER_DATA_DIR=/data/personas/Default/chrome_profile/Default<- this is incorrect, take the/Defaultoff the end (the error you saw was in the help text explaining this)CHROME_USER_DATA_DIR=/any/path/to/chrome_profile/Default<- this is incorrect (same, take/Defaultoff the end)For example to create and use a new chrome profile stored in
~/Desktop/test_profileyou'd run:@orthodoxe commented on GitHub (Jan 29, 2025):
Should I create the profile with google chrome or chromium? I read the docs and, from my understanding, I should use chromium but does archivebox accept google chrome profile too or do I have to find a way of installing chromium?
@pirate commented on GitHub (Jan 31, 2025):
You can use either one (controlled by
archivebox config --set CHROME_BINARY=chromium), but whatever you use needs to match. The browser that creates the profile needs to be on the same OS, CPU architecture, and ideally the exact same chromium/chrome binary.Depending on your OS you can use any of these:
github.com/ArchiveBox/ArchiveBox@12f109b1be/archivebox/pkgs/abx-plugin-chrome/abx_plugin_chrome/binaries.py (L27C1-L51C76)and probably other chromium based browsers too like
brave@orthodoxe commented on GitHub (Feb 10, 2025):
Hello, sorry to respond to late but life's been busy.
I tried making a chromium profile and it loads without errors but it still doesn't use it.
This is the

Defaultfolder insidechrome-profile. The mounted folder ischrome-profile.I still get the cookie popup on things like singlefile
@pirate commented on GitHub (Feb 11, 2025):
Singlefile not respecting the chrome profile is a known issue on v0.7.3, is it working for screenshot, PDF, and DOM though? Those are the only ones it applies to
@orthodoxe commented on GitHub (Feb 11, 2025):
Yes, it is working for pdf, screenshot and dom, I don't see the cookie popup.
Then I'll wait for some bug fixes and maybe help with the project myself.
Anyways, thanks for all the help.