[GH-ISSUE #1247] Question: Is there a way around permission checks? #766

Closed
opened 2026-03-01 14:46:10 +03:00 by kerem · 11 comments
Owner

Originally created by @zblesk on GitHub (Oct 18, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1247

Because of space constraints, I had to move my big archive from the server to a NAS. I've then mounted the archive via NFS and pointed Archivebox to it.
I haven't been able to get it running since.

It always only gets to:

archivebox_1  | Change in ownership detected, please be patient while we chown existing files
archivebox_1  | This could take some time...

The archive is ~300GB big and because of how it's structured, all FS-level operations take a lot of time.

I've tried chowning it myself, which took over 4 hours.
I apparently didn't hit the correct perm combo, because the message just appears again.

I've let the container run for ~17 hours, which should be more than enough, but nothing changed.
(I'm currently running a differenattempt at a maual fix, but I'm not holding out much hope.)

Is there any way around this? Overriding which user is used (UID and GID in Compose don'd seem to do anything), or maybe letting it run as root and skip perm checks, or something?

(I have also temporarily stopped and disabled Sonic, just to be sure.)

Thanks.

Originally created by @zblesk on GitHub (Oct 18, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1247 Because of space constraints, I had to move my big archive from the server to a NAS. I've then mounted the archive via NFS and pointed Archivebox to it. I haven't been able to get it running since. It always only gets to: ``` archivebox_1 | Change in ownership detected, please be patient while we chown existing files archivebox_1 | This could take some time... ``` The archive is ~300GB big and because of how it's structured, all FS-level operations take a lot of time. I've tried chowning it myself, which took over 4 hours. I apparently didn't hit the correct perm combo, because the message just appears again. I've let the container run for ~17 hours, which should be more than enough, but nothing changed. (I'm currently running a differenattempt at a maual fix, but I'm not holding out much hope.) Is there any way around this? Overriding which user is used (UID and GID in Compose don'd seem to do anything), or maybe letting it run as root and skip perm checks, or something? (I have also temporarily stopped and disabled Sonic, just to be sure.) Thanks.
kerem closed this issue 2026-03-01 14:46:10 +03:00
Author
Owner

@pirate commented on GitHub (Oct 18, 2023):

I am thinking of removing these checks entirely. They were added to help beginner users with common permission issues when mounting via Docker Desktop on macOS/Windows, but they cause problems for advanced users on network filesystems or who want to manage permissions more granularly.

Until I release that change, as a temporary workaround, you can download and mount this script into docker and remove the permissions check lines here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/bin/docker_entrypoint.sh#L16

This will override the entrypoint that runs when docker starts and should let you change it without needing to rebuild an entire custom docker image.

<!-- gh-comment-id:1769152712 --> @pirate commented on GitHub (Oct 18, 2023): I am thinking of removing these checks entirely. They were added to help beginner users with common permission issues when mounting via Docker Desktop on macOS/Windows, but they cause problems for advanced users on network filesystems or who want to manage permissions more granularly. ~~Until I release that change, as a temporary workaround, you can download and mount this script into docker and remove the permissions check lines here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/bin/docker_entrypoint.sh#L16~~ ~~This will override the entrypoint that runs when docker starts and should let you change it without needing to rebuild an entire custom docker image.~~
Author
Owner

@pirate commented on GitHub (Oct 19, 2023):

Ok I removed the forced chown and replaced it with a warning telling the user how to fix it manually: github.com/ArchiveBox/ArchiveBox@4f655fc4a1

<!-- gh-comment-id:1769744408 --> @pirate commented on GitHub (Oct 19, 2023): Ok I removed the forced `chown` and replaced it with a warning telling the user how to fix it manually: https://github.com/ArchiveBox/ArchiveBox/commit/4f655fc4a1fc21759fe21f13b354d940ca53b4a2
Author
Owner

@zblesk commented on GitHub (Oct 20, 2023):

Great, thanks! Do you plan to push the change to Docker Hub anytime soon? 🙏🏻

<!-- gh-comment-id:1772576766 --> @zblesk commented on GitHub (Oct 20, 2023): Great, thanks! Do you plan to push the change to Docker Hub anytime soon? 🙏🏻
Author
Owner

@zblesk commented on GitHub (Oct 24, 2023):

@pirate I've tried pulling the :dev image, and I'm seeing some weird behavior.
All files were owned by my default user (1000:1000). But when I start the container, it starts changing the permissions - it sets ArchiveBox.conf to 999:999, then crashes with PermissionError: [Errno 13] Permission denied: '/data/ArchiveBox.conf'.
I've tried chowning it and starting the container again, but it just sets it to 999 again and crashes.

<!-- gh-comment-id:1777812481 --> @zblesk commented on GitHub (Oct 24, 2023): @pirate I've tried pulling the :dev image, and I'm seeing some weird behavior. All files were owned by my default user (1000:1000). But when I start the container, it starts changing the permissions - it sets ArchiveBox.conf to 999:999, then crashes with `PermissionError: [Errno 13] Permission denied: '/data/ArchiveBox.conf'`. I've tried chowning it and starting the container again, but it just sets it to 999 again and crashes.
Author
Owner

@pirate commented on GitHub (Oct 26, 2023):

Hmm strange ok I'll remove the top-level chown. What OS are you using on the host?

<!-- gh-comment-id:1780594481 --> @pirate commented on GitHub (Oct 26, 2023): Hmm strange ok I'll remove the top-level chown. What OS are you using on the host?
Author
Owner

@p0n1 commented on GitHub (Oct 26, 2023):

@pirate I've tried pulling the :dev image, and I'm seeing some weird behavior. All files were owned by my default user (1000:1000). But when I start the container, it starts changing the permissions - it sets ArchiveBox.conf to 999:999, then crashes with PermissionError: [Errno 13] Permission denied: '/data/ArchiveBox.conf'. I've tried chowning it and starting the container again, but it just sets it to 999 again and crashes.

@pirate

Similar issue here.

I used to use image with latest tag and it works perfectly. I just switched to the latest dev tag for SINGLEFILE_ARGS feature today.

Got errors bellow:

2023-10-26T13:07:51.804235828Z touch: cannot touch '/data/archive/.permissions_test_safe_to_delete': Permission denied
2023-10-26T13:07:51.804394367Z /app/bin/docker_entrypoint.sh: line 25: 2: Permission denied
2023-10-26T13:07:51.804630917Z /app/bin/docker_entrypoint.sh: line 26: 2: Permission denied
2023-10-26T13:07:51.804882554Z /app/bin/docker_entrypoint.sh: line 27: 2: Permission denied

I just deployed a new instance for test. If I mount a empty folder, the dev version could start correctly. But it will fail to start if I just restarted the container and get permission errors like above. I'm using a folder mounted by NFS.

Update:

If I just mount a local folder, everything works perfectly without any permission issue. This should be related with strange behavior is NFS mounting.

<!-- gh-comment-id:1781105542 --> @p0n1 commented on GitHub (Oct 26, 2023): > @pirate I've tried pulling the :dev image, and I'm seeing some weird behavior. All files were owned by my default user (1000:1000). But when I start the container, it starts changing the permissions - it sets ArchiveBox.conf to 999:999, then crashes with `PermissionError: [Errno 13] Permission denied: '/data/ArchiveBox.conf'`. I've tried chowning it and starting the container again, but it just sets it to 999 again and crashes. @pirate Similar issue here. I used to use image with `latest` tag and it works perfectly. I just switched to the latest `dev` tag for `SINGLEFILE_ARGS` feature today. Got errors bellow: ``` 2023-10-26T13:07:51.804235828Z touch: cannot touch '/data/archive/.permissions_test_safe_to_delete': Permission denied 2023-10-26T13:07:51.804394367Z /app/bin/docker_entrypoint.sh: line 25: 2: Permission denied 2023-10-26T13:07:51.804630917Z /app/bin/docker_entrypoint.sh: line 26: 2: Permission denied 2023-10-26T13:07:51.804882554Z /app/bin/docker_entrypoint.sh: line 27: 2: Permission denied ``` I just deployed a new instance for test. If I mount a empty folder, the `dev` version could start correctly. But it will fail to start if I just restarted the container and get permission errors like above. I'm using a folder mounted by NFS. Update: If I just mount a local folder, everything works perfectly without any permission issue. This should be related with strange behavior is NFS mounting.
Author
Owner

@pirate commented on GitHub (Oct 30, 2023):

@zblesk and @p0n1 both of your issues should be fixed by passing the PUID & PGID environment variables to the container containing the UID & GID you want the files to be owned by. This matches the behavior of all LinuxServer.io docker images, and is a general common standard for many containers. I'm moving away from force-chowning the data dir in favor of this standard as it's a better practice to ask the user what the ownership should be rather than reset it by force. https://docs.linuxserver.io/general/understanding-puid-and-pgid

I also made some changes to the entrypoint script. Can you try setting those variables, pulling/building the latest :dev image, and running it again?

<!-- gh-comment-id:1786164847 --> @pirate commented on GitHub (Oct 30, 2023): @zblesk and @p0n1 both of your issues should be fixed by passing the PUID & PGID environment variables to the container containing the UID & GID you want the files to be owned by. This matches the behavior of all LinuxServer.io docker images, and is a general common standard for many containers. I'm moving away from force-chowning the data dir in favor of this standard as it's a better practice to ask the user what the ownership should be rather than reset it by force. https://docs.linuxserver.io/general/understanding-puid-and-pgid I also made some changes to the entrypoint script. Can you try setting those variables, pulling/building the latest `:dev` image, and running it again?
Author
Owner

@mAAdhaTTah commented on GitHub (Oct 30, 2023):

@pirate dev image isn't building, was last published 11 days ago.

<!-- gh-comment-id:1786204081 --> @mAAdhaTTah commented on GitHub (Oct 30, 2023): @pirate `dev` image isn't building, was last published [11 days ago](https://hub.docker.com/r/archivebox/archivebox/tags).
Author
Owner

@p0n1 commented on GitHub (Oct 31, 2023):

Thanks for replying. @pirate I just deployed a new instance for testing again. I set the PUID and GUID as 999 which is the same as the host. I got the same error I posted before.

Just as @mAAdhaTTah raised, maybe I'm not using the latest dev image you mentioned.

<!-- gh-comment-id:1786745982 --> @p0n1 commented on GitHub (Oct 31, 2023): Thanks for replying. @pirate I just deployed a new instance for testing again. I set the `PUID` and `GUID` as `999` which is the same as the host. I got the same error I posted before. Just as @mAAdhaTTah raised, maybe I'm not using the latest `dev` image you mentioned.
Author
Owner

@pirate commented on GitHub (Nov 1, 2023):

My apologies, just fixed the build and just pushed dev, (aka 0.7.0, aka latest) for amd64, arm64, and arm/v7 (run ./bin/build_docker.sh to build these yourself).

  • fix issue where error output was writing to >./2 instead of >&2
  • massively increased the speed of all the Docker builds (>20x) by adding granular shared cache mounts for reusable apt and pip caches (--mount=type=cache...)
  • fixed multi-arch docker builds for amd64, arm64, arm/v7 and updated Dockerfile and ./bin/build_docker.sh with latest config and PDM package manager
  • attempt to autodetect PUID and PGID based on the ownership of existing files within ./data, should prevent unecessary chowns and remove the need to pass PUID/PGID explicitly in many cases
  • and added more helpful instructions to clearly show when a user has hit an edge case that requires passing explicit PUID & PGID (like non-compliant NFS/Samba/FUSE mounts or Docker Desktop on some platforms)
  • updated the base image to python:3.11-bookworm-slim with bookworm-backports, node v21, yt-dlp 2023.10.13, chrome v119, and all other apt, pip, and npm dependency versions upgraded
  • with a lot of research, yak-shaving, and trial and error I've managed to get the Docker image down to 500~630mb for all platforms (it was >1GB in the previous version), but I cant promise it'll stay this small forever. It's a lot of work to keep it as small as possible for all architectures!
  • added new /VERSION.txt file containing build summary, check it on your platform for lots of juicy details about what version you're running
# pull the :latest image from docker hub (also published to ghcr.io/archivebox/archivebox)
docker pull archivebox/archivebox

# get detailed info about the build and dependency versions
docker run archivebox/archivebox cat /VERSION.txt
docker run -it archivebox/archivebox version

# run update to retry any previously failed downloads using the latest extractor versions
docker run -it -v $PWD:/data archivebox/archivebox update

https://hub.docker.com/layers/archivebox/archivebox/dev/images/sha256-afa301d566a01cab8934aa1141b5f6133a4585ee5b449304c659c765e240575d?context=explore

<!-- gh-comment-id:1788269076 --> @pirate commented on GitHub (Nov 1, 2023): My apologies, just fixed the build and just pushed `dev`, (aka `0.7.0`, aka `latest`) for `amd64`, `arm64`, and `arm/v7` (run `./bin/build_docker.sh` to build these yourself). - fix issue where error output was writing to `>./2` instead of `>&2` - massively increased the speed of all the Docker builds (>20x) by adding granular shared cache mounts for reusable apt and pip caches ([`--mount=type=cache...`](https://github.com/docker/buildx/issues/549#issuecomment-1788266709)) - fixed multi-arch docker builds for `amd64`, `arm64`, `arm/v7` and updated Dockerfile and `./bin/build_docker.sh` with latest config and `PDM` package manager - attempt to autodetect PUID and PGID based on the ownership of existing files within `./data`, should prevent unecessary `chown`s and remove the need to pass `PUID`/`PGID` explicitly in many cases - and added more helpful instructions to clearly show when a user has hit an edge case that requires passing explicit PUID & PGID (like non-compliant NFS/Samba/FUSE mounts or Docker Desktop on some platforms) - updated the base image to `python:3.11-bookworm-slim` with `bookworm-backports`, `node v21`, `yt-dlp 2023.10.13`, `chrome v119`, and all other `apt`, `pip`, and `npm` dependency versions upgraded - with a lot of [research, yak-shaving, and trial and error](https://github.com/ArchiveBox/ArchiveBox/commits/dev/) I've managed to get the Docker image down to [500~630mb for all platforms](https://hub.docker.com/r/archivebox/archivebox/tags) (it was >1GB in the previous version), but I cant promise it'll stay this small forever. It's a lot of work to keep it as small as possible for all architectures! - added new `/VERSION.txt` file containing build summary, check it on your platform for lots of juicy details about what version you're running ```bash # pull the :latest image from docker hub (also published to ghcr.io/archivebox/archivebox) docker pull archivebox/archivebox # get detailed info about the build and dependency versions docker run archivebox/archivebox cat /VERSION.txt docker run -it archivebox/archivebox version # run update to retry any previously failed downloads using the latest extractor versions docker run -it -v $PWD:/data archivebox/archivebox update ``` https://hub.docker.com/layers/archivebox/archivebox/dev/images/sha256-afa301d566a01cab8934aa1141b5f6133a4585ee5b449304c659c765e240575d?context=explore
Author
Owner

@p0n1 commented on GitHub (Nov 1, 2023):

Thanks for the update @pirate. I just tested the latest dev tag with PUID=1000 and PGID=1000. Everything works now.

I also figured out why many permission errors for my case and even can't start a new container instance when I mount the NFS folder filled with archivebox archives.

It relates with the following code.

github.com/ArchiveBox/ArchiveBox@c6e5a565c0/bin/docker_entrypoint.sh (L21-L27)

This script couldn't write into /data folder with root user for NFS folders. After I set mapall field in NFS share to the the original owner user in the host machine, this check will pass.

ref: mapall in https://man.freebsd.org/cgi/man.cgi?exports(5).

<!-- gh-comment-id:1788594311 --> @p0n1 commented on GitHub (Nov 1, 2023): Thanks for the update @pirate. I just tested the latest `dev` tag with `PUID=1000` and `PGID=1000`. Everything works now. I also figured out why many permission errors for my case and even can't start a new container instance when I mount the NFS folder filled with archivebox archives. It relates with the following code. https://github.com/ArchiveBox/ArchiveBox/blob/c6e5a565c0b040358ab326d59f329c549e7f6ce0/bin/docker_entrypoint.sh#L21-L27 This script couldn't write into `/data` folder with `root` user for NFS folders. After I set `mapall` field in NFS share to the the original owner user in the host machine, this check will pass. ref: `mapall` in https://man.freebsd.org/cgi/man.cgi?exports(5).
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#766
No description provided.