[GH-ISSUE #975] Bug: Can't use PUID=33 in docker #606

Open
opened 2026-03-01 14:44:56 +03:00 by kerem · 2 comments
Owner

Originally created by @HHousen on GitHub (May 15, 2022).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/975

Describe the bug

Creating an archivebox docker container using the PUID and PGID variables works correctly except when using PUID=33 and PGID=33 (www-data id). When using the below docker-compose configuration and running docker compose run archivebox init --setup, the data directory (/opt/archivebox) is owned by 999:docker not www-data:www-data.

Steps to reproduce

Use the following docker-compose configuration

  archivebox:
    container_name: archivebox
    image: archivebox/archivebox:latest
    restart: always
    command: server --quick-init 0.0.0.0:8000
    ports:
      - 8000:8000
    environment:
      - ALLOWED_HOSTS=*
      - MEDIA_MAX_SIZE=750m
      - PUID=33
      - PGID=33
    volumes:
      - /opt/archivebox:/data

I am able to use the id 33 in linuxserver.io images with the same PUID and PGID options. Also, using a different id like 876 works as intended with the data directory being owned by 876:876.

ArchiveBox version

ArchiveBox v0.6.3
Cpython Linux Linux-5.4.0-110-generic-x86_64-with-glibc2.31 x86_64
IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.3          valid     /usr/local/bin/archivebox                                                   
 √  PYTHON_BINARY         v3.10.4         valid     /usr/local/bin/python3.10                                                   
 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.10/site-packages/django/bin/django-admin.py          
 √  CURL_BINARY           v7.74.0         valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21           valid     /usr/bin/wget                                                               
 √  NODE_BINARY           v17.9.0         valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file                              
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor              
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js                         
 √  GIT_BINARY            v2.30.2         valid     /usr/bin/git                                                                
 √  YOUTUBEDL_BINARY      v2022.04.08     valid     /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v101.0.4951.41  valid     /usr/bin/chromium                                                           
 √  RIPGREP_BINARY        v12.1.1         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           24 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates                                                   
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                              

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled                                                                              
 -  COOKIES_FILE          -               disabled                                                                              


[i] Data locations:
Originally created by @HHousen on GitHub (May 15, 2022). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/975 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> Creating an archivebox docker container using the `PUID` and `PGID` variables works correctly except when using `PUID=33` and `PGID=33` (`www-data` id). When using the below docker-compose configuration and running `docker compose run archivebox init --setup`, the data directory (`/opt/archivebox`) is owned by `999:docker` not `www-data:www-data`. #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> Use the following docker-compose configuration ```yaml archivebox: container_name: archivebox image: archivebox/archivebox:latest restart: always command: server --quick-init 0.0.0.0:8000 ports: - 8000:8000 environment: - ALLOWED_HOSTS=* - MEDIA_MAX_SIZE=750m - PUID=33 - PGID=33 volumes: - /opt/archivebox:/data ``` I am able to use the id `33` in linuxserver.io images with the same `PUID` and `PGID` options. Also, using a different id like `876` works as intended with the data directory being owned by `876:876`. #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs ArchiveBox v0.6.3 Cpython Linux Linux-5.4.0-110-generic-x86_64-with-glibc2.31 x86_64 IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.3 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.10.4 valid /usr/local/bin/python3.10 √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.10/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.74.0 valid /usr/bin/curl √ WGET_BINARY v1.21 valid /usr/bin/wget √ NODE_BINARY v17.9.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.30.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2022.04.08 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v101.0.4951.41 valid /usr/bin/chromium √ RIPGREP_BINARY v12.1.1 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 24 files valid /app/archivebox √ TEMPLATES_DIR 4 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
Author
Owner

@pirate commented on GitHub (Jun 9, 2022):

I've added some more debug output to the version command to help track down the bug. Can you install from dev and post the output of archivebox version again.

<!-- gh-comment-id:1150588252 --> @pirate commented on GitHub (Jun 9, 2022): I've added some more debug output to the version command to help track down the bug. Can you [install from dev](https://github.com/ArchiveBox/ArchiveBox#install-and-run-a-specific-github-branch) and post the output of `archivebox version` again.
Author
Owner

@pirate commented on GitHub (Jan 4, 2024):

It seems like 33 conflicts with an existing UID in the container. Are you able to use a different PUID? If so you can keep 33 as the PGID so that the files are accessible by www-data on the host.

Also please try with the latest archivebox/archivebox:0.7.2 build as I've added many helpful error messages and improved Docker permissions handling in general.

<!-- gh-comment-id:1877682757 --> @pirate commented on GitHub (Jan 4, 2024): It seems like `33` conflicts with an existing UID in the container. Are you able to use a different PUID? If so you can keep 33 as the `PGID` so that the files are accessible by `www-data` on the host. Also please try with the latest `archivebox/archivebox:0.7.2` build as I've added many helpful error messages and improved Docker permissions handling in general.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#606
No description provided.