[GH-ISSUE #1530] Bug: Configuration like SAVE_*, USE_CHROME, and YOUTUBEDL_ARGS are inconsistently read on v0.8.5rc3 #905

Open
opened 2026-03-01 14:47:12 +03:00 by kerem · 1 comment
Owner

Originally created by @nguyenmp on GitHub (Oct 7, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1530

Describe the bug

In the old version v0.7.2, I set all my config through the environment variables on my docker container (defined in compose.yaml). I'm trying to upgrade to v0.8.5rc3 and am discovering some very strange behavior.

The easiest one I can define is that YOUTUBEDL_ARGS no longer seems to work. It seems like, under the hood, we've switched to yt-dlp and the code is now using YTDLP_EXTRA_ARGS. However, I am unable to set either of these in environment variables OR through the CLI. The command spits out a fat stack trace saying it can't find the right section for this config. I think the problem is that this extractor is now a "plugin" so configuring it is different now?

docker compose run archivebox config --set "YTDLP_EXTRA_ARGS=["--write-description", "--skip-download", "--write-subs"]"

On the other hand, I feel like USE_CHROME is completely ignored now and reading through the code, I think that the code was accidentally lost.

I tried to switch to SAVE_* variables but they seem to ONLY be read from the ArchiveBox.conf file which I finally figured out how to set in the CLI:

docker compose run archivebox config --set SAVE_PDF=false
docker compose run archivebox config --set SAVE_SCREENSHOT=false
docker compose run archivebox config --set SAVE_DOM=false
docker compose run archivebox config --set SAVE_SINGLEFILE=false

But that leaves me with the question, are environment variables still going to be supported?

Steps to reproduce

Run docker compose up with files that

Screenshots or log output

Let me know, happy to add more info here but this seems like a pervasive issue rather than a specific one.

ArchiveBox version

0.8.5rc3
ArchiveBox v0.8.5rc3 COMMIT_HASH=7a895d9 BUILD_TIME=2024-10-05 23:45:50 1728171950
IN_DOCKER=True IN_QEMU=False ARCH=aarch64 OS=Linux PLATFORM=Linux-6.10.4-linuxkit-aarch64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=911:0 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

 Dependency versions:
 √  node                  22.9.0       apt        /usr/bin/node
 √  npm                   10.9.0       apt        /usr/bin/npm
 √  pip                   24.0.0       sys_pip    /usr/local/bin/pip
 √  python                3.11.10      sys_pip    /usr/local/bin/python3.11
 √  sqlite                2.6.0        venv_pip   /usr/local/lib/python3.11/sqlite3/dbapi2.py
 √  django                5.1.1        venv_pip   /usr/local/lib/python3.11/site-packages/django/__init__.py
 √  playwright            1.47.0       sys_pip    /usr/local/bin/playwright
 √  puppeteer             23.5.0       lib_npm    /app/lib/npm/node_modules/.bin/puppeteer
 √  ldap                  3.4.4        venv_pip   /usr/local/lib/python3.11/site-packages/ldap/__init__.py
 √  rg                    13.0.0       apt        /usr/bin/rg
 √  sonic                 1.4.9        env        /usr/local/bin/sonic
 √  chrome                129.0.6668   env        /usr/bin/chromium-browser
 √  curl                  8.10.1       apt        /usr/bin/curl
 √  git                   2.39.5       apt        /usr/bin/git
 √  postlight-parser      2.2.3        lib_npm    /app/lib/npm/node_modules/.bin/postlight-parser
 √  readability-extractor 0.0.11       lib_npm    /app/lib/npm/node_modules/.bin/readability-extractor
 √  single-file           1.1.54       lib_npm    /app/lib/npm/node_modules/.bin/single-file
 √  wget                  1.21.3       apt        /usr/bin/wget
 √  yt-dlp                2024.9.27    apt        /usr/bin/yt-dlp
 √  ffmpeg                5.1.6        env        /usr/bin/ffmpeg

 Source-code locations:
 √  PACKAGE_DIR           34 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates                                                   
 √  LIB_DIR               3 files         valid     /app/lib                                                                    
 √  TMP_DIR               2 files         valid     /tmp/archivebox                                                             

 Data locations:
 √  DATA_DIR              14 files @      valid     /data                                                                       
 √  CONFIG_FILE           240.0 Bytes     valid     ./ArchiveBox.conf                      
 √  SQL_INDEX             424.0 KB        valid     ./index.sqlite3                        
 √  QUEUE_DATABASE        92.0 KB         valid     ./queue.sqlite3                        
 √  ARCHIVE_DIR           2 files         valid     ./archive                              
 √  SOURCES_DIR           2 files         valid     ./sources                              
 √  LOGS_DIR              5 files         valid     ./logs                                 
 √  PERSONAS_DIR          1 files         valid     ./personas                             
 -  CUSTOM_TEMPLATES_DIR  missing         unused    ./user_templates                       
 -  USER_PLUGINS_DIR      missing         unused    ./user_plugins                         
Originally created by @nguyenmp on GitHub (Oct 7, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1530 #### Describe the bug In the old version v0.7.2, I set all my config through the environment variables on my docker container (defined in compose.yaml). I'm trying to upgrade to v0.8.5rc3 and am discovering some very strange behavior. The easiest one I can define is that `YOUTUBEDL_ARGS` no longer seems to work. It seems like, under the hood, we've switched to `yt-dlp` and the code is now using `YTDLP_EXTRA_ARGS`. However, I am unable to set either of these in environment variables OR through the CLI. The command spits out a fat stack trace saying it can't find the right section for this config. I think the problem is that this extractor is now a "plugin" so configuring it is different now? ``` docker compose run archivebox config --set "YTDLP_EXTRA_ARGS=["--write-description", "--skip-download", "--write-subs"]" ``` On the other hand, I feel like `USE_CHROME` is completely ignored now and reading through the code, I think that the code was accidentally lost. I tried to switch to SAVE_* variables but they seem to ONLY be read from the `ArchiveBox.conf` file which I finally figured out how to set in the CLI: ``` docker compose run archivebox config --set SAVE_PDF=false docker compose run archivebox config --set SAVE_SCREENSHOT=false docker compose run archivebox config --set SAVE_DOM=false docker compose run archivebox config --set SAVE_SINGLEFILE=false ``` But that leaves me with the question, are environment variables still going to be supported? #### Steps to reproduce Run `docker compose up` with files that #### Screenshots or log output Let me know, happy to add more info here but this seems like a pervasive issue rather than a specific one. #### ArchiveBox version ``` 0.8.5rc3 ArchiveBox v0.8.5rc3 COMMIT_HASH=7a895d9 BUILD_TIME=2024-10-05 23:45:50 1728171950 IN_DOCKER=True IN_QEMU=False ARCH=aarch64 OS=Linux PLATFORM=Linux-6.10.4-linuxkit-aarch64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=911:0 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False Dependency versions: √ node 22.9.0 apt /usr/bin/node √ npm 10.9.0 apt /usr/bin/npm √ pip 24.0.0 sys_pip /usr/local/bin/pip √ python 3.11.10 sys_pip /usr/local/bin/python3.11 √ sqlite 2.6.0 venv_pip /usr/local/lib/python3.11/sqlite3/dbapi2.py √ django 5.1.1 venv_pip /usr/local/lib/python3.11/site-packages/django/__init__.py √ playwright 1.47.0 sys_pip /usr/local/bin/playwright √ puppeteer 23.5.0 lib_npm /app/lib/npm/node_modules/.bin/puppeteer √ ldap 3.4.4 venv_pip /usr/local/lib/python3.11/site-packages/ldap/__init__.py √ rg 13.0.0 apt /usr/bin/rg √ sonic 1.4.9 env /usr/local/bin/sonic √ chrome 129.0.6668 env /usr/bin/chromium-browser √ curl 8.10.1 apt /usr/bin/curl √ git 2.39.5 apt /usr/bin/git √ postlight-parser 2.2.3 lib_npm /app/lib/npm/node_modules/.bin/postlight-parser √ readability-extractor 0.0.11 lib_npm /app/lib/npm/node_modules/.bin/readability-extractor √ single-file 1.1.54 lib_npm /app/lib/npm/node_modules/.bin/single-file √ wget 1.21.3 apt /usr/bin/wget √ yt-dlp 2024.9.27 apt /usr/bin/yt-dlp √ ffmpeg 5.1.6 env /usr/bin/ffmpeg Source-code locations: √ PACKAGE_DIR 34 files valid /app/archivebox √ TEMPLATES_DIR 4 files valid /app/archivebox/templates √ LIB_DIR 3 files valid /app/lib √ TMP_DIR 2 files valid /tmp/archivebox Data locations: √ DATA_DIR 14 files @ valid /data √ CONFIG_FILE 240.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 424.0 KB valid ./index.sqlite3 √ QUEUE_DATABASE 92.0 KB valid ./queue.sqlite3 √ ARCHIVE_DIR 2 files valid ./archive √ SOURCES_DIR 2 files valid ./sources √ LOGS_DIR 5 files valid ./logs √ PERSONAS_DIR 1 files valid ./personas - CUSTOM_TEMPLATES_DIR missing unused ./user_templates - USER_PLUGINS_DIR missing unused ./user_plugins ```
Author
Owner

@pirate commented on GitHub (Oct 7, 2024):

The ArchiveBox config command for writing changes to config is not yet implemented in v0.8.5, hang tight I'm still working on it.

Config got a huge overhaul and I haven't matched up all the old aliases yet.

(The code for it wasn't lost it's just been broken out into plugins and changed dramatically)

<!-- gh-comment-id:2397518366 --> @pirate commented on GitHub (Oct 7, 2024): The ArchiveBox config command for writing changes to config is not yet implemented in v0.8.5, hang tight I'm still working on it. Config got a huge overhaul and I haven't matched up all the old aliases yet. (The code for it wasn't lost it's just been broken out into plugins and changed dramatically)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#905
No description provided.