[GH-ISSUE #1190] Feature Request: Add ability to choose config options e.g. YOUTUBEDL_ARGS for a specific add command #738

Closed
opened 2026-03-01 14:45:58 +03:00 by kerem · 2 comments
Owner

Originally created by @melyux on GitHub (Jul 22, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1190

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

Some feeds, I may only care about archiving with the smallest media sizes. Other ones, I'd want to archive at maximum quality. Currently there's no way to choose different options for different things I add. This is especially important for scheduled pulls, where I can't go in and change the setting before each import and then change it back

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

A new flag for the add command (--config) that would take a JSON list. These options would be set before this job and then set back to the old ones after the job.

(Would only be a problem if multiple jobs are running at the same time... not sure if that's a possible scenario.)

What hacks or alternative solutions have you tried to solve the problem?

For manual adding, I can set and unset the config options manually before and after the job, respectively, but with scheduled jobs this is not possible as far as I know (and even if it were, I'd have to by hand keep track of the default options and any options I set in the ENV when unsetting).

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time / money to fix this issue
  • I like ArchiveBox so far / would recommend it to a friend
  • I've had a lot of difficulty getting ArchiveBox set up
Originally created by @melyux on GitHub (Jul 22, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1190 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you :) --> ## Type - [ ] General question or discussion - [ ] Propose a brand new feature - [x] Request modification of existing behavior or design ## What is the problem that your feature request solves <!-- e.g. I need to be able to archive spanish and french subtitle files from a particular <example.com> movie site that's going down soon. --> Some feeds, I may only care about archiving with the smallest media sizes. Other ones, I'd want to archive at maximum quality. Currently there's no way to choose different options for different things I add. This is especially important for scheduled pulls, where I can't go in and change the setting before each import and then change it back ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes <!-- e.g. I specifically need a new archive method to look for multilingual subtitle files related to pages. The bigger picture solution is the ability for custom user scripts to be run in a puppeteer context during archiving. --> A new flag for the `add` command (`--config`) that would take a JSON list. These options would be set before this job and then set back to the old ones after the job. (Would only be a problem if multiple jobs are running at the same time... not sure if that's a possible scenario.) ## What hacks or alternative solutions have you tried to solve the problem? <!-- A clear and concise description of any alternative solutions, workarounds, or other software you've considered using to fix the problem. --> For manual adding, I can set and unset the config options manually before and after the job, respectively, but with scheduled jobs this is not possible as far as I know (and even if it were, I'd have to by hand keep track of the default options and any options I set in the ENV when unsetting). ## How badly do you want this new feature? - [ ] It's an urgent deal-breaker, I can't live without it - [x] It's important to add it in the near-mid term future - [ ] It would be nice to have eventually --- - [x] I'm willing to contribute [dev time](https://github.com/ArchiveBox/ArchiveBox#archivebox-development) / [money](https://github.com/sponsors/pirate) to fix this issue - [ ] I like ArchiveBox so far / would recommend it to a friend - [ ] I've had a lot of difficulty getting ArchiveBox set up
kerem closed this issue 2026-03-01 14:45:58 +03:00
Author
Owner

@melyux commented on GitHub (Jul 22, 2023):

I was able to get this working, even in the scheduler crontab! Like this:

@daily cd /data && YOUTUBEDL_ARGS='["--write-description","--write-info-json","--write-annotations","--write-thumbnail","--no-call-home","--write-sub","--all-subs","--convert-subs=srt","--yes-playlist","--continue","--no-abort-on-error","--ignore-errors","--geo-bypass","--add-metadata","-S","res:480","--write-comments"]' /usr/local/bin/archivebox add --parser json "$(/data/feedparse.py 'https://www.youtube.com/feeds/videos.xml?channel_id=ALSJFDLAFASDFASDF')" >> /data/logs/schedule.log 2>&1 # archivebox_schedule

So you just put your configuration variables right after the && and right before the /usr/local/bin/archivebox in the command. Another example:

@daily cd /data && SAVE_PDF=True SAVE_SCREENSHOT=False /usr/local/bin/archivebox add --parser json "$(/data/feedparse.py 'https://dustinspecker.com/rss.xml')" >> /data/logs/schedule.log 2>&1 # archivebox_schedule

Can confirm that the env variables set in one line don't affect the other lines; they only apply to the single add command on the line.

This should probably be documented somewhere, but it works great. Would be nice to have an option for this on the web UI too.

<!-- gh-comment-id:1646690020 --> @melyux commented on GitHub (Jul 22, 2023): I was able to get this working, even in the scheduler crontab! Like this: ```bash @daily cd /data && YOUTUBEDL_ARGS='["--write-description","--write-info-json","--write-annotations","--write-thumbnail","--no-call-home","--write-sub","--all-subs","--convert-subs=srt","--yes-playlist","--continue","--no-abort-on-error","--ignore-errors","--geo-bypass","--add-metadata","-S","res:480","--write-comments"]' /usr/local/bin/archivebox add --parser json "$(/data/feedparse.py 'https://www.youtube.com/feeds/videos.xml?channel_id=ALSJFDLAFASDFASDF')" >> /data/logs/schedule.log 2>&1 # archivebox_schedule ``` So you just put your configuration variables right after the `&&` and right before the `/usr/local/bin/archivebox` in the command. Another example: ```bash @daily cd /data && SAVE_PDF=True SAVE_SCREENSHOT=False /usr/local/bin/archivebox add --parser json "$(/data/feedparse.py 'https://dustinspecker.com/rss.xml')" >> /data/logs/schedule.log 2>&1 # archivebox_schedule ``` Can confirm that the env variables set in one line don't affect the other lines; they only apply to the single `add` command on the line. This should probably be documented somewhere, but it works great. Would be nice to have an option for this on the web UI too.
Author
Owner

@pirate commented on GitHub (Jul 27, 2023):

This is equivalent to doing env SOME_VAR="some temp value" archivebox ... which is documented in a few places, including the wiki Configuration and CLI Usage pages.

<!-- gh-comment-id:1654461938 --> @pirate commented on GitHub (Jul 27, 2023): This is equivalent to doing `env SOME_VAR="some temp value" archivebox ...` which is documented in a few places, including the wiki Configuration and CLI Usage pages.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#738
No description provided.