[PR #506] [MERGED] feat: Add config for extractor args #2706

Closed
opened 2026-03-01 18:00:29 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/506
Author: @cdvv7788
Created: 10/14/2020
Status: Merged
Merged: 10/22/2020
Merged by: @cdvv7788

Base: masterHead: extractors-dependencies


📝 Commits (8)

  • a286bfb feat: Add config for youtubedl (YOUTUBEDL_ARGS)
  • 601779d refactor: Use json.loads instead of split for list arguments
  • 20fd004 feat: Add WGET_ARGS to control wget arguments
  • 86ee658 feat: Add CURL_ARGS to control curl arguments
  • 8884b55 feat: Use CURL_ARGS on header extractor
  • ab5972f feat: Use CURL_ARGS in favicon extractor
  • e2135ea feat: Use CURL_ARGS on title extractor
  • 7d33f60 refactor: Change typing for new stubs

📊 Changes

9 files changed (+58 additions, -45 deletions)

View changed files

📝 archivebox/config/__init__.py (+37 -3)
📝 archivebox/config/stubs.py (+6 -2)
📝 archivebox/extractors/archive_org.py (+2 -3)
📝 archivebox/extractors/favicon.py (+2 -3)
📝 archivebox/extractors/git.py (+2 -1)
📝 archivebox/extractors/headers.py (+3 -4)
📝 archivebox/extractors/media.py (+2 -18)
📝 archivebox/extractors/title.py (+2 -3)
📝 archivebox/extractors/wget.py (+2 -8)

📄 Description

Summary

Create config options to consolidate extractor args in a single key

Related issues

#503

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/506 **Author:** [@cdvv7788](https://github.com/cdvv7788) **Created:** 10/14/2020 **Status:** ✅ Merged **Merged:** 10/22/2020 **Merged by:** [@cdvv7788](https://github.com/cdvv7788) **Base:** `master` ← **Head:** `extractors-dependencies` --- ### 📝 Commits (8) - [`a286bfb`](https://github.com/ArchiveBox/ArchiveBox/commit/a286bfb3a01f9b9870d276a6fc02787fe7d29c55) feat: Add config for youtubedl (YOUTUBEDL_ARGS) - [`601779d`](https://github.com/ArchiveBox/ArchiveBox/commit/601779d69abc4a2c2f262fd9b568a0083b58267d) refactor: Use json.loads instead of split for list arguments - [`20fd004`](https://github.com/ArchiveBox/ArchiveBox/commit/20fd0047218838dbb0af0fe0381fa122e4036ec0) feat: Add WGET_ARGS to control wget arguments - [`86ee658`](https://github.com/ArchiveBox/ArchiveBox/commit/86ee658bcdda567b2bbd652a0ed28a3764284c99) feat: Add CURL_ARGS to control curl arguments - [`8884b55`](https://github.com/ArchiveBox/ArchiveBox/commit/8884b558367d23856956bb040720bf7ed985d46e) feat: Use CURL_ARGS on header extractor - [`ab5972f`](https://github.com/ArchiveBox/ArchiveBox/commit/ab5972f3b25a81a86f43270627ece46f6b04d71e) feat: Use CURL_ARGS in favicon extractor - [`e2135ea`](https://github.com/ArchiveBox/ArchiveBox/commit/e2135ea9865bac1d0a97ae684d80dd0bb7c50010) feat: Use CURL_ARGS on title extractor - [`7d33f60`](https://github.com/ArchiveBox/ArchiveBox/commit/7d33f6027b6162df53067fb522eb68706e66bee7) refactor: Change typing for new stubs ### 📊 Changes **9 files changed** (+58 additions, -45 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/config/__init__.py` (+37 -3) 📝 `archivebox/config/stubs.py` (+6 -2) 📝 `archivebox/extractors/archive_org.py` (+2 -3) 📝 `archivebox/extractors/favicon.py` (+2 -3) 📝 `archivebox/extractors/git.py` (+2 -1) 📝 `archivebox/extractors/headers.py` (+3 -4) 📝 `archivebox/extractors/media.py` (+2 -18) 📝 `archivebox/extractors/title.py` (+2 -3) 📝 `archivebox/extractors/wget.py` (+2 -8) </details> ### 📄 Description # Summary Create config options to consolidate extractor args in a single key # Related issues #503 # Changes these areas - [ ] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [X] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 18:00:29 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2706
No description provided.