[GH-ISSUE #271] Feature Request: Archive only from specific domain #3213

Closed
opened 2026-03-14 21:38:29 +03:00 by kerem · 1 comment
Owner

Originally created by @LaserWires on GitHub (Sep 19, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/271

It would be ideal to permit users to archive from a specific domain thus preventing archivebox from archiving pages from any other domain than from the URI which it was given. This should not interfere with multimedia objects such as videos as those are typically hosted by other domains other than the URI passed to archivebox.

Originally created by @LaserWires on GitHub (Sep 19, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/271 It would be ideal to permit users to archive from a specific domain thus preventing archivebox from archiving pages from any other domain than from the URI which it was given. This should not interfere with multimedia objects such as videos as those are typically hosted by other domains other than the URI passed to archivebox.
Author
Owner

@pirate commented on GitHub (Sep 20, 2019):

You can pull this off using the https://github.com/pirate/ArchiveBox/wiki/Configuration#url_blacklist feature.

Archiving pages on one domain but media on any domain is not as simple as it sounds, there are potentially dozens of possible behaviors around how edge cases are handled, but if you can express exactly the behavior you want in regex form, URL_BLACKLIST should work fine.

<!-- gh-comment-id:533388294 --> @pirate commented on GitHub (Sep 20, 2019): You can pull this off using the https://github.com/pirate/ArchiveBox/wiki/Configuration#url_blacklist feature. Archiving pages on one domain but media on any domain is not as simple as it sounds, there are potentially dozens of possible behaviors around how edge cases are handled, but if you can express exactly the behavior you want in regex form, `URL_BLACKLIST` should work fine.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3213
No description provided.