mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[PR #1195] [MERGED] Add method-specific URL allow/deny lists #1341
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#1341
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/1195
Author: @overhacked
Created: 7/31/2023
Status: ✅ Merged
Merged: 10/28/2023
Merged by: @pirate
Base:
dev← Head:method_allow_deny📝 Commits (4)
46e80ddRename URL_(WHITE|BLACK)LIST to URL_(ALLOW|DENY)LISTb44f7e6Add URL-specific method allow/deny lists2076474Drop use of TypeAlias to maintain Python 3.9 compat63ad43fMerge branch 'dev' into method_allow_deny📊 Changes
6 files changed (+96 additions, -24 deletions)
View changed files
📝
archivebox/config.py(+13 -5)📝
archivebox/config_stubs.py(+1 -1)📝
archivebox/core/forms.py(+1 -1)📝
archivebox/extractors/__init__.py(+36 -11)📝
archivebox/index/__init__.py(+4 -4)📝
tests/test_extractors.py(+41 -2)📄 Description
Summary
This adds the ability to toggle extractors (aka methods, aka outputs) on an URL-specific basis. This is useful for sites on which
singlepage, for example, does not provide a usable snapshot. Or, in cases in which you might want to only download the media for a URL and nothing else.This PR also includes a commit to rename
URL_(WHITE|BLACK)LISTtoURL_(ALLOW|DENY)LISTas proposed in the documentation. The old names are preserved as aliases. I included this change in this PR so as not to have to name the new configuration parameters with the deprecated terms.Config Example
Documentation
Glad to share some Wiki commits if you'd like to move forward with this PR. You can't PR a wiki, right?
Related issues
None found
Changes these areas
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.
core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist#2328core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist#3838