mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #1475] Feature Request: Automatic rate limiting for multiple entries of the same domain #3892
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3892
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @maxiride on GitHub (Jul 31, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1475
Type
What is the problem that your feature request solves
When submitting multiple pages of the same website (domain) it may trigger a captcha or other limiting factor.
Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes
The tool should self- limit processing entries of the same domain. Optionally the delay should be configurable.
What hacks or alternative solutions have you tried to solve the problem?
Nothing. On the other hand I discovered too late that many pages weren't archived and now they aren't available anymore on any platform.
How badly do you want this new feature?
@pirate commented on GitHub (Aug 12, 2024):
We discussed this as a sub-requirement of recursive archiving (https://github.com/ArchiveBox/ArchiveBox/issues/191) but I agree it deserves a separate issue.
The (currently closed-source) tooling I provide on top of ArchiveBox for paying clients does implement fairly advanced rate-limiting on a per-local-host, per-remote-IP, per-domain, and per-chrome-profile basis, but it's not yet public.