mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #553] queuing for add #349
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#349
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @shepner on GitHub (Nov 28, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/553
Type
What is the problem that your feature request solves
archivebox addcan be slow and I typically just want a quick "fire and forget" way to submit new URLs. Id also like this to be a multi-threaded process.Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes
Implement a command (ie
archivebox queue) which parses the input (similar toarchivebox add) and places each URL into a message queue. On the other side of the message queue, have a process which will kick off anarchivebox addcommand in the background per CPU available.What hacks or alternative solutions have you tried to solve the problem?
While "doing it right" is a rather involved process, this could also be done external to
archiveboxitself as scripting within the Docker container. Ive done "quick and dirty" variants similar to this a few times over the years with Python (and Perl) scripts.In the simplest form, the message queue could just be a list or even a file. Running multiple threads can be as simple as just watching to ensure no more than N instances are running at any given time and pulling more entries from the queue when there are more slots open.
How badly do you want this new feature?
@cdvv7788 commented on GitHub (Nov 28, 2020):
@pirate this is related to the huey implementation, right?
@pirate commented on GitHub (Nov 28, 2020):
The message queue-style implementation is coming soon with Huey, but the behavior you want can already be achieved with:
Going to close this for now because the Huey implementation is already a long-running dev task we're tracking in other issues.
Feel free to reply if you still have questions / want help though and I'll continue answering here.