mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #1646] Bug: All timers/progress bars timeout when archiving from REST API #2497
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#2497
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @benmuth on GitHub (Feb 1, 2025).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1646
Originally assigned to: @pirate on GitHub.
Provide a screenshot and describe the bug
When using the extension (in the
redesignbranch, using the REST API) to archive any URL, all timers/progress bars take the maximum amount of time. Parsers take 240 seconds, most extractors take 60 seconds, the media extractor takes an hour (I think, I didn't wait around to find out). This doesn't happen when runningarchivebox addthrough the CLI, only the REST API.I kind of figured out what's going on, but not why. In
archivebox/logging_util.py,TimedProgress.end()tries toterminatetheprogress_barprocess. For some reason (busy writing to stdout?), the process ignores the terminate, then thejoin()call after the terminate just blocks until theprogress_barfunction finishes execution.Adding an explicit signal handler to the beginning of the
progress_barfunction seems to fix the problem:That works, but I'm not sure if there's a better solution. Should I open a PR with this fix against dev?
Steps to reproduce
Logs or errors
ArchiveBox Version
How did you install the version of ArchiveBox you are using?
Other
What operating system are you running on?
macOS (including Docker on macOS)
What type of drive are you using to store your ArchiveBox data?
data/is on a local SSD or NVMe drivedata/is on a spinning hard drive or external USB drivedata/is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.)data/is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.)Docker Compose Configuration
ArchiveBox Configuration
@pirate commented on GitHub (Feb 1, 2025):
Sure I'll take the PR :)
thanks