mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #702] Question: Performance Comparison docker based vs bare metal #3459
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3459
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @asitemade4u on GitHub (Apr 11, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/702
Hi,
Thank you for Archive Box, which is outstanding -- basically exactly what I was looking for.
My only concern is ArchiveBox performance when executed within a docker environment. Notably. it does not seem to be using all CPU available nor memory.
Is there a better track record with bare-metal installations? Can ArchiveBox take full advantage of all processing power available and work in parallel?
Best,
Stephen
@pirate commented on GitHub (Apr 12, 2021):
Going to close this in favor of our existing issue about parallel archiving / performance: https://github.com/ArchiveBox/ArchiveBox/issues/91
Please subscribe to that one if you want updates.
There is no significant difference in performance between docker / non-docker, the main bottleneck is the blocking IO and network due to single-threaded extractor execution (which will be removed when we moved to a message-passing based worker queue architecture).
The summary is: parallel archiving is semi-doable and safe already right now, but not perfect (you might encounter "database locked" errors https://github.com/ArchiveBox/ArchiveBox/issues/601 if you try too many threads). Just run multiple
archivebox addcommands at once.