mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #305] Question: Comparison between this and other archival products #3245
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3245
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DonaldTsang on GitHub (Dec 25, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/305
See:
@crisdosaygo commented on GitHub (Dec 31, 2019):
22120 is able to archive everything you browse by hooking into the browser directly. Then offline, you can use your browser as normal, and your browser still works like you're online -- for the pages you've already browsed.
The archive format in 22120 is simple JSON files, organized by directory that represents their origin. There is 1 JSON file for every resource the origin serves you.
You can zip your archive folder and copy it around, and you can create multiple archive folders. You can also specify domain patterns to exclude from your archive.
More information about it is in the README
See: https://github.com/dosyago/22120 for instructions on how to install from the source, or from npm and for the latest binaries see here:
https://github.com/dosyago/22120/releases/latest
@DonaldTsang commented on GitHub (Jan 1, 2020):
@crislin2046 hold on, I am asking for a comparison, not just what they do since ArchiveBox and others are too similar.
@crisdosaygo commented on GitHub (Jan 1, 2020):
I don't know that much about ArchiveBox so my contribution to the comparison is to share what I know. Other people can then compare with ArchiveBox based on that, using what they know! 😄
@pirate commented on GitHub (Jan 5, 2020):
https://github.com/pirate/ArchiveBox#comparison-to-other-projects
https://github.com/pirate/ArchiveBox/wiki/Web-Archiving-Community#Web-Archiving-Projects
@DonaldTsang commented on GitHub (Jan 6, 2020):
@pirate thanks for the explanation, but could you describe how WARC differs from the native ArchiveBox format?
@pirate commented on GitHub (Jan 7, 2020):
There isn't really an "ArchiveBox format" (yet), it does produce some JSON index files, but really the main output is the output of the tools that it calls to archive each site, e.g.:
You can find more info here: https://github.com/pirate/ArchiveBox/wiki/Usage#Disk-Layout