mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-26 01:26:00 +03:00
[GH-ISSUE #764] Question: I accidentally used a depth setting of one and didn't choose an archive type. I ended up getting about 747 archive results in my log. What should I do? #1995
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#1995
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @NylaTheWolf on GitHub (Jun 6, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/764
So I was an idiot and decided not to choose an archive type that I wanted to upload. I even set the depth to 1. I canceled the process when I realized what was happening after I got back from a shower, and I now have 1.01 gb of files on my computer (some of the processes have failed, and maybe some weren't even fully downloaded). There's a part of me that wants to delete the extraneous files (like favicons, headers, wgets(?) etc), but I wonder if maybe it'd be a good idea to back it up, or even upload everything to archive.org. I also know that I could output things into a json or html list, and I'm guessing it gives all the links of what I archived?
I know there is a delete command, but I was wondering if I could delete certain types of files or something?
@pirate commented on GitHub (Jun 7, 2021):
You can delete any specified files you want from the
Logpage, or entire snapshots from theSnapshotpage, or use thearchivebox removecommand from the CLI to do the same thing. Up to you if you want to back it up or not, if you didn't change the default settings then it also saved everything to Archive.org.Using all the archive types is good, you should generally continue doing that, just be careful what you use depth=1 for as it will grow quickly on pages with many URLs within.
@NylaTheWolf commented on GitHub (Jun 11, 2021):
Alright! I'll try that out when I have the time!
Would it be risky for me to upload the archived webpages to archive.org? Would I be compromising any personal information?
@pirate commented on GitHub (Jun 11, 2021):
They are already saved to Archive.org by ArchiveBox, no need to upload them again.
@NylaTheWolf commented on GitHub (Jun 11, 2021):
Oh okay! I was just wondering if I should make sure for the other kinds of files that were saved aha
@pirate commented on GitHub (Jun 11, 2021):
Not sure Archive.org will take them, as they already use ways to save most of the file types we save, except for maybe PDF and screenshot copies for redundancy.