mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #265] JSONDecodingError while archiving a specific website #1698
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#1698
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @phretor on GitHub (Sep 12, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/265
Describe the bug
I'm getting this
JSONDecodingErroron a specific website.Steps to reproduce
@pirate commented on GitHub (Sep 19, 2019):
Looks like your index got corrupted somehow, can you look inside the
index.jsonfile and see if it got truncated at line 5283? If so, don't worry you haven't lost any data, you'll just have to rebuild the index with ArchiveBox v0.4.x which has a new automatic index-rebuilding feature. This has happened in the past to users who used older versions that didn't have the atomic-writing index save feature to prevent corrupted indexes.@logistic-bot commented on GitHub (Oct 7, 2019):
Getting similar error:
Temporarly solved it by deleting the
output/archive/1570023650folder@gjedeer commented on GitHub (Oct 16, 2019):
I'm having a malformed index, how do I rebuild it? I tried doing it manually but it's FUBAR.
As you can see I'm now running the latest version and I'm still getting the exception:
@gjedeer commented on GitHub (Oct 16, 2019):
I've found out about development now happening in branches: cloned latest v0.5.0 and built a docker image from it, same problem.
@pirate commented on GitHub (Oct 17, 2019):
@gjedeer I'm removing the JSON index entirely and sticking to SQLite for the final release, it's too hard to incrementally add JSON entries in an efficient way without corrupting the index during power outages or causing huge read/write spikes for no reason. Instead it will use SQLite for the core index, and export a JSON index if the user manually requests it, or once an archiving import process is completely finished.
@gjedeer commented on GitHub (Oct 18, 2019):
Cool, and I've just extracted individual archived page URLs from $subdir/index.js, removed the currupted main index.js file and archived the URLs again - fortunately, they were still available.
BTW, there was no power loss or anything like that. I've seen in the sources that you've had in-place JSON changing code, yeah, agree that SQLite is a better solution.
@pirate commented on GitHub (May 9, 2020):
Closing this in favor of #234