[GH-ISSUE #513] Move ArchiveResults into the SQL db to avoid checking the filesystem for every Snapshot output status #333

Closed
opened 2026-03-01 14:42:34 +03:00 by kerem · 2 comments
Owner

Originally created by @pirate on GitHub (Oct 24, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/513

Originally assigned to: @cdvv7788 on GitHub.

Right now archivebox.zervice.io is taking 6s+ to load because it has to check the filesystem many times for each row on the page during the template rendering process.

We can load the ArchiveResults entries from the archive/<timestamp>/index.json files into the sqlite3 db. This will eliminate checking each output path on the filesystem every time we want to see whether it completed or not.

Screen Shot 2020-10-24 at 3 17 45 PM

Originally created by @pirate on GitHub (Oct 24, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/513 Originally assigned to: @cdvv7788 on GitHub. Right now archivebox.zervice.io is taking 6s+ to load because it has to check the filesystem many times for each row on the page during the template rendering process. We can load the `ArchiveResult`s entries from the `archive/<timestamp>/index.json` files into the sqlite3 db. This will eliminate checking each output path on the filesystem every time we want to see whether it completed or not. ![Screen Shot 2020-10-24 at 3 17 45 PM](https://user-images.githubusercontent.com/511499/97091787-6a134b00-160c-11eb-9d8c-c2beb329c468.png)
Author
Owner

@cdvv7788 commented on GitHub (Oct 26, 2020):

I will create a model and associate its instances with the Snapshot via foreign keys. Is that what you have in mind? @pirate

<!-- gh-comment-id:716499882 --> @cdvv7788 commented on GitHub (Oct 26, 2020): I will create a model and associate its instances with the `Snapshot` via foreign keys. Is that what you have in mind? @pirate
Author
Owner

@cdvv7788 commented on GitHub (Dec 5, 2020):

@pirate can this be closed?

<!-- gh-comment-id:739322616 --> @cdvv7788 commented on GitHub (Dec 5, 2020): @pirate can this be closed?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#333
No description provided.