mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[PR #452] [MERGED] Replace index.json with index.sql as the main index #1172
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#1172
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/452
Author: @cdvv7788
Created: 8/18/2020
Status: ✅ Merged
Merged: 9/15/2020
Merged by: @cdvv7788
Base:
master← Head:sql_index📝 Commits (10+)
ecf7476feat: Replace index.json with index.sql as the main index in init11c3e69feat: Update status command to consider sql as the main indexb94e651feat: Update extractors and add command to use sql index as source of truthf028999feat: Remove patch_main_index2d05462feat: Update data folder check58d79b9feat: Save static indexes at the end ofinitbb8bbe1feat: Add flag to list command to support index like output3ba5ad1feat: Add html export to list command1509fb6feat: list command fails when --index is used without --json or --html97215f6feat: load_main_index returns a queryset now📊 Changes
17 files changed (+629 additions, -355 deletions)
View changed files
📝
archivebox/cli/archivebox_list.py(+20 -1)📝
archivebox/cli/archivebox_oneshot.py(+1 -1)📝
archivebox/config/__init__.py(+2 -3)📝
archivebox/core/admin.py(+3 -3)📝
archivebox/core/models.py(+4 -0)📝
archivebox/extractors/__init__.py(+21 -19)📝
archivebox/index/__init__.py(+117 -158)📝
archivebox/index/html.py(+3 -2)📝
archivebox/index/sql.py(+22 -12)📝
archivebox/logging_util.py(+29 -8)📝
archivebox/main.py(+111 -144)➕
archivebox/themes/legacy/main_index_minimal.html(+20 -0)📝
tests/test_add.py(+11 -0)📝
tests/test_init.py(+68 -0)➕
tests/test_list.py(+67 -0)📝
tests/test_remove.py(+103 -4)➕
tests/test_update.py(+27 -0)📄 Description
Summary
After this PR is ready, the json index will not be considered the main source of truth anymore. Instead, the index.sqlite3 will replace it in that role.
The index.json will still be around, but it will only be written at the end of the processes that run. If the archive is old (no index.sqlite3 is present) running
archivebox init --forcewill be necessary to update it to the latest version.Changes these areas
Roadmap Goals
This is one of the main goals of the 0.5 release.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.