mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-26 01:26:00 +03:00
[GH-ISSUE #697] Bug: Sonic throwing invalid_meta_key exception when indexing snapshots with headers #3457
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3457
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @erob8 on GitHub (Apr 8, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/697
Describe the bug
I have about 12 ifixit articles urls that I've added snapshots for. The majority of them throw an exception when i run the
docker-compose run archivebox update --index-onlyThese exceptions seem to break the content indexing completely and even snapshots that didn't exception are not returning search results for their content now.Steps to reproduce
docker-compose run archivebox update --index-only[X] The search backend threw an exception=ERR invalid_meta_key(?[mp4a.40.2\",\"mime\":\"video\/mp4\",\"always_generate\":true},\"MP4_592\":{\"column\":\"MP4_592\",\"label\":\"Low\",\"encoding\":\"mp4\",\"width\":592,\"height\":444,\"ma"])in the output.Screenshots or log output
ArchiveBox version
@erob8 commented on GitHub (Apr 8, 2021):
I was wrong about this part
These exceptions seem to break the content indexing completely and even snapshots that didn't exception are not returning search results for their content now.The search works as expected from /admin/core/snapshot search bar, even with content from snapshots that threw an exception during
archivebox update --index-onlywhich is good. However, searching on content doesn't work on the /public/ search bar. Which is not how I'd expect it to function as I havePUBLIC_INDEXset toTrue.Is this part functioning as expected? I would like to enable the same search functionality from the public & admin view.
@pirate commented on GitHub (Apr 9, 2021):
The Full-text index is not connected to the public search yet actually. We'll likely push it in the next version after v0.6, I just haven't gotten around to it yet.Currently the public site only searches the main Snapshot db fields (title, url, timestamp, tags, etc.).Just added full-text search on the public index in v0.6
89158d5. Also added a thing to catch the index errors and bail out after 5 failures on files that don't support searching3093057.Give it a shot and comment back here if you're still having trouble and I'll reopen the issue.