mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #1193] Bug: Search sometimes shows the same snapshot twice #740
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#740
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @melyux on GitHub (Jul 26, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1193
Describe the bug
When doing a search (sonic), sometimes the same snapshot will show up in two rows right after one another.
Steps to reproduce
Not quite sure.
Screenshots or log output
N/A
ArchiveBox version
@pirate commented on GitHub (Jul 28, 2023):
Thanks for reporting.
I'm not entirely surprised by this given how it works. We augment the default Django search of the db fields with the Sonic results, so it's possible the dedupe step is failing which leads to results showing twice if both their content and metadata match.
PR's welcome, otherwise I'll probably get to this in the next bug fixing passes after 0.7.0 is released.
@neel-suthar commented on GitHub (Jan 18, 2024):
@pirate You mean handling dupes in the following function right? We just need to make sure it returns a distinct query set.
Will be happy to work on this. Please let me know.
@pirate commented on GitHub (Jan 19, 2024):
Yeah, that's the spot! If you wanna open a PR to change it to
return qs.distinct()I'll approve + merge it intodev(0.7.3-rc). :)@neel-suthar commented on GitHub (Jan 19, 2024):
@pirate I found another place where we have the same kind of logic but I am not sure if it requires a distinct call or not. Can you please confirm? Here is the code...
@pirate commented on GitHub (Jan 19, 2024):
good catch @neel-suthar, want to also add
return qs.distinct(), use_distinctthere? sorry I saw this after merging your first PR already so you have to open another one@neel-suthar commented on GitHub (Jan 19, 2024):
@pirate will take care of it. This time I will try to reproduce the issue as well. Maybe will include a clip showing that the issue is fixed. Thanks.
@pirate commented on GitHub (Jan 20, 2024):
Closing as fixed, Thanks @neel-suthar! will be out in the next release
0.7.3.