[GH-ISSUE #1315] Fr: expose binaryQuantized of meilisearch (faster search and lighter db) #840

Open
opened 2026-03-02 11:53:09 +03:00 by kerem · 1 comment
Owner

Originally created by @thiswillbeyourgithub on GitHub (Apr 26, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1315

Describe the feature you'd like

Hi,

binaryQuantized allows to keep only 1 and 0s in the embedding process, making the db way smaller and the search way faster. Depending on the model and dimensions the loss can actually be negligible.

In this example using binary embeddings uses 3% of the space (x32 compression) and retains 96% of the performance with a x25 speedup.

I'm not saying it should default to true but exposing the setting would be a good start!

Here's the doc

Describe the benefits this would bring to existing Karakeep users

Way faster search, way lower db size, usable on cheaper hardware.

Can the goal of this request already be achieved via other means?

No

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @thiswillbeyourgithub on GitHub (Apr 26, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1315 ### Describe the feature you'd like Hi, binaryQuantized allows to keep only 1 and 0s in the embedding process, making the db way smaller and the search way faster. Depending on the model and dimensions the loss can actually be negligible. In [this example](https://emschwartz.me/binary-vector-embeddings-are-so-cool/) using binary embeddings uses 3% of the space (x32 compression) and retains 96% of the performance with a x25 speedup. I'm not saying it should default to true but exposing the setting would be a good start! Here's [the doc](https://www.meilisearch.com/docs/reference/api/settings#binaryquantized) ### Describe the benefits this would bring to existing Karakeep users Way faster search, way lower db size, usable on cheaper hardware. ### Can the goal of this request already be achieved via other means? No ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@thiswillbeyourgithub commented on GitHub (Aug 20, 2025):

I gave this some more thoughts and now understand that some model retain performance after binarization better than others. Hence I believe that exposing that setting is a hood thing but should include a warning that it's experimental. I believe this can be indispensible in some setup and unwelcome in others.

<!-- gh-comment-id:3206874438 --> @thiswillbeyourgithub commented on GitHub (Aug 20, 2025): I gave this some more thoughts and now understand that some model retain performance after binarization better than others. Hence I believe that exposing that setting is a hood thing but should include a warning that it's experimental. I believe this can be indispensible in some setup and unwelcome in others.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#840
No description provided.