[GH-ISSUE #1230] FR: semantic visualization #799

Closed
opened 2026-03-02 11:52:50 +03:00 by kerem · 1 comment
Owner

Originally created by @thiswillbeyourgithub on GitHub (Apr 10, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1230

Describe the feature you'd like

Hi,

I am very happy with Karakeep but wanted to mention a project that could be of interest.

BERTopic is a python library that makes it extremely easy to create a semantic embedding based map of different texts. I don't really know the exact technical difficulties that you would expect to encounter to include this feature as part of karakeep, but I think it would be really great to help manage a huge amount of hoarded data.

example

I am thinking a viable technical way would be to basically create the visualization using an external small Python code and output it to something like HTML maybe and then load it into your typescript backend. But really I don't know anything about TypeScript.

Personally, and if you don't include it, I intend to make my own personal scripts on the side to help organize my reading queue based on semantic topics.

Describe the benefits this would bring to existing Hoarder users

The ability to visualize your entire collection
A useful way to use the pre-computed text embeddings apart from search.
Organize your reading queue based on semantic links.
It's super cool

Can the goal of this request already be achieved via other means?

Using external scripts, maybe we could derive some of the advantages using the external API of your project.

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @thiswillbeyourgithub on GitHub (Apr 10, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1230 ### Describe the feature you'd like Hi, I am very happy with Karakeep but wanted to mention a project that could be of interest. [BERTopic](https://github.com/MaartenGr/BERTopic) is a python library that makes it extremely easy to create a semantic embedding based map of different texts. I don't really know the exact technical difficulties that you would expect to encounter to include this feature as part of karakeep, but I think it would be really great to help manage a huge amount of hoarded data. ![example](https://raw.githubusercontent.com/MaartenGr/BERTopic/refs/heads/master/images/topic_visualization.gif) I am thinking a viable technical way would be to basically create the visualization using an external small Python code and output it to something like HTML maybe and then load it into your typescript backend. But really I don't know anything about TypeScript. Personally, and if you don't include it, I intend to make my own personal scripts on the side to help organize my reading queue based on semantic topics. ### Describe the benefits this would bring to existing Hoarder users The ability to visualize your entire collection A useful way to use the pre-computed text embeddings apart from search. Organize your reading queue based on semantic links. It's super cool ### Can the goal of this request already be achieved via other means? Using external scripts, maybe we could derive some of the advantages using the external API of your project. ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@MohamedBassem commented on GitHub (May 10, 2025):

Hey, this is indeed cool, and future releases will actually have embeddings for the bookmarks to be used for search, etc. I think once the embeddings are there, we can consider making some visualization like this but I think for now, given how involved this will be, I think this better lives as a community project rather than integrated in karakeep itself. Thanks!

<!-- gh-comment-id:2868939895 --> @MohamedBassem commented on GitHub (May 10, 2025): Hey, this is indeed cool, and future releases will actually have embeddings for the bookmarks to be used for search, etc. I think once the embeddings are there, we can consider making some visualization like this but I think for now, given how involved this will be, I think this better lives as a community project rather than integrated in karakeep itself. Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#799
No description provided.