[GH-ISSUE #1266] Need help dealing with way too many tags #815

Open
opened 2026-03-02 11:52:58 +03:00 by kerem · 5 comments
Owner

Originally created by @handyman5 on GitHub (Apr 14, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1266

Describe the feature you'd like

The AI-generated-tags feature is very cool and helpful, but it tends to generate tags that apply to only one item, and now I have a zillion one-item tags. This makes interacting with the tags page slow, and probably has other negative performance impacts.

I would like to request some way to clean up these single-item tags. The idea I had is to support a threshold for "unused" tags as at the bottom of the tags page, so that I could say "a tag attached to only 1 or 2 items is 'unused' for cleanup purposes". Then I could reuse the existing "Delete All Unused Tags" button to clean these up as well. However, I'm not married to that suggestion.

Describe the benefits this would bring to existing Karakeep users

This would help mitigate the explosion of low-value AI-generated tags.

Can the goal of this request already be achieved via other means?

I could probably write a script to iterate through all the tags and delete the ones which are attached to only one item, but that would be brittle and more work than I'd prefer to do.

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @handyman5 on GitHub (Apr 14, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1266 ### Describe the feature you'd like The AI-generated-tags feature is very cool and helpful, but it tends to generate tags that apply to only one item, and now I have a zillion one-item tags. This makes interacting with the tags page slow, and probably has other negative performance impacts. I would like to request some way to clean up these single-item tags. The idea I had is to support a threshold for "unused" tags as at the bottom of the tags page, so that I could say "a tag attached to only 1 or 2 items is 'unused' for cleanup purposes". Then I could reuse the existing "Delete All Unused Tags" button to clean these up as well. However, I'm not married to that suggestion. ### Describe the benefits this would bring to existing Karakeep users This would help mitigate the explosion of low-value AI-generated tags. ### Can the goal of this request already be achieved via other means? I could probably write a script to iterate through all the tags and delete the ones which are attached to only one item, but that would be brittle and more work than I'd prefer to do. ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@hasansino commented on GitHub (Apr 15, 2025):

Rather than fighting with symptoms, a better idea is to cure the root cause.

A. Linkwarden have a great option, 'Auto-categorize links to existing tags based on the content of each link.' which allows AI only to use existing tags. You seed the initial scope of tags, so to speak. Can be replicated easily.

B. Even easier solution, ask AI in prompt to use general scope tags instead of specific.

<!-- gh-comment-id:2804054714 --> @hasansino commented on GitHub (Apr 15, 2025): Rather than fighting with symptoms, a better idea is to cure the root cause. A. Linkwarden have a great option, 'Auto-categorize links to existing tags based on the content of each link.' which allows AI only to use existing tags. You seed the initial scope of tags, so to speak. Can be replicated easily. B. Even easier solution, ask AI in prompt to use general scope tags instead of specific.
Author
Owner

@handyman5 commented on GitHub (Apr 15, 2025):

There was a conversation about using existing tags, the note cautions against using it but perhaps I'll try it out and see if it works for my case.

<!-- gh-comment-id:2807051314 --> @handyman5 commented on GitHub (Apr 15, 2025): There was [a conversation](https://github.com/karakeep-app/karakeep/issues/111) about using existing tags, the note cautions against using it but perhaps I'll try it out and see if it works for my case.
Author
Owner

@Eragos commented on GitHub (Apr 20, 2025):

By time to time I cleanup it manually (via https://karakeep.domain.com/dashboard/tags). But it would be a nice webhook event option btw)

<!-- gh-comment-id:2817286469 --> @Eragos commented on GitHub (Apr 20, 2025): By time to time I cleanup it manually (via https://karakeep.domain.com/dashboard/tags). But it would be a nice webhook event option btw)
Author
Owner

@thiswillbeyourgithub commented on GitHub (May 8, 2025):

By time to time I cleanup it manually (via https://karakeep.domain.com/dashboard/tags). But it would be a nice webhook event option btw)

You might be interested in giving a try to my karakeep python api as it should make it pretty easy to fetch the tags into python then apply some merging heuristics and then sync them back up.

It's a very early release though so please tell me if you run into issues. And i'd be interested in your script too!

<!-- gh-comment-id:2864632201 --> @thiswillbeyourgithub commented on GitHub (May 8, 2025): > By time to time I cleanup it manually (via https://karakeep.domain.com/dashboard/tags). But it would be a nice webhook event option btw) You might be interested in giving a try to my [karakeep python api](https://github.com/thiswillbeyourgithub/karakeep_python_api) as it should make it pretty easy to fetch the tags into python then apply some merging heuristics and then sync them back up. It's a very early release though so please tell me if you run into issues. And i'd be interested in your script too!
Author
Owner

@Eragos commented on GitHub (May 9, 2025):

This https://github.com/karakeep-app/karakeep/discussions/843#discussioncomment-13080110 solved my problem mostly. it's worth to try it IMO.
I put the python attempt in my mind. Thanks!

<!-- gh-comment-id:2867527112 --> @Eragos commented on GitHub (May 9, 2025): This https://github.com/karakeep-app/karakeep/discussions/843#discussioncomment-13080110 solved my problem mostly. it's worth to try it IMO. I put the python attempt in my mind. Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#815
No description provided.