[GH-ISSUE #296] [Feature Request] OCR search text in images #195

Closed
opened 2026-03-02 11:47:31 +03:00 by kerem · 11 comments
Owner

Originally created by @lethefrost on GitHub (Jul 12, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/296

It would be especially helpful when you have a lot of screenshots, diagrams, photo of slides, etc., embedded in documents or as stand alone image files. Text in images may contain a large amount of information. However, it's not very easy to retrieve them in the traditional ways of file management. It would be greatly appreciated if you could consider making them searchable.

Originally created by @lethefrost on GitHub (Jul 12, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/296 It would be especially helpful when you have a lot of screenshots, diagrams, photo of slides, etc., embedded in documents or as stand alone image files. Text in images may contain a large amount of information. However, it's not very easy to retrieve them in the traditional ways of file management. It would be greatly appreciated if you could consider making them searchable.
kerem 2026-03-02 11:47:31 +03:00
Author
Owner

@MohamedBassem commented on GitHub (Jul 13, 2024):

hmmm, OCR is a cool idea indeed. My only concern is finding a good OCR tool that would work with different languages.

<!-- gh-comment-id:2226883181 --> @MohamedBassem commented on GitHub (Jul 13, 2024): hmmm, OCR is a cool idea indeed. My only concern is finding a good OCR tool that would work with different languages.
Author
Owner

@lethefrost commented on GitHub (Jul 13, 2024):

hmmm, OCR is a cool idea indeed. My only concern is finding a good OCR tool that would work with different languages.

This might be helpful - I am thinking probably we can let each user configure a list of possible languages that would occur in their hoard - which usually are the languages they know, so the list wouldn't be too long (for most people it might be 1-3?). It seems that Tesseract.js supports recognizing multiple languages at the same time when you concatenate the lang codes with +?

<!-- gh-comment-id:2226999017 --> @lethefrost commented on GitHub (Jul 13, 2024): > hmmm, OCR is a cool idea indeed. My only concern is finding a good OCR tool that would work with different languages. [This](https://github.com/naptha/tesseract.js) might be helpful - I am thinking probably we can let each user configure a list of possible languages that would occur in their hoard - which usually are the languages they know, so the list wouldn't be too long (for most people it might be 1-3?). It seems that Tesseract.js supports recognizing multiple languages at the same time when you concatenate the lang codes with `+`?
Author
Owner

@MohamedBassem commented on GitHub (Jul 27, 2024):

tesseract.js looks cool indeed. We can probably add it to the roadmap at some point

<!-- gh-comment-id:2254258554 --> @MohamedBassem commented on GitHub (Jul 27, 2024): [tesseract.js](https://github.com/naptha/tesseract.js) looks cool indeed. We can probably add it to the roadmap at some point
Author
Owner

@akshara-tg commented on GitHub (Sep 1, 2024):

Without OCR (which allows for searching text within images), the hoarding images become somewhat pointless.

<!-- gh-comment-id:2323171583 --> @akshara-tg commented on GitHub (Sep 1, 2024): Without OCR (which allows for searching text within images), the hoarding images become somewhat pointless.
Author
Owner

@Arcturuss commented on GitHub (Oct 13, 2024):

+1 for OCR in images.
Personally I wanted to make a "meme catalog" in Hoarder. few thoughts about that:

  • in addition to OCR, semantic search is needed. similar to suggested in #441 but for images, like Immich does
  • maybe option to enable OCR separately for hoarded single images only and not for the images from webpages
<!-- gh-comment-id:2408943857 --> @Arcturuss commented on GitHub (Oct 13, 2024): +1 for OCR in images. Personally I wanted to make a "meme catalog" in Hoarder. few thoughts about that: - in addition to OCR, semantic search is needed. similar to suggested in #441 but for images, like Immich does - maybe option to enable OCR separately for hoarded single images only and not for the images from webpages
Author
Owner

@MohamedBassem commented on GitHub (Oct 13, 2024):

@Arcturuss OCR for uploaded images is something on our roadmap and I'm definitely planning to do it pretty soon.

<!-- gh-comment-id:2408944468 --> @MohamedBassem commented on GitHub (Oct 13, 2024): @Arcturuss OCR for uploaded images is something on our roadmap and I'm definitely planning to do it pretty soon.
Author
Owner

@MohamedBassem commented on GitHub (Oct 21, 2024):

OCR is now implemented and will be available in the next release.

<!-- gh-comment-id:2426319585 --> @MohamedBassem commented on GitHub (Oct 21, 2024): OCR is now implemented and will be available in the next release.
Author
Owner

@lethefrost commented on GitHub (Oct 21, 2024):

OCR is now implemented and will be available in the next release.

Thank you! It's very great to hear that! Appreciate it a lot.

<!-- gh-comment-id:2427553817 --> @lethefrost commented on GitHub (Oct 21, 2024): > OCR is now implemented and will be available in the next release. Thank you! It's very great to hear that! Appreciate it a lot.
Author
Owner

@drycounty commented on GitHub (Nov 16, 2024):

Can you tell me how this is implemented? Do I need to specify any of the ENV variables for it to work? Can't seem to get it to work from photos of pages of text.

<!-- gh-comment-id:2480331350 --> @drycounty commented on GitHub (Nov 16, 2024): Can you tell me how this is implemented? Do I need to specify any of the ENV variables for it to work? Can't seem to get it to work from photos of pages of text.
Author
Owner

@MohamedBassem commented on GitHub (Nov 17, 2024):

@drycounty it's enabled by default. Currently, we don't expose the extracted text, but we only index it for search. Try searching for the content of the page and see if it'll showup.

<!-- gh-comment-id:2480888878 --> @MohamedBassem commented on GitHub (Nov 17, 2024): @drycounty it's enabled by default. Currently, we don't expose the extracted text, but we only index it for search. Try searching for the content of the page and see if it'll showup.
Author
Owner

@radonmiser commented on GitHub (Jan 9, 2025):

Is the extracted text forced into the inference lang? My inference lang is English and I'm unable to find any of my Japanese bookmarks with Japanese

<!-- gh-comment-id:2579191541 --> @radonmiser commented on GitHub (Jan 9, 2025): Is the extracted text forced into the inference lang? My inference lang is English and I'm unable to find any of my Japanese bookmarks with Japanese
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#195
No description provided.