[GH-ISSUE #2507] LLM-based OCR Extracted Content #1501

New issue

Open

opened 2026-03-02 11:57:43 +03:00 by kerem · 0 comments

kerem commented

2026-03-02 11:57:43 +03:00

Owner

Originally created by @folosleg on GitHub (Feb 22, 2026).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2507

Describe the Bug

With certain custom models (see conf below) LLM-based OCR does not fill in Extracted Content field with ocr-ed text.
Therefore image text is not searchable.
It does generate tags okay though.
(with default settings everything works as expected)

Steps to Reproduce

Set ENV like this:
OCR_LANGS=eng,hun
OPENAI_API_KEY=***
INFERENCE_TEXT_MODEL=gpt-5-nano
INFERENCE_IMAGE_MODEL=gpt-5-nano
INFERENCE_CONTEXT_LENGTH=4096
INFERENCE_MAX_OUTPUT_TOKENS=2048
INFERENCE_JOB_TIMEOUT_SEC=60
INFERENCE_USE_MAX_COMPLETION_TOKENS=true
OPENAI_SERVICE_TIER=flex
OCR_USE_LLM=true

Set ENV to the above.
Upload image with text (like a screenshot or such).
Wait for the inference to finish.
Check Edit - Extracted Content

Expected Behaviour

Save LLM ORC output to Extracted Content field as Tesseract does.

Screenshots or Additional Context

No response

Device Details

No response

Exact Karakeep Version

0.31.0

Environment Details

docker

Debug Logs

No response

Have you checked the troubleshooting guide?

I have checked the troubleshooting guide and I haven't found a solution to my problem

Originally created by @folosleg on GitHub (Feb 22, 2026). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2507 ### Describe the Bug With certain custom models (see conf below) LLM-based OCR does not fill in Extracted Content field with ocr-ed text. Therefore image text is not searchable. It does generate tags okay though. (with default settings everything works as expected) ### Steps to Reproduce Set ENV like this: OCR_LANGS=eng,hun OPENAI_API_KEY=*** INFERENCE_TEXT_MODEL=gpt-5-nano INFERENCE_IMAGE_MODEL=gpt-5-nano INFERENCE_CONTEXT_LENGTH=4096 INFERENCE_MAX_OUTPUT_TOKENS=2048 INFERENCE_JOB_TIMEOUT_SEC=60 INFERENCE_USE_MAX_COMPLETION_TOKENS=true OPENAI_SERVICE_TIER=flex OCR_USE_LLM=true Set ENV to the above. Upload image with text (like a screenshot or such). Wait for the inference to finish. Check Edit - Extracted Content ### Expected Behaviour Save LLM ORC output to Extracted Content field as Tesseract does. ### Screenshots or Additional Context _No response_ ### Device Details _No response_ ### Exact Karakeep Version 0.31.0 ### Environment Details docker ### Debug Logs _No response_ ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem