[GH-ISSUE #2014] inference job failed: Error: The text contains a special token that is not allowed: <|endoftext|> #1254

Open
opened 2026-03-02 11:56:05 +03:00 by kerem · 1 comment
Owner

Originally created by @Atomique on GitHub (Oct 6, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2014

I installed and configured a fresh karakeep in version 0.27.1 and added ollama to it. If I try to summarize an article, it gives me this error message:

"error: [inference][6] inference job failed: Error: The text contains a special token that is not allowed: <|endoftext|>"

Config regarding Ollama:

INFERENCE_CONTEXT_LENGTH: "8096"
INFERENCE_ENABLE_AUTO_TAGGING: "true"
INFERENCE_ENABLE_AUTO_SUMMARIZATION: "true"
INFERENCE_JOB_TIMEOUT_SEC: "60"
INFERENCE_OUTPUT_SCHEMA: "structured"
INFERENCE_LANG: "english"
OLLAMA_BASE_URL: "https://ollama.domain.tld"
INFERENCE_TEXT_MODEL: "gemma3:12b"
INFERENCE_IMAGE_MODEL: "llava"
EMBEDDING_TEXT_MODEL: "embeddinggemma:latest"

Can anyone help me out with this? Maybe I have overseen something in my config and it is no issue.

Thanks a lot!

EDIT: I switched to a setup with an OpenAI API Key and the error persists. Maybe its no problem with Ollama itself.

Originally created by @Atomique on GitHub (Oct 6, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2014 I installed and configured a fresh karakeep in version 0.27.1 and added ollama to it. If I try to summarize an article, it gives me this error message: "error: [inference][6] inference job failed: Error: The text contains a special token that is not allowed: <|endoftext|>" Config regarding Ollama: ``` INFERENCE_CONTEXT_LENGTH: "8096" INFERENCE_ENABLE_AUTO_TAGGING: "true" INFERENCE_ENABLE_AUTO_SUMMARIZATION: "true" INFERENCE_JOB_TIMEOUT_SEC: "60" INFERENCE_OUTPUT_SCHEMA: "structured" INFERENCE_LANG: "english" OLLAMA_BASE_URL: "https://ollama.domain.tld" INFERENCE_TEXT_MODEL: "gemma3:12b" INFERENCE_IMAGE_MODEL: "llava" EMBEDDING_TEXT_MODEL: "embeddinggemma:latest" ``` Can anyone help me out with this? Maybe I have overseen something in my config and it is no issue. Thanks a lot! EDIT: I switched to a setup with an OpenAI API Key and the error persists. Maybe its no problem with Ollama itself.
Author
Owner

@Atomique commented on GitHub (Oct 6, 2025):

I have an important update: I tried to save this link to find out how karakeep works and it seems it was kind of a "hardcore test" for it.

https://blog.steelph0enix.dev/posts/llama-cpp-guide/

The content of the website contains <|endoftext|> which is a problem for the AI feature here as it is a special token that seems not to be allowed here. Is there a way to "ignore" these so that the rest of the website will be summarized and tagged?

<!-- gh-comment-id:3371016784 --> @Atomique commented on GitHub (Oct 6, 2025): I have an important update: I tried to save this link to find out how karakeep works and it seems it was kind of a "hardcore test" for it. https://blog.steelph0enix.dev/posts/llama-cpp-guide/ The content of the website contains `<|endoftext|>` which is a problem for the AI feature here as it is a special token that seems not to be allowed here. Is there a way to "ignore" these so that the rest of the website will be summarized and tagged?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1254
No description provided.