[GH-ISSUE #963] unable to parse openrouter auto tag response that enclosed by markdown code block tag #636

Closed
opened 2026-03-02 11:51:31 +03:00 by kerem · 2 comments
Owner

Originally created by @kurokuma-lab on GitHub (Feb 2, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/963

Describe the Bug

I have tried a few models using openrouter api by setting the OPENAI_BASE_URL=https://openrouter.ai/api/v1/

Autotagging is not working due to the response from the models are enclosed by the markdown code block tag ```json { ... } ```

Here is the error message that obtained from the log:

2025-02-02T07:09:46.064Z error: [inference][37] inference job failed: Error: [inference][37] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json
{
"tags": ["
Error: [inference][37] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json
{
"tags": ["
    at inferTags (/app/apps/workers/openaiWorker.ts:6:4164)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6686)
    at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.0_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2578)

Steps to Reproduce

OPENAI_BASE_URL=https://openrouter.ai/api/v1/
OPENAI_API_KEY=my token
INFERENCE_TEXT_MODEL=google/gemma-2-9b-it:free

Expected Behaviour

AI autotagging

Screenshots or Additional Context

The issue can be resolved by using regex to clean up the reponse.reponse in opneaiWorker.ts.

Instead of directly parsing the model response

let tags = openAIResponseSchema.parse(JSON.parse(response.response)).tags;

We can clean up the response using regex

const cleanedResponse = cleanJsonString(response.response);
let tags = openAIResponseSchema.parse(JSON.parse(cleanedResponse)).tags;
function cleanJsonString(jsonString: string): string {
  const pattern = /^```json\s*(.*?)\s*```$/s;
  const cleanedString = jsonString.replace(pattern, '$1');
  return cleanedString.trim();
}

Device Details

No response

Exact Hoarder Version

0.21.0

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @kurokuma-lab on GitHub (Feb 2, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/963 ### Describe the Bug I have tried a few models using openrouter api by setting the OPENAI_BASE_URL=https://openrouter.ai/api/v1/ Autotagging is not working due to the response from the models are enclosed by the markdown code block tag \`\`\`json { ... } \`\`\` Here is the error message that obtained from the log: ``` 2025-02-02T07:09:46.064Z error: [inference][37] inference job failed: Error: [inference][37] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json { "tags": [" Error: [inference][37] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json { "tags": [" at inferTags (/app/apps/workers/openaiWorker.ts:6:4164) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6686) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.0_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2578) ``` ### Steps to Reproduce OPENAI_BASE_URL=https://openrouter.ai/api/v1/ OPENAI_API_KEY=my token INFERENCE_TEXT_MODEL=google/gemma-2-9b-it:free ### Expected Behaviour AI autotagging ### Screenshots or Additional Context The issue can be resolved by using regex to clean up the `reponse.reponse` in `opneaiWorker.ts`. Instead of directly parsing the model response ```typescript let tags = openAIResponseSchema.parse(JSON.parse(response.response)).tags; ``` We can clean up the response using regex ```typescript const cleanedResponse = cleanJsonString(response.response); let tags = openAIResponseSchema.parse(JSON.parse(cleanedResponse)).tags; ``` ```typescript function cleanJsonString(jsonString: string): string { const pattern = /^```json\s*(.*?)\s*```$/s; const cleanedString = jsonString.replace(pattern, '$1'); return cleanedString.trim(); } ``` ### Device Details _No response_ ### Exact Hoarder Version 0.21.0 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
kerem closed this issue 2026-03-02 11:51:31 +03:00
Author
Owner

@MohamedBassem commented on GitHub (Feb 2, 2025):

Can you maybe try using the custom prompt feature to instruct the model to not emit a code block tag?

<!-- gh-comment-id:2629460015 --> @MohamedBassem commented on GitHub (Feb 2, 2025): Can you maybe try using the custom prompt feature to instruct the model to not emit a code block tag?
Author
Owner

@kurokuma-lab commented on GitHub (Feb 4, 2025):

Can you maybe try using the custom prompt feature to instruct the model to not emit a code block tag?

It works now after I added a custom prompt
No markdown codeblock tag in the respond json. Respond in pure json without formatting

<!-- gh-comment-id:2633681884 --> @kurokuma-lab commented on GitHub (Feb 4, 2025): > Can you maybe try using the custom prompt feature to instruct the model to not emit a code block tag? It works now after I added a custom prompt `No markdown codeblock tag in the respond json. Respond in pure json without formatting`
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#636
No description provided.