starred/karakeep

Fork 0

mirror of https://github.com/karakeep-app/karakeep.git synced 2026-04-26 00:16:03 +03:00

[PR #505] [CLOSED] feature: Improve the LLM prompts for tag generation #1640

New issue

Closed

opened 2026-03-02 11:58:31 +03:00 by kerem · 0 comments

kerem commented

2026-03-02 11:58:31 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/karakeep-app/karakeep/pull/505
Author: @Papierkorb
Created: 10/7/2024
Status: ❌ Closed

Base: main ← Head: feature/custom-prompt

📝 Commits (1)

567e63d feature: Improve the LLM prompts for tag generation

📊 Changes

1 file changed (+16 additions, -17 deletions)

View changed files

📝 packages/shared/prompts.ts (+16 -17)

📄 Description

Hello!

I just found this great piece of software! I was thrilled to see that I can use my own OpenAI endpoint and point it towards my Llama 3.1 8B endpoint.

Status Quo

I think the standard prompt as set in packages/shared/prompts.ts could require some tweaking.
The default prompt doesn't work at all with my model (Which I think is a pretty commonly used one). This one does work great:

You're an expert document tagger. Given an input you generate an JSON object of the following format: { "tags": ["First tag", "Another tag", ...] }

Only respond in this format! Write only the JSON data and nothing else. Do not write an explanation. If you can't find tags you're satisfied with respond with an empty tags array. Write at most five tags. Write your tags in ${lang}.

${content}

I've only tested this prompt manually with my model. With this, my 0% success rate turns into a 100% success rate; Which surprises me, usually you need to handle the model giving an introduction like "Your requested JSON document: ..." by looking for the first { and last }.

Thus

Long story short: This PR adds the "TEXT_TAG_PROMPT" and "IMAGE_TAG_PROMPT" environment variables. If not set, then the already existing prompt is used. If it is set, it's used instead. The variables are then interpolated Like {{this}}, akin to Jinja2 templates (Which the project may adopt in the future?)

My template in this syntax

You're an expert document tagger. Given an input you generate an JSON object of the following format: { "tags": ["First tag", "Another tag", ...] }

Only respond in this format! Write only the JSON data and nothing else. Do not write an explanation. If you can't find tags you're satisfied with respond with an empty tags array. Write at most five tags. Write your tags in {{lang}}.

{{content}}

Not Great: As is, this change breaks AISettings.tsx, as it would now require the server configuration to work. For my tests I simply removed the calls to the buildPrompt functions - But this of course isn't acceptable for merging. I'd need a little help here on how we should approach this.

Now it works with Llama 3.1 8B, yay

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/karakeep-app/karakeep/pull/505 **Author:** [@Papierkorb](https://github.com/Papierkorb) **Created:** 10/7/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `feature/custom-prompt` --- ### 📝 Commits (1) - [`567e63d`](https://github.com/karakeep-app/karakeep/commit/567e63dcf0f4783838365bd205828c3275083660) feature: Improve the LLM prompts for tag generation ### 📊 Changes **1 file changed** (+16 additions, -17 deletions) <details> <summary>View changed files</summary> 📝 `packages/shared/prompts.ts` (+16 -17) </details> ### 📄 Description Hello! I just found this great piece of software! I was thrilled to see that I can use my own OpenAI endpoint and point it towards my Llama 3.1 8B endpoint. # Status Quo I think the standard prompt as set in `packages/shared/prompts.ts` could require some tweaking. The default prompt doesn't work at all with my model (Which I think is a pretty commonly used one). This one does work great: ``` You're an expert document tagger. Given an input you generate an JSON object of the following format: { "tags": ["First tag", "Another tag", ...] } Only respond in this format! Write only the JSON data and nothing else. Do not write an explanation. If you can't find tags you're satisfied with respond with an empty tags array. Write at most five tags. Write your tags in ${lang}. ${content} ``` I've only tested this prompt manually with my model. With this, my 0% success rate turns into a 100% success rate; Which surprises me, usually you need to handle the model giving an introduction like "Your requested JSON document: ..." by looking for the first `{` and last `}`. # Thus Long story short: This PR adds the "TEXT_TAG_PROMPT" and "IMAGE_TAG_PROMPT" environment variables. If not set, then the already existing prompt is used. If it is set, it's used instead. The variables are then interpolated `Like {{this}}`, akin to Jinja2 templates (Which the project may adopt in the future?) <details><summary>My template in this syntax</summary> ``` You're an expert document tagger. Given an input you generate an JSON object of the following format: { "tags": ["First tag", "Another tag", ...] } Only respond in this format! Write only the JSON data and nothing else. Do not write an explanation. If you can't find tags you're satisfied with respond with an empty tags array. Write at most five tags. Write your tags in {{lang}}. {{content}} ``` </details> **Not Great**: As is, this change breaks `AISettings.tsx`, as it would now require the server configuration to work. For my tests I simply removed the calls to the buildPrompt functions - But this of course isn't acceptable for merging. I'd need a little help here on how we should approach this. <details><summary>Now it works with Llama 3.1 8B, yay</summary> ![grafik](https://github.com/user-attachments/assets/3e5ed180-67d5-499f-a119-fb36f4f9b5c6) </details> --- 🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

kerem

2026-03-02 11:58:31 +03:00

closed this issue
added the
pull-request
label

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/karakeep#1640

No description provided.

Rows
Columns