[GH-ISSUE #891] INFERENCE_LANG to be dynamic #584

Open
opened 2026-03-02 11:51:03 +03:00 by kerem · 10 comments
Owner

Originally created by @tareefdev on GitHub (Jan 16, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/891

Describe the feature you'd like

I would like to have ai generated tags use the same language as the article itself.

Describe the benefits this would bring to existing Hoarder users

This provides better multi-language workflow, where the tags match article's language.

Can the goal of this request already be achieved via other means?

No

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @tareefdev on GitHub (Jan 16, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/891 ### Describe the feature you'd like I would like to have ai generated tags use the same language as the article itself. ### Describe the benefits this would bring to existing Hoarder users This provides better multi-language workflow, where the tags match article's language. ### Can the goal of this request already be achieved via other means? No ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@MohamedBassem commented on GitHub (Jan 16, 2025):

People have had success by defining this as a custom rule in the AI settings page.

<!-- gh-comment-id:2595512798 --> @MohamedBassem commented on GitHub (Jan 16, 2025): People have had success by defining this as a custom rule in the AI settings page.
Author
Owner

@tareefdev commented on GitHub (Jan 16, 2025):

Just adding another rule did not work, here is my current config:

You are a bot in a read-it-later app and your responsibility is to help with automatic tagging.
Please analyze the text between the sentences "CONTENT START HERE" and "CONTENT END HERE" and suggest relevant tags that describe its key themes, topics, and main ideas. The rules are:
- Aim for a variety of tags, including broad categories, specific keywords, and potential sub-genres.
- The tags language must be in english.
- If it's a famous website you may also include a tag for the website. If the tag is not generic enough, don't include it.
- The content can include text for cookie consent and privacy policy, ignore those while tagging.
- Aim for 3-5 tags.
- If there are no good tags, leave the array empty.
- The tags language be in the language of the article itself.
CONTENT START HERE

<CONTENT_HERE>

CONTENT END HERE
You must respond in JSON with the key "tags" and the value is an array of string tags.
<!-- gh-comment-id:2596325727 --> @tareefdev commented on GitHub (Jan 16, 2025): Just adding another rule did not work, here is my current config: ```` You are a bot in a read-it-later app and your responsibility is to help with automatic tagging. Please analyze the text between the sentences "CONTENT START HERE" and "CONTENT END HERE" and suggest relevant tags that describe its key themes, topics, and main ideas. The rules are: - Aim for a variety of tags, including broad categories, specific keywords, and potential sub-genres. - The tags language must be in english. - If it's a famous website you may also include a tag for the website. If the tag is not generic enough, don't include it. - The content can include text for cookie consent and privacy policy, ignore those while tagging. - Aim for 3-5 tags. - If there are no good tags, leave the array empty. - The tags language be in the language of the article itself. CONTENT START HERE <CONTENT_HERE> CONTENT END HERE You must respond in JSON with the key "tags" and the value is an array of string tags. ````
Author
Owner

@MohamedBassem commented on GitHub (Jan 18, 2025):

@tareefdev This might sound stupid, but try to be more strict with the model. Try:

Generated tags MUST be in the same language of the bookmark.

What model are you using btw?

<!-- gh-comment-id:2599958180 --> @MohamedBassem commented on GitHub (Jan 18, 2025): @tareefdev This might sound stupid, but try to be more strict with the model. Try: ``` Generated tags MUST be in the same language of the bookmark. ``` What model are you using btw?
Author
Owner

@tareefdev commented on GitHub (Jan 23, 2025):

No, it did not work. I am using OpenAI API, most probably the default model.

I was thinking of something like this change to be introduced:

export function buildImagePrompt(lang: string, customPrompts: string[]) {
const isDynamic = lang === `dynamic`;
return `
....
The tags language must be in ${isDynamic ? `article language` : lang}.
...
`
}
<!-- gh-comment-id:2611106362 --> @tareefdev commented on GitHub (Jan 23, 2025): No, it did not work. I am using OpenAI API, most probably the default model. I was thinking of something like this change to be introduced: ````ts export function buildImagePrompt(lang: string, customPrompts: string[]) { const isDynamic = lang === `dynamic`; return ` .... The tags language must be in ${isDynamic ? `article language` : lang}. ... ` } ````
Author
Owner

@cadmi commented on GitHub (Mar 8, 2025):

Just my two cents for those who might be reading this issue.

It helped me to add another rule with this wording:

- Cancel the rule about tags language. Generated tags MUST be in the same language of the bookmark.

If it makes a difference, I use mistral.ai,

OPENAI_BASE_URL=https://api.mistral.ai/v1
INFERENCE_TEXT_MODEL=mistral-small-latest
<!-- gh-comment-id:2708245795 --> @cadmi commented on GitHub (Mar 8, 2025): Just my two cents for those who might be reading this issue. It helped me to add another rule with this wording: `- Cancel the rule about tags language. Generated tags MUST be in the same language of the bookmark. ` If it makes a difference, I use mistral.ai, ``` OPENAI_BASE_URL=https://api.mistral.ai/v1 INFERENCE_TEXT_MODEL=mistral-small-latest ```
Author
Owner

@tareefdev commented on GitHub (Mar 8, 2025):

- Cancel the rule about tags language. Generated tags MUST be in the same language of the bookmark.

This exact phrasing worked for me too, thank you!

<!-- gh-comment-id:2708366062 --> @tareefdev commented on GitHub (Mar 8, 2025): > `- Cancel the rule about tags language. Generated tags MUST be in the same language of the bookmark. ` This exact phrasing worked for me too, thank you!
Author
Owner

@huyz commented on GitHub (Mar 14, 2025):

So ticket can be closed?

<!-- gh-comment-id:2723972131 --> @huyz commented on GitHub (Mar 14, 2025): So ticket can be closed?
Author
Owner

@tareefdev commented on GitHub (Mar 14, 2025):

Thanks for checking on this. Unfortunately, the abovementioned rule sometimes works and other times does not. Taking a quick look at my links, the success rate is about 50%

<!-- gh-comment-id:2725266077 --> @tareefdev commented on GitHub (Mar 14, 2025): Thanks for checking on this. Unfortunately, the abovementioned rule sometimes works and other times does not. Taking a quick look at my links, the success rate is about 50%
Author
Owner

@zantag commented on GitHub (Apr 28, 2025):

Thanks for checking on this. Unfortunately, the abovementioned rule sometimes works and other times does not. Taking a quick look at my links, the success rate is about 50%

Have to figure out how is implemented in Paperless-ai... There tags are on the same language as the doc is.

<!-- gh-comment-id:2835577915 --> @zantag commented on GitHub (Apr 28, 2025): > Thanks for checking on this. Unfortunately, the abovementioned rule sometimes works and other times does not. Taking a quick look at my links, the success rate is about 50% Have to figure out how is implemented in Paperless-ai... There tags are on the same language as the doc is.
Author
Owner

@jabsammy commented on GitHub (May 7, 2025):

Can you please explain where exactly to put these rules? No idea how to use that with OpenAI API.

<!-- gh-comment-id:2858758699 --> @jabsammy commented on GitHub (May 7, 2025): Can you please explain where exactly to put these rules? No idea how to use that with OpenAI API.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#584
No description provided.