[GH-ISSUE #462] Add support for more AI providers in a generic way #297

Closed
opened 2026-03-02 11:48:33 +03:00 by kerem · 30 comments
Owner

Originally created by @kamtschatka on GitHub (Oct 3, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/462

Different people have already asked for "their" AI provider to be supported.
It is unlikely that we add support for all of them, but we could switch to a library that allows us to connect to at least the more well known ones.
In the discussion https://github.com/hoarder-app/hoarder/discussions/453, there was a list of possible solutions already:

Originally created by @kamtschatka on GitHub (Oct 3, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/462 Different people have already asked for "their" AI provider to be supported. It is unlikely that we add support for all of them, but we could switch to a library that allows us to connect to at least the more well known ones. In the discussion https://github.com/hoarder-app/hoarder/discussions/453, there was a list of possible solutions already: * https://github.com/themaximalist/llm.js * https://github.com/samestrin/llm-interface * https://github.com/Portkey-AI/gateway/
kerem 2026-03-02 11:48:33 +03:00
Author
Owner

@kamtschatka commented on GitHub (Oct 3, 2024):

Requested providers so far:

<!-- gh-comment-id:2392002018 --> @kamtschatka commented on GitHub (Oct 3, 2024): Requested providers so far: * OpenRouter (seems to work with OpenAI support already) * OpenAI Azure (seems to have some issues: #137) * 1min.ai (#451) * Gemini (https://github.com/hoarder-app/hoarder/discussions/453) * Perplexity (#262)
Author
Owner

@jkaberg commented on GitHub (Oct 4, 2024):

I didn't see https://claude.ai mentioned anywhere yet so here's my +1 - documentation here

<!-- gh-comment-id:2393022895 --> @jkaberg commented on GitHub (Oct 4, 2024): I didn't see https://claude.ai mentioned anywhere yet so here's my +1 - documentation [here](https://support.anthropic.com/en/collections/5370014-anthropic-api-api-console)
Author
Owner

@bhupesh-sf commented on GitHub (Nov 7, 2024):

What about local LLM's using ollama. All we need is to give an option to configure OpenAI Base url and there are many providers which
given OpenAI compatible endpoint for the use of local LLMs.

<!-- gh-comment-id:2461694862 --> @bhupesh-sf commented on GitHub (Nov 7, 2024): What about local LLM's using ollama. All we need is to give an option to configure OpenAI Base url and there are many providers which given OpenAI compatible endpoint for the use of local LLMs.
Author
Owner

@MohamedBassem commented on GitHub (Nov 7, 2024):

@bhupesh-sf hoarder already supports local llms natively. Check the inference section in the configuration documentation

<!-- gh-comment-id:2461767170 --> @MohamedBassem commented on GitHub (Nov 7, 2024): @bhupesh-sf hoarder already supports local llms natively. Check the inference section in the configuration documentation
Author
Owner

@bhupesh-sf commented on GitHub (Nov 7, 2024):

Oh, thanks. Being excited with the app I missed in documentation. Sorry for my ignorance

<!-- gh-comment-id:2463023114 --> @bhupesh-sf commented on GitHub (Nov 7, 2024): Oh, thanks. Being excited with the app I missed in documentation. Sorry for my ignorance
Author
Owner

@MohamedBassem commented on GitHub (Nov 9, 2024):

Gemini now has an OpenAI compatible API as well: https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/

<!-- gh-comment-id:2466454901 --> @MohamedBassem commented on GitHub (Nov 9, 2024): Gemini now has an OpenAI compatible API as well: https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/
Author
Owner

@bebound commented on GitHub (Nov 12, 2024):

I've used https://github.com/stulzq/azure-openai-proxy to simulate Azure OpenAI as OpenAI. It works well for chenzhaoyu94/chatgpt-web.

When use the same config in hoarder, it shows "something went wrong", and I can't find any useful message in the log.

<!-- gh-comment-id:2470245522 --> @bebound commented on GitHub (Nov 12, 2024): I've used https://github.com/stulzq/azure-openai-proxy to simulate Azure OpenAI as OpenAI. It works well for [chenzhaoyu94/chatgpt-web](https://github.com/Chanzhaoyu/chatgpt-web). When use the same config in hoarder, it shows "something went wrong", and I can't find any useful message in the log.
Author
Owner

@dinnouti commented on GitHub (Nov 27, 2024):

+1 for Amazon Bedrock LLMs like Claude, Meta, Cohere, so-on

<!-- gh-comment-id:2504652904 --> @dinnouti commented on GitHub (Nov 27, 2024): +1 for Amazon Bedrock LLMs like Claude, Meta, Cohere, so-on
Author
Owner

@jbohnslav commented on GitHub (Dec 1, 2024):

If you use a LiteLLM proxy, you can already connect to all of these LLMs via the OpenAI api.

<!-- gh-comment-id:2509893024 --> @jbohnslav commented on GitHub (Dec 1, 2024): If you use a LiteLLM proxy, you can already connect to all of these LLMs via the OpenAI api.
Author
Owner

@xiaoduo commented on GitHub (Dec 3, 2024):

If can support OpenAI compatible API, would be enough.

<!-- gh-comment-id:2513429381 --> @xiaoduo commented on GitHub (Dec 3, 2024): If can support OpenAI compatible API, would be enough.
Author
Owner

@dinnouti commented on GitHub (Dec 4, 2024):

Just to close the loop on the Bedrock, AWS has a sample OpenAI-compatible RESTful APIs for Amazon Bedrock code.

https://github.com/aws-samples/bedrock-access-gateway

<!-- gh-comment-id:2517821267 --> @dinnouti commented on GitHub (Dec 4, 2024): Just to close the loop on the Bedrock, AWS has a sample OpenAI-compatible RESTful APIs for Amazon Bedrock code. https://github.com/aws-samples/bedrock-access-gateway
Author
Owner

@bradhawkins85 commented on GitHub (Jan 16, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/
OPENAI_API_KEY=Your API Key From Google AI Studio
INFERENCE_TEXT_MODEL=gemini-1.5-flash

<!-- gh-comment-id:2594557785 --> @bradhawkins85 commented on GitHub (Jan 16, 2025): For those wanting to use Gemini here is the section from my Docker .env that works perfectly. OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash
Author
Owner

@stancubed commented on GitHub (Jan 25, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

Fantastic! Would love to see this as an example in the docs, if appropriate!

<!-- gh-comment-id:2614076873 --> @stancubed commented on GitHub (Jan 25, 2025): > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash Fantastic! Would love to see this as an example in the docs, if appropriate!
Author
Owner

@yeathn commented on GitHub (Feb 2, 2025):

Here is what got mine working for Perplexity.

OPENAI_BASE_URL: https://api.perplexity.ai
OPENAI_API_KEY: Your Perplexity API Key
INFERENCE_TEXT_MODEL: sonar-pro

<!-- gh-comment-id:2629444857 --> @yeathn commented on GitHub (Feb 2, 2025): Here is what got mine working for Perplexity. OPENAI_BASE_URL: https://api.perplexity.ai OPENAI_API_KEY: Your Perplexity API Key INFERENCE_TEXT_MODEL: sonar-pro
Author
Owner

@sparkyfen commented on GitHub (Feb 3, 2025):

@yeathn what version of hoarder are you using? My docker container still complains with:

Error: 400 ["At body -> response_format -> ResponseFormatText -> type: Input should be 'text'", "At body -> response_format -> ResponseFormatJSONSchema -> type: Input should be 'json_schema'", "At body -> response_format -> ResponseFormatJSONSchema -> json_schema: Field required", "At body -> response_format -> ResponseFormatRegex -> type: Input should be 'regex'", "At body -> response_format -> ResponseFormatRegex -> regex: Field required"]
<!-- gh-comment-id:2629685059 --> @sparkyfen commented on GitHub (Feb 3, 2025): @yeathn what version of hoarder are you using? My docker container still complains with: ``` Error: 400 ["At body -> response_format -> ResponseFormatText -> type: Input should be 'text'", "At body -> response_format -> ResponseFormatJSONSchema -> type: Input should be 'json_schema'", "At body -> response_format -> ResponseFormatJSONSchema -> json_schema: Field required", "At body -> response_format -> ResponseFormatRegex -> type: Input should be 'regex'", "At body -> response_format -> ResponseFormatRegex -> regex: Field required"] ```
Author
Owner

@yeathn commented on GitHub (Feb 3, 2025):

@sparkyfen Just checked the logs mine does too. It apprently only worked for the AI summary feature but not for tagging.

<!-- gh-comment-id:2630259059 --> @yeathn commented on GitHub (Feb 3, 2025): @sparkyfen Just checked the logs mine does too. It apprently only worked for the AI summary feature but not for tagging.
Author
Owner

@Corb3t commented on GitHub (Feb 5, 2025):

Is it possible to add a new setting within the webgui to choose which AI & API key from User Settings > AI Settings instead of having to adjust the docker env?

<!-- gh-comment-id:2635674922 --> @Corb3t commented on GitHub (Feb 5, 2025): Is it possible to add a new setting within the webgui to choose which AI & API key from User Settings > AI Settings instead of having to adjust the docker env?
Author
Owner

@hz-xiaxz commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

<!-- gh-comment-id:2664513901 --> @hz-xiaxz commented on GitHub (Feb 18, 2025): > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!
Author
Owner

@bradhawkins85 commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

try:
OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/
OPENAI_API_KEY: Your API Key From Google AI Studio
INFERENCE_TEXT_MODEL: gemini-1.5-flash
INFERENCE_IMAGE_MODEL: gemini-1.5-flash
EMBEDDING_TEXT_MODEL: text-embedding-004
INFERENCE_JOB_TIMEOUT_SEC: 3600

Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version.

Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed.
https://pastebin.com/QAHrgFFc

<!-- gh-comment-id:2664521800 --> @bradhawkins85 commented on GitHub (Feb 18, 2025): > > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash > > I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks! try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600 Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version. Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc
Author
Owner

@hz-xiaxz commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600

Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version.

Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc

thanks for your fast and kind reply!! Though I still find the tagging not working. Maybe I should open another issue. Your setting has ruled out the problem of AI setting and thank you again!

<!-- gh-comment-id:2664528356 --> @hz-xiaxz commented on GitHub (Feb 18, 2025): > > > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash > > > > > > I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks! > > try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600 > > Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version. > > Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc thanks for your fast and kind reply!! Though I still find the tagging not working. Maybe I should open another issue. Your setting has ruled out the problem of AI setting and thank you again!
Author
Owner

@JC1738 commented on GitHub (Mar 2, 2025):

I am using LLM Studio for local LLM. I get the summary to work, but for the tagging, I get the following error:

Hoarder Logs
2025-03-02T01:45:09.613Z info: [inference][70] Starting an inference job for bookmark with id "izehx9mvbn7dl41so5dx9maj" 2025-03-02T01:45:09.623Z error: [inference][70] inference job failed: Error: 400 "'response_format.type' must be 'json_schema'" Error: 400 "'response_format.type' must be 'json_schema'" at APIError.generate (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/error.js:45:20) at OpenAI.makeStatusError (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:291:33) at OpenAI.makeRequest (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:335:30) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async OpenAIInferenceClient.inferFromText (/app/apps/workers/node_modules/.pnpm/@hoarder+shared@file+packages+shared_better-sqlite3@11.3.0/node_modules/@hoarder/shared/inference.ts:2:2002) at async inferTagsFromText (/app/apps/workers/openaiWorker.ts:6:3097) at async inferTags (/app/apps/workers/openaiWorker.ts:6:3356) at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6814) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.2_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2656)

LLM Studio
2025-03-01 17:51:51 [DEBUG] Received request: POST to /v1/chat/completions with body { "messages": [ { "role": "user", "content": "\nYou are a bot in a read-it-later app and your res... <Truncated in logs> ...y \"tags\" and the value is an array of string tags." } ], "model": "qwen2.5-14b-instruct", "response_format": { "type": "json_object" } }

<!-- gh-comment-id:2692510812 --> @JC1738 commented on GitHub (Mar 2, 2025): I am using LLM Studio for local LLM. I get the summary to work, but for the tagging, I get the following error: **Hoarder Logs** `2025-03-02T01:45:09.613Z info: [inference][70] Starting an inference job for bookmark with id "izehx9mvbn7dl41so5dx9maj" 2025-03-02T01:45:09.623Z error: [inference][70] inference job failed: Error: 400 "'response_format.type' must be 'json_schema'" Error: 400 "'response_format.type' must be 'json_schema'" at APIError.generate (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/error.js:45:20) at OpenAI.makeStatusError (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:291:33) at OpenAI.makeRequest (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:335:30) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async OpenAIInferenceClient.inferFromText (/app/apps/workers/node_modules/.pnpm/@hoarder+shared@file+packages+shared_better-sqlite3@11.3.0/node_modules/@hoarder/shared/inference.ts:2:2002) at async inferTagsFromText (/app/apps/workers/openaiWorker.ts:6:3097) at async inferTags (/app/apps/workers/openaiWorker.ts:6:3356) at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6814) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.2_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2656)` **LLM Studio** `2025-03-01 17:51:51 [DEBUG] Received request: POST to /v1/chat/completions with body { "messages": [ { "role": "user", "content": "\nYou are a bot in a read-it-later app and your res... <Truncated in logs> ...y \"tags\" and the value is an array of string tags." } ], "model": "qwen2.5-14b-instruct", "response_format": { "type": "json_object" } }`
Author
Owner

@MohamedBassem commented on GitHub (Mar 2, 2025):

Hey folks, I found the problem with this response format thing and merged a fix in 69d81aa. The nightly build will be ready in 15mins and will have a fix for this issue. I tried it with gemini and it works well (both for tagging and summaries). And as an escape hatch, if the provider you're using doesn't support structured outputs, you will be able to set INFERENCE_SUPPORTS_STRUCTURED_OUTPUT=false and hope that the model will be able to respond in the correct format.

<!-- gh-comment-id:2692698170 --> @MohamedBassem commented on GitHub (Mar 2, 2025): Hey folks, I found the problem with this response format thing and merged a fix in [69d81aa](https://github.com/hoarder-app/hoarder/commit/69d81aafe113a2b4769ecb936b9a5a02e31a0fd8). The nightly build will be ready in 15mins and will have a fix for this issue. I tried it with gemini and it works well (both for tagging and summaries). And as an escape hatch, if the provider you're using doesn't support structured outputs, you will be able to set `INFERENCE_SUPPORTS_STRUCTURED_OUTPUT=false` and hope that the model will be able to respond in the correct format.
Author
Owner

@JC1738 commented on GitHub (Mar 2, 2025):

Great, ideally we could have different endpoints and different models for all the uses. My local LLMs don't do well on vision, but are perfectly fine summary, and hopefully work for tagging, but would be nice to specify the url and model for text embedding, summary, tagging, and images.

<!-- gh-comment-id:2692803063 --> @JC1738 commented on GitHub (Mar 2, 2025): Great, ideally we could have different endpoints and different models for all the uses. My local LLMs don't do well on vision, but are perfectly fine summary, and hopefully work for tagging, but would be nice to specify the url and model for text embedding, summary, tagging, and images.
Author
Owner

@Rising-Galaxy commented on GitHub (Mar 21, 2025):

deepseek please.

<!-- gh-comment-id:2744232187 --> @Rising-Galaxy commented on GitHub (Mar 21, 2025): deepseek please.
Author
Owner

@BenGeba commented on GitHub (Apr 10, 2025):

Has anyone found a solution for Azure? I always get a “404 - Resource not found”.
I have used the following ways of writing the URL:
OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini
OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview
OPENAI_BASE_URL: https://my-resource.openai.azure.com

I get the same response for all of them

<!-- gh-comment-id:2791723317 --> @BenGeba commented on GitHub (Apr 10, 2025): Has anyone found a solution for Azure? I always get a “404 - Resource not found”. I have used the following ways of writing the URL: `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini` `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview` `OPENAI_BASE_URL: https://my-resource.openai.azure.com` I get the same response for all of them
Author
Owner

@nooz commented on GitHub (Apr 18, 2025):

+1 for OpenRouter support

<!-- gh-comment-id:2815902399 --> @nooz commented on GitHub (Apr 18, 2025): +1 for OpenRouter support
Author
Owner

@MohamedBassem commented on GitHub (Apr 18, 2025):

Folks, at this point, there's no plans to support any more providers beside openai-compatible ones (most of the providers are) and Ollama. The industry is converging on openai-compatible APIs anyways. I've added a guide (link) about how to configure some of of the most popular providers (e.g. gemini, openrouter and perplexity). If you try other popular providers and they work, please send a PR to add it to this guide.

<!-- gh-comment-id:2816028339 --> @MohamedBassem commented on GitHub (Apr 18, 2025): Folks, at this point, there's no plans to support any more providers beside openai-compatible ones (most of the providers are) and Ollama. The industry is converging on openai-compatible APIs anyways. I've added a guide ([link](https://docs.karakeep.app/next/guides/different-ai-providers)) about how to configure some of of the most popular providers (e.g. gemini, openrouter and perplexity). If you try other popular providers and they work, please send a PR to add it to this guide.
Author
Owner

@snotrauk commented on GitHub (Jul 21, 2025):

has anyone got a work around for Azure AI?

<!-- gh-comment-id:3096006451 --> @snotrauk commented on GitHub (Jul 21, 2025): has anyone got a work around for Azure AI?
Author
Owner

@snotrauk commented on GitHub (Jul 21, 2025):

managed to get it working with - https://github.com/stulzq/azure-openai-proxy

<!-- gh-comment-id:3096121904 --> @snotrauk commented on GitHub (Jul 21, 2025): managed to get it working with - https://github.com/stulzq/azure-openai-proxy
Author
Owner

@cloudchristoph commented on GitHub (Oct 26, 2025):

Has anyone found a solution for Azure? I always get a “404 - Resource not found”. I have used the following ways of writing the URL: OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview OPENAI_BASE_URL: https://my-resource.openai.azure.com

I get the same response for all of them

Yes @BenGeba. For gpt-4.1-mini you should use the following parameters. (There is no need for a proxy @snotrauk - at least not for karakeep.)

OPENAI_BASE_URL=https://my-resource.cognitiveservices.azure.com/openai/v1/
OPENAI_API_KEY=<your-key>
INFERENCE_TEXT_MODEL=gpt-4.1-mini

You find all infos in the AI Foundry Portal (My assets -> Models + endpoints -> your model), but ignore the "Target URI".
Switch to the Open AI SDK in the right example section.

Image

Your base URL could be ".openai.azure.com" or ".cognitiveservices.azure.com" - pay close attention. Microsoft is slowly migrating to a new endpoint.

GPT-5-mini will work the same way - but we have to wait for the release of the "max_tokens" to "max_completion_tokens" fix (#1969) - cause Azure is only accepting the new parameter for GPT-5 etc.

Hope this helps.

<!-- gh-comment-id:3448505887 --> @cloudchristoph commented on GitHub (Oct 26, 2025): > Has anyone found a solution for Azure? I always get a “404 - Resource not found”. I have used the following ways of writing the URL: `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini` `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview` `OPENAI_BASE_URL: https://my-resource.openai.azure.com` > > I get the same response for all of them Yes @BenGeba. For gpt-4.1-mini you should use the following parameters. (There is no need for a proxy @snotrauk - at least not for karakeep.) ``` OPENAI_BASE_URL=https://my-resource.cognitiveservices.azure.com/openai/v1/ OPENAI_API_KEY=<your-key> INFERENCE_TEXT_MODEL=gpt-4.1-mini ``` You find all infos in the AI Foundry Portal (My assets -> Models + endpoints -> your model), but ignore the "Target URI". Switch to the `Open AI SDK` in the right example section. <img width="1920" height="861" alt="Image" src="https://github.com/user-attachments/assets/dfe31a7c-c8d7-4fbe-bece-12253a8f813f" /> Your base URL could be ".openai.azure.com" or ".cognitiveservices.azure.com" - pay close attention. Microsoft is slowly migrating to a new endpoint. GPT-5-mini will work the same way - but we have to wait for the release of the "max_tokens" to "max_completion_tokens" fix (#1969) - cause Azure is only accepting the new parameter for GPT-5 etc. Hope this helps.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#297
No description provided.