[GH-ISSUE #462 ] Add support for more AI providers in a generic way #297

Author

Owner

@kamtschatka commented on GitHub (Oct 3, 2024):

Requested providers so far:

OpenRouter (seems to work with OpenAI support already)
OpenAI Azure (seems to have some issues: #137)
1min.ai (#451)
Gemini (https://github.com/hoarder-app/hoarder/discussions/453)
Perplexity (#262)

@kamtschatka commented on GitHub (Oct 3, 2024): Requested providers so far: * OpenRouter (seems to work with OpenAI support already) * OpenAI Azure (seems to have some issues: #137) * 1min.ai (#451) * Gemini (https://github.com/hoarder-app/hoarder/discussions/453) * Perplexity (#262)

kerem commented

Author

Owner

@jkaberg commented on GitHub (Oct 4, 2024):

I didn't see https://claude.ai mentioned anywhere yet so here's my +1 - documentation here

@jkaberg commented on GitHub (Oct 4, 2024): I didn't see https://claude.ai mentioned anywhere yet so here's my +1 - documentation [here](https://support.anthropic.com/en/collections/5370014-anthropic-api-api-console)

kerem commented

Author

Owner

@bhupesh-sf commented on GitHub (Nov 7, 2024):

What about local LLM's using ollama. All we need is to give an option to configure OpenAI Base url and there are many providers which
given OpenAI compatible endpoint for the use of local LLMs.

@bhupesh-sf commented on GitHub (Nov 7, 2024): What about local LLM's using ollama. All we need is to give an option to configure OpenAI Base url and there are many providers which given OpenAI compatible endpoint for the use of local LLMs.

kerem commented

Author

Owner

@MohamedBassem commented on GitHub (Nov 7, 2024):

@bhupesh-sf hoarder already supports local llms natively. Check the inference section in the configuration documentation

@MohamedBassem commented on GitHub (Nov 7, 2024): @bhupesh-sf hoarder already supports local llms natively. Check the inference section in the configuration documentation

kerem commented

Author

Owner

@bhupesh-sf commented on GitHub (Nov 7, 2024):

Oh, thanks. Being excited with the app I missed in documentation. Sorry for my ignorance

@bhupesh-sf commented on GitHub (Nov 7, 2024): Oh, thanks. Being excited with the app I missed in documentation. Sorry for my ignorance

kerem commented

Author

Owner

@MohamedBassem commented on GitHub (Nov 9, 2024):

Gemini now has an OpenAI compatible API as well: https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/

@MohamedBassem commented on GitHub (Nov 9, 2024): Gemini now has an OpenAI compatible API as well: https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/

kerem commented

Author

Owner

@bebound commented on GitHub (Nov 12, 2024):

I've used https://github.com/stulzq/azure-openai-proxy to simulate Azure OpenAI as OpenAI. It works well for chenzhaoyu94/chatgpt-web.

When use the same config in hoarder, it shows "something went wrong", and I can't find any useful message in the log.

@bebound commented on GitHub (Nov 12, 2024): I've used https://github.com/stulzq/azure-openai-proxy to simulate Azure OpenAI as OpenAI. It works well for [chenzhaoyu94/chatgpt-web](https://github.com/Chanzhaoyu/chatgpt-web). When use the same config in hoarder, it shows "something went wrong", and I can't find any useful message in the log.

kerem commented

Author

Owner

@dinnouti commented on GitHub (Nov 27, 2024):

+1 for Amazon Bedrock LLMs like Claude, Meta, Cohere, so-on

@dinnouti commented on GitHub (Nov 27, 2024): +1 for Amazon Bedrock LLMs like Claude, Meta, Cohere, so-on

kerem commented

Author

Owner

@jbohnslav commented on GitHub (Dec 1, 2024):

If you use a LiteLLM proxy, you can already connect to all of these LLMs via the OpenAI api.

@jbohnslav commented on GitHub (Dec 1, 2024): If you use a LiteLLM proxy, you can already connect to all of these LLMs via the OpenAI api.

kerem commented

Author

Owner

@xiaoduo commented on GitHub (Dec 3, 2024):

If can support OpenAI compatible API, would be enough.

@xiaoduo commented on GitHub (Dec 3, 2024): If can support OpenAI compatible API, would be enough.

kerem commented

https://github.com/aws-samples/bedrock-access-gateway

Author

Owner

@dinnouti commented on GitHub (Dec 4, 2024):

Just to close the loop on the Bedrock, AWS has a sample OpenAI-compatible RESTful APIs for Amazon Bedrock code.

@dinnouti commented on GitHub (Dec 4, 2024): Just to close the loop on the Bedrock, AWS has a sample OpenAI-compatible RESTful APIs for Amazon Bedrock code. https://github.com/aws-samples/bedrock-access-gateway

kerem commented

Author

Owner

@bradhawkins85 commented on GitHub (Jan 16, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/
OPENAI_API_KEY=Your API Key From Google AI Studio
INFERENCE_TEXT_MODEL=gemini-1.5-flash

@bradhawkins85 commented on GitHub (Jan 16, 2025): For those wanting to use Gemini here is the section from my Docker .env that works perfectly. OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

kerem commented

Author

Owner

@stancubed commented on GitHub (Jan 25, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

Fantastic! Would love to see this as an example in the docs, if appropriate!

@stancubed commented on GitHub (Jan 25, 2025): > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash Fantastic! Would love to see this as an example in the docs, if appropriate!

kerem commented

Author

Owner

@yeathn commented on GitHub (Feb 2, 2025):

Here is what got mine working for Perplexity.

OPENAI_BASE_URL: https://api.perplexity.ai
OPENAI_API_KEY: Your Perplexity API Key
INFERENCE_TEXT_MODEL: sonar-pro

@yeathn commented on GitHub (Feb 2, 2025): Here is what got mine working for Perplexity. OPENAI_BASE_URL: https://api.perplexity.ai OPENAI_API_KEY: Your Perplexity API Key INFERENCE_TEXT_MODEL: sonar-pro

kerem commented

Author

Owner

@sparkyfen commented on GitHub (Feb 3, 2025):

@yeathn what version of hoarder are you using? My docker container still complains with:

Error: 400 ["At body -> response_format -> ResponseFormatText -> type: Input should be 'text'", "At body -> response_format -> ResponseFormatJSONSchema -> type: Input should be 'json_schema'", "At body -> response_format -> ResponseFormatJSONSchema -> json_schema: Field required", "At body -> response_format -> ResponseFormatRegex -> type: Input should be 'regex'", "At body -> response_format -> ResponseFormatRegex -> regex: Field required"]

@sparkyfen commented on GitHub (Feb 3, 2025): @yeathn what version of hoarder are you using? My docker container still complains with: ``` Error: 400 ["At body -> response_format -> ResponseFormatText -> type: Input should be 'text'", "At body -> response_format -> ResponseFormatJSONSchema -> type: Input should be 'json_schema'", "At body -> response_format -> ResponseFormatJSONSchema -> json_schema: Field required", "At body -> response_format -> ResponseFormatRegex -> type: Input should be 'regex'", "At body -> response_format -> ResponseFormatRegex -> regex: Field required"] ```

kerem commented

Author

Owner

@yeathn commented on GitHub (Feb 3, 2025):

@sparkyfen Just checked the logs mine does too. It apprently only worked for the AI summary feature but not for tagging.

@yeathn commented on GitHub (Feb 3, 2025): @sparkyfen Just checked the logs mine does too. It apprently only worked for the AI summary feature but not for tagging.

kerem commented

Author

Owner

@Corb3t commented on GitHub (Feb 5, 2025):

Is it possible to add a new setting within the webgui to choose which AI & API key from User Settings > AI Settings instead of having to adjust the docker env?

@Corb3t commented on GitHub (Feb 5, 2025): Is it possible to add a new setting within the webgui to choose which AI & API key from User Settings > AI Settings instead of having to adjust the docker env?

kerem commented

Author

Owner

@hz-xiaxz commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.

OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

@hz-xiaxz commented on GitHub (Feb 18, 2025): > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

kerem commented

Author

Owner

@bradhawkins85 commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

try:
OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/
OPENAI_API_KEY: Your API Key From Google AI Studio
INFERENCE_TEXT_MODEL: gemini-1.5-flash
INFERENCE_IMAGE_MODEL: gemini-1.5-flash
EMBEDDING_TEXT_MODEL: text-embedding-004
INFERENCE_JOB_TIMEOUT_SEC: 3600

Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version.

Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed.
https://pastebin.com/QAHrgFFc

@bradhawkins85 commented on GitHub (Feb 18, 2025): > > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash > > I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks! try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600 Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version. Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc

kerem commented

Author

Owner

@hz-xiaxz commented on GitHub (Feb 18, 2025):

For those wanting to use Gemini here is the section from my Docker .env that works perfectly.
OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash

I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks!

try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600

Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version.

Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc

thanks for your fast and kind reply!! Though I still find the tagging not working. Maybe I should open another issue. Your setting has ruled out the problem of AI setting and thank you again!

@hz-xiaxz commented on GitHub (Feb 18, 2025): > > > For those wanting to use Gemini here is the section from my Docker .env that works perfectly. > > > OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY=Your API Key From Google AI Studio INFERENCE_TEXT_MODEL=gemini-1.5-flash > > > > > > I tried this and it works good on AI summary, but automatic tagging is not working. Anyone has any idea on that? Thanks! > > try: OPENAI_BASE_URL: https://generativelanguage.googleapis.com/v1beta/ OPENAI_API_KEY: Your API Key From Google AI Studio INFERENCE_TEXT_MODEL: gemini-1.5-flash INFERENCE_IMAGE_MODEL: gemini-1.5-flash EMBEDDING_TEXT_MODEL: text-embedding-004 INFERENCE_JOB_TIMEOUT_SEC: 3600 > > Don't know if it will make any difference but I rebuilt my hoarder server recently and added the extra lines, worked fine previously but that's my most up to date version. > > Here is a link to my complete docker compose in case that helps, API and Secret keys have been removed. https://pastebin.com/QAHrgFFc thanks for your fast and kind reply!! Though I still find the tagging not working. Maybe I should open another issue. Your setting has ruled out the problem of AI setting and thank you again!

kerem commented

Author

Owner

@JC1738 commented on GitHub (Mar 2, 2025):

I am using LLM Studio for local LLM. I get the summary to work, but for the tagging, I get the following error:

Hoarder Logs
2025-03-02T01:45:09.613Z info: [inference][70] Starting an inference job for bookmark with id "izehx9mvbn7dl41so5dx9maj" 2025-03-02T01:45:09.623Z error: [inference][70] inference job failed: Error: 400 "'response_format.type' must be 'json_schema'" Error: 400 "'response_format.type' must be 'json_schema'" at APIError.generate (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/error.js:45:20) at OpenAI.makeStatusError (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:291:33) at OpenAI.makeRequest (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:335:30) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async OpenAIInferenceClient.inferFromText (/app/apps/workers/node_modules/.pnpm/@hoarder+shared@file+packages+shared_better-sqlite3@11.3.0/node_modules/@hoarder/shared/inference.ts:2:2002) at async inferTagsFromText (/app/apps/workers/openaiWorker.ts:6:3097) at async inferTags (/app/apps/workers/openaiWorker.ts:6:3356) at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6814) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.2_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2656)

LLM Studio
2025-03-01 17:51:51 [DEBUG] Received request: POST to /v1/chat/completions with body { "messages": [ { "role": "user", "content": "\nYou are a bot in a read-it-later app and your res... <Truncated in logs> ...y \"tags\" and the value is an array of string tags." } ], "model": "qwen2.5-14b-instruct", "response_format": { "type": "json_object" } }

@JC1738 commented on GitHub (Mar 2, 2025): I am using LLM Studio for local LLM. I get the summary to work, but for the tagging, I get the following error: **Hoarder Logs** `2025-03-02T01:45:09.613Z info: [inference][70] Starting an inference job for bookmark with id "izehx9mvbn7dl41so5dx9maj" 2025-03-02T01:45:09.623Z error: [inference][70] inference job failed: Error: 400 "'response_format.type' must be 'json_schema'" Error: 400 "'response_format.type' must be 'json_schema'" at APIError.generate (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/error.js:45:20) at OpenAI.makeStatusError (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:291:33) at OpenAI.makeRequest (/app/apps/workers/node_modules/.pnpm/openai@4.67.1_zod@3.22.4/node_modules/openai/core.js:335:30) at process.processTicksAndRejections (node:internal/process/task_queues:105:5) at async OpenAIInferenceClient.inferFromText (/app/apps/workers/node_modules/.pnpm/@hoarder+shared@file+packages+shared_better-sqlite3@11.3.0/node_modules/@hoarder/shared/inference.ts:2:2002) at async inferTagsFromText (/app/apps/workers/openaiWorker.ts:6:3097) at async inferTags (/app/apps/workers/openaiWorker.ts:6:3356) at async Object.runOpenAI [as run] (/app/apps/workers/openaiWorker.ts:6:6814) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.3.2_better-sqlite3@11.3.0/node_modules/liteque/dist/runner.js:2:2656)` **LLM Studio** `2025-03-01 17:51:51 [DEBUG] Received request: POST to /v1/chat/completions with body { "messages": [ { "role": "user", "content": "\nYou are a bot in a read-it-later app and your res... <Truncated in logs> ...y \"tags\" and the value is an array of string tags." } ], "model": "qwen2.5-14b-instruct", "response_format": { "type": "json_object" } }`

kerem commented

Author

Owner

@MohamedBassem commented on GitHub (Mar 2, 2025):

Hey folks, I found the problem with this response format thing and merged a fix in 69d81aa. The nightly build will be ready in 15mins and will have a fix for this issue. I tried it with gemini and it works well (both for tagging and summaries). And as an escape hatch, if the provider you're using doesn't support structured outputs, you will be able to set INFERENCE_SUPPORTS_STRUCTURED_OUTPUT=false and hope that the model will be able to respond in the correct format.

@MohamedBassem commented on GitHub (Mar 2, 2025): Hey folks, I found the problem with this response format thing and merged a fix in [69d81aa](https://github.com/hoarder-app/hoarder/commit/69d81aafe113a2b4769ecb936b9a5a02e31a0fd8). The nightly build will be ready in 15mins and will have a fix for this issue. I tried it with gemini and it works well (both for tagging and summaries). And as an escape hatch, if the provider you're using doesn't support structured outputs, you will be able to set `INFERENCE_SUPPORTS_STRUCTURED_OUTPUT=false` and hope that the model will be able to respond in the correct format.

kerem commented

Author

Owner

@JC1738 commented on GitHub (Mar 2, 2025):

Great, ideally we could have different endpoints and different models for all the uses. My local LLMs don't do well on vision, but are perfectly fine summary, and hopefully work for tagging, but would be nice to specify the url and model for text embedding, summary, tagging, and images.

@JC1738 commented on GitHub (Mar 2, 2025): Great, ideally we could have different endpoints and different models for all the uses. My local LLMs don't do well on vision, but are perfectly fine summary, and hopefully work for tagging, but would be nice to specify the url and model for text embedding, summary, tagging, and images.

kerem commented

Author

Owner

@Rising-Galaxy commented on GitHub (Mar 21, 2025):

deepseek please.

@Rising-Galaxy commented on GitHub (Mar 21, 2025): deepseek please.

kerem commented

Author

Owner

@BenGeba commented on GitHub (Apr 10, 2025):

Has anyone found a solution for Azure? I always get a “404 - Resource not found”.
I have used the following ways of writing the URL:
OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini
OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview
OPENAI_BASE_URL: https://my-resource.openai.azure.com

I get the same response for all of them

@BenGeba commented on GitHub (Apr 10, 2025): Has anyone found a solution for Azure? I always get a “404 - Resource not found”. I have used the following ways of writing the URL: `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini` `OPENAI_BASE_URL: https://my-resource.openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview` `OPENAI_BASE_URL: https://my-resource.openai.azure.com` I get the same response for all of them

kerem commented

Author

Owner

@nooz commented on GitHub (Apr 18, 2025):

+1 for OpenRouter support

@nooz commented on GitHub (Apr 18, 2025): +1 for OpenRouter support

kerem commented

Author

Owner

@MohamedBassem commented on GitHub (Apr 18, 2025):

Folks, at this point, there's no plans to support any more providers beside openai-compatible ones (most of the providers are) and Ollama. The industry is converging on openai-compatible APIs anyways. I've added a guide (link) about how to configure some of of the most popular providers (e.g. gemini, openrouter and perplexity). If you try other popular providers and they work, please send a PR to add it to this guide.

@MohamedBassem commented on GitHub (Apr 18, 2025): Folks, at this point, there's no plans to support any more providers beside openai-compatible ones (most of the providers are) and Ollama. The industry is converging on openai-compatible APIs anyways. I've added a guide ([link](https://docs.karakeep.app/next/guides/different-ai-providers)) about how to configure some of of the most popular providers (e.g. gemini, openrouter and perplexity). If you try other popular providers and they work, please send a PR to add it to this guide.

kerem commented

Author

Owner

@snotrauk commented on GitHub (Jul 21, 2025):

has anyone got a work around for Azure AI?

@snotrauk commented on GitHub (Jul 21, 2025): has anyone got a work around for Azure AI?

kerem commented

Author

Owner

@snotrauk commented on GitHub (Jul 21, 2025):

managed to get it working with - https://github.com/stulzq/azure-openai-proxy

@snotrauk commented on GitHub (Jul 21, 2025): managed to get it working with - https://github.com/stulzq/azure-openai-proxy

kerem commented