[GH-ISSUE #15] feature request: provide anthropic compatible API #14

Closed
opened 2026-02-27 07:17:25 +03:00 by kerem · 9 comments
Owner

Originally created by @mzazon on GitHub (Jan 6, 2026).
Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/15

I know the project is openai gateway, but since kiro has all anthropic models, would be nice to have an anthropic compatible API exposed. I get around this using litellm with a your code in a container and litellm provides this anthropic API.

I may take a stab at it when I have some time but thought to raise the issue if you have thought about it

Originally created by @mzazon on GitHub (Jan 6, 2026). Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/15 I know the project is openai gateway, but since kiro has all anthropic models, would be nice to have an anthropic compatible API exposed. I get around this using litellm with a your code in a container and litellm provides this anthropic API. I may take a stab at it when I have some time but thought to raise the issue if you have thought about it
kerem closed this issue 2026-02-27 07:17:25 +03:00
Author
Owner

@jwadow commented on GitHub (Jan 7, 2026):

@mzazon, good idea.

Working now on v2.0.0 with native Anthropic support. Both APIs will work on the same server:

  • /v1/chat/completions for OpenAI
  • /v1/messages for Anthropic

No config needed, just use whichever endpoint you want.
Also renaming to kiro-gateway since it's multi-API now.

It will be ready when it is ready.

<!-- gh-comment-id:3717416629 --> @jwadow commented on GitHub (Jan 7, 2026): @mzazon, good idea. Working now on v2.0.0 with native Anthropic support. Both APIs will work on the same server: - `/v1/chat/completions` for OpenAI - `/v1/messages` for Anthropic No config needed, just use whichever endpoint you want. Also renaming to `kiro-gateway` since it's multi-API now. It will be ready when it is ready.
Author
Owner

@jwadow commented on GitHub (Jan 9, 2026):

I carried out a large-scale refactoring, the current version is 2.0-rc.1, everything works under my conditions, but I don’t want to release it yet. I will wait for user feedback and then I will make a normal release.

<!-- gh-comment-id:3728712404 --> @jwadow commented on GitHub (Jan 9, 2026): I carried out a large-scale refactoring, the current version is `2.0-rc.1`, everything works under my conditions, but I don’t want to release it yet. I will wait for user feedback and then I will make a normal release.
Author
Owner

@cristianadrielbraun commented on GitHub (Jan 9, 2026):

Hi @jwadow I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)".
I tried with claude-opus-4-5; claude-sonnet-4-5, no luck.
Is it expected?

EDIT: At a first glance I realized CC actually calls a versioned Haiku model that the kiro gateway doesn't expose, so it fails. I added "claude-haiku-4-5-20251001": "claude-haiku-4.5" to the config file and it started working... for a second. it still crashes with the same 422 as soon as it starts to do real work

<!-- gh-comment-id:3730325687 --> @cristianadrielbraun commented on GitHub (Jan 9, 2026): Hi @jwadow I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)". I tried with claude-opus-4-5; claude-sonnet-4-5, no luck. Is it expected? EDIT: At a first glance I realized CC actually calls a versioned Haiku model that the kiro gateway doesn't expose, so it fails. I added "claude-haiku-4-5-20251001": "claude-haiku-4.5" to the config file and it started working... for a second. it still crashes with the same 422 as soon as it starts to do real work
Author
Owner

@mzazon commented on GitHub (Jan 9, 2026):

Hi @jwadow I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)". I tried with claude-opus-4-5; claude-sonnet-4-5, no luck. Is it expected?

EDIT: At a first glance I realized CC actually calls a versioned Haiku model that the kiro gateway doesn't expose, so it fails. I added "claude-haiku-4-5-20251001": "claude-haiku-4.5" to the config file and it started working... for a second. it still crashes with the same 422 as soon as it starts to do real work

FYI not sure if this helps but have had to do a lot of playing to get something working but have not tested native anthropic support in claude code yet I did come up with this for .claude/settings.json, maybe it helps:
{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:8888",
"ANTHROPIC_MODEL": "claude-sonnet-4-5",
"ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-5",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-5",
"CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
"MAX_THINKING_TOKENS": "2000",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "true",
"CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1" },

<!-- gh-comment-id:3730450360 --> @mzazon commented on GitHub (Jan 9, 2026): > Hi [@jwadow](https://github.com/jwadow) I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)". I tried with claude-opus-4-5; claude-sonnet-4-5, no luck. Is it expected? > > EDIT: At a first glance I realized CC actually calls a versioned Haiku model that the kiro gateway doesn't expose, so it fails. I added "claude-haiku-4-5-20251001": "claude-haiku-4.5" to the config file and it started working... for a second. it still crashes with the same 422 as soon as it starts to do real work FYI not sure if this helps but have had to do a lot of playing to get something working but have not tested native anthropic support in claude code yet I did come up with this for .claude/settings.json, maybe it helps: { "env": { "ANTHROPIC_BASE_URL": "http://localhost:8888", "ANTHROPIC_MODEL": "claude-sonnet-4-5", "ANTHROPIC_SMALL_FAST_MODEL": "claude-haiku-4-5", "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-5", "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5", "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-5", "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000", "MAX_THINKING_TOKENS": "2000", "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "true", "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1" },
Author
Owner

@cristianadrielbraun commented on GitHub (Jan 9, 2026):

Well I was just taking a look at the Claude Code Docs, trying to find something like this! Thanks, I'll give it a try

<!-- gh-comment-id:3730457318 --> @cristianadrielbraun commented on GitHub (Jan 9, 2026): Well I was just taking a look at the Claude Code Docs, trying to find something like this! Thanks, I'll give it a try
Author
Owner

@bhaskoro-muthohar commented on GitHub (Jan 10, 2026):

Tested the native Anthropic API endpoint (/v1/messages) with Claude Code - works great!

Minor issue encountered: Claude Code requests dated model variants like claude-haiku-4-5-20251001 which weren't in the default model mapping. Had to add:

# In MODEL_MAPPING
"claude-haiku-4-5-20251001": "claude-haiku-4.5",

# In AVAILABLE_MODELS
"claude-haiku-4-5-20251001",

Suggestion: Consider enabling extended thinking (FAKE_REASONING=true) by default in the Anthropic endpoint, since Claude Code benefits significantly from it. Also, the default FAKE_REASONING_MAX_TOKENS=4000 might be a bit low for development tasks - 8000 seems like a better balance.

Overall, this is a huge improvement over the LiteLLM workaround. Thanks for the great work!

<!-- gh-comment-id:3732361832 --> @bhaskoro-muthohar commented on GitHub (Jan 10, 2026): Tested the native Anthropic API endpoint (`/v1/messages`) with Claude Code - works great! **Minor issue encountered:** Claude Code requests dated model variants like `claude-haiku-4-5-20251001` which weren't in the default model mapping. Had to add: ```python # In MODEL_MAPPING "claude-haiku-4-5-20251001": "claude-haiku-4.5", # In AVAILABLE_MODELS "claude-haiku-4-5-20251001", ``` **Suggestion:** Consider enabling extended thinking (`FAKE_REASONING=true`) by default in the Anthropic endpoint, since Claude Code benefits significantly from it. Also, the default `FAKE_REASONING_MAX_TOKENS=4000` might be a bit low for development tasks - 8000 seems like a better balance. Overall, this is a huge improvement over the LiteLLM workaround. Thanks for the great work!
Author
Owner

@jwadow commented on GitHub (Jan 10, 2026):

Tested the native Anthropic API endpoint (/v1/messages) with Claude Code - works great!

Minor issue encountered: Claude Code requests dated model variants like claude-haiku-4-5-20251001 which weren't in the default model mapping. Had to add:

"claude-haiku-4-5-20251001": "claude-haiku-4.5",

"claude-haiku-4-5-20251001",
Suggestion: Consider enabling extended thinking (FAKE_REASONING=true) by default in the Anthropic endpoint, since Claude Code benefits significantly from it. Also, the default FAKE_REASONING_MAX_TOKENS=4000 might be a bit low for development tasks - 8000 seems like a better balance.

Overall, this is a huge improvement over the LiteLLM workaround. Thanks for the great work!

Hi, thanks for the feedback.

As far as I remember, reasoning should be enabled by default, regardless of the API endpoint.

Also, regarding 4000, it's just a convention; it's not enforced at all. In reality, a Kiro API request only lasts 120 seconds before it's interrupted, during which time the model manages to print 8000+ tokens. I don't know how to fix this, perhaps there's no way. This 120-second limit per request is built into their internal API and is the cornerstone of our entire project.

So, considering that 8000+ is a constant, asking the model to reason for 4000 tokens is a reasonable limit, which the model won't respect anyway, as far as I understand. Just being on the safe side.

<!-- gh-comment-id:3732411767 --> @jwadow commented on GitHub (Jan 10, 2026): > Tested the native Anthropic API endpoint (`/v1/messages`) with Claude Code - works great! > > **Minor issue encountered:** Claude Code requests dated model variants like `claude-haiku-4-5-20251001` which weren't in the default model mapping. Had to add: > > "claude-haiku-4-5-20251001": "claude-haiku-4.5", > > "claude-haiku-4-5-20251001", > **Suggestion:** Consider enabling extended thinking (`FAKE_REASONING=true`) by default in the Anthropic endpoint, since Claude Code benefits significantly from it. Also, the default `FAKE_REASONING_MAX_TOKENS=4000` might be a bit low for development tasks - 8000 seems like a better balance. > > Overall, this is a huge improvement over the LiteLLM workaround. Thanks for the great work! Hi, thanks for the feedback. As far as I remember, reasoning should be enabled by default, regardless of the API endpoint. Also, regarding 4000, it's just a convention; it's not enforced at all. In reality, a Kiro API request only lasts 120 seconds before it's interrupted, during which time the model manages to print 8000+ tokens. I don't know how to fix this, perhaps there's no way. This 120-second limit per request is built into their internal API and is the cornerstone of our entire project. So, considering that 8000+ is a constant, asking the model to reason for 4000 tokens is a reasonable limit, which the model won't respect anyway, as far as I understand. Just being on the safe side.
Author
Owner

@jwadow commented on GitHub (Jan 11, 2026):

Hi @jwadow I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)". I tried with claude-opus-4-5; claude-sonnet-4-5, no luck. Is it expected?

This should have been a separate issue; it's quite a tricky thing. Thanks for reporting it.


Hey guys @cristianadrielbraun @mzazon here is the commit github.com/jwadow/kiro-gateway@6ce52d9ed9

So I got the dynamic model resolution working. No more hardcoded stuff in
the project, it has no value other than hidden 3.7. Everything loads from the Kiro API now.

What changed:

  • Models load at startup from the API
  • Model names get normalized automatically (with ., with - etc.)
  • Handles versioned models with dates
  • Works with any date you want, 2033, 2050, doesnt matter

The only hardcoded thing left is the legend claude 3.7 sonnet because we need it for
compatibility with older stuff.

When new models come out they will just work automatically. No code changes
needed, just restart Kiro Gateway to show new models like Sonnet 4.7 in /models.
Pretty clean setup now.

P.S. If a model is released while the gateway is running while 100.00% uptime, the new model will still work without restart in manual requests, but it won't show up in /models. This only happens once every few months anyway, so restarting it isn't a big deal, haha.

<!-- gh-comment-id:3733757160 --> @jwadow commented on GitHub (Jan 11, 2026): > Hi [@jwadow](https://github.com/jwadow) I dont' want to open an new issue just for this so, question: Using clients like OpenCode seems to work, but using Claude Code directly I get "API Error: 422", an the app output shows: " HTTP 400 - POST /v1/messages - Invalid model. Please select a different model to continue. (reason: INVALID_MODEL_ID)". I tried with claude-opus-4-5; claude-sonnet-4-5, no luck. Is it expected? This should have been a separate issue; it's quite a tricky thing. Thanks for reporting it. --- Hey guys @cristianadrielbraun @mzazon here is the commit https://github.com/jwadow/kiro-gateway/commit/6ce52d9ed9584b0c442a973876830df97d240a1c So I got the dynamic model resolution working. No more hardcoded stuff in the project, it has no value other than hidden 3.7. Everything loads from the Kiro API now. What changed: - Models load at startup from the API - Model names get normalized automatically (with `.`, with `-` etc.) - Handles versioned models with dates - Works with any date you want, 2033, 2050, doesnt matter The only hardcoded thing left is the legend claude 3.7 sonnet because we need it for compatibility with older stuff. When new models come out they will just work automatically. No code changes needed, just restart Kiro Gateway to show new models like Sonnet 4.7 in /models. Pretty clean setup now. P.S. If a model is released while the gateway is running while 100.00% uptime, the new model will still work without restart in manual requests, but it won't show up in /models. This only happens once every few months anyway, so restarting it isn't a big deal, haha.
Author
Owner

@jwadow commented on GitHub (Jan 11, 2026):

On the topic at hand, as far as I understand, no one's having any problems with Anthropic? So, I'll be releasing version 2.0 soon.
I'll also close this issue.

<!-- gh-comment-id:3733759682 --> @jwadow commented on GitHub (Jan 11, 2026): On the topic at hand, as far as I understand, no one's having any problems with Anthropic? So, I'll be releasing version 2.0 soon. I'll also close this issue.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/kiro-gateway-jwadow#14
No description provided.