[GH-ISSUE #32] [BUG] Vision capability fails via Gateway despite verified support in native Kiro IDE #24

New issue

Closed

opened 2026-02-27 07:17:29 +03:00 by kerem · 1 comment

kerem commented

2026-02-27 07:17:29 +03:00

Owner

Originally created by @smileheart0708 on GitHub (Jan 11, 2026).
Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/32

Kiro Gateway Version

v2.0

What happened?

Description

Following the fix for the 422 validation error in #30, the gateway now correctly accepts requests with image blocks. However, the model (Claude 3.5/4.5) remains unable to "see" or process the images when requested through the gateway.

Evidence of Backend Support

I have verified that the native Kiro IDE supports vision recognition perfectly using the same backend. This confirms that the Kiro/AWS backend API is capable of processing multi-modal inputs.

The issue seems to be that the image data is being lost or incorrectly formatted during the conversion process within the gateway (likely in converter/anthropic.go), causing the backend to ignore the image blocks while only processing the text.

Comparison (Screenshots attached)

Via Kiro-API Gateway: The model fails to recognize the image and asks for a local file path or tool usage.
Via Native Kiro IDE: The model successfully identifies and describes the image content.

Steps to Reproduce

Upload an image through the gateway.
Ask "What is in this image?" in English.
Observe the model's failure to recognize the input.

Suggested Investigation

Please check the BuildKiroPayload logic to ensure that image content blocks are correctly mapped to the specific format expected by the Kiro/AWS Event Stream.

Debug Logs

no logs file

Originally created by @smileheart0708 on GitHub (Jan 11, 2026). Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/32 ### Kiro Gateway Version v2.0 ### What happened? ### Description Following the fix for the 422 validation error in #30, the gateway now correctly accepts requests with image blocks. However, the model (Claude 3.5/4.5) remains unable to "see" or process the images when requested through the gateway. ### Evidence of Backend Support I have verified that the **native Kiro IDE supports vision recognition perfectly** using the same backend. This confirms that the Kiro/AWS backend API is capable of processing multi-modal inputs. The issue seems to be that the image data is being lost or incorrectly formatted during the conversion process within the gateway (likely in `converter/anthropic.go`), causing the backend to ignore the image blocks while only processing the text. ### Comparison (Screenshots attached) 1. **Via Kiro-API Gateway:** The model fails to recognize the image and asks for a local file path or tool usage. 2. **Via Native Kiro IDE:** The model successfully identifies and describes the image content. ### Steps to Reproduce 1. Upload an image through the gateway. 2. Ask "What is in this image?" in English. 3. Observe the model's failure to recognize the input. ### Suggested Investigation Please check the `BuildKiroPayload` logic to ensure that `image` content blocks are correctly mapped to the specific format expected by the Kiro/AWS Event Stream. <img width="1202" height="1584" alt="Image" src="https://github.com/user-attachments/assets/befc5c8a-834b-4621-96d9-cf164bb71683" /> <img width="2313" height="1202" alt="Image" src="https://github.com/user-attachments/assets/7329218f-1e63-4a3d-a7a7-d3cbb131ebc4" /> ### Debug Logs no logs file

kerem

2026-02-27 07:17:29 +03:00

closed this issue
added the
bug

fixed
labels

kerem commented

2026-02-27 07:17:31 +03:00

Author

Owner

@jwadow commented on GitHub (Jan 11, 2026):

Fixed! The problem was that images were going to the wrong place in the request. They were in userInputMessageContext.images but should be in userInputMessage.images. I checked how native Kiro IDE sends requests and matched that format. Also added stripping of data URL prefix since some clients send the full thing instead of just base64. Should work now, let me know if you still have issues!

P.S. I should re-release 2.0, lol

@jwadow commented on GitHub (Jan 11, 2026): Fixed! The problem was that images were going to the wrong place in the request. They were in userInputMessageContext.images but should be in userInputMessage.images. I checked how native Kiro IDE sends requests and matched that format. Also added stripping of data URL prefix since some clients send the full thing instead of just base64. Should work now, let me know if you still have issues! P.S. I should re-release 2.0, lol