[GH-ISSUE #73] Feature: Auto-truncate conversation history when payload exceeds Kiro API body size limit #51

Open
opened 2026-02-27 07:17:42 +03:00 by kerem · 6 comments
Owner

Originally created by @kilhyeonjun on GitHub (Feb 9, 2026).
Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/73

Summary

When the conversation context grows large (e.g., 137k+ tokens with system prompt + tool definitions), the total HTTP payload sent to Kiro API can exceed ~1.2-1.3MB, causing a 400 Improperly formed request error.

This is an upstream Kiro API limitation (no max_tokens or inferenceConfig support), but the gateway could mitigate it by detecting oversized payloads before sending.

Environment

  • Kiro Gateway: Docker (latest)
  • Client: OpenClaw (OpenAI-compatible API consumer)
  • Model: claude-opus-4.6

Reproduction

  1. Build a conversation with 100+ messages + system prompt + 30+ tool definitions
  2. Total payload exceeds ~1.3MB
  3. Kiro API returns 400 Improperly formed request (no reason field)

Verified payload size limit

Payload size Result
1.0 MB (156k tokens) OK
1.25 MB OK
1.3 MB 400

Proposed Solution

Add a pre-flight check before sending to Kiro API:

  1. Serialize the Kiro payload and check byte size
  2. If exceeding a configurable threshold (e.g., MAX_PAYLOAD_SIZE_MB=1.2), automatically truncate oldest conversation history messages until under the limit
  3. Log a warning when truncation occurs

This would prevent cryptic 400 errors for long conversations and complement the existing CONTENT_LENGTH_EXCEEDS_THRESHOLD handling in kiro_errors.py.

Workaround

Currently mitigated client-side by setting contextWindow: 128000 (instead of 200000) in the model config, so the client triggers compaction earlier.

Originally created by @kilhyeonjun on GitHub (Feb 9, 2026). Original GitHub issue: https://github.com/jwadow/kiro-gateway/issues/73 ## Summary When the conversation context grows large (e.g., 137k+ tokens with system prompt + tool definitions), the total HTTP payload sent to Kiro API can exceed ~1.2-1.3MB, causing a `400 Improperly formed request` error. This is an upstream Kiro API limitation (no `max_tokens` or `inferenceConfig` support), but the gateway could mitigate it by detecting oversized payloads before sending. ## Environment - Kiro Gateway: Docker (latest) - Client: OpenClaw (OpenAI-compatible API consumer) - Model: claude-opus-4.6 ## Reproduction 1. Build a conversation with 100+ messages + system prompt + 30+ tool definitions 2. Total payload exceeds ~1.3MB 3. Kiro API returns `400 Improperly formed request` (no `reason` field) ## Verified payload size limit | Payload size | Result | |---|---| | 1.0 MB (156k tokens) | ✅ OK | | 1.25 MB | ✅ OK | | 1.3 MB | ❌ 400 | ## Proposed Solution Add a pre-flight check before sending to Kiro API: 1. Serialize the Kiro payload and check byte size 2. If exceeding a configurable threshold (e.g., `MAX_PAYLOAD_SIZE_MB=1.2`), automatically truncate oldest conversation history messages until under the limit 3. Log a warning when truncation occurs This would prevent cryptic 400 errors for long conversations and complement the existing `CONTENT_LENGTH_EXCEEDS_THRESHOLD` handling in `kiro_errors.py`. ## Workaround Currently mitigated client-side by setting `contextWindow: 128000` (instead of 200000) in the model config, so the client triggers compaction earlier.
Author
Owner

@bhaskoro-muthohar commented on GitHub (Feb 9, 2026):

Additional finding: exact payload size limit is ~615KB, not 1.2-1.3MB

I independently hit this same issue and binary-searched the exact boundary:

629,504 bytes (614.75 KB) → HTTP 200 OK
629,760 bytes (615.00 KB) → HTTP 400 "Improperly formed request."

Methodology: Took a known-good Kiro payload (~302KB) and padded a history entry's content with dummy text in 256-byte increments. The limit is consistent and deterministic — fails 5/5 above threshold, succeeds 5/5 below.

Payload Size Result
600 KB OK
610 KB OK
614 KB OK
615 KB FAIL
625 KB FAIL
700 KB FAIL

The error response is {"message":"Improperly formed request.","reason":null} — very misleading for a size limit.

Context: Claude Code with 41 tools and 168 messages. The Kiro JSON payload was 627KB, just over the limit. The error appears "intermittent" because conversations grow over time and cross the threshold.

A safe truncation target would be ~580-590KB to leave headroom.

<!-- gh-comment-id:3871431408 --> @bhaskoro-muthohar commented on GitHub (Feb 9, 2026): ## Additional finding: exact payload size limit is ~615KB, not 1.2-1.3MB I independently hit this same issue and binary-searched the exact boundary: ``` 629,504 bytes (614.75 KB) → HTTP 200 OK 629,760 bytes (615.00 KB) → HTTP 400 "Improperly formed request." ``` **Methodology:** Took a known-good Kiro payload (~302KB) and padded a history entry's content with dummy text in 256-byte increments. The limit is consistent and deterministic — fails 5/5 above threshold, succeeds 5/5 below. | Payload Size | Result | |-------------|--------| | 600 KB | ✅ OK | | 610 KB | ✅ OK | | 614 KB | ✅ OK | | 615 KB | ❌ FAIL | | 625 KB | ❌ FAIL | | 700 KB | ❌ FAIL | The error response is `{"message":"Improperly formed request.","reason":null}` — very misleading for a size limit. **Context:** Claude Code with 41 tools and 168 messages. The Kiro JSON payload was 627KB, just over the limit. The error appears "intermittent" because conversations grow over time and cross the threshold. A safe truncation target would be ~580-590KB to leave headroom.
Author
Owner

@bhaskoro-muthohar commented on GitHub (Feb 9, 2026):

@jwadow I'd like to make a case for why this should be handled by the gateway, not the client.

In issue #60, the design philosophy was stated as:

"Our gateway is transparent, we make minimal changes to the user's original request — with the exception of Kiro's bugs and shortcomings."

This ~615KB payload size limit is a Kiro shortcoming, not a client issue. The same payload works fine against the real Anthropic API, which accepts payloads well over 1MB. A client sending a valid Anthropic API request should not get a 400 Improperly formed request from a gateway that claims Anthropic API compatibility.

The gateway already handles several Kiro-specific quirks:

  • ensure_first_message_is_user() — Kiro requires user-first, Anthropic doesn't
  • ensure_alternating_roles() — Kiro requires strict alternation, Anthropic is more flexible
  • sanitize_json_schema() — strips fields Kiro rejects but Anthropic accepts
  • strip_all_tool_content() — handles Kiro's toolResults-without-tools rejection

Payload size truncation fits the same pattern: compensating for a Kiro limitation that doesn't exist in the Anthropic API.

Additionally, the error message "Improperly formed request." with reason: null gives the client zero information that this is a size issue. Without the gateway handling it, every client has to independently discover and work around this undocumented limit.

<!-- gh-comment-id:3871462641 --> @bhaskoro-muthohar commented on GitHub (Feb 9, 2026): @jwadow I'd like to make a case for why this should be handled by the gateway, not the client. In [issue #60](https://github.com/jwadow/kiro-gateway/issues/60#issuecomment-3831702050), the design philosophy was stated as: > "Our gateway is transparent, we make minimal changes to the user's original request — **with the exception of Kiro's bugs and shortcomings**." This ~615KB payload size limit **is a Kiro shortcoming**, not a client issue. The same payload works fine against the real Anthropic API, which accepts payloads well over 1MB. A client sending a valid Anthropic API request should not get a `400 Improperly formed request` from a gateway that claims Anthropic API compatibility. The gateway already handles several Kiro-specific quirks: - `ensure_first_message_is_user()` — Kiro requires user-first, Anthropic doesn't - `ensure_alternating_roles()` — Kiro requires strict alternation, Anthropic is more flexible - `sanitize_json_schema()` — strips fields Kiro rejects but Anthropic accepts - `strip_all_tool_content()` — handles Kiro's toolResults-without-tools rejection Payload size truncation fits the same pattern: compensating for a Kiro limitation that doesn't exist in the Anthropic API. Additionally, the error message `"Improperly formed request."` with `reason: null` gives the client **zero information** that this is a size issue. Without the gateway handling it, every client has to independently discover and work around this undocumented limit.
Author
Owner

@Hitesh-Sisara commented on GitHub (Feb 12, 2026):

hey @kilhyeonjun how can i set custom contextWindow for claude code

<!-- gh-comment-id:3891256983 --> @Hitesh-Sisara commented on GitHub (Feb 12, 2026): hey @kilhyeonjun how can i set custom contextWindow for claude code
Author
Owner

@sametakofficial commented on GitHub (Feb 12, 2026):

Additional finding: exact payload size limit is ~615KB, not 1.2-1.3MB

I independently hit this same issue and binary-searched the exact boundary:

629,504 bytes (614.75 KB) → HTTP 200 OK
629,760 bytes (615.00 KB) → HTTP 400 "Improperly formed request."

Methodology: Took a known-good Kiro payload (~302KB) and padded a history entry's content with dummy text in 256-byte increments. The limit is consistent and deterministic — fails 5/5 above threshold, succeeds 5/5 below.
Payload Size Result
600 KB OK
610 KB OK
614 KB OK
615 KB FAIL
625 KB FAIL
700 KB FAIL

The error response is {"message":"Improperly formed request.","reason":null} — very misleading for a size limit.

Context: Claude Code with 41 tools and 168 messages. The Kiro JSON payload was 627KB, just over the limit. The error appears "intermittent" because conversations grow over time and cross the threshold.

A safe truncation target would be ~580-590KB to leave headroom.

hey, ı used your payload size research and implemented fixes to this fork : github.com/sametakofficial/kiro-gateway
ı vibe coded it, but it works very well

<!-- gh-comment-id:3893974233 --> @sametakofficial commented on GitHub (Feb 12, 2026): > ## Additional finding: exact payload size limit is ~615KB, not 1.2-1.3MB > > I independently hit this same issue and binary-searched the exact boundary: > > ``` > 629,504 bytes (614.75 KB) → HTTP 200 OK > 629,760 bytes (615.00 KB) → HTTP 400 "Improperly formed request." > ``` > > **Methodology:** Took a known-good Kiro payload (~302KB) and padded a history entry's content with dummy text in 256-byte increments. The limit is consistent and deterministic — fails 5/5 above threshold, succeeds 5/5 below. > Payload Size Result > 600 KB ✅ OK > 610 KB ✅ OK > 614 KB ✅ OK > 615 KB ❌ FAIL > 625 KB ❌ FAIL > 700 KB ❌ FAIL > > The error response is `{"message":"Improperly formed request.","reason":null}` — very misleading for a size limit. > > **Context:** Claude Code with 41 tools and 168 messages. The Kiro JSON payload was 627KB, just over the limit. The error appears "intermittent" because conversations grow over time and cross the threshold. > > A safe truncation target would be ~580-590KB to leave headroom. hey, ı used your payload size research and implemented fixes to this fork : github.com/sametakofficial/kiro-gateway ı vibe coded it, but it works very well
Author
Owner

@kilhyeonjun commented on GitHub (Feb 15, 2026):

@Hitesh-Sisara

Claude Code doesn't have a direct contextWindow setting as far as I know.

One possible workaround is the CLAUDE_AUTOCOMPACT_PCT_OVERRIDE environment variable, which may control when auto-compaction triggers:

export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=64

This would set compaction at ~64% context usage (128K / 200K), helping avoid the ~615KB payload limit. But I'm not 100% sure this is officially supported — you may need to verify on your end.

Alternatively, you can set it in ~/.claude/settings.json:

{
  "env": {
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "64"
  }
}

If anyone has confirmed this works with Claude Code + kiro-gateway, please share!

<!-- gh-comment-id:3903310057 --> @kilhyeonjun commented on GitHub (Feb 15, 2026): @Hitesh-Sisara Claude Code doesn't have a direct `contextWindow` setting as far as I know. One possible workaround is the `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` environment variable, which may control when auto-compaction triggers: ```bash export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=64 ``` This would set compaction at ~64% context usage (128K / 200K), helping avoid the ~615KB payload limit. But I'm not 100% sure this is officially supported — you may need to verify on your end. Alternatively, you can set it in `~/.claude/settings.json`: ```json { "env": { "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "64" } } ``` If anyone has confirmed this works with Claude Code + kiro-gateway, please share!
Author
Owner

@Chumbayoumba commented on GitHub (Feb 18, 2026):

dummy

<!-- gh-comment-id:3920842954 --> @Chumbayoumba commented on GitHub (Feb 18, 2026): dummy
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/kiro-gateway-jwadow#51
No description provided.