[GH-ISSUE #38] [BUG] Streaming response silently hangs on large tool-call conversations, causing client-side timeout and truncated output #34

New issue

Closed

opened 2026-02-27 15:38:02 +03:00 by kerem · 3 comments

kerem commented

2026-02-27 15:38:02 +03:00

Owner

Originally created by @terryops on GitHub (Feb 20, 2026).
Original GitHub issue: https://github.com/NikkeTryHard/zerogravity/issues/38

Description

When using ZeroGravity (Docker, v1.107.0) as a proxy for opus-4.6 with a large conversation history containing many tool calls, the streaming response can silently hang mid-transfer. The upstream Google API returns 200 OK and begins streaming SSE events, but stops sending data without completing the response or closing the connection. ZeroGravity does not detect or log this stall, so the client eventually times out after ~10 minutes.

Symptoms

Client receives a partial/truncated text response (the response is cut off mid-sentence)
No error is logged on the ZeroGravity side — the last log entry shows MITM: streaming response with status=200 but never reaches response complete
Client-side timeout fires after 600s: FailoverError: LLM request timed out.
The session transcript is corrupted (malformed JSON line from the incomplete streamed response)

Reproduction

Environment

ZeroGravity: v1.107.0 (Docker ghcr.io/nikketryhard/zerogravity:latest)
Model: opus-4.6 (Claude Opus 4.6 via Antigravity LS)
Client: OpenClaw agent with stream=true

Request characteristics

Conversation with 70+ tool call rounds (exec, read, web_fetch, session_status, etc.)
MITM-modified body size: ~360KB–522KB (after injecting tool rounds as functionCall/Response pairs)
thinkingBudget=2048, maxOutputTokens=32000

Timeline (from logs)

10:02:23 POST /v1/chat/completions model=opus-4.6 stream=true
10:02:23 MITM: request modified [...append 22 tool round(s)...] modified=360029
10:02:26 MITM: streaming response status=200
10:02:36 MITM: response complete — response_text_len=1063   ← cascade multi-turn continues
10:02:36 MITM: forwarding LLM request (cascade continuation, body_len=9787)
10:02:40 MITM: streaming response status=200                ← streaming starts
         ... (no further logs for this cascade) ...
10:05:49 MITM: connecting upstream (BoringSSL)               ← LS heartbeat, unrelated
         ... (still no response complete) ...
10:19:33 [client] embedded run timeout (600000ms)            ← client gives up
10:19:49 [client] Profile zerogravity:default timed out
10:21:48 [client] session file repaired: dropped 1 malformed line

The streaming started at 10:02:40 but never completed. No response complete was logged. The cascade d1987dae just disappeared.

Expected Behavior

ZeroGravity should detect when the upstream stream stalls (no SSE events for N seconds) and close the connection with an error
Log a warning/error when a streaming response does not complete within a reasonable timeout
The response should not silently hang — either complete normally or report an error

Suggested Fix

Add a stream idle timeout (e.g., 120s with no SSE data received) that aborts the response and returns a proper error to the client
Log a WARN when a streaming response is interrupted or stalls
Consider forwarding a finish_reason: "error" or similar to the client so it can retry

Notes

This issue particularly affects agentic clients that send large conversation histories with many tool calls. The MITM layer successfully processes the request (rewriting tools, injecting parameters), but the upstream response just silently stops mid-stream. This may be related to the upstream Google API dropping connections for very large requests.

Originally created by @terryops on GitHub (Feb 20, 2026). Original GitHub issue: https://github.com/NikkeTryHard/zerogravity/issues/38 ## Description When using ZeroGravity (Docker, v1.107.0) as a proxy for `opus-4.6` with a large conversation history containing many tool calls, the streaming response can **silently hang** mid-transfer. The upstream Google API returns `200 OK` and begins streaming SSE events, but stops sending data without completing the response or closing the connection. ZeroGravity does not detect or log this stall, so the client eventually times out after ~10 minutes. ## Symptoms - Client receives a **partial/truncated** text response (the response is cut off mid-sentence) - No error is logged on the ZeroGravity side — the last log entry shows `MITM: streaming response` with `status=200` but never reaches `response complete` - Client-side timeout fires after 600s: `FailoverError: LLM request timed out.` - The session transcript is corrupted (malformed JSON line from the incomplete streamed response) ## Reproduction ### Environment - ZeroGravity: `v1.107.0` (Docker `ghcr.io/nikketryhard/zerogravity:latest`) - Model: `opus-4.6` (Claude Opus 4.6 via Antigravity LS) - Client: OpenClaw agent with `stream=true` ### Request characteristics - Conversation with **70+ tool call rounds** (exec, read, web_fetch, session_status, etc.) - MITM-modified body size: **~360KB–522KB** (after injecting tool rounds as functionCall/Response pairs) - `thinkingBudget=2048`, `maxOutputTokens=32000` ### Timeline (from logs) ``` 10:02:23 POST /v1/chat/completions model=opus-4.6 stream=true 10:02:23 MITM: request modified [...append 22 tool round(s)...] modified=360029 10:02:26 MITM: streaming response status=200 10:02:36 MITM: response complete — response_text_len=1063 ← cascade multi-turn continues 10:02:36 MITM: forwarding LLM request (cascade continuation, body_len=9787) 10:02:40 MITM: streaming response status=200 ← streaming starts ... (no further logs for this cascade) ... 10:05:49 MITM: connecting upstream (BoringSSL) ← LS heartbeat, unrelated ... (still no response complete) ... 10:19:33 [client] embedded run timeout (600000ms) ← client gives up 10:19:49 [client] Profile zerogravity:default timed out 10:21:48 [client] session file repaired: dropped 1 malformed line ``` The streaming started at `10:02:40` but never completed. No `response complete` was logged. The cascade `d1987dae` just disappeared. ## Expected Behavior 1. ZeroGravity should detect when the upstream stream stalls (no SSE events for N seconds) and close the connection with an error 2. Log a warning/error when a streaming response does not complete within a reasonable timeout 3. The response should not silently hang — either complete normally or report an error ## Suggested Fix - Add a **stream idle timeout** (e.g., 120s with no SSE data received) that aborts the response and returns a proper error to the client - Log a `WARN` when a streaming response is interrupted or stalls - Consider forwarding a `finish_reason: "error"` or similar to the client so it can retry ## Notes This issue particularly affects agentic clients that send large conversation histories with many tool calls. The MITM layer successfully processes the request (rewriting tools, injecting parameters), but the upstream response just silently stops mid-stream. This may be related to the upstream Google API dropping connections for very large requests.

kerem closed this issue

2026-02-27 15:38:02 +03:00

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 20, 2026):

give me your zg report for that specific trace

@NikkeTryHard commented on GitHub (Feb 20, 2026): give me your `zg report` for that specific trace

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 20, 2026):

@NikkeTryHard commented on GitHub (Feb 20, 2026): <img width="869" height="490" alt="Image" src="https://github.com/user-attachments/assets/0509f18d-eaff-4ac4-8865-2e8a3b43e8cb" />

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 21, 2026):

Fixed in d3cff42. When the upstream stream stalls (120s idle timeout), the MITM proxy now sends a 504 STREAM_IDLE_TIMEOUT error through the event channel instead of silently dropping it. API handlers will surface this as a proper error response to the client, instead of emitting finish_reason=stop with truncated content.

Available in v1.2.1.

@NikkeTryHard commented on GitHub (Feb 21, 2026): Fixed in d3cff42. When the upstream stream stalls (120s idle timeout), the MITM proxy now sends a `504 STREAM_IDLE_TIMEOUT` error through the event channel instead of silently dropping it. API handlers will surface this as a proper error response to the client, instead of emitting `finish_reason=stop` with truncated content. Available in v1.2.1.