mirror of
https://github.com/NikkeTryHard/zerogravity.git
synced 2026-04-25 15:15:59 +03:00
[GH-ISSUE #42] [BUG] "blocking spontaneous background LLM call" causes OpenAI-compatible API clients to hang indefinitely #35
Labels
No labels
bug
enhancement
enhancement
notice
pull-request
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/zerogravity#35
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @terryops on GitHub (Feb 21, 2026).
Original GitHub issue: https://github.com/NikkeTryHard/zerogravity/issues/42
Description
When using ZeroGravity v1.2.1 (Docker) via the
/v1/chat/completionsOpenAI-compatible API, the response hangs indefinitely. The MITM proxy returns a partial response via the cascade mechanism, but then blocks the LS cascade continuation as a "spontaneous background LLM call", causing the polling to never complete.Symptoms
/v1/chat/completionswithstream=falseresponse_text_len=181)MITM: blocking spontaneous background LLM call to save quotaPolling for response on cascade xxx (timeout=3600s)) never resolvesReproduction
Environment
ghcr.io/nikketryhard/zerogravity:latest,system_fingerprint=fp_121)opus-4.6stream=falseRequest
Simple request via
/v1/chat/completions:This works for short conversations but hangs for conversations with large system prompts + tool definitions (~65KB+ request body after MITM modification).
Timeline (from docker logs)
Key observations
response_text_len=181)/v1/chat/completionspolling waits up to 3600s for a result that will never comeAdditional issue: thinking stripped
The request was made with thinking enabled (
thinking=medium), but ZG setincludeThoughts=false:This causes
thinking_text_len=0in the response, which breaks clients expecting thinking content.Expected Behavior
/v1/chat/completionsAPI call should NOT be blocked by quota protectionSuggested Fix
thinkingparameter from the API request when configuringincludeThoughtsWorkaround
Currently using Anthropic API directly as a fallback. ZG works fine for short/simple requests but fails for agentic clients with large conversation histories.
@NikkeTryHard commented on GitHub (Feb 21, 2026):
can you at least give me the trace report with zg trace or zg report? thx
@terryops commented on GitHub (Feb 21, 2026):
Hey, thanks for the quick response!
Unfortunately the hang never completed, so no trace was written for the incident — traces only get saved when a request finishes. Today's trace directory is empty.
Here's the diagnostic report from
zg report:Additional context from our logs (OpenClaw gateway):
The hang happened with a large conversation payload (~65KB). The ZG MITM trimmed it down, upstream returned successfully (
response_text_len=181), but then the cascade continuation (body_len=4028) was blocked as a "spontaneous background LLM call". The polling never resolved.We've since switched back to Anthropic direct, so I can't easily reproduce it right now. But if you need a live trace, I can temporarily switch back to ZG and trigger it with a large conversation — it was fairly reproducible with big system prompts + tool definitions. Let me know!
@NikkeTryHard commented on GitHub (Feb 21, 2026):
fixed in latest release can you test again? wait for release or build yourself with src if you have access
@NikkeTryHard commented on GitHub (Feb 21, 2026):
released please test
@terryops commented on GitHub (Feb 21, 2026):
Hey! Just tested with the latest Docker release (
ghcr.io/nikketryhard/zerogravity:latest,system_fingerprint=fp_130).The hanging issue is fixed! ✅ Both
opus-4.6andgemini-3-flashreturn responses promptly now — no more infinite polling. Great work!However, I noticed a new issue with system prompts being ignored when using the OpenAI-compatible API. The MITM proxy replaces the user-provided system prompt with a dummy:
Test case:
Expected: Response follows the system prompt (identifies as PastaBot)
Actual: Response says "I am Antigravity, a powerful agentic AI coding assistant designed by Google DeepMind..." — system prompt completely ignored.
The
prompt_tokenscount is ~5950 even though the input is only ~50 tokens, confirming the MITM injects the full Antigravity system prompt and discards the user's.This is likely related to #35. For OpenAI-compatible API consumers (like OpenClaw), system prompt passthrough is critical — it carries persona, tools, workspace context, etc. Without it, the model loses all context and identity.
Thanks for the quick fix on the hang! 🎉
@NikkeTryHard commented on GitHub (Feb 22, 2026):
System prompt passthrough is implemented — the MITM now injects your system prompt and strips the competing identity from the backend prompt. Give it a try once the next release is out.
@LibertX commented on GitHub (Feb 22, 2026):
It seems to work but it's less strong than rules (#35) :
When using rules on Antigravity, it works fine (identifies itself as PastaBot).
@terryops commented on GitHub (Feb 22, 2026):
Mine could have gone to fallback and it’s actually anthropic?在 2026年2月22日,下午6:48,LibertX @.***> 写道:LibertX left a comment (NikkeTryHard/zerogravity#42)
I tested on 1.3.2 and it doesn't seem to work.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
@NikkeTryHard commented on GitHub (Feb 22, 2026):
Root cause found. The model's own thinking output showed:
The
<SYSTEM_INSTRUCTION_OVERRIDE>tag was too transparent — models are trained to resist prompt injection and the tag name gives it away.Fix (v1.3.3): switched to injecting system prompts as
<user_rules>— the same format GEMINI.md rules use. Models treat this as legitimate configuration, not a jailbreak attempt. This is exactly what #35 requested.Build from source or wait for the next release to test.
@NikkeTryHard commented on GitHub (Feb 22, 2026):
alr released im sleeping gn
@terryops commented on GitHub (Feb 22, 2026):
Terrific, it's working now. Thank you very much.