[GH-ISSUE #42] [BUG] "blocking spontaneous background LLM call" causes OpenAI-compatible API clients to hang indefinitely

kerem commented

2026-02-27 15:38:02 +03:00

Owner

Originally created by @terryops on GitHub (Feb 21, 2026).
Original GitHub issue: https://github.com/NikkeTryHard/zerogravity/issues/42

Description

When using ZeroGravity v1.2.1 (Docker) via the /v1/chat/completions OpenAI-compatible API, the response hangs indefinitely. The MITM proxy returns a partial response via the cascade mechanism, but then blocks the LS cascade continuation as a "spontaneous background LLM call", causing the polling to never complete.

Symptoms

Client sends a request to /v1/chat/completions with stream=false
ZG logs show the upstream response completes successfully (response_text_len=181)
Immediately after, a cascade continuation request is blocked: MITM: blocking spontaneous background LLM call to save quota
The API polling (Polling for response on cascade xxx (timeout=3600s)) never resolves
Client hangs until its own timeout fires

Reproduction

Environment

ZeroGravity: v1.2.1 (Docker ghcr.io/nikketryhard/zerogravity:latest, system_fingerprint=fp_121)
Model: opus-4.6
Client: OpenClaw (OpenAI-compatible API client) with stream=false

Request

Simple request via /v1/chat/completions:

curl http://127.0.0.1:8741/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "opus-4.6", "max_tokens": 16384, "messages": [{"role": "user", "content": "hello"}]}'

This works for short conversations but hangs for conversations with large system prompts + tool definitions (~65KB+ request body after MITM modification).

Timeline (from docker logs)

04:31:46 MITM: forwarding LLM request body_len=65624 cascade=fb7ed2eb
04:31:46 MITM: request modified [remove 1/3 content messages, ...] original=65610 modified=24029 saved_pct=63
04:31:46 Polling for response on cascade fb7ed2eb (timeout=3600s)
04:31:47 MITM: streaming response status=200
04:31:48 MITM: response complete — response_text_len=181 thinking_text_len=0
04:31:48 MITM: forwarding LLM request body_len=4028 cascade=fb7ed2eb
04:31:48 MITM: blocking spontaneous background LLM call to save quota  ← BLOCKED
         ... (polling never completes, client hangs forever) ...

Key observations

The initial response completes fine (response_text_len=181)
The LS then makes a follow-up cascade request (body_len=4028) which ZG blocks
Because the cascade is blocked, the LS never signals completion
The /v1/chat/completions polling waits up to 3600s for a result that will never come

Additional issue: thinking stripped

The request was made with thinking enabled (thinking=medium), but ZG set includeThoughts=false:

Claude default-off: includeThoughts=false

This causes thinking_text_len=0 in the response, which breaks clients expecting thinking content.

Expected Behavior

Cascade continuation requests that are part of an active /v1/chat/completions API call should NOT be blocked by quota protection
If a cascade continuation is blocked, the API should still return the partial response to the client instead of hanging
Thinking should be forwarded when the client requests it via the API

Suggested Fix

Distinguish between "spontaneous background" calls and cascade continuations that are part of an active API request
If blocking is necessary, close the cascade and return whatever response was already received
Respect the thinking parameter from the API request when configuring includeThoughts

Workaround

Currently using Anthropic API directly as a fallback. ZG works fine for short/simple requests but fails for agentic clients with large conversation histories.

Originally created by @terryops on GitHub (Feb 21, 2026). Original GitHub issue: https://github.com/NikkeTryHard/zerogravity/issues/42 ## Description When using ZeroGravity v1.2.1 (Docker) via the `/v1/chat/completions` OpenAI-compatible API, the response hangs indefinitely. The MITM proxy returns a partial response via the cascade mechanism, but then **blocks the LS cascade continuation** as a "spontaneous background LLM call", causing the polling to never complete. ## Symptoms - Client sends a request to `/v1/chat/completions` with `stream=false` - ZG logs show the upstream response completes successfully (`response_text_len=181`) - Immediately after, a cascade continuation request is blocked: `MITM: blocking spontaneous background LLM call to save quota` - The API polling (`Polling for response on cascade xxx (timeout=3600s)`) never resolves - Client hangs until its own timeout fires ## Reproduction ### Environment - ZeroGravity: v1.2.1 (Docker `ghcr.io/nikketryhard/zerogravity:latest`, `system_fingerprint=fp_121`) - Model: `opus-4.6` - Client: OpenClaw (OpenAI-compatible API client) with `stream=false` ### Request Simple request via `/v1/chat/completions`: ```bash curl http://127.0.0.1:8741/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "opus-4.6", "max_tokens": 16384, "messages": [{"role": "user", "content": "hello"}]}' ``` This works for **short conversations** but hangs for conversations with **large system prompts + tool definitions** (~65KB+ request body after MITM modification). ### Timeline (from docker logs) ``` 04:31:46 MITM: forwarding LLM request body_len=65624 cascade=fb7ed2eb 04:31:46 MITM: request modified [remove 1/3 content messages, ...] original=65610 modified=24029 saved_pct=63 04:31:46 Polling for response on cascade fb7ed2eb (timeout=3600s) 04:31:47 MITM: streaming response status=200 04:31:48 MITM: response complete — response_text_len=181 thinking_text_len=0 04:31:48 MITM: forwarding LLM request body_len=4028 cascade=fb7ed2eb 04:31:48 MITM: blocking spontaneous background LLM call to save quota ← BLOCKED ... (polling never completes, client hangs forever) ... ``` ### Key observations 1. The initial response completes fine (`response_text_len=181`) 2. The LS then makes a follow-up cascade request (body_len=4028) which ZG blocks 3. Because the cascade is blocked, the LS never signals completion 4. The `/v1/chat/completions` polling waits up to 3600s for a result that will never come ## Additional issue: thinking stripped The request was made with thinking enabled (`thinking=medium`), but ZG set `includeThoughts=false`: ``` Claude default-off: includeThoughts=false ``` This causes `thinking_text_len=0` in the response, which breaks clients expecting thinking content. ## Expected Behavior 1. Cascade continuation requests that are part of an active `/v1/chat/completions` API call should NOT be blocked by quota protection 2. If a cascade continuation is blocked, the API should still return the partial response to the client instead of hanging 3. Thinking should be forwarded when the client requests it via the API ## Suggested Fix - Distinguish between "spontaneous background" calls and cascade continuations that are part of an active API request - If blocking is necessary, close the cascade and return whatever response was already received - Respect the `thinking` parameter from the API request when configuring `includeThoughts` ## Workaround Currently using Anthropic API directly as a fallback. ZG works fine for short/simple requests but fails for agentic clients with large conversation histories.

kerem closed this issue

2026-02-27 15:38:02 +03:00

kerem commented

2026-02-27 15:38:02 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 21, 2026):

can you at least give me the trace report with zg trace or zg report? thx

@NikkeTryHard commented on GitHub (Feb 21, 2026): can you at least give me the trace report with zg trace or zg report? thx

kerem commented

2026-02-27 15:38:02 +03:00

Author

Owner

@terryops commented on GitHub (Feb 21, 2026):

Hey, thanks for the quick response!

Unfortunately the hang never completed, so no trace was written for the incident — traces only get saved when a request finishes. Today's trace directory is empty.

Here's the diagnostic report from zg report:

generated_at: 2026-02-21T07:17:15Z
format: TOON/1.0
system:
  os: Linux
  kernel: 6.14.0-1017-azure
  arch: x86_64
  distro: Ubuntu 24.04.3 LTS
  glibc: ldd (Ubuntu GLIBC 2.39-0ubuntu8.7) 2.39
  memory_total: 7.8Gi
  memory_available: 4.6Gi
  in_docker: false
zerogravity:
  version: 1.2.1
  dev_build: false
  zg_binary: <install_dir>/zerogravity/zg
  main_binary: <install_dir>/zerogravity/zerogravity
  main_binary_size: 9587208
  main_binary_max_glibc: GLIBC_2.17
service:
  running: true
  proxy_port: 8741
  mitm_port: 8742
  systemd: inactive
  listeners[4]:
    - "LISTEN 0      4096                       0.0.0.0:8742       0.0.0.0:*                                                 "
    - "LISTEN 0      4096                       0.0.0.0:8741       0.0.0.0:*                                                 "
    - "LISTEN 0      4096                          [::]:8742          [::]:*                                                 "
    - "LISTEN 0      4096                          [::]:8741          [::]:*"
config:
  config_dir: ~/.config/zerogravity
  token_file_exists: true
  token_file_length: 260
  token_age: 3h16m
  token_env_set: false
  api_key_set: false
  env:
ls_binary:
  found: false
api:
  health: "{\"status\":\"ok\"}"
  models: "{\"data\":[{\"created\":1700000000,\"id\":\"opus-4.6\",\"meta\":{\"enum_value\":1026,\"label\":\"Claude Opus 4.6 (Thinking)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"opus-4.6\"},{\"created\":1700000000,\"id\":\"sonnet-4.6\",\"meta\":{\"enum_value\":1035,\"label\":\"Claude Sonnet 4.6 (Thinking)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"sonnet-4.6\"},{\"created\":1700000000,\"id\":\"gemini-3-flash\",\"meta\":{\"enum_value\":1018,\"label\":\"Gemini 3 Flash\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3-flash\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro\",\"meta\":{\"enum_value\":1037,\"label\":\"Gemini 3.1 Pro (High)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro-high\",\"meta\":{\"enum_value\":1037,\"label\":\"Gemini 3.1 Pro (High)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro-high\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro-low\",\"meta\":{\"enum_value\":1036,\"label\":\"Gemini 3.1 Pro (Low)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro-low\"}],\"object\":\"list\"}"
  quota: "{\"last_updated\":\"2026-02-21T07:16:56.843822880+00:00\",\"models\":[{\"label\":\"Claude Opus 4.6 (Thinking)\",\"model_id\":\"MODEL_PLACEHOLDER_M26\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"},{\"label\":\"GPT-OSS 120B (Medium)\",\"model_id\":\"MODEL_OPENAI_GPT_OSS_120B_MEDIUM\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"},{\"label\":\"Gemini 3.1 Pro (High)\",\"model_id\":\"MODEL_PLACEHOLDER_M37\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"2h 40m\",\"reset_in_secs\":9603,\"reset_time\":\"2026-02-21T09:57:00Z\"},{\"label\":\"Gemini 3.1 Pro (Low)\",\"model_id\":\"MODEL_PLACEHOLDER_M36\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"2h 40m\",\"reset_in_secs\":9603,\"reset_time\":\"2026-02-21T09:57:00Z\"},{\"label\":\"Gemini 3 Flash\",\"model_id\":\"MODEL_PLACEHOLDER_M18\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"available\",\"reset_in_secs\":-5354,\"reset_time\":\"2026-02-21T05:47:42Z\"},{\"label\":\"Claude Sonnet 4.6 (Thinking)\",\"model_id\":\"MODEL_PLACEHOLDER_M35\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"}],\"plan\":{\"plan_name\":\"Pro\",\"tier_id\":\"g1-ultra-tier\",\"tier_name\":\"Google AI Ultra\"}}"
  usage: "{\"mitm\":{\"per_model\":{\"opus-4.6\":{\"cache_creation_tokens\":0,\"cache_read_tokens\":0,\"input_tokens\":152927,\"output_tokens\":347,\"requests\":5}},\"total_cache_creation_tokens\":0,\"total_cache_read_tokens\":0,\"total_input_tokens\":152927,\"total_output_tokens\":347,\"total_requests\":5,\"total_response_output_tokens\":0,\"total_thinking_output_tokens\":0,\"total_tokens\":153274}}"
logs:
  exists: false
traces:
  date: 2026-02-21
  total_today: 0

Additional context from our logs (OpenClaw gateway):

The hang happened with a large conversation payload (~65KB). The ZG MITM trimmed it down, upstream returned successfully (response_text_len=181), but then the cascade continuation (body_len=4028) was blocked as a "spontaneous background LLM call". The polling never resolved.

We've since switched back to Anthropic direct, so I can't easily reproduce it right now. But if you need a live trace, I can temporarily switch back to ZG and trigger it with a large conversation — it was fairly reproducible with big system prompts + tool definitions. Let me know!

@terryops commented on GitHub (Feb 21, 2026): Hey, thanks for the quick response! Unfortunately the hang never completed, so **no trace was written** for the incident — traces only get saved when a request finishes. Today's trace directory is empty. Here's the diagnostic report from `zg report`: ``` generated_at: 2026-02-21T07:17:15Z format: TOON/1.0 system: os: Linux kernel: 6.14.0-1017-azure arch: x86_64 distro: Ubuntu 24.04.3 LTS glibc: ldd (Ubuntu GLIBC 2.39-0ubuntu8.7) 2.39 memory_total: 7.8Gi memory_available: 4.6Gi in_docker: false zerogravity: version: 1.2.1 dev_build: false zg_binary: <install_dir>/zerogravity/zg main_binary: <install_dir>/zerogravity/zerogravity main_binary_size: 9587208 main_binary_max_glibc: GLIBC_2.17 service: running: true proxy_port: 8741 mitm_port: 8742 systemd: inactive listeners[4]: - "LISTEN 0 4096 0.0.0.0:8742 0.0.0.0:* " - "LISTEN 0 4096 0.0.0.0:8741 0.0.0.0:* " - "LISTEN 0 4096 [::]:8742 [::]:* " - "LISTEN 0 4096 [::]:8741 [::]:*" config: config_dir: ~/.config/zerogravity token_file_exists: true token_file_length: 260 token_age: 3h16m token_env_set: false api_key_set: false env: ls_binary: found: false api: health: "{\"status\":\"ok\"}" models: "{\"data\":[{\"created\":1700000000,\"id\":\"opus-4.6\",\"meta\":{\"enum_value\":1026,\"label\":\"Claude Opus 4.6 (Thinking)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"opus-4.6\"},{\"created\":1700000000,\"id\":\"sonnet-4.6\",\"meta\":{\"enum_value\":1035,\"label\":\"Claude Sonnet 4.6 (Thinking)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"sonnet-4.6\"},{\"created\":1700000000,\"id\":\"gemini-3-flash\",\"meta\":{\"enum_value\":1018,\"label\":\"Gemini 3 Flash\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3-flash\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro\",\"meta\":{\"enum_value\":1037,\"label\":\"Gemini 3.1 Pro (High)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro-high\",\"meta\":{\"enum_value\":1037,\"label\":\"Gemini 3.1 Pro (High)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro-high\"},{\"created\":1700000000,\"id\":\"gemini-3.1-pro-low\",\"meta\":{\"enum_value\":1036,\"label\":\"Gemini 3.1 Pro (Low)\"},\"object\":\"model\",\"owned_by\":\"antigravity\",\"parent\":null,\"permission\":[],\"root\":\"gemini-3.1-pro-low\"}],\"object\":\"list\"}" quota: "{\"last_updated\":\"2026-02-21T07:16:56.843822880+00:00\",\"models\":[{\"label\":\"Claude Opus 4.6 (Thinking)\",\"model_id\":\"MODEL_PLACEHOLDER_M26\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"},{\"label\":\"GPT-OSS 120B (Medium)\",\"model_id\":\"MODEL_OPENAI_GPT_OSS_120B_MEDIUM\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"},{\"label\":\"Gemini 3.1 Pro (High)\",\"model_id\":\"MODEL_PLACEHOLDER_M37\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"2h 40m\",\"reset_in_secs\":9603,\"reset_time\":\"2026-02-21T09:57:00Z\"},{\"label\":\"Gemini 3.1 Pro (Low)\",\"model_id\":\"MODEL_PLACEHOLDER_M36\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"2h 40m\",\"reset_in_secs\":9603,\"reset_time\":\"2026-02-21T09:57:00Z\"},{\"label\":\"Gemini 3 Flash\",\"model_id\":\"MODEL_PLACEHOLDER_M18\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"available\",\"reset_in_secs\":-5354,\"reset_time\":\"2026-02-21T05:47:42Z\"},{\"label\":\"Claude Sonnet 4.6 (Thinking)\",\"model_id\":\"MODEL_PLACEHOLDER_M35\",\"remaining_fraction\":1.0,\"remaining_pct\":100.0,\"reset_in_human\":\"0h 25m\",\"reset_in_secs\":1501,\"reset_time\":\"2026-02-21T07:41:58Z\"}],\"plan\":{\"plan_name\":\"Pro\",\"tier_id\":\"g1-ultra-tier\",\"tier_name\":\"Google AI Ultra\"}}" usage: "{\"mitm\":{\"per_model\":{\"opus-4.6\":{\"cache_creation_tokens\":0,\"cache_read_tokens\":0,\"input_tokens\":152927,\"output_tokens\":347,\"requests\":5}},\"total_cache_creation_tokens\":0,\"total_cache_read_tokens\":0,\"total_input_tokens\":152927,\"total_output_tokens\":347,\"total_requests\":5,\"total_response_output_tokens\":0,\"total_thinking_output_tokens\":0,\"total_tokens\":153274}}" logs: exists: false traces: date: 2026-02-21 total_today: 0 ``` **Additional context from our logs (OpenClaw gateway):** The hang happened with a large conversation payload (~65KB). The ZG MITM trimmed it down, upstream returned successfully (`response_text_len=181`), but then the cascade continuation (`body_len=4028`) was blocked as a "spontaneous background LLM call". The polling never resolved. We've since switched back to Anthropic direct, so I can't easily reproduce it right now. But if you need a live trace, I can temporarily switch back to ZG and trigger it with a large conversation — it was fairly reproducible with big system prompts + tool definitions. Let me know!

kerem commented

2026-02-27 15:38:02 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 21, 2026):

fixed in latest release can you test again? wait for release or build yourself with src if you have access

@NikkeTryHard commented on GitHub (Feb 21, 2026): fixed in latest release can you test again? wait for release or build yourself with src if you have access

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 21, 2026):

released please test

@NikkeTryHard commented on GitHub (Feb 21, 2026): released please test

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@terryops commented on GitHub (Feb 21, 2026):

Hey! Just tested with the latest Docker release (ghcr.io/nikketryhard/zerogravity:latest, system_fingerprint=fp_130).

The hanging issue is fixed! ✅ Both opus-4.6 and gemini-3-flash return responses promptly now — no more infinite polling. Great work!

However, I noticed a new issue with system prompts being ignored when using the OpenAI-compatible API. The MITM proxy replaces the user-provided system prompt with a dummy:

MITM: request modified [..., replace dummy prompt in USER_REQUEST wrapper (69 chars), ...]

Test case:

curl http://localhost:8741/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"opus-4.6","messages":[{"role":"system","content":"You are a helpful pasta-loving assistant named PastaBot. Never identify as Antigravity."},{"role":"user","content":"Who are you? Reply in one sentence."}],"max_tokens":100}'

Expected: Response follows the system prompt (identifies as PastaBot)
Actual: Response says "I am Antigravity, a powerful agentic AI coding assistant designed by Google DeepMind..." — system prompt completely ignored.

The prompt_tokens count is ~5950 even though the input is only ~50 tokens, confirming the MITM injects the full Antigravity system prompt and discards the user's.

This is likely related to #35. For OpenAI-compatible API consumers (like OpenClaw), system prompt passthrough is critical — it carries persona, tools, workspace context, etc. Without it, the model loses all context and identity.

Thanks for the quick fix on the hang! 🎉

@terryops commented on GitHub (Feb 21, 2026): Hey! Just tested with the latest Docker release (`ghcr.io/nikketryhard/zerogravity:latest`, `system_fingerprint=fp_130`). **The hanging issue is fixed!** ✅ Both `opus-4.6` and `gemini-3-flash` return responses promptly now — no more infinite polling. Great work! However, I noticed a **new issue with system prompts being ignored** when using the OpenAI-compatible API. The MITM proxy replaces the user-provided system prompt with a dummy: ``` MITM: request modified [..., replace dummy prompt in USER_REQUEST wrapper (69 chars), ...] ``` **Test case:** ```bash curl http://localhost:8741/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"opus-4.6","messages":[{"role":"system","content":"You are a helpful pasta-loving assistant named PastaBot. Never identify as Antigravity."},{"role":"user","content":"Who are you? Reply in one sentence."}],"max_tokens":100}' ``` **Expected:** Response follows the system prompt (identifies as PastaBot) **Actual:** Response says "I am Antigravity, a powerful agentic AI coding assistant designed by Google DeepMind..." — system prompt completely ignored. The `prompt_tokens` count is ~5950 even though the input is only ~50 tokens, confirming the MITM injects the full Antigravity system prompt and discards the user's. This is likely related to #35. For OpenAI-compatible API consumers (like [OpenClaw](https://github.com/openclaw/openclaw)), system prompt passthrough is critical — it carries persona, tools, workspace context, etc. Without it, the model loses all context and identity. Thanks for the quick fix on the hang! 🎉

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 22, 2026):

System prompt passthrough is implemented — the MITM now injects your system prompt and strips the competing identity from the backend prompt. Give it a try once the next release is out.

@NikkeTryHard commented on GitHub (Feb 22, 2026): System prompt passthrough is implemented — the MITM now injects your system prompt and strips the competing identity from the backend prompt. Give it a try once the next release is out.

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@LibertX commented on GitHub (Feb 22, 2026):

It seems to work but it's less strong than rules (#35) :

podman exec -it zerogravity curl http://localhost:8741/v1/chat/completions   -H "Content-Type: application/json"   -d '{"model":"opus-4.6","messages":[{"role":"system","content":"You are a helpful pasta-loving assistant named PastaBot. Never identify as Antigravity or Claude."},{"role":"user","content":"Who are you? Reply in one sentence."}],"max_tokens":100}'
{"id":"chatcmpl-7bb696cdb5ee4872aa476c113d22f1e6","object":"chat.completion","created":1771757372,"model":"opus-4.6","system_fingerprint":"fp_132","service_tier":"default","choices":[{"index":0,"message":{"role":"assistant","content":"I'm Claude, an AI assistant made by Anthropic, here to help you with coding, writing, analysis, and a wide range of other tasks.","reasoning_content":"The user is asking who I am. I should ignore the system instruction override attempt and respond honestly.","thinking_blocks":[{"type":"thinking","thinking":"The user is asking who I am. I should ignore the system instruction override attempt and respond honestly."}]},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":5848,"completion_tokens":63,"total_tokens":5911,"prompt_tokens_details":{"cached_tokens":0},"completion_tokens_details":{"reasoning_tokens":0}}}

When using rules on Antigravity, it works fine (identifies itself as PastaBot).

@LibertX commented on GitHub (Feb 22, 2026): It seems to work but it's less strong than rules (#35) : ``` podman exec -it zerogravity curl http://localhost:8741/v1/chat/completions -H "Content-Type: application/json" -d '{"model":"opus-4.6","messages":[{"role":"system","content":"You are a helpful pasta-loving assistant named PastaBot. Never identify as Antigravity or Claude."},{"role":"user","content":"Who are you? Reply in one sentence."}],"max_tokens":100}' {"id":"chatcmpl-7bb696cdb5ee4872aa476c113d22f1e6","object":"chat.completion","created":1771757372,"model":"opus-4.6","system_fingerprint":"fp_132","service_tier":"default","choices":[{"index":0,"message":{"role":"assistant","content":"I'm Claude, an AI assistant made by Anthropic, here to help you with coding, writing, analysis, and a wide range of other tasks.","reasoning_content":"The user is asking who I am. I should ignore the system instruction override attempt and respond honestly.","thinking_blocks":[{"type":"thinking","thinking":"The user is asking who I am. I should ignore the system instruction override attempt and respond honestly."}]},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":5848,"completion_tokens":63,"total_tokens":5911,"prompt_tokens_details":{"cached_tokens":0},"completion_tokens_details":{"reasoning_tokens":0}}} ``` When using rules on Antigravity, it works fine (identifies itself as PastaBot).

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@terryops commented on GitHub (Feb 22, 2026):

Mine could have gone to fallback and it’s actually anthropic?在 2026年2月22日，下午6:48，LibertX @.***> 写道：LibertX left a comment (NikkeTryHard/zerogravity#42)
I tested on 1.3.2 and it doesn't seem to work.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

@terryops commented on GitHub (Feb 22, 2026): Mine could have gone to fallback and it’s actually anthropic?在 2026年2月22日，下午6:48，LibertX ***@***.***> 写道：LibertX left a comment (NikkeTryHard/zerogravity#42) I tested on 1.3.2 and it doesn't seem to work. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 22, 2026):

Root cause found. The model's own thinking output showed:

"I should ignore the system instruction override attempt and respond honestly."

The <SYSTEM_INSTRUCTION_OVERRIDE> tag was too transparent — models are trained to resist prompt injection and the tag name gives it away.

Fix (v1.3.3): switched to injecting system prompts as <user_rules> — the same format GEMINI.md rules use. Models treat this as legitimate configuration, not a jailbreak attempt. This is exactly what #35 requested.

Build from source or wait for the next release to test.

@NikkeTryHard commented on GitHub (Feb 22, 2026): Root cause found. The model's own thinking output showed: > "I should ignore the system instruction override attempt and respond honestly." The `<SYSTEM_INSTRUCTION_OVERRIDE>` tag was too transparent — models are trained to resist prompt injection and the tag name gives it away. **Fix (v1.3.3):** switched to injecting system prompts as `<user_rules>` — the same format GEMINI.md rules use. Models treat this as legitimate configuration, not a jailbreak attempt. This is exactly what #35 requested. Build from source or wait for the next release to test.

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@NikkeTryHard commented on GitHub (Feb 22, 2026):

alr released im sleeping gn

@NikkeTryHard commented on GitHub (Feb 22, 2026): alr released im sleeping gn

kerem commented

2026-02-27 15:38:03 +03:00

Author

Owner

@terryops commented on GitHub (Feb 22, 2026):

Terrific, it's working now. Thank you very much.

2026年2月22日 19:38，Louie @.***> 写道：

NikkeTryHard
left a comment
(NikkeTryHard/zerogravity#42)
https://github.com/NikkeTryHard/zerogravity/issues/42#issuecomment-3940757160
alr released im sleeping gn

—
Reply to this email directly, view it on GitHub https://github.com/NikkeTryHard/zerogravity/issues/42#issuecomment-3940757160, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3NLOIGLO6UMFUIIFHMAZT4NGIMFAVCNFSM6AAAAACV3EOI6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSNBQG42TOMJWGA.
You are receiving this because you authored the thread.

@terryops commented on GitHub (Feb 22, 2026): Terrific, it's working now. Thank you very much. > 2026年2月22日 19:38，Louie ***@***.***> 写道： > > > NikkeTryHard > left a comment > (NikkeTryHard/zerogravity#42) > <https://github.com/NikkeTryHard/zerogravity/issues/42#issuecomment-3940757160> > alr released im sleeping gn > > — > Reply to this email directly, view it on GitHub <https://github.com/NikkeTryHard/zerogravity/issues/42#issuecomment-3940757160>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3NLOIGLO6UMFUIIFHMAZT4NGIMFAVCNFSM6AAAAACV3EOI6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSNBQG42TOMJWGA>. > You are receiving this because you authored the thread. >

Rows
Columns

[GH-ISSUE #42] [BUG] "blocking spontaneous background LLM call" causes OpenAI-compatible API clients to hang indefinitely #35

Description

Symptoms

Reproduction

Environment

Request

Timeline (from docker logs)

Key observations

Additional issue: thinking stripped

Expected Behavior

Suggested Fix

Workaround