[GH-ISSUE #182] feat: Add context window utilization tracking to iteration summary #70

New issue

Open

opened 2026-02-27 10:22:04 +03:00 by kerem · 0 comments

kerem commented

2026-02-27 10:22:04 +03:00

Owner

Originally created by @arjhun-personal on GitHub (Feb 21, 2026).
Original GitHub issue: https://github.com/mikeyobrien/ralph-orchestrator/issues/182

Summary

Add context window utilization (%) as a metric displayed after each orchestration iteration, alongside the existing duration, cost, and turns.

Target output:

Duration: 12345ms | Est. cost: $0.0526 | Turns: 3 | Context: 45% (90K/200K)

Problem

Token data (input_tokens, output_tokens) arrives from Claude and Pi stream events but is currently dropped before reaching the display layer. Operators have no visibility into how close an agent is to hitting the context window limit.

Scope

Capture token data from Claude (Usage on Assistant events) and Pi (PiUsage on TurnEnd events) — both currently dropped
Extend SessionResult with input_tokens, output_tokens, context_window
Display context utilization % in iteration summary (Pretty, Console, TUI handlers)
Add context_window_tokens config with sensible defaults (200K for Claude/Pi)
Track per-hat peak token usage in LoopState
Surface token data in events.jsonl and ralph events

Key Files

crates/ralph-adapters/src/stream_handler.rs — SessionResult + display
crates/ralph-adapters/src/pty_executor.rs — Claude token capture
crates/ralph-adapters/src/pi_stream.rs — Pi token capture
crates/ralph-core/src/event_loop/loop_state.rs — per-hat tracking
crates/ralph-core/src/event_logger.rs — events.jsonl persistence
crates/ralph-core/src/config.rs — context window config
crates/ralph-cli/src/loop_runner.rs — wiring

Originally created by @arjhun-personal on GitHub (Feb 21, 2026). Original GitHub issue: https://github.com/mikeyobrien/ralph-orchestrator/issues/182 ## Summary Add context window utilization (%) as a metric displayed after each orchestration iteration, alongside the existing duration, cost, and turns. **Target output:** ``` Duration: 12345ms | Est. cost: $0.0526 | Turns: 3 | Context: 45% (90K/200K) ``` ## Problem Token data (`input_tokens`, `output_tokens`) arrives from Claude and Pi stream events but is currently **dropped** before reaching the display layer. Operators have no visibility into how close an agent is to hitting the context window limit. ## Scope - Capture token data from Claude (`Usage` on `Assistant` events) and Pi (`PiUsage` on `TurnEnd` events) — both currently dropped - Extend `SessionResult` with `input_tokens`, `output_tokens`, `context_window` - Display context utilization % in iteration summary (Pretty, Console, TUI handlers) - Add `context_window_tokens` config with sensible defaults (200K for Claude/Pi) - Track per-hat peak token usage in `LoopState` - Surface token data in `events.jsonl` and `ralph events` ## Key Files - `crates/ralph-adapters/src/stream_handler.rs` — `SessionResult` + display - `crates/ralph-adapters/src/pty_executor.rs` — Claude token capture - `crates/ralph-adapters/src/pi_stream.rs` — Pi token capture - `crates/ralph-core/src/event_loop/loop_state.rs` — per-hat tracking - `crates/ralph-core/src/event_logger.rs` — events.jsonl persistence - `crates/ralph-core/src/config.rs` — context window config - `crates/ralph-cli/src/loop_runner.rs` — wiring