[PR #198] feat(core): add disallowed_tools, stale loop detection, and file-modification audit #193

Open
opened 2026-02-27 10:22:40 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/mikeyobrien/ralph-orchestrator/pull/198
Author: @arjhun-personal
Created: 2/25/2026
Status: 🔄 Open

Base: mainHead: feat/loop-cycling-phase2


📝 Commits (10+)

  • 159821d feat(core): add hat scope enforcement, event chain validation, and human timeout routing
  • 63111d9 docs: add upstream PR draft for hat scope enforcement
  • 479e6cb feat(core): make hat enforcement opt-in via config flags
  • dd46589 style: fix rustfmt formatting in hat enforcement code
  • 31c55ef spec: add context window utilization tracking design
  • adce9bc fix(core): record default_publishes topics in seen_topics for chain validation
  • 4da8468 docs: add upstream PR draft for default_publishes seen_topics fix
  • ed40abe fix(core): default_publishes of completion_promise must set completion_requested
  • 3c25a06 docs: add upstream PR draft for default_publishes completion_requested fix
  • 3998962 feat(core): add disallowed_tools, stale loop detection, and file-modification audit

📊 Changes

18 files changed (+1663 additions, -4 deletions)

View changed files

📝 crates/ralph-bench/src/main.rs (+2 -0)
📝 crates/ralph-cli/src/display.rs (+2 -0)
📝 crates/ralph-cli/src/doctor.rs (+1 -0)
📝 crates/ralph-cli/src/loop_runner.rs (+29 -0)
📝 crates/ralph-core/src/config.rs (+29 -0)
📝 crates/ralph-core/src/event_loop/loop_state.rs (+38 -0)
📝 crates/ralph-core/src/event_loop/mod.rs (+219 -4)
📝 crates/ralph-core/src/event_loop/tests.rs (+595 -0)
📝 crates/ralph-core/src/hat_registry.rs (+79 -0)
📝 crates/ralph-core/src/hatless_ralph.rs (+23 -0)
📝 crates/ralph-core/src/summary_writer.rs (+6 -0)
docs/specs/context-window-utilization.md (+151 -0)
upstream-PRs/default-publishes-completion-requested-body.md (+120 -0)
upstream-PRs/default-publishes-completion-requested.md (+46 -0)
upstream-PRs/default-publishes-seen-topics-body.md (+91 -0)
upstream-PRs/default-publishes-seen-topics.md (+47 -0)
upstream-PRs/hat-scope-enforcement-body.md (+58 -0)
upstream-PRs/hat-scope-enforcement.md (+127 -0)

📄 Description

Closes #194

Summary

Adds three engine-level mechanisms to prevent hat role violations and infinite loop cycling — Phase 2 of the loop cycling fix plan.

  • 2A: disallowed_tools field on HatConfig with prompt-level enforcement
  • 2B: Stale topic detection that terminates loops when the same event is emitted 3+ times consecutively
  • 2C: Post-iteration file-modification audit that emits scope_violation events

Problem

Two bugs were observed during a ralph loop run:

Bug 1: Dispatcher implemented code (role violation)
The dispatcher hat read plan files and edited 12 source files despite instructions saying "Don't build anything yourself." Existing enforce_hat_scope only validates event publishing, not tool usage. There's no mechanism to prevent a hat from using Edit/Write/Bash tools.

Bug 2: 5 wasted iterations cycling after work done (~$1.7 burned)
After all work completed, the loop cycled between dispatcher and builder emitting the same events repeatedly:

iter 7  (builder):    nothing to do → default_publishes → build.complete
iter 8  (dispatcher): got build.complete → all.built
iter 9  (builder):    nothing to do → build.complete
iter 10 (dispatcher): got build.complete → all.built (again)
iter 11 (builder):    finally emits LOOP_COMPLETE

The same all.builtbuild.completeall.built pattern was also observed in a separate run.

Changes

2A: disallowed_tools prompt-level enforcement

File Change
crates/ralph-core/src/config.rs New disallowed_tools: Vec<String> field on HatConfig
crates/ralph-core/src/hatless_ralph.rs New disallowed_tools on HatInfo; TOOL RESTRICTIONS section injected in active hat prompts

When a hat has disallowed_tools configured, the prompt includes a prominent section:

### TOOL RESTRICTIONS

You MUST NOT use these tools in this hat:
- **Edit** — blocked for this hat
- **Write** — blocked for this hat

Using a restricted tool is a scope violation.
File modifications are audited after each iteration.

Preset usage:

dispatcher:
  disallowed_tools: ['Edit', 'Write', 'NotebookEdit']

2B: Stale topic detection (cycle breaker)

File Change
crates/ralph-core/src/event_loop/loop_state.rs New last_emitted_topic and consecutive_same_topic fields; record_topic() tracks consecutive emissions
crates/ralph-core/src/event_loop/mod.rs New TerminationReason::LoopStale variant; check_termination() returns LoopStale when same topic emitted 3+ times

When the same topic is emitted 3 or more times consecutively, the loop terminates with exit code 1 (LoopStale). This catches the all.builtbuild.completeall.built cycle pattern.

The tracking is done in record_topic() which is called both from process_events_from_jsonl() (agent-written events) and check_default_publishes() (auto-injected events), ensuring full coverage.

2C: File-modification audit (hard enforcement)

File Change
crates/ralph-core/src/event_loop/mod.rs New audit_file_modifications() method called from process_output()

After each iteration, if the active hat has Edit or Write in disallowed_tools, runs git diff --stat HEAD to detect unauthorized file modifications. If modifications are found, emits a <hat_id>.scope_violation event on the bus.

Presets can route this event to trigger corrective actions:

final_committer:
  triggers: ['all.built', 'dispatcher.scope_violation']

Exhaustive match updates

File Change
crates/ralph-cli/src/display.rs Added LoopStale to termination display
crates/ralph-cli/src/loop_runner.rs Added LoopStale to history recording and merge queue state
crates/ralph-core/src/summary_writer.rs Added LoopStale to summary status text
crates/ralph-bench/src/main.rs Added LoopStale to benchmark result formatting

Tests

  • cargo test — full workspace passes (all existing tests + new field defaults)
  • cargo build — clean compilation

Test Plan

  • Configure a preset with disallowed_tools: ['Edit', 'Write'] on dispatcher → verify TOOL RESTRICTIONS section appears in prompt
  • Run a loop where the same event cycles 3+ times → verify LoopStale termination
  • Run a loop where a restricted hat modifies files → verify scope_violation event emitted

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/mikeyobrien/ralph-orchestrator/pull/198 **Author:** [@arjhun-personal](https://github.com/arjhun-personal) **Created:** 2/25/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `feat/loop-cycling-phase2` --- ### 📝 Commits (10+) - [`159821d`](https://github.com/mikeyobrien/ralph-orchestrator/commit/159821ddb1b5bb3dd8aff358a1b3d7d8cedcbdb3) feat(core): add hat scope enforcement, event chain validation, and human timeout routing - [`63111d9`](https://github.com/mikeyobrien/ralph-orchestrator/commit/63111d9e8007ee3e49158d6b1d5c5a1d8a693c4d) docs: add upstream PR draft for hat scope enforcement - [`479e6cb`](https://github.com/mikeyobrien/ralph-orchestrator/commit/479e6cbc5e7e8ccb3bea47c1bb1cc914a37080d4) feat(core): make hat enforcement opt-in via config flags - [`dd46589`](https://github.com/mikeyobrien/ralph-orchestrator/commit/dd46589d6a05cbd1aded82c651d93beb359ab23b) style: fix rustfmt formatting in hat enforcement code - [`31c55ef`](https://github.com/mikeyobrien/ralph-orchestrator/commit/31c55efc8348025302491932f910ee4a3d365572) spec: add context window utilization tracking design - [`adce9bc`](https://github.com/mikeyobrien/ralph-orchestrator/commit/adce9bcd46a364af4aea5f7a710bab5b86d2e062) fix(core): record default_publishes topics in seen_topics for chain validation - [`4da8468`](https://github.com/mikeyobrien/ralph-orchestrator/commit/4da84681accffb10c620857e8010f7af6fc09e97) docs: add upstream PR draft for default_publishes seen_topics fix - [`ed40abe`](https://github.com/mikeyobrien/ralph-orchestrator/commit/ed40abe586ab0390e9338b44d8f00de1d5f37213) fix(core): default_publishes of completion_promise must set completion_requested - [`3c25a06`](https://github.com/mikeyobrien/ralph-orchestrator/commit/3c25a06b0e300574635f1edf6bd672c1821cbd03) docs: add upstream PR draft for default_publishes completion_requested fix - [`3998962`](https://github.com/mikeyobrien/ralph-orchestrator/commit/399896252cfeaeb2673e4c6dce95c96cc6477286) feat(core): add disallowed_tools, stale loop detection, and file-modification audit ### 📊 Changes **18 files changed** (+1663 additions, -4 deletions) <details> <summary>View changed files</summary> 📝 `crates/ralph-bench/src/main.rs` (+2 -0) 📝 `crates/ralph-cli/src/display.rs` (+2 -0) 📝 `crates/ralph-cli/src/doctor.rs` (+1 -0) 📝 `crates/ralph-cli/src/loop_runner.rs` (+29 -0) 📝 `crates/ralph-core/src/config.rs` (+29 -0) 📝 `crates/ralph-core/src/event_loop/loop_state.rs` (+38 -0) 📝 `crates/ralph-core/src/event_loop/mod.rs` (+219 -4) 📝 `crates/ralph-core/src/event_loop/tests.rs` (+595 -0) 📝 `crates/ralph-core/src/hat_registry.rs` (+79 -0) 📝 `crates/ralph-core/src/hatless_ralph.rs` (+23 -0) 📝 `crates/ralph-core/src/summary_writer.rs` (+6 -0) ➕ `docs/specs/context-window-utilization.md` (+151 -0) ➕ `upstream-PRs/default-publishes-completion-requested-body.md` (+120 -0) ➕ `upstream-PRs/default-publishes-completion-requested.md` (+46 -0) ➕ `upstream-PRs/default-publishes-seen-topics-body.md` (+91 -0) ➕ `upstream-PRs/default-publishes-seen-topics.md` (+47 -0) ➕ `upstream-PRs/hat-scope-enforcement-body.md` (+58 -0) ➕ `upstream-PRs/hat-scope-enforcement.md` (+127 -0) </details> ### 📄 Description Closes #194 ## Summary Adds three engine-level mechanisms to prevent hat role violations and infinite loop cycling — Phase 2 of the loop cycling fix plan. - **2A**: `disallowed_tools` field on `HatConfig` with prompt-level enforcement - **2B**: Stale topic detection that terminates loops when the same event is emitted 3+ times consecutively - **2C**: Post-iteration file-modification audit that emits `scope_violation` events ## Problem Two bugs were observed during a ralph loop run: **Bug 1: Dispatcher implemented code (role violation)** The dispatcher hat read plan files and edited 12 source files despite instructions saying "Don't build anything yourself." Existing `enforce_hat_scope` only validates event publishing, not tool usage. There's no mechanism to prevent a hat from using Edit/Write/Bash tools. **Bug 2: 5 wasted iterations cycling after work done (~$1.7 burned)** After all work completed, the loop cycled between dispatcher and builder emitting the same events repeatedly: ``` iter 7 (builder): nothing to do → default_publishes → build.complete iter 8 (dispatcher): got build.complete → all.built iter 9 (builder): nothing to do → build.complete iter 10 (dispatcher): got build.complete → all.built (again) iter 11 (builder): finally emits LOOP_COMPLETE ``` The same `all.built` → `build.complete` → `all.built` pattern was also observed in a separate run. ## Changes ### 2A: `disallowed_tools` prompt-level enforcement | File | Change | |------|--------| | `crates/ralph-core/src/config.rs` | New `disallowed_tools: Vec<String>` field on `HatConfig` | | `crates/ralph-core/src/hatless_ralph.rs` | New `disallowed_tools` on `HatInfo`; TOOL RESTRICTIONS section injected in active hat prompts | When a hat has `disallowed_tools` configured, the prompt includes a prominent section: ```markdown ### TOOL RESTRICTIONS You MUST NOT use these tools in this hat: - **Edit** — blocked for this hat - **Write** — blocked for this hat Using a restricted tool is a scope violation. File modifications are audited after each iteration. ``` Preset usage: ```yaml dispatcher: disallowed_tools: ['Edit', 'Write', 'NotebookEdit'] ``` ### 2B: Stale topic detection (cycle breaker) | File | Change | |------|--------| | `crates/ralph-core/src/event_loop/loop_state.rs` | New `last_emitted_topic` and `consecutive_same_topic` fields; `record_topic()` tracks consecutive emissions | | `crates/ralph-core/src/event_loop/mod.rs` | New `TerminationReason::LoopStale` variant; `check_termination()` returns `LoopStale` when same topic emitted 3+ times | When the same topic is emitted 3 or more times consecutively, the loop terminates with exit code 1 (`LoopStale`). This catches the `all.built` → `build.complete` → `all.built` cycle pattern. The tracking is done in `record_topic()` which is called both from `process_events_from_jsonl()` (agent-written events) and `check_default_publishes()` (auto-injected events), ensuring full coverage. ### 2C: File-modification audit (hard enforcement) | File | Change | |------|--------| | `crates/ralph-core/src/event_loop/mod.rs` | New `audit_file_modifications()` method called from `process_output()` | After each iteration, if the active hat has `Edit` or `Write` in `disallowed_tools`, runs `git diff --stat HEAD` to detect unauthorized file modifications. If modifications are found, emits a `<hat_id>.scope_violation` event on the bus. Presets can route this event to trigger corrective actions: ```yaml final_committer: triggers: ['all.built', 'dispatcher.scope_violation'] ``` ### Exhaustive match updates | File | Change | |------|--------| | `crates/ralph-cli/src/display.rs` | Added `LoopStale` to termination display | | `crates/ralph-cli/src/loop_runner.rs` | Added `LoopStale` to history recording and merge queue state | | `crates/ralph-core/src/summary_writer.rs` | Added `LoopStale` to summary status text | | `crates/ralph-bench/src/main.rs` | Added `LoopStale` to benchmark result formatting | ## Tests - [x] `cargo test` — full workspace passes (all existing tests + new field defaults) - [x] `cargo build` — clean compilation ## Test Plan - [x] Configure a preset with `disallowed_tools: ['Edit', 'Write']` on dispatcher → verify TOOL RESTRICTIONS section appears in prompt - [x] Run a loop where the same event cycles 3+ times → verify `LoopStale` termination - [x] Run a loop where a restricted hat modifies files → verify `scope_violation` event emitted --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ralph-orchestrator#193
No description provided.