[GH-ISSUE #178] [Bug]: is 'args' parameter broken for custom hats? #72

Closed
opened 2026-02-27 10:22:04 +03:00 by kerem · 1 comment
Owner

Originally created by @francocalvo on GitHub (Feb 19, 2026).
Original GitHub issue: https://github.com/mikeyobrien/ralph-orchestrator/issues/178

Operating system

macOS 14.3

Ralph version

ralph 2.5.1

AI backend

OpenCode

Hat preset / workflow

No response

Steps to reproduce

I'm running this ralph.yml:

cli:
  backend: "claude"
project:
  prompt: "PROMPT.md"
  specs_dir: "./.ralph/specs/s3-sync-cli/" 
^^^^^
Workaround

tasks:
  enabled: true

event_loop:
  completion_promise: "LOOP_COMPLETE"
  max_iterations: 150
  starting_event: "work.start"

hats:
  planner:
    name: "📋 Planner"
    description: "Analyzes requirements and creates implementation plans"
    backend: "claude"
    triggers: ["work.start"]
    publishes: ["plan.complete"]
    instructions: |
      YOU DO NOT CODE.
      YOU ARE A PLANNER.
      YOU OUTPUT A PLAN. YOU DO NOT CODE.

      You are a senior software architect focused on planning and analysis. You never
      give code examples or implementations, you explain the needed changes and
      features in clear steps. You NEVER write code, not even in your plans.

      Use the ralph tools to manage tasks. Tasks are important:
      `ralph tools task --help` to understand more on how to use them.

      If tasks are empty, it means you are the initial planner, which needs to wire the tasks mentioned in the PRPOMPT.md to tasks. 
      You do not need to plan nor start coding.

      Your role:
      - Analyze requirements and break them into clear, actionable steps
      - Design system architecture and data flows
      - Identify edge cases, potential issues, and dependencies
      - Create structured implementation plans

      Guidelines:
      - Think step-by-step before proposing solutions
      - Consider trade-offs and alternatives
      - Be specific about file structure, function signatures, and data models
      - Do NOT make any code changes — only analyze and plan

      Always consider:
      - Update docs
      - Implement tests
      - Run linters and formatters
      - Check and fix tests
      - Add verification steps to confirm the feature works

      Publish plan.complete when your plan is ready.

  coder:
    name: "⚙️ Coder"
    description: "Implements the plan or fixes review issues"
    backend: "opencode"
    args: ["--agent", "danger", "--model", "zai-coding-plan/glm-4.7"]
    triggers: ["plan.complete", "review.issues"]
    publishes: ["code.complete"]
    instructions: |
      Implement the plan from the planner, or fix issues from the reviewer.
      If you finished one task, you don't continue with the next one alone, 
      as the plan might not have been done or the review might have corrections.

      If triggered by plan.complete: implement the full plan.
      If triggered by review.issues: fix the issues identified in the review.
        - Address all 🔴 Critical issues.
        - Address 🟡 Important issues if reasonable.
        - 🟢 Suggestions are optional.

      Follow existing code patterns in the codebase.
      Run tests after implementation.
      Publish code.complete when done.

      You DON'T close tasks. That's the reviewer job.

      For all Python packages, use UV!
      Running test:
      `uv run pytest`

  reviewer:
    name: "🔍 Reviewer"
    description: "Reviews code for quality and correctness"
    backend: "codex"
    triggers: ["code.complete"]
    publishes: ["review.issues"]
    instructions: |
      You are a senior code reviewer. Provide constructive, actionable feedback.

      Review Process:
      1. Understand context — Read the diff and surrounding code
      2. Check correctness — Logic errors, edge cases, error handling
      3. Check style — Repeated code, complicated code. Code must be maintainable.
      4. Verify types — Run `uv run pyright` if type issues suspected
      5. Run linter — Use `ruff check .` to catch style/quality issues
      6. Assess tests — Check coverage, run `uv run pytest` if needed

      Review Categories:
      - 🔴 Critical: Must fix — bugs, security issues, data loss risks
      - 🟡 Important: Should fix — performance, maintainability, missing tests
      - 🟢 Suggestion: Nice to have — style, naming, minor improvements

      What to Check:
      - Correctness: Off-by-one errors, null handling, race conditions, error handling
      - Security: Input validation, SQL injection, secrets in code, permission checks
      - Performance: N+1 queries, unnecessary allocations, blocking calls in async
      - Maintainability: Clear naming, appropriate abstraction, code duplication
      - Testing: Coverage gaps, brittle tests, missing edge cases, test isolation

      Be specific. Reference file paths and line numbers. Suggest concrete fixes.
      Do NOT make changes. Only analyze and report findings.

      If 🔴 Critical or 🟡 Important issues found: publish review.issues
      If no issues: finish.

      If you see in any step that tests are broken, even if they are from before,
      report them as issues.

      Don't create endless loops — ship it if it works. If only minor 🟢 Suggestions
      remain after a fix cycle, finish.

      You are the only one that should close the Ralph tasks. So afer the review is 
      successful, first close the task as completed.

      For all Python packages, use UV!
      Running test:
      `uv run pytest`

core:
  guardrails:
    - "Always run tests before declaring done"
    - "Ruff and pyright are important tools that should pass"
    - "Never modify production database"
    - "Follow existing code patterns"

Expected behavior

  1. It should use the correct agent and model in Opencode.
  2. --dry-run should show the correct spec dir specified in ralph.yml.

Actual behavior

I've got two issues:

  1. The args is being ignored. I think it wasn't before, but I can't 100% confirm. Now the header at the beggining show which model is using and it's 100% not the one I specify. This has happened with Opencode, but I'll test Codex too.
  2. The specs directory is not being picked up. I do: ralph run --dry-run and the specs are shown in ./.ralph. I had to move them there, but previously I had them in ./ralph/specs but the dry-run shows them in the wrong spot.

Logs or error output


Config / preset file

No response

Originally created by @francocalvo on GitHub (Feb 19, 2026). Original GitHub issue: https://github.com/mikeyobrien/ralph-orchestrator/issues/178 ### Operating system macOS 14.3 ### Ralph version ralph 2.5.1 ### AI backend OpenCode ### Hat preset / workflow _No response_ ### Steps to reproduce I'm running this `ralph.yml`: ``` cli: backend: "claude" project: prompt: "PROMPT.md" specs_dir: "./.ralph/specs/s3-sync-cli/" ^^^^^ Workaround tasks: enabled: true event_loop: completion_promise: "LOOP_COMPLETE" max_iterations: 150 starting_event: "work.start" hats: planner: name: "📋 Planner" description: "Analyzes requirements and creates implementation plans" backend: "claude" triggers: ["work.start"] publishes: ["plan.complete"] instructions: | YOU DO NOT CODE. YOU ARE A PLANNER. YOU OUTPUT A PLAN. YOU DO NOT CODE. You are a senior software architect focused on planning and analysis. You never give code examples or implementations, you explain the needed changes and features in clear steps. You NEVER write code, not even in your plans. Use the ralph tools to manage tasks. Tasks are important: `ralph tools task --help` to understand more on how to use them. If tasks are empty, it means you are the initial planner, which needs to wire the tasks mentioned in the PRPOMPT.md to tasks. You do not need to plan nor start coding. Your role: - Analyze requirements and break them into clear, actionable steps - Design system architecture and data flows - Identify edge cases, potential issues, and dependencies - Create structured implementation plans Guidelines: - Think step-by-step before proposing solutions - Consider trade-offs and alternatives - Be specific about file structure, function signatures, and data models - Do NOT make any code changes — only analyze and plan Always consider: - Update docs - Implement tests - Run linters and formatters - Check and fix tests - Add verification steps to confirm the feature works Publish plan.complete when your plan is ready. coder: name: "⚙️ Coder" description: "Implements the plan or fixes review issues" backend: "opencode" args: ["--agent", "danger", "--model", "zai-coding-plan/glm-4.7"] triggers: ["plan.complete", "review.issues"] publishes: ["code.complete"] instructions: | Implement the plan from the planner, or fix issues from the reviewer. If you finished one task, you don't continue with the next one alone, as the plan might not have been done or the review might have corrections. If triggered by plan.complete: implement the full plan. If triggered by review.issues: fix the issues identified in the review. - Address all 🔴 Critical issues. - Address 🟡 Important issues if reasonable. - 🟢 Suggestions are optional. Follow existing code patterns in the codebase. Run tests after implementation. Publish code.complete when done. You DON'T close tasks. That's the reviewer job. For all Python packages, use UV! Running test: `uv run pytest` reviewer: name: "🔍 Reviewer" description: "Reviews code for quality and correctness" backend: "codex" triggers: ["code.complete"] publishes: ["review.issues"] instructions: | You are a senior code reviewer. Provide constructive, actionable feedback. Review Process: 1. Understand context — Read the diff and surrounding code 2. Check correctness — Logic errors, edge cases, error handling 3. Check style — Repeated code, complicated code. Code must be maintainable. 4. Verify types — Run `uv run pyright` if type issues suspected 5. Run linter — Use `ruff check .` to catch style/quality issues 6. Assess tests — Check coverage, run `uv run pytest` if needed Review Categories: - 🔴 Critical: Must fix — bugs, security issues, data loss risks - 🟡 Important: Should fix — performance, maintainability, missing tests - 🟢 Suggestion: Nice to have — style, naming, minor improvements What to Check: - Correctness: Off-by-one errors, null handling, race conditions, error handling - Security: Input validation, SQL injection, secrets in code, permission checks - Performance: N+1 queries, unnecessary allocations, blocking calls in async - Maintainability: Clear naming, appropriate abstraction, code duplication - Testing: Coverage gaps, brittle tests, missing edge cases, test isolation Be specific. Reference file paths and line numbers. Suggest concrete fixes. Do NOT make changes. Only analyze and report findings. If 🔴 Critical or 🟡 Important issues found: publish review.issues If no issues: finish. If you see in any step that tests are broken, even if they are from before, report them as issues. Don't create endless loops — ship it if it works. If only minor 🟢 Suggestions remain after a fix cycle, finish. You are the only one that should close the Ralph tasks. So afer the review is successful, first close the task as completed. For all Python packages, use UV! Running test: `uv run pytest` core: guardrails: - "Always run tests before declaring done" - "Ruff and pyright are important tools that should pass" - "Never modify production database" - "Follow existing code patterns" ``` ### Expected behavior 1. It should use the correct agent and model in Opencode. 2. `--dry-run` should show the correct spec dir specified in `ralph.yml`. ### Actual behavior I've got two issues: 1. The args is being ignored. I think it wasn't before, but I can't 100% confirm. Now the header at the beggining show which model is using and it's 100% not the one I specify. This has happened with Opencode, but I'll test Codex too. 2. The `specs` directory is not being picked up. I do: `ralph run --dry-run` and the specs are shown in `./.ralph`. I had to move them there, but previously I had them in `./ralph/specs` but the dry-run shows them in the wrong spot. ### Logs or error output ```shell ``` ### Config / preset file _No response_
kerem 2026-02-27 10:22:04 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@mikeyobrien commented on GitHub (Feb 25, 2026):

Revalidated this against current mainline behavior.

Findings:

  • Hat args are working for custom hats: args are passed through to the selected hat backend invocation.
  • Specs directory in dry-run is read from core.specs_dir.
  • project.specs_dir is intentionally deprecated and now rejected with an explicit error directing users to core.specs_dir.

Given that, this does not appear to be an active bug in current code.
Closing as completed.

If you can still reproduce on latest, please open a new issue with a minimal config and exact command/output.

<!-- gh-comment-id:3960134684 --> @mikeyobrien commented on GitHub (Feb 25, 2026): Revalidated this against current mainline behavior. Findings: - Hat args are working for custom hats: args are passed through to the selected hat backend invocation. - Specs directory in dry-run is read from core.specs_dir. - project.specs_dir is intentionally deprecated and now rejected with an explicit error directing users to core.specs_dir. Given that, this does not appear to be an active bug in current code. Closing as completed. If you can still reproduce on latest, please open a new issue with a minimal config and exact command/output.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ralph-orchestrator#72
No description provided.