starred/OwnPilot

Fork 0

mirror of https://github.com/ownpilot/OwnPilot.git synced 2026-04-25 15:25:52 +03:00

[GH-ISSUE #6] [Feature Request] PTY-based interactive Claude Code CLI control for autonomous multi-agent orchestration #2

New issue

Closed

opened 2026-02-27 20:20:05 +03:00 by kerem · 2 comments

kerem commented

2026-02-27 20:20:05 +03:00

Owner

Originally created by @CyPack on GitHub (Feb 25, 2026).
Original GitHub issue: https://github.com/ownpilot/OwnPilot/issues/6

Summary

Currently, when agent frameworks spawn Claude Code as a subprocess (via exec/spawn with stdio: pipe), Claude Code detects it is not connected to a real terminal (isatty() == false) and silently switches out of interactive mode. This makes it impossible to:

Receive mid-task questions from Claude Code
Feed back autonomous answers within the 60-second AskUserQuestion window
Stream real-time output without block-buffering delays
Build multi-agent orchestration where agents consult each other

The fix is straightforward: spawn Claude Code via a PTY (Pseudo-Terminal) so it believes it is attached to a real terminal. This is the exact mechanism OpenClaw uses in its exec tool with pty: true.

Problem Statement

The `isatty()` Trap

When Claude Code is started via a pipe (subprocess.Popen(stdin=PIPE, stdout=PIPE) in Python, or Bun.spawn / child_process.spawn without a PTY in Node), the kernel's isatty() returns false. Claude Code uses this check to decide its behavior:

Behavior	With PTY (`isatty = true`)	Without PTY (`isatty = false`)
Mid-task questions (`AskUserQuestion`)	✅ Asked — waits for answer	❌ Suppressed — silently skips
ANSI colors / progress indicators	✅ Enabled	❌ Disabled
GNU Readline (history, editing)	✅ Active	❌ Falls back to raw buffered input
Real-time output streaming	✅ Line-buffered	❌ Block-buffered (stalls up to 4 KB)
Ctrl+C / SIGINT forwarding	✅ Delivered	❌ Ignored
Password / confirmation prompts	✅ Work	❌ Hang or fail silently

The most critical consequence: AskUserQuestion has a hard 60-second timeout. If the controlling agent cannot detect the question and inject an answer via PTY stdin within that window, the task either aborts or takes a destructive default path.

Why This Blocks Multi-Agent Orchestration

The ideal autonomous workflow looks like this:

User (WhatsApp/Telegram)
  → Orchestrator agent (cheap model, e.g. Haiku)
  → tmux pane 1: PTY → claude → /gsd:execute-phase 34
                              ↓ "Which database should I use?"
  → Orchestrator detects question in PTY output stream
  → tmux pane 2 (consultation): PTY → claude →
      "Given ROADMAP.md and previous decisions: {context} — which DB?"
                              ↓ "Use PostgreSQL"
  → Orchestrator feeds answer into pane 1's PTY stdin
  → Pane 1 continues without human intervention

This pattern is completely impossible with pipe-based subprocess control. The consultation agent in pane 2 never sees the question because pane 1 never emits it.

Requested Feature

Minimum Viable Implementation

Add a pty: true flag (or equivalent) to the shell/exec tool so Claude Code is spawned inside a PTY instead of a pipe.

Node.js reference implementation using node-pty:

const pty = require('node-pty');
const stripAnsi = require('strip-ansi');

function spawnClaudeCodeWithPTY(prompt, workdir, answerFn) {
    const proc = pty.spawn('claude', [prompt], {
        name: 'xterm-color',
        cols: 120,
        rows: 40,
        cwd: workdir,
        env: process.env,
    });

    let buffer = '';
    const QUESTION_TIMEOUT_MS = 55_000; // 5s margin before Claude's 60s timeout

    proc.onData(chunk => {
        buffer += stripAnsi(chunk);

        // Detect AskUserQuestion prompt pattern
        if (buffer.includes('?') && isWaitingForInput(proc)) {
            const question = extractQuestion(buffer);
            const answer = answerFn(question); // LLM call, context lookup, etc.
            proc.write(answer + '\r');
            buffer = '';
        }
    });

    return proc;
}

Rust reference (for Moltis — using portable-pty crate):

use portable_pty::{CommandBuilder, PtySize, native_pty_system};

fn spawn_claude_with_pty(prompt: &str, workdir: &Path) -> anyhow::Result<Box<dyn Child>> {
    let pty_system = native_pty_system();
    let pair = pty_system.openpty(PtySize { rows: 40, cols: 120, ..Default::default() })?;

    let mut cmd = CommandBuilder::new("claude");
    cmd.arg(prompt);
    cmd.cwd(workdir);

    let child = pair.slave.spawn_command(cmd)?;

    // Read from pair.master (output), write to pair.master (input)
    Ok(child)
}

Additional Advantages (Beyond Prompt Answering)

Beyond the core interactive-prompt use case, PTY unlocks several production-grade orchestration capabilities:

1. Process Group / Subtree Cleanup

When Claude Code is spawned via PTY with detached: true, all child processes it creates (subagents, compilers, test runners) belong to the same POSIX process group. A single kill(-pgid, SIGKILL) destroys the entire tree atomically:

# PTY model: kills Claude Code + every subprocess it spawned
kill -9 -$(ps -o pgid= -p $claude_pid | tr -d ' ')

# Pipe model: kills only the top process; children become orphans
kill $claude_pid

This is critical when an agent run goes wrong and must be terminated cleanly.

2. Session Persistence via Terminal Multiplexer

PTY sessions can be managed by tmux or screen, providing true session persistence:

# Spawn agent in detached tmux session
tmux new-session -d -s agent-phase34 'claude /gsd:execute-phase 34'

# Reconnect after SSH disconnect / laptop sleep
tmux attach-session -t agent-phase34

# Snapshot current output without interrupting
tmux capture-pane -t agent-phase34 -p

# Send orchestrator answer to running session
tmux send-keys -t agent-phase34 'PostgreSQL' Enter

A long-running multi-hour agent task survives network interruptions, server restarts, and laptop sleep/wake cycles — none of which are possible with a pipe-backed subprocess.

3. Pause / Resume Capability

PTY enables POSIX job control signals:

# Suspend an expensive agent run (e.g., to free API quota)
kill -SIGTSTP $claude_pty_pid    # Equivalent to Ctrl+Z

# Resume when ready
kill -SIGCONT $claude_pty_pid

With pipe-based processes, "pause" requires killing and restarting (losing all in-progress state).

4. SIGWINCH — Proper Output Formatting

Claude Code and rich CLI tools use SIGWINCH (window resize signal) to reflow their output. Without it, progress bars and formatted tables overflow or truncate incorrectly. PTY automatically delivers SIGWINCH on pty.resize(cols, rows):

process.on('SIGWINCH', () => {
    proc.resize(process.stdout.columns, process.stdout.rows);
});

5. Real-Time Streaming Without 4 KB Buffer Stall

Standard pipes are block-buffered (typically 4 KB) unless explicitly flushed. This means:

A Claude Code task generating 3 KB of progress output → nothing visible until overflow
Orchestrator decisions are delayed → 60-second question window shrinks

PTY forces line-buffering, so every newline flushes immediately to the master. Orchestrators react in real time.

Security Considerations

PTY-based control is powerful and should be gated appropriately:

Allowlist tool calls — A PTY-capable shell should still enforce allowCommands / denyCommands
Never expose PTY stdin to untrusted input — Prompt injection via messaging apps is the primary attack vector
Run in container when possible — PTY inside Docker/Podman limits blast radius of any compromise
Auth on MCP bridge — If PTY-controlled Claude Code is exposed over MCP, require API key authentication

Prior Art

OpenClaw (exec tool, pty: true flag): The primary reference implementation. PTY spawning with automatic fallback on EBADF (macOS edge case).
node-pty (Microsoft): Mature, cross-platform PTY library for Node.js. Used by VS Code's integrated terminal.
portable-pty (Rust crate): Cross-platform PTY for Rust. Used by Wezterm.
pexpect (Python): Classic PTY automation library. Pattern-based prompt detection.
Claude Squad: tmux-based multi-session manager for Claude Code (uses PTY indirectly via tmux).

Implementation Checklist

Add node-pty (Node) or portable-pty (Rust) as dependency
Add pty: boolean option to exec / shell tool
Implement PTY spawn path alongside existing subprocess path
Add ANSI strip utility for clean output parsing
Add resize(cols, rows) API so orchestrators can set terminal dimensions
Add EBADF fallback for macOS compatibility
Document 60-second AskUserQuestion timeout constraint
Add process_group_kill() helper for clean shutdown

Impact

Use Case	Without PTY	With PTY
Autonomous multi-agent orchestration	Blocked	Fully possible
Mid-task question answering	Impossible	60s window available
Real-time progress streaming	Delayed (4 KB buffer)	Immediate
Long-running task persistence	Lost on disconnect	Survives via tmux
Emergency task abort (full tree)	Orphan processes remain	Atomic via SIGKILL pgid
Pause/resume expensive tasks	Restart required (state lost)	SIGTSTP/SIGCONT

Feature request prepared based on analysis of OpenClaw's PTY implementation, node-pty documentation, and production multi-agent orchestration patterns.

Originally created by @CyPack on GitHub (Feb 25, 2026). Original GitHub issue: https://github.com/ownpilot/OwnPilot/issues/6 ## Summary Currently, when agent frameworks spawn Claude Code as a subprocess (via `exec`/`spawn` with `stdio: pipe`), Claude Code detects it is **not connected to a real terminal** (`isatty() == false`) and silently switches out of interactive mode. This makes it impossible to: 1. Receive mid-task questions from Claude Code 2. Feed back autonomous answers within the 60-second `AskUserQuestion` window 3. Stream real-time output without block-buffering delays 4. Build multi-agent orchestration where agents consult each other The fix is straightforward: spawn Claude Code via a **PTY (Pseudo-Terminal)** so it believes it is attached to a real terminal. This is the exact mechanism OpenClaw uses in its `exec` tool with `pty: true`. --- ## Problem Statement ### The `isatty()` Trap When Claude Code is started via a pipe (`subprocess.Popen(stdin=PIPE, stdout=PIPE)` in Python, or `Bun.spawn` / `child_process.spawn` without a PTY in Node), the kernel's `isatty()` returns `false`. Claude Code uses this check to decide its behavior: | Behavior | With PTY (`isatty = true`) | Without PTY (`isatty = false`) | |---|---|---| | Mid-task questions (`AskUserQuestion`) | ✅ Asked — waits for answer | ❌ Suppressed — silently skips | | ANSI colors / progress indicators | ✅ Enabled | ❌ Disabled | | GNU Readline (history, editing) | ✅ Active | ❌ Falls back to raw buffered input | | Real-time output streaming | ✅ Line-buffered | ❌ Block-buffered (stalls up to 4 KB) | | Ctrl+C / SIGINT forwarding | ✅ Delivered | ❌ Ignored | | Password / confirmation prompts | ✅ Work | ❌ Hang or fail silently | The most critical consequence: **`AskUserQuestion` has a hard 60-second timeout**. If the controlling agent cannot detect the question and inject an answer via PTY stdin within that window, the task either aborts or takes a destructive default path. ### Why This Blocks Multi-Agent Orchestration The ideal autonomous workflow looks like this: ``` User (WhatsApp/Telegram) → Orchestrator agent (cheap model, e.g. Haiku) → tmux pane 1: PTY → claude → /gsd:execute-phase 34 ↓ "Which database should I use?" → Orchestrator detects question in PTY output stream → tmux pane 2 (consultation): PTY → claude → "Given ROADMAP.md and previous decisions: {context} — which DB?" ↓ "Use PostgreSQL" → Orchestrator feeds answer into pane 1's PTY stdin → Pane 1 continues without human intervention ``` This pattern is **completely impossible** with pipe-based subprocess control. The consultation agent in pane 2 never sees the question because pane 1 never emits it. --- ## Requested Feature ### Minimum Viable Implementation Add a `pty: true` flag (or equivalent) to the shell/exec tool so Claude Code is spawned inside a PTY instead of a pipe. **Node.js reference implementation using `node-pty`:** ```javascript const pty = require('node-pty'); const stripAnsi = require('strip-ansi'); function spawnClaudeCodeWithPTY(prompt, workdir, answerFn) { const proc = pty.spawn('claude', [prompt], { name: 'xterm-color', cols: 120, rows: 40, cwd: workdir, env: process.env, }); let buffer = ''; const QUESTION_TIMEOUT_MS = 55_000; // 5s margin before Claude's 60s timeout proc.onData(chunk => { buffer += stripAnsi(chunk); // Detect AskUserQuestion prompt pattern if (buffer.includes('?') && isWaitingForInput(proc)) { const question = extractQuestion(buffer); const answer = answerFn(question); // LLM call, context lookup, etc. proc.write(answer + '\r'); buffer = ''; } }); return proc; } ``` **Rust reference (for Moltis — using `portable-pty` crate):** ```rust use portable_pty::{CommandBuilder, PtySize, native_pty_system}; fn spawn_claude_with_pty(prompt: &str, workdir: &Path) -> anyhow::Result<Box<dyn Child>> { let pty_system = native_pty_system(); let pair = pty_system.openpty(PtySize { rows: 40, cols: 120, ..Default::default() })?; let mut cmd = CommandBuilder::new("claude"); cmd.arg(prompt); cmd.cwd(workdir); let child = pair.slave.spawn_command(cmd)?; // Read from pair.master (output), write to pair.master (input) Ok(child) } ``` ### Additional Advantages (Beyond Prompt Answering) Beyond the core interactive-prompt use case, PTY unlocks several production-grade orchestration capabilities: #### 1. Process Group / Subtree Cleanup When Claude Code is spawned via PTY with `detached: true`, all child processes it creates (subagents, compilers, test runners) belong to the same **POSIX process group**. A single `kill(-pgid, SIGKILL)` destroys the entire tree atomically: ```bash # PTY model: kills Claude Code + every subprocess it spawned kill -9 -$(ps -o pgid= -p $claude_pid | tr -d ' ') # Pipe model: kills only the top process; children become orphans kill $claude_pid ``` This is critical when an agent run goes wrong and must be terminated cleanly. #### 2. Session Persistence via Terminal Multiplexer PTY sessions can be managed by `tmux` or `screen`, providing true session persistence: ```bash # Spawn agent in detached tmux session tmux new-session -d -s agent-phase34 'claude /gsd:execute-phase 34' # Reconnect after SSH disconnect / laptop sleep tmux attach-session -t agent-phase34 # Snapshot current output without interrupting tmux capture-pane -t agent-phase34 -p # Send orchestrator answer to running session tmux send-keys -t agent-phase34 'PostgreSQL' Enter ``` A long-running multi-hour agent task survives network interruptions, server restarts, and laptop sleep/wake cycles — none of which are possible with a pipe-backed subprocess. #### 3. Pause / Resume Capability PTY enables POSIX job control signals: ```bash # Suspend an expensive agent run (e.g., to free API quota) kill -SIGTSTP $claude_pty_pid # Equivalent to Ctrl+Z # Resume when ready kill -SIGCONT $claude_pty_pid ``` With pipe-based processes, "pause" requires killing and restarting (losing all in-progress state). #### 4. SIGWINCH — Proper Output Formatting Claude Code and rich CLI tools use `SIGWINCH` (window resize signal) to reflow their output. Without it, progress bars and formatted tables overflow or truncate incorrectly. PTY automatically delivers `SIGWINCH` on `pty.resize(cols, rows)`: ```javascript process.on('SIGWINCH', () => { proc.resize(process.stdout.columns, process.stdout.rows); }); ``` #### 5. Real-Time Streaming Without 4 KB Buffer Stall Standard pipes are **block-buffered** (typically 4 KB) unless explicitly flushed. This means: - A Claude Code task generating 3 KB of progress output → **nothing visible until overflow** - Orchestrator decisions are delayed → 60-second question window shrinks PTY forces **line-buffering**, so every newline flushes immediately to the master. Orchestrators react in real time. --- ## Security Considerations PTY-based control is powerful and should be gated appropriately: 1. **Allowlist tool calls** — A PTY-capable shell should still enforce `allowCommands` / `denyCommands` 2. **Never expose PTY stdin to untrusted input** — Prompt injection via messaging apps is the primary attack vector 3. **Run in container when possible** — PTY inside Docker/Podman limits blast radius of any compromise 4. **Auth on MCP bridge** — If PTY-controlled Claude Code is exposed over MCP, require API key authentication --- ## Prior Art - **OpenClaw** (`exec` tool, `pty: true` flag): The primary reference implementation. PTY spawning with automatic fallback on `EBADF` (macOS edge case). - **node-pty** (Microsoft): Mature, cross-platform PTY library for Node.js. Used by VS Code's integrated terminal. - **portable-pty** (Rust crate): Cross-platform PTY for Rust. Used by Wezterm. - **pexpect** (Python): Classic PTY automation library. Pattern-based prompt detection. - **Claude Squad**: `tmux`-based multi-session manager for Claude Code (uses PTY indirectly via tmux). --- ## Implementation Checklist - [ ] Add `node-pty` (Node) or `portable-pty` (Rust) as dependency - [ ] Add `pty: boolean` option to `exec` / `shell` tool - [ ] Implement PTY spawn path alongside existing subprocess path - [ ] Add ANSI strip utility for clean output parsing - [ ] Add `resize(cols, rows)` API so orchestrators can set terminal dimensions - [ ] Add `EBADF` fallback for macOS compatibility - [ ] Document 60-second `AskUserQuestion` timeout constraint - [ ] Add `process_group_kill()` helper for clean shutdown --- ## Impact | Use Case | Without PTY | With PTY | |---|---|---| | Autonomous multi-agent orchestration | Blocked | Fully possible | | Mid-task question answering | Impossible | 60s window available | | Real-time progress streaming | Delayed (4 KB buffer) | Immediate | | Long-running task persistence | Lost on disconnect | Survives via tmux | | Emergency task abort (full tree) | Orphan processes remain | Atomic via SIGKILL pgid | | Pause/resume expensive tasks | Restart required (state lost) | SIGTSTP/SIGCONT | --- *Feature request prepared based on analysis of OpenClaw's PTY implementation, node-pty documentation, and production multi-agent orchestration patterns.*

kerem referenced this issue

2026-02-27 20:20:05 +03:00

[GH-ISSUE #2] GUI-Add Local AI Provider problem #3

kerem closed this issue

2026-02-27 20:20:05 +03:00

kerem commented

2026-02-27 20:20:06 +03:00

Author

Owner

@ersinkoc commented on GitHub (Feb 26, 2026):

Thanks for the detailed writeup! PTY-based interactive CLI control is partially implemented in the codebase already via coding-agent-pty.ts and the Coding Agents feature (auto mode with child_process.spawn). The key limitation noted here — that node-pty is required for true interactive mode — is documented in our memory notes.

For now, the auto mode path (spawnStreamingProcess) works without native deps and handles most orchestration use cases. Full PTY support (interactive mode with node-pty) is on the roadmap but not blocking current workflows.

Closing as the core functionality exists. We'll reopen or create a follow-up when interactive PTY mode becomes a priority.

@ersinkoc commented on GitHub (Feb 26, 2026): Thanks for the detailed writeup! PTY-based interactive CLI control is partially implemented in the codebase already via `coding-agent-pty.ts` and the Coding Agents feature (auto mode with `child_process.spawn`). The key limitation noted here — that `node-pty` is required for true interactive mode — is documented in our memory notes. For now, the auto mode path (`spawnStreamingProcess`) works without native deps and handles most orchestration use cases. Full PTY support (interactive mode with `node-pty`) is on the roadmap but not blocking current workflows. Closing as the core functionality exists. We'll reopen or create a follow-up when interactive PTY mode becomes a priority.

kerem commented

2026-02-27 20:20:06 +03:00

Author

Owner

@ersinkoc commented on GitHub (Feb 26, 2026):

Closing as the core auto-mode functionality exists. Will revisit when full interactive PTY mode becomes a priority.

@ersinkoc commented on GitHub (Feb 26, 2026): Closing as the core auto-mode functionality exists. Will revisit when full interactive PTY mode becomes a priority.

No labels

pull-request

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/OwnPilot#2

No description provided.

Rows
Columns

[GH-ISSUE #6] [Feature Request] PTY-based interactive Claude Code CLI control for autonomous multi-agent orchestration #2

Summary

Problem Statement

The isatty() Trap

Why This Blocks Multi-Agent Orchestration

Requested Feature

Minimum Viable Implementation

Additional Advantages (Beyond Prompt Answering)

1. Process Group / Subtree Cleanup

2. Session Persistence via Terminal Multiplexer

3. Pause / Resume Capability

4. SIGWINCH — Proper Output Formatting

5. Real-Time Streaming Without 4 KB Buffer Stall

Security Considerations

Prior Art

Implementation Checklist

Impact

The `isatty()` Trap