[GH-ISSUE #36] resume later when reaching limits with Claude OAuth #8

Closed
opened 2026-02-27 07:19:57 +03:00 by kerem · 1 comment
Owner

Originally created by @resident-ngo on GitHub (Dec 22, 2025).
Original GitHub issue: https://github.com/KeygraphHQ/shannon/issues/36

Claude Pro has some aggressive limits that are easily hit. It seems that some modules like SSRF can detect that limits were hit:

⠏ Running analysis...    ✅ Empty checkpoint created (no workspace changes)
    🎭 Assigned authz-vuln → playwright-agent5
[SSRF] Spending cap reached resets 6pm

    🏁 COMPLETED:
    ⏱️  Duration: 0.7s, Cost: $0.0000
✓ Analysis complete! (1 turns, 1.9s)

But then it tries to move forward - all the validations fail and final report is empty.

In other cases we can see no deliverables produced and it still tries to execute steps doomed to fail:

    🤖 Turn 348 (Pre-recon agent):
    Now I'll synthesize all findings into the comprehensive security report. Based on the agent outputs, I have all the information needed. Let me create the complete CODE_ANALYSIS deliverable:

    🤖 Turn 349 (Pre-recon agent):
    API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable.
    ⚠️  API Error detected in assistant response: API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable.

    🏁 COMPLETED:
    ⏱️  Duration: 1714.3s, Cost: $6.3090
  ⚠️ API Error detected in Pre-recon agent - will validate deliverables before failing
✓ Pre-recon analysis complete! (349 turns, 29m 5s)
    🔍 Validating pre-recon agent output
    📋 Using validator for agent: pre-recon
    📂 Source directory: /app/repos/reponame
    ❌ Validation failed: Missing required deliverable files
⚠️ Pre-recon agent completed but output validation failed
⚠️ API Error detected with validation failure - treating as retryable
    🔄 Rolling back workspace for validation failure
    ✅ Rollback completed - removed 1 contaminated changes:
       ?? deliverables/
    📍 Creating checkpoint for Pre-recon agent (attempt 2)
    🧹 Cleaning workspace for Pre-recon agent (retry cleanup)
    ✅ Workspace already clean (no changes to remove)
    ✅ Empty checkpoint created (no workspace changes)
    🎭 Assigned pre-recon → playwright-agent1

    🤖 Turn 1 (Pre-recon agent):
    I'll begin the comprehensive security-focused code analysis of this application. Let me start by creating a task list and then launching the discovery agents.

To reproduce the issue you can use Claude OAuth token and use any huge repo like existing Wordpress installation with some plugins.

The desired behavior is to pause the work, wait for the time when limits are reset and continue, ideally retrying failed jobs.

P.S. Awesome product, thank you so much for making it available for free for small security initiative and individuals! Many organizations have Claude Pro subscription, but can't afford to pay 50$ per run. From other hand they are very patient, so it is ok for them to wait for 12-24 hours for analysis and tests to finish.

Originally created by @resident-ngo on GitHub (Dec 22, 2025). Original GitHub issue: https://github.com/KeygraphHQ/shannon/issues/36 Claude Pro has some aggressive limits that are easily hit. It seems that some modules like SSRF can detect that limits were hit: ``` ⠏ Running analysis... ✅ Empty checkpoint created (no workspace changes) 🎭 Assigned authz-vuln → playwright-agent5 [SSRF] Spending cap reached resets 6pm 🏁 COMPLETED: ⏱️ Duration: 0.7s, Cost: $0.0000 ✓ Analysis complete! (1 turns, 1.9s) ``` But then it tries to move forward - all the validations fail and final report is empty. In other cases we can see no deliverables produced and it still tries to execute steps doomed to fail: ``` 🤖 Turn 348 (Pre-recon agent): Now I'll synthesize all findings into the comprehensive security report. Based on the agent outputs, I have all the information needed. Let me create the complete CODE_ANALYSIS deliverable: 🤖 Turn 349 (Pre-recon agent): API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable. ⚠️ API Error detected in assistant response: API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable. 🏁 COMPLETED: ⏱️ Duration: 1714.3s, Cost: $6.3090 ⚠️ API Error detected in Pre-recon agent - will validate deliverables before failing ✓ Pre-recon analysis complete! (349 turns, 29m 5s) 🔍 Validating pre-recon agent output 📋 Using validator for agent: pre-recon 📂 Source directory: /app/repos/reponame ❌ Validation failed: Missing required deliverable files ⚠️ Pre-recon agent completed but output validation failed ⚠️ API Error detected with validation failure - treating as retryable 🔄 Rolling back workspace for validation failure ✅ Rollback completed - removed 1 contaminated changes: ?? deliverables/ 📍 Creating checkpoint for Pre-recon agent (attempt 2) 🧹 Cleaning workspace for Pre-recon agent (retry cleanup) ✅ Workspace already clean (no changes to remove) ✅ Empty checkpoint created (no workspace changes) 🎭 Assigned pre-recon → playwright-agent1 🤖 Turn 1 (Pre-recon agent): I'll begin the comprehensive security-focused code analysis of this application. Let me start by creating a task list and then launching the discovery agents. ``` **To reproduce** the issue you can use Claude OAuth token and use any huge repo like existing Wordpress installation with some plugins. **The desired behavior** is to pause the work, wait for the time when limits are reset and continue, ideally retrying failed jobs. P.S. Awesome product, thank you so much for making it available for free for small security initiative and individuals! Many organizations have Claude Pro subscription, but can't afford to pay 50$ per run. From other hand they are very patient, so it is ok for them to wait for 12-24 hours for analysis and tests to finish.
kerem closed this issue 2026-02-27 07:19:57 +03:00
Author
Owner

@keygraphVarun commented on GitHub (Jan 18, 2026):

Sorry for the late reply! And thanks for the kind words, that's what makes OSS work worthwhile.
The "detect rate limit → wait for reset → resume" flow you're describing makes a lot of sense, especially for users on Claude Pro users who are happy to wait for limits to reset rather than burn through failed retries.
Quick update: we recently landed a big refactor using Temporal for durable execution, which gives us the foundation to support something like this. Can't promise if or when we'll build it, but I've added it to the backlog.
(Cc: @ajmallesh )
Thanks again for the detailed write-up!

<!-- gh-comment-id:3765013390 --> @keygraphVarun commented on GitHub (Jan 18, 2026): Sorry for the late reply! And thanks for the kind words, that's what makes OSS work worthwhile. The "detect rate limit → wait for reset → resume" flow you're describing makes a lot of sense, especially for users on Claude Pro users who are happy to wait for limits to reset rather than burn through failed retries. Quick update: we recently landed a big refactor using Temporal for durable execution, which gives us the foundation to support something like this. Can't promise if or when we'll build it, but I've added it to the backlog. (Cc: @ajmallesh ) Thanks again for the detailed write-up!
Sign in to join this conversation.
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shannon-KeygraphHQ#8
No description provided.