[PR #203] [CLOSED] feat(kiro): Add extended thinking support and enhanced token estimation #301

Closed
opened 2026-02-27 07:18:52 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/justlovemaki/AIClient-2-API/pull/203
Author: @tickernelz
Created: 1/9/2026
Status: Closed

Base: mainHead: feat/kiro-thinking-support-v2


📝 Commits (7)

  • fe38820 Merge branch 'justlovemaki:main' into main
  • 1aa0f37 Merge remote-tracking branch 'aiclient/main'
  • fc01608 Revert "refactor(kiro): 优化流式响应立即发送message_start,移除contextUsagePercentage等待逻辑"
  • 2d4690f Merge branch 'fix-400-kiro'
  • c53e38f feat(kiro): add extended thinking support with streaming content block handling
  • b6fb472 refactor(kiro): improve input token estimation with detailed breakdown and dynamic updates
  • 692fa49 Merge remote-tracking branch 'aiclient/main' into feat/kiro-thinking-support-v2

📊 Changes

1 file changed (+441 additions, -78 deletions)

View changed files

📝 src/claude/claude-kiro.js (+441 -78)

📄 Description

Summary

This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools.

Key Features

1. Extended Thinking Support

  • Implements Claude's extended thinking capability (PR #197 feature request)
  • Real-time thinking tag parsing with state machine for streaming responses
  • Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+)
  • Dynamic block index management to maintain Claude API compatibility
  • Smart buffer management to prevent tag truncation during streaming
  • Support for thinking budget token configuration with validation and clamping

2. Enhanced Token Estimation

  • Adaptive overhead calculation based on tools definition size
  • Intelligent estimation strategy:
    • No tools: 25% overhead + 400 base tokens
    • Small tools definition (<21k tokens): 18% overhead + 400 base tokens
    • Large tools definition (≥21k tokens): 8% overhead + 400 base tokens
  • Comprehensive content type coverage:
    • System prompts
    • Text content
    • Tool results (critical for large context scenarios)
    • Tool use inputs
    • Images (1500 tokens each)
    • Thinking content in message history
  • Minimal gap with actual context usage (0.1-7% for large requests)
  • Real context values still retrieved from API for accuracy

3. Performance Characteristics

  • No performance degradation
  • Message start event sent immediately before content streaming
  • Maintains backward compatibility with existing implementations
  • Proper handling of conversation history in token calculations

Testing

Extensively tested across multiple AI agent tools:

  • Claude Code
  • OpenCode
  • Kilo Code
  • Forge

All tools demonstrate improved context management and accurate token estimation across various workload patterns including:

  • Small requests without tools
  • Large requests with extensive tool definitions
  • Multi-turn conversations with thinking content
  • Requests with large tool results (30k+ tokens)

Technical Implementation

Thinking Support

  • Added KIRO_THINKING constants for tag management
  • Implemented helper functions for tag detection outside quoted strings
  • Created thinking-related methods for budget normalization and prefix generation
  • Updated buildCodewhispererRequest() to inject thinking prefixes and handle thinking blocks
  • Refactored generateContentStream() with proper state machine for real-time parsing
  • Updated buildClaudeResponse() for non-streaming thinking support

Token Estimation

  • Refactored estimateInputTokens() to iterate through all content types
  • Implemented adaptive overhead based on tools definition size
  • Added proper handling for conversation history overhead
  • Removed debug logging for production readiness

Files Changed

  • src/claude/claude-kiro.js: +441 lines, -78 lines

Breaking Changes

None. All changes are backward compatible.

  • Addresses PR #197 feature request for extended thinking support
  • Improves token estimation accuracy for auto-compact functionality

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/justlovemaki/AIClient-2-API/pull/203 **Author:** [@tickernelz](https://github.com/tickernelz) **Created:** 1/9/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `feat/kiro-thinking-support-v2` --- ### 📝 Commits (7) - [`fe38820`](https://github.com/justlovemaki/AIClient-2-API/commit/fe38820f72b5f36f964131cc92b460c2b2545bf2) Merge branch 'justlovemaki:main' into main - [`1aa0f37`](https://github.com/justlovemaki/AIClient-2-API/commit/1aa0f371b27b0ed3ae1497afb92520073bcbd7df) Merge remote-tracking branch 'aiclient/main' - [`fc01608`](https://github.com/justlovemaki/AIClient-2-API/commit/fc01608a2e98fd2983c526f7101617cd433e6256) Revert "refactor(kiro): 优化流式响应立即发送message_start,移除contextUsagePercentage等待逻辑" - [`2d4690f`](https://github.com/justlovemaki/AIClient-2-API/commit/2d4690fbe19cf36383afce812ef341ddec867a2b) Merge branch 'fix-400-kiro' - [`c53e38f`](https://github.com/justlovemaki/AIClient-2-API/commit/c53e38ff06308debdd4e7608c7e6bffd116f8891) feat(kiro): add extended thinking support with streaming content block handling - [`b6fb472`](https://github.com/justlovemaki/AIClient-2-API/commit/b6fb472971ad56c86412be1cb80e9bf6b890246d) refactor(kiro): improve input token estimation with detailed breakdown and dynamic updates - [`692fa49`](https://github.com/justlovemaki/AIClient-2-API/commit/692fa491a58db3197473dfaf50769aa46477036c) Merge remote-tracking branch 'aiclient/main' into feat/kiro-thinking-support-v2 ### 📊 Changes **1 file changed** (+441 additions, -78 deletions) <details> <summary>View changed files</summary> 📝 `src/claude/claude-kiro.js` (+441 -78) </details> ### 📄 Description ## Summary This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools. ## Key Features ### 1. Extended Thinking Support - Implements Claude's extended thinking capability (PR #197 feature request) - Real-time thinking tag parsing with state machine for streaming responses - Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+) - Dynamic block index management to maintain Claude API compatibility - Smart buffer management to prevent tag truncation during streaming - Support for thinking budget token configuration with validation and clamping ### 2. Enhanced Token Estimation - Adaptive overhead calculation based on tools definition size - Intelligent estimation strategy: - No tools: 25% overhead + 400 base tokens - Small tools definition (<21k tokens): 18% overhead + 400 base tokens - Large tools definition (≥21k tokens): 8% overhead + 400 base tokens - Comprehensive content type coverage: - System prompts - Text content - Tool results (critical for large context scenarios) - Tool use inputs - Images (1500 tokens each) - Thinking content in message history - Minimal gap with actual context usage (0.1-7% for large requests) - Real context values still retrieved from API for accuracy ### 3. Performance Characteristics - No performance degradation - Message start event sent immediately before content streaming - Maintains backward compatibility with existing implementations - Proper handling of conversation history in token calculations ## Testing Extensively tested across multiple AI agent tools: - Claude Code - OpenCode - Kilo Code - Forge All tools demonstrate improved context management and accurate token estimation across various workload patterns including: - Small requests without tools - Large requests with extensive tool definitions - Multi-turn conversations with thinking content - Requests with large tool results (30k+ tokens) ## Technical Implementation ### Thinking Support - Added `KIRO_THINKING` constants for tag management - Implemented helper functions for tag detection outside quoted strings - Created thinking-related methods for budget normalization and prefix generation - Updated `buildCodewhispererRequest()` to inject thinking prefixes and handle thinking blocks - Refactored `generateContentStream()` with proper state machine for real-time parsing - Updated `buildClaudeResponse()` for non-streaming thinking support ### Token Estimation - Refactored `estimateInputTokens()` to iterate through all content types - Implemented adaptive overhead based on tools definition size - Added proper handling for conversation history overhead - Removed debug logging for production readiness ## Files Changed - `src/claude/claude-kiro.js`: +441 lines, -78 lines ## Breaking Changes None. All changes are backward compatible. ## Related Issues - Addresses PR #197 feature request for extended thinking support - Improves token estimation accuracy for auto-compact functionality --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-02-27 07:18:52 +03:00
Sign in to join this conversation.
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/AIClient-2-API-justlovemaki#301
No description provided.