[PR #210] [MERGED] feat(kiro): extended thinking support dan fix token counting #303

Closed
opened 2026-02-27 07:18:53 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/justlovemaki/AIClient-2-API/pull/210
Author: @tickernelz
Created: 1/11/2026
Status: Merged
Merged: 1/12/2026
Merged by: @justlovemaki

Base: mainHead: feat/kiro-think-token-fix


📝 Commits (3)

  • bdfb27d feat(kiro): implement extended thinking support with streaming and token estimation
  • 6ff2a9b docs(kiro): restore deleted comments
  • 10e4a48 Merge remote-tracking branch 'aiclient/main' into feat/kiro-think-token-fix

📊 Changes

1 file changed (+452 additions, -83 deletions)

View changed files

📝 src/providers/claude/claude-kiro.js (+452 -83)

📄 Description

Summary

This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools.

Key Features

1. Extended Thinking Support

  • Implements Claude's extended thinking capability (PR #197 feature request)
  • Real-time thinking tag parsing with state machine for streaming responses
  • Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+)
  • Dynamic block index management to maintain Claude API compatibility
  • Smart buffer management to prevent tag truncation during streaming
  • Support for thinking budget token configuration with validation and clamping

2. Enhanced Token Estimation

  • Adaptive overhead calculation based on tools definition size
  • Intelligent estimation strategy:
    • No tools: 25% overhead + 400 base tokens
    • Small tools definition (<21k tokens): 18% overhead + 400 base tokens
    • Large tools definition (≥21k tokens): 8% overhead + 400 base tokens
  • Comprehensive content type coverage:
    • System prompts
    • Text content
    • Tool results (critical for large context scenarios)
    • Tool use inputs
    • Images (1500 tokens each)
    • Thinking content in message history
  • Minimal gap with actual context usage (0.1-7% for large requests)
  • Real context values still retrieved from API for accuracy

3. Performance Characteristics

  • No performance degradation
  • Message start event sent immediately before content streaming
  • Maintains backward compatibility with existing implementations
  • Proper handling of conversation history in token calculations

Testing

Extensively tested across multiple AI agent tools:

  • Claude Code
  • OpenCode
  • Kilo Code
  • Forge

All tools demonstrate improved context management and accurate token estimation across various workload patterns including:

  • Small requests without tools
  • Large requests with extensive tool definitions
  • Multi-turn conversations with thinking content
  • Requests with large tool results (30k+ tokens)

Technical Implementation

Thinking Support

  • Added KIRO_THINKING constants for tag management
  • Implemented helper functions for tag detection outside quoted strings
  • Created thinking-related methods for budget normalization and prefix generation
  • Updated buildCodewhispererRequest() to inject thinking prefixes and handle thinking blocks
  • Refactored generateContentStream() with proper state machine for real-time parsing
  • Updated buildClaudeResponse() for non-streaming thinking support

Token Estimation

  • Refactored estimateInputTokens() to iterate through all content types
  • Implemented adaptive overhead based on tools definition size
  • Added proper handling for conversation history overhead
  • Removed debug logging for production readiness

Breaking Changes

None. All changes are backward compatible.

  • Addresses PR #197 feature request for extended thinking support
  • Improves token estimation accuracy for auto-compact functionality

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/justlovemaki/AIClient-2-API/pull/210 **Author:** [@tickernelz](https://github.com/tickernelz) **Created:** 1/11/2026 **Status:** ✅ Merged **Merged:** 1/12/2026 **Merged by:** [@justlovemaki](https://github.com/justlovemaki) **Base:** `main` ← **Head:** `feat/kiro-think-token-fix` --- ### 📝 Commits (3) - [`bdfb27d`](https://github.com/justlovemaki/AIClient-2-API/commit/bdfb27d6d4a3fed8b392999d038aaf2b29f0678f) feat(kiro): implement extended thinking support with streaming and token estimation - [`6ff2a9b`](https://github.com/justlovemaki/AIClient-2-API/commit/6ff2a9b7bdbd54af775c49db59057c40681e2272) docs(kiro): restore deleted comments - [`10e4a48`](https://github.com/justlovemaki/AIClient-2-API/commit/10e4a48f7993d0bd90a536c031215073751b6197) Merge remote-tracking branch 'aiclient/main' into feat/kiro-think-token-fix ### 📊 Changes **1 file changed** (+452 additions, -83 deletions) <details> <summary>View changed files</summary> 📝 `src/providers/claude/claude-kiro.js` (+452 -83) </details> ### 📄 Description ## Summary This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools. ## Key Features ### 1. Extended Thinking Support - Implements Claude's extended thinking capability (PR #197 feature request) - Real-time thinking tag parsing with state machine for streaming responses - Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+) - Dynamic block index management to maintain Claude API compatibility - Smart buffer management to prevent tag truncation during streaming - Support for thinking budget token configuration with validation and clamping ### 2. Enhanced Token Estimation - Adaptive overhead calculation based on tools definition size - Intelligent estimation strategy: - No tools: 25% overhead + 400 base tokens - Small tools definition (<21k tokens): 18% overhead + 400 base tokens - Large tools definition (≥21k tokens): 8% overhead + 400 base tokens - Comprehensive content type coverage: - System prompts - Text content - Tool results (critical for large context scenarios) - Tool use inputs - Images (1500 tokens each) - Thinking content in message history - Minimal gap with actual context usage (0.1-7% for large requests) - Real context values still retrieved from API for accuracy ### 3. Performance Characteristics - No performance degradation - Message start event sent immediately before content streaming - Maintains backward compatibility with existing implementations - Proper handling of conversation history in token calculations ## Testing Extensively tested across multiple AI agent tools: - Claude Code - OpenCode - Kilo Code - Forge All tools demonstrate improved context management and accurate token estimation across various workload patterns including: - Small requests without tools - Large requests with extensive tool definitions - Multi-turn conversations with thinking content - Requests with large tool results (30k+ tokens) ## Technical Implementation ### Thinking Support - Added `KIRO_THINKING` constants for tag management - Implemented helper functions for tag detection outside quoted strings - Created thinking-related methods for budget normalization and prefix generation - Updated `buildCodewhispererRequest()` to inject thinking prefixes and handle thinking blocks - Refactored `generateContentStream()` with proper state machine for real-time parsing - Updated `buildClaudeResponse()` for non-streaming thinking support ### Token Estimation - Refactored `estimateInputTokens()` to iterate through all content types - Implemented adaptive overhead based on tools definition size - Added proper handling for conversation history overhead - Removed debug logging for production readiness ## Breaking Changes None. All changes are backward compatible. ## Related Issues - Addresses PR #197 feature request for extended thinking support - Improves token estimation accuracy for auto-compact functionality --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-02-27 07:18:53 +03:00
Sign in to join this conversation.
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/AIClient-2-API-justlovemaki#303
No description provided.