starred/AIClient-2-API-justlovemaki

Fork 0

mirror of https://github.com/justlovemaki/AIClient-2-API.git synced 2026-04-26 09:55:54 +03:00

[PR #203] [CLOSED] feat(kiro): Add extended thinking support and enhanced token estimation #301

New issue

Closed

opened 2026-02-27 07:18:52 +03:00 by kerem · 0 comments

kerem commented

2026-02-27 07:18:52 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/justlovemaki/AIClient-2-API/pull/203
Author: @tickernelz
Created: 1/9/2026
Status: ❌ Closed

Base: main ← Head: feat/kiro-thinking-support-v2

📝 Commits (7)

fe38820 Merge branch 'justlovemaki:main' into main
1aa0f37 Merge remote-tracking branch 'aiclient/main'
fc01608 Revert "refactor(kiro): 优化流式响应立即发送message_start，移除contextUsagePercentage等待逻辑"
2d4690f Merge branch 'fix-400-kiro'
c53e38f feat(kiro): add extended thinking support with streaming content block handling
b6fb472 refactor(kiro): improve input token estimation with detailed breakdown and dynamic updates
692fa49 Merge remote-tracking branch 'aiclient/main' into feat/kiro-thinking-support-v2

📊 Changes

1 file changed (+441 additions, -78 deletions)

View changed files

📝 src/claude/claude-kiro.js (+441 -78)

📄 Description

Summary

This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools.

Key Features

1. Extended Thinking Support

Implements Claude's extended thinking capability (PR #197 feature request)
Real-time thinking tag parsing with state machine for streaming responses
Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+)
Dynamic block index management to maintain Claude API compatibility
Smart buffer management to prevent tag truncation during streaming
Support for thinking budget token configuration with validation and clamping

2. Enhanced Token Estimation

Adaptive overhead calculation based on tools definition size
Intelligent estimation strategy:
- No tools: 25% overhead + 400 base tokens
- Small tools definition (<21k tokens): 18% overhead + 400 base tokens
- Large tools definition (≥21k tokens): 8% overhead + 400 base tokens
Comprehensive content type coverage:
- System prompts
- Text content
- Tool results (critical for large context scenarios)
- Tool use inputs
- Images (1500 tokens each)
- Thinking content in message history
Minimal gap with actual context usage (0.1-7% for large requests)
Real context values still retrieved from API for accuracy

3. Performance Characteristics

No performance degradation
Message start event sent immediately before content streaming
Maintains backward compatibility with existing implementations
Proper handling of conversation history in token calculations

Testing

Extensively tested across multiple AI agent tools:

Claude Code
OpenCode
Kilo Code
Forge

All tools demonstrate improved context management and accurate token estimation across various workload patterns including:

Small requests without tools
Large requests with extensive tool definitions
Multi-turn conversations with thinking content
Requests with large tool results (30k+ tokens)

Technical Implementation

Thinking Support

Added KIRO_THINKING constants for tag management
Implemented helper functions for tag detection outside quoted strings
Created thinking-related methods for budget normalization and prefix generation
Updated buildCodewhispererRequest() to inject thinking prefixes and handle thinking blocks
Refactored generateContentStream() with proper state machine for real-time parsing
Updated buildClaudeResponse() for non-streaming thinking support

Token Estimation

Refactored estimateInputTokens() to iterate through all content types
Implemented adaptive overhead based on tools definition size
Added proper handling for conversation history overhead
Removed debug logging for production readiness

Files Changed

src/claude/claude-kiro.js: +441 lines, -78 lines

Breaking Changes

None. All changes are backward compatible.

Addresses PR #197 feature request for extended thinking support
Improves token estimation accuracy for auto-compact functionality

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/justlovemaki/AIClient-2-API/pull/203 **Author:** [@tickernelz](https://github.com/tickernelz) **Created:** 1/9/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `feat/kiro-thinking-support-v2` --- ### 📝 Commits (7) - [`fe38820`](https://github.com/justlovemaki/AIClient-2-API/commit/fe38820f72b5f36f964131cc92b460c2b2545bf2) Merge branch 'justlovemaki:main' into main - [`1aa0f37`](https://github.com/justlovemaki/AIClient-2-API/commit/1aa0f371b27b0ed3ae1497afb92520073bcbd7df) Merge remote-tracking branch 'aiclient/main' - [`fc01608`](https://github.com/justlovemaki/AIClient-2-API/commit/fc01608a2e98fd2983c526f7101617cd433e6256) Revert "refactor(kiro): 优化流式响应立即发送message_start，移除contextUsagePercentage等待逻辑" - [`2d4690f`](https://github.com/justlovemaki/AIClient-2-API/commit/2d4690fbe19cf36383afce812ef341ddec867a2b) Merge branch 'fix-400-kiro' - [`c53e38f`](https://github.com/justlovemaki/AIClient-2-API/commit/c53e38ff06308debdd4e7608c7e6bffd116f8891) feat(kiro): add extended thinking support with streaming content block handling - [`b6fb472`](https://github.com/justlovemaki/AIClient-2-API/commit/b6fb472971ad56c86412be1cb80e9bf6b890246d) refactor(kiro): improve input token estimation with detailed breakdown and dynamic updates - [`692fa49`](https://github.com/justlovemaki/AIClient-2-API/commit/692fa491a58db3197473dfaf50769aa46477036c) Merge remote-tracking branch 'aiclient/main' into feat/kiro-thinking-support-v2 ### 📊 Changes **1 file changed** (+441 additions, -78 deletions) <details> <summary>View changed files</summary> 📝 `src/claude/claude-kiro.js` (+441 -78) </details> ### 📄 Description ## Summary This PR implements extended thinking support for Claude's Kiro API adapter and significantly improves input token estimation accuracy for better context management in AI agent tools. ## Key Features ### 1. Extended Thinking Support - Implements Claude's extended thinking capability (PR #197 feature request) - Real-time thinking tag parsing with state machine for streaming responses - Proper content block ordering: thinking (index 0), text (index 1), tool_use (index 2+) - Dynamic block index management to maintain Claude API compatibility - Smart buffer management to prevent tag truncation during streaming - Support for thinking budget token configuration with validation and clamping ### 2. Enhanced Token Estimation - Adaptive overhead calculation based on tools definition size - Intelligent estimation strategy: - No tools: 25% overhead + 400 base tokens - Small tools definition (<21k tokens): 18% overhead + 400 base tokens - Large tools definition (≥21k tokens): 8% overhead + 400 base tokens - Comprehensive content type coverage: - System prompts - Text content - Tool results (critical for large context scenarios) - Tool use inputs - Images (1500 tokens each) - Thinking content in message history - Minimal gap with actual context usage (0.1-7% for large requests) - Real context values still retrieved from API for accuracy ### 3. Performance Characteristics - No performance degradation - Message start event sent immediately before content streaming - Maintains backward compatibility with existing implementations - Proper handling of conversation history in token calculations ## Testing Extensively tested across multiple AI agent tools: - Claude Code - OpenCode - Kilo Code - Forge All tools demonstrate improved context management and accurate token estimation across various workload patterns including: - Small requests without tools - Large requests with extensive tool definitions - Multi-turn conversations with thinking content - Requests with large tool results (30k+ tokens) ## Technical Implementation ### Thinking Support - Added `KIRO_THINKING` constants for tag management - Implemented helper functions for tag detection outside quoted strings - Created thinking-related methods for budget normalization and prefix generation - Updated `buildCodewhispererRequest()` to inject thinking prefixes and handle thinking blocks - Refactored `generateContentStream()` with proper state machine for real-time parsing - Updated `buildClaudeResponse()` for non-streaming thinking support ### Token Estimation - Refactored `estimateInputTokens()` to iterate through all content types - Implemented adaptive overhead based on tools definition size - Added proper handling for conversation history overhead - Removed debug logging for production readiness ## Files Changed - `src/claude/claude-kiro.js`: +441 lines, -78 lines ## Breaking Changes None. All changes are backward compatible. ## Related Issues - Addresses PR #197 feature request for extended thinking support - Improves token estimation accuracy for auto-compact functionality --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

kerem

2026-02-27 07:18:52 +03:00

closed this issue
added the
pull-request
label

No labels

pull-request

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/AIClient-2-API-justlovemaki#301

No description provided.

Rows
Columns