[PR #483] [CLOSED] feat(core): improve stdin-buffer emoji/IME handling with Kitty protocol #1349

Closed
opened 2026-03-14 09:31:59 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/anomalyco/opentui/pull/483
Author: @GreyElaina
Created: 1/7/2026
Status: Closed

Base: mainHead: core-stdin-emoji-ime


📝 Commits (7)

  • 6dbb9fa feat(core): add grapheme segmentation for proper emoji/CJK handling in StdinBuffer
  • 1608669 feat(core): add Kitty emoji reassembly in KeyHandler
  • b68c742 docs(core): add Unicode spec references (UAX #29, UTS #51) to emoji helpers
  • b0a0d66 fix(core): handle surrogate edge cases and preserve polyfill errors
  • 4f580bb feat(core): use isSingleGrapheme in parse.keypress for proper emoji/CJK detection
  • 51a8920 Merge branch 'main' into core-stdin-emoji-ime
  • 09d008a bun exclusive

📊 Changes

8 files changed (+1011 additions, -124 deletions)

View changed files

📝 packages/core/src/lib/KeyHandler.test.ts (+157 -0)
📝 packages/core/src/lib/KeyHandler.ts (+165 -11)
packages/core/src/lib/grapheme-segmenter.ts (+53 -0)
📝 packages/core/src/lib/parse.keypress.ts (+3 -2)
📝 packages/core/src/lib/stdin-buffer.test.ts (+95 -2)
📝 packages/core/src/lib/stdin-buffer.ts (+70 -109)
packages/vue/tests/__snapshots__/layout.test.ts.snap (+183 -0)
packages/vue/tests/__snapshots__/textarea.test.ts.snap (+285 -0)

📄 Description

Summary

Add proper grapheme segmentation and Kitty emoji reassembly for correct handling of emoji, CJK characters, and other multi-codepoint sequences.

Changes

StdinBuffer (grapheme segmentation)

  • Add grapheme-segmenter.ts using Intl.Segmenter for Unicode-correct text segmentation
  • Update StdinBuffer to emit complete grapheme clusters for non-escape input
  • Preserve Kitty keyboard protocol sequences unchanged for downstream parsing

KeyHandler (Kitty emoji reassembly)

  • Add emoji codepoint detection helpers (isGraphemeExtender, canStartGraphemeCluster)
  • Buffer Kitty sequences that form multi-codepoint emoji (ZWJ families, flags, skin tones, keycaps)
  • Flush buffer on timeout or when non-emoji input arrives
  • Preserve all raw sequences in KeyEvent.raw field

Architecture

stdin → StdinBuffer → KeyHandler.processInput() → parseKeypress() → emit
              ↓                    ↓
     grapheme clusters      emoji reassembly
     (for raw input)        (for Kitty protocol)

Key insight: StdinBuffer handles grapheme segmentation for raw UTF-8 input, while KeyHandler handles emoji reassembly for Kitty protocol sequences. This separation preserves the raw field correctly.

Supported Emoji Types

  • Basic emoji: 😀
  • ZWJ sequences: 👨‍👩‍👧 (family)
  • Flag emoji: 🇺🇸 🇯🇵
  • Skin tone modifiers: 👋🏻
  • Keycap sequences: #️⃣
  • Subdivision flags: 🏴󠁧󠁢󠁥󠁮󠁧󠁿

Testing

  • All 113 stdin-buffer tests pass
  • All 60 renderer.input tests pass
  • All 40 KeyHandler tests pass (10 new emoji tests)

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/anomalyco/opentui/pull/483 **Author:** [@GreyElaina](https://github.com/GreyElaina) **Created:** 1/7/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `core-stdin-emoji-ime` --- ### 📝 Commits (7) - [`6dbb9fa`](https://github.com/anomalyco/opentui/commit/6dbb9fa05dce1534a6b2b0877500093937acb2dd) feat(core): add grapheme segmentation for proper emoji/CJK handling in StdinBuffer - [`1608669`](https://github.com/anomalyco/opentui/commit/160866955aa50da15cfdef3b6c429ec229b69146) feat(core): add Kitty emoji reassembly in KeyHandler - [`b68c742`](https://github.com/anomalyco/opentui/commit/b68c74285f2a8fee34814787026fa8f90d2fff3e) docs(core): add Unicode spec references (UAX #29, UTS #51) to emoji helpers - [`b0a0d66`](https://github.com/anomalyco/opentui/commit/b0a0d66b20d5ba41eea29a946d91625765dbed20) fix(core): handle surrogate edge cases and preserve polyfill errors - [`4f580bb`](https://github.com/anomalyco/opentui/commit/4f580bbd1603b6af776a8114b07e38387be31d70) feat(core): use isSingleGrapheme in parse.keypress for proper emoji/CJK detection - [`51a8920`](https://github.com/anomalyco/opentui/commit/51a8920175db924ac64f8100352542882828d4f0) Merge branch 'main' into core-stdin-emoji-ime - [`09d008a`](https://github.com/anomalyco/opentui/commit/09d008ad6f5e483663ba4e42c49d1ec6173e2506) bun exclusive ### 📊 Changes **8 files changed** (+1011 additions, -124 deletions) <details> <summary>View changed files</summary> 📝 `packages/core/src/lib/KeyHandler.test.ts` (+157 -0) 📝 `packages/core/src/lib/KeyHandler.ts` (+165 -11) ➕ `packages/core/src/lib/grapheme-segmenter.ts` (+53 -0) 📝 `packages/core/src/lib/parse.keypress.ts` (+3 -2) 📝 `packages/core/src/lib/stdin-buffer.test.ts` (+95 -2) 📝 `packages/core/src/lib/stdin-buffer.ts` (+70 -109) ➕ `packages/vue/tests/__snapshots__/layout.test.ts.snap` (+183 -0) ➕ `packages/vue/tests/__snapshots__/textarea.test.ts.snap` (+285 -0) </details> ### 📄 Description ## Summary Add proper grapheme segmentation and Kitty emoji reassembly for correct handling of emoji, CJK characters, and other multi-codepoint sequences. ## Changes ### StdinBuffer (grapheme segmentation) - Add `grapheme-segmenter.ts` using `Intl.Segmenter` for Unicode-correct text segmentation - Update `StdinBuffer` to emit complete grapheme clusters for non-escape input - Preserve Kitty keyboard protocol sequences unchanged for downstream parsing ### KeyHandler (Kitty emoji reassembly) - Add emoji codepoint detection helpers (`isGraphemeExtender`, `canStartGraphemeCluster`) - Buffer Kitty sequences that form multi-codepoint emoji (ZWJ families, flags, skin tones, keycaps) - Flush buffer on timeout or when non-emoji input arrives - Preserve all raw sequences in `KeyEvent.raw` field ## Architecture ``` stdin → StdinBuffer → KeyHandler.processInput() → parseKeypress() → emit ↓ ↓ grapheme clusters emoji reassembly (for raw input) (for Kitty protocol) ``` **Key insight**: StdinBuffer handles grapheme segmentation for raw UTF-8 input, while KeyHandler handles emoji reassembly for Kitty protocol sequences. This separation preserves the `raw` field correctly. ## Supported Emoji Types - Basic emoji: 😀 - ZWJ sequences: 👨‍👩‍👧 (family) - Flag emoji: 🇺🇸 🇯🇵 - Skin tone modifiers: 👋🏻 - Keycap sequences: #️⃣ - Subdivision flags: 🏴󠁧󠁢󠁥󠁮󠁧󠁿 ## Testing - All 113 stdin-buffer tests pass - All 60 renderer.input tests pass - All 40 KeyHandler tests pass (10 new emoji tests) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-14 09:31:59 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/opentui#1349
No description provided.