mirror of
https://github.com/anomalyco/opentui.git
synced 2026-04-25 13:06:00 +03:00
[PR #506] [MERGED] perf(utf8): ASCII wrapping via strict printable-only invariant #1362
Labels
No labels
bug
core
documentation
feature
good first issue
help wanted
pull-request
question
react
solid
tmux
windows
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/opentui#1362
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/anomalyco/opentui/pull/506
Author: @simonklee
Created: 1/9/2026
Status: ✅ Merged
Merged: 1/13/2026
Merged by: @kommander
Base:
main← Head:perf-utf8-ascii-invariant📝 Commits (2)
715d195perf(utf8): ASCII wrapping via strict printable-only invariantb187042Merge branch 'main' into perf-utf8-ascii-invariant📊 Changes
5 files changed (+46 additions, -134 deletions)
View changed files
📝
packages/core/src/zig/bench/utf8_bench.zig(+1 -1)📝
packages/core/src/zig/tests/utf8_test.zig(+25 -25)📝
packages/core/src/zig/tests/utf8_wcwidth_test.zig(+1 -1)📝
packages/core/src/zig/text-buffer-segment.zig(+1 -1)📝
packages/core/src/zig/utf8.zig(+18 -106)📄 Description
The is ascii only checks isn't exactly what the name implies. It strictly enforces printable ASCII (32-126), explicitly excluding control characters like tabs (
\t) and newlines.This provides a stronger guarantee than typical 7-bit ASCII checks: if
isAsciiOnlyis true, every byte is exactly 1 column wide.However, we ignored this and still running O(N)
width loops even on the fast path. This patch deletes those loops entirely:
If it's guaranteed printable ASCII, the display width is identical to
text.len. We don't need to iterate N bytes just to add 1 N times.Since width maps 1:1 to byte index, the wrap position is simply
min(text.len, max_width). We don't need to scan the string to find where it overflows.The obvious risk here is tabs (byte 9), which are strictly ASCII but variable width.
But since
isAsciiOnlychecksval >= 32, so it returns false for\t. This forces tabbed content into the slow Unicode path wheretab_widthis handled properly.I also considered if it was possible for FFI consumers to pass
isAsciiOnly=truefor strings with tabs or newlines. But since the public API doesn't exposeisAsciiOnlydirectly, and instead derives it viautf8.isAsciiOnly(), which returns false for empty strings and control characters, this optimization is safe and transparent to external users.🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.