[PR #471] [CLOSED] perf(text-buffer-view): stream word wrap for large chunks #559

New issue

Closed

opened 2026-03-02 23:47:05 +03:00 by kerem · 0 comments

kerem commented

2026-03-02 23:47:05 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/anomalyco/opentui/pull/471
Author: @simonklee
Created: 1/4/2026
Status: ❌ Closed

Base: main ← Head: perf/word-wrap

📝 Commits (1)

3341e8f perf(text-buffer-view): stream word wrap for large chunks

📊 Changes

3 files changed (+622 additions, -136 deletions)

View changed files

📝 packages/core/src/zig/tests/utf8_test.zig (+160 -0)
📝 packages/core/src/zig/text-buffer-view.zig (+311 -115)
📝 packages/core/src/zig/utf8.zig (+151 -21)

📄 Description

Note: I wasn't sure weather to add this as an issue or a PR, so I went with PR,
but feel free to close if you want to discuss approach first.

The gist of this change is to improve word wrapping performance on large
single-line chunks by switching to a streaming approach instead of
precomputing all word boundaries.

before

https://github.com/user-attachments/assets/6e24a3fe-96ab-4d03-b2e6-ecb2c766cbe2

after

https://github.com/user-attachments/assets/6345cc3a-7af5-4ff7-a040-b9809e49c37c

Word wrapping large single-line files (minified JavaScript, continuous
logs) was slow. The old getWrapOffsets() precomputed all word boundary
positions for the entire chunk before wrapping began. Multi-megabyte
files produced arrays with tens of thousands of entries, most unused
since wrapping only needs boundaries within the current wrap width.

Added a hybrid strategy based on chunk size. Chunks larger than
64KB now use findWordWrapPosition(), which scans only up to wrap_width
columns per line and returns the last word boundary within that
window. This stops early instead of walking the full chunk. Smaller
chunks keep the cached approach where the upfront cost pays off
through cache locality.

Note: wrap-break detection now honors width_method in the cached
path. This changes semantics for .wcwidth and .no_zwj
(per‑codepoint breaks; ZWJ forces a break), while .unicode behavior
is unchanged. This aligns cached offsets with the streaming path and
cursor movement.

The streaming path uses per-codepoint widths without full grapheme
state, so complex emoji or Indic sequences may wrap differently than
cached, but only in chunks over 64KB containing such sequences at wrap
boundaries.

Benchmarks:

Baseline 31a5cc2 (main) -> Current c688b4a (perf/word-wrap)

Benchmark	Baseline	Current	Delta	Memory
TextBufferView wrap (word, width=120, single-line)	7.76ms	1.41ms	-81.8%	-15.19 MB
TextBufferView wrap (word, width=80, single-line)	7.76ms	1.45ms	-81.3%	-15.19 MB
TextBufferView wrap (word, width=40, single-line)	7.84ms	1.57ms	-79.9%	-15.19 MB
TextBufferView wrap (char, width=120, single-line)	666.44µs	649.26µs	-2.6%	888 B
TextBufferView wrap (char, width=80, multi-line)	8.40ms	8.27ms	-1.5%	2.84 MB
TextBufferView wrap (char, width=80, single-line)	688.99µs	685.68µs	-0.5%	888 B
TextBufferView wrap (char, width=40, single-line)	779.51µs	777.64µs	-0.2%	888 B
TextBufferView wrap (char, width=40, multi-line)	9.39ms	9.38ms	-0.0%	4.37 MB
TextBufferView wrap (char, width=120, multi-line)	7.75ms	7.75ms	-0.0%	2.84 MB
TextBufferView wrap (word, width=40, multi-line)	8.12ms	8.16ms	+0.5%	7.09 MB
TextBufferView wrap (word, width=120, multi-line)	5.12ms	5.17ms	+1.0%	7.09 MB
TextBufferView wrap (word, width=80, multi-line)	5.42ms	5.49ms	+1.2%	7.09 MB

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/anomalyco/opentui/pull/471 **Author:** [@simonklee](https://github.com/simonklee) **Created:** 1/4/2026 **Status:** ❌ Closed **Base:** `main` ← **Head:** `perf/word-wrap` --- ### 📝 Commits (1) - [`3341e8f`](https://github.com/anomalyco/opentui/commit/3341e8f3b265d0e5b4018b855addedb1b65f272f) perf(text-buffer-view): stream word wrap for large chunks ### 📊 Changes **3 files changed** (+622 additions, -136 deletions) <details> <summary>View changed files</summary> 📝 `packages/core/src/zig/tests/utf8_test.zig` (+160 -0) 📝 `packages/core/src/zig/text-buffer-view.zig` (+311 -115) 📝 `packages/core/src/zig/utf8.zig` (+151 -21) </details> ### 📄 Description Note: I wasn't sure weather to add this as an issue or a PR, so I went with PR, but feel free to close if you want to discuss approach first. The gist of this change is to improve word wrapping performance on large single-line chunks by switching to a streaming approach instead of precomputing all word boundaries. _before_ https://github.com/user-attachments/assets/6e24a3fe-96ab-4d03-b2e6-ecb2c766cbe2 _after_ https://github.com/user-attachments/assets/6345cc3a-7af5-4ff7-a040-b9809e49c37c Word wrapping large single-line files (minified JavaScript, continuous logs) was slow. The old getWrapOffsets() precomputed all word boundary positions for the entire chunk before wrapping began. Multi-megabyte files produced arrays with tens of thousands of entries, most unused since wrapping only needs boundaries within the current wrap width. Added a hybrid strategy based on chunk size. Chunks larger than 64KB now use findWordWrapPosition(), which scans only up to wrap_width columns per line and returns the last word boundary within that window. This stops early instead of walking the full chunk. Smaller chunks keep the cached approach where the upfront cost pays off through cache locality. Note: wrap-break detection now honors `width_method` in the cached path. This changes semantics for `.wcwidth` and `.no_zwj` (per‑codepoint breaks; ZWJ forces a break), while `.unicode` behavior is unchanged. This aligns cached offsets with the streaming path and cursor movement. The streaming path uses per-codepoint widths without full grapheme state, so complex emoji or Indic sequences may wrap differently than cached, but only in chunks over 64KB containing such sequences at wrap boundaries. **Benchmarks:** Baseline `31a5cc2` (main) -> Current `c688b4a` (perf/word-wrap) | Benchmark | Baseline | Current | Delta | Memory | | -------------------------------------------------- | -------- | -------- | ------ | --------- | | TextBufferView wrap (word, width=120, single-line) | 7.76ms | 1.41ms | -81.8% | -15.19 MB | | TextBufferView wrap (word, width=80, single-line) | 7.76ms | 1.45ms | -81.3% | -15.19 MB | | TextBufferView wrap (word, width=40, single-line) | 7.84ms | 1.57ms | -79.9% | -15.19 MB | | TextBufferView wrap (char, width=120, single-line) | 666.44µs | 649.26µs | -2.6% | 888 B | | TextBufferView wrap (char, width=80, multi-line) | 8.40ms | 8.27ms | -1.5% | 2.84 MB | | TextBufferView wrap (char, width=80, single-line) | 688.99µs | 685.68µs | -0.5% | 888 B | | TextBufferView wrap (char, width=40, single-line) | 779.51µs | 777.64µs | -0.2% | 888 B | | TextBufferView wrap (char, width=40, multi-line) | 9.39ms | 9.38ms | -0.0% | 4.37 MB | | TextBufferView wrap (char, width=120, multi-line) | 7.75ms | 7.75ms | -0.0% | 2.84 MB | | TextBufferView wrap (word, width=40, multi-line) | 8.12ms | 8.16ms | +0.5% | 7.09 MB | | TextBufferView wrap (word, width=120, multi-line) | 5.12ms | 5.17ms | +1.0% | 7.09 MB | | TextBufferView wrap (word, width=80, multi-line) | 5.42ms | 5.49ms | +1.2% | 7.09 MB | --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>