[GH-ISSUE #82] copilot uses hooks to display garbled characters. #46

Open
opened 2026-03-03 18:50:15 +03:00 by kerem · 7 comments
Owner

Originally created by @Hexiaopi on GitHub (Feb 25, 2026).
Original GitHub issue: https://github.com/OthmanAdi/planning-with-files/issues/82

Image
Originally created by @Hexiaopi on GitHub (Feb 25, 2026). Original GitHub issue: https://github.com/OthmanAdi/planning-with-files/issues/82 <img width="524" height="433" alt="Image" src="https://github.com/user-attachments/assets/370da1c4-ec2a-42d7-b896-a77845714c8c" />
Author
Owner

@OthmanAdi commented on GitHub (Feb 25, 2026):

im on it

<!-- gh-comment-id:3957529235 --> @OthmanAdi commented on GitHub (Feb 25, 2026): im on it
Author
Owner

@OthmanAdi commented on GitHub (Feb 25, 2026):

Found it — and it was actually two separate issues, not one.

On Windows (your case): PowerShell 5.x defaults to UTF-16LE for stdout. Copilot reads that output as UTF-8, and the byte mismatch is exactly what produces those diamond characters (◆). Fixed in all four .ps1 scripts by setting $OutputEncoding and [Console]::OutputEncoding to UTF-8 before any output.

On Linux/macOS: session-start.sh and pre-tool-use.sh were using Python's json.dumps() with ensure_ascii=True (the default), which converts every non-ASCII character in task_plan.md — emojis, CJK characters, accented letters — into raw \uXXXX sequences. Fixed with ensure_ascii=False.

Both fixes are in v2.16.1 — just released.

@Hexiaopi whenever you get a chance, let me know if this resolves it on your end. Keeping this open until you confirm. And thank you for the clear screenshot — it made the diagnosis much faster. You're credited as a contributor in the release.

<!-- gh-comment-id:3961201658 --> @OthmanAdi commented on GitHub (Feb 25, 2026): Found it — and it was actually two separate issues, not one. **On Windows (your case):** PowerShell 5.x defaults to UTF-16LE for stdout. Copilot reads that output as UTF-8, and the byte mismatch is exactly what produces those diamond characters (◆). Fixed in all four `.ps1` scripts by setting `$OutputEncoding` and `[Console]::OutputEncoding` to UTF-8 before any output. **On Linux/macOS:** `session-start.sh` and `pre-tool-use.sh` were using Python's `json.dumps()` with `ensure_ascii=True` (the default), which converts every non-ASCII character in `task_plan.md` — emojis, CJK characters, accented letters — into raw `\uXXXX` sequences. Fixed with `ensure_ascii=False`. Both fixes are in **v2.16.1** — just released. @Hexiaopi whenever you get a chance, let me know if this resolves it on your end. Keeping this open until you confirm. And thank you for the clear screenshot — it made the diagnosis much faster. You're credited as a contributor in the release.
Author
Owner

@OthmanAdi commented on GitHub (Feb 25, 2026):


发现了——实际上是两个独立的问题,不是一个。

Windows(您的情况): PowerShell 5.x 默认使用 UTF-16LE 编码输出。Copilot 以 UTF-8 读取该输出,字节不匹配正是导致那些乱码菱形字符(◆)的原因。已在全部四个 .ps1 脚本中修复,在任何输出之前设置 $OutputEncoding[Console]::OutputEncoding 为 UTF-8。

Linux/macOS: session-start.shpre-tool-use.sh 使用 Python 的 json.dumps() 时采用默认的 ensure_ascii=True,这会将 task_plan.md 中的所有非 ASCII 字符——表情符号、中文字符、带重音字母——转换为原始的 \uXXXX 转义序列。已改用 ensure_ascii=False 修复。

两个修复均已包含在刚刚发布的 v2.16.1 中。

@Hexiaopi 方便的时候,麻烦确认一下这是否解决了您的问题。在您确认之前,我会保持此 issue 开放。感谢您提供的清晰截图——让问题定位快了很多。您已被列为此次版本的贡献者。

<!-- gh-comment-id:3961223060 --> @OthmanAdi commented on GitHub (Feb 25, 2026): --- 发现了——实际上是两个独立的问题,不是一个。 **Windows(您的情况):** PowerShell 5.x 默认使用 UTF-16LE 编码输出。Copilot 以 UTF-8 读取该输出,字节不匹配正是导致那些乱码菱形字符(◆)的原因。已在全部四个 `.ps1` 脚本中修复,在任何输出之前设置 `$OutputEncoding` 和 `[Console]::OutputEncoding` 为 UTF-8。 **Linux/macOS:** `session-start.sh` 和 `pre-tool-use.sh` 使用 Python 的 `json.dumps()` 时采用默认的 `ensure_ascii=True`,这会将 `task_plan.md` 中的所有非 ASCII 字符——表情符号、中文字符、带重音字母——转换为原始的 `\uXXXX` 转义序列。已改用 `ensure_ascii=False` 修复。 两个修复均已包含在刚刚发布的 **v2.16.1** 中。 @Hexiaopi 方便的时候,麻烦确认一下这是否解决了您的问题。在您确认之前,我会保持此 issue 开放。感谢您提供的清晰截图——让问题定位快了很多。您已被列为此次版本的贡献者。
Author
Owner

@Hexiaopi commented on GitHub (Feb 26, 2026):

It seems this problem still exists.

<!-- gh-comment-id:3964278538 --> @Hexiaopi commented on GitHub (Feb 26, 2026): It seems this problem still exists.
Author
Owner

@OthmanAdi commented on GitHub (Feb 26, 2026):

Thank you for confirming @Hexiaopi. Found the remaining issue.

Root cause of what's still broken:

The v2.16.1 fix addressed the output pipe encoding — setting $OutputEncoding and [Console]::OutputEncoding to UTF-8. That part is correct. But it missed the read side entirely.

Every Get-Content call in the four PS1 scripts has no -Encoding parameter:

$Context = (Get-Content $PlanFile -TotalCount 30 -ErrorAction SilentlyContinue) -join "`n"

In PowerShell 5.x, Get-Content with no encoding parameter reads using the system default ANSI code page (Windows-1252 in most Western locales). If task_plan.md is UTF-8 without BOM (which it is — every modern tool writes UTF-8), any non-ASCII character in the file — emoji, Chinese characters, accented letters, any Unicode outside the ASCII range — gets read with the wrong encoding and the corruption happens before it ever reaches the output pipe. Fixing the pipe encoding does nothing if the string is already garbage when it enters ConvertTo-Json.

Three scripts affected: pre-tool-use.ps1, session-start.ps1, agent-stop.ps1. (post-tool-use has no Get-Content.)

There is also a secondary issue: [System.Text.Encoding]::UTF8 in .NET returns UTF-8 with BOM. Setting $OutputEncoding to this means PowerShell may prepend 0xEF 0xBB 0xBF to stdout, which breaks JSON parsers that do not handle the UTF-8 BOM preamble. The fix is [System.Text.UTF8Encoding]::new($false) — UTF-8 without BOM, explicitly.

Fix being applied now:

All four PS1 scripts:

  • [System.Text.Encoding]::UTF8[System.Text.UTF8Encoding]::new($false) (removes BOM risk)
  • All Get-Content calls get -Encoding UTF8 (reads files correctly)

Shipping in v2.18.1 shortly. Will confirm here when live.

<!-- gh-comment-id:3968230140 --> @OthmanAdi commented on GitHub (Feb 26, 2026): Thank you for confirming @Hexiaopi. Found the remaining issue. **Root cause of what's still broken:** The v2.16.1 fix addressed the output pipe encoding — setting `$OutputEncoding` and `[Console]::OutputEncoding` to UTF-8. That part is correct. But it missed the read side entirely. Every `Get-Content` call in the four PS1 scripts has no `-Encoding` parameter: ```powershell $Context = (Get-Content $PlanFile -TotalCount 30 -ErrorAction SilentlyContinue) -join "`n" ``` In PowerShell 5.x, `Get-Content` with no encoding parameter reads using the system default ANSI code page (Windows-1252 in most Western locales). If `task_plan.md` is UTF-8 without BOM (which it is — every modern tool writes UTF-8), any non-ASCII character in the file — emoji, Chinese characters, accented letters, any Unicode outside the ASCII range — gets read with the wrong encoding and the corruption happens before it ever reaches the output pipe. Fixing the pipe encoding does nothing if the string is already garbage when it enters `ConvertTo-Json`. Three scripts affected: `pre-tool-use.ps1`, `session-start.ps1`, `agent-stop.ps1`. (post-tool-use has no `Get-Content`.) There is also a secondary issue: `[System.Text.Encoding]::UTF8` in .NET returns UTF-8 with BOM. Setting `$OutputEncoding` to this means PowerShell may prepend `0xEF 0xBB 0xBF` to stdout, which breaks JSON parsers that do not handle the UTF-8 BOM preamble. The fix is `[System.Text.UTF8Encoding]::new($false)` — UTF-8 without BOM, explicitly. **Fix being applied now:** All four PS1 scripts: - `[System.Text.Encoding]::UTF8` → `[System.Text.UTF8Encoding]::new($false)` (removes BOM risk) - All `Get-Content` calls get `-Encoding UTF8` (reads files correctly) Shipping in v2.18.1 shortly. Will confirm here when live.
Author
Owner

@OthmanAdi commented on GitHub (Feb 26, 2026):

Fixed. v2.18.1 is live.

The root cause was on the read side, not the write side.

Every Get-Content call in the PS1 scripts had no -Encoding parameter. In PowerShell 5.x, Get-Content without an explicit encoding reads files using the system ANSI code page — Windows-1252 in most Western locales. Any non-ASCII character in task_plan.md or SKILL.md was being corrupted at the moment it was read, before it ever reached the output pipe. The v2.16.1 fix was correct — fixing the pipe encoding was right — but the string was already garbled by the time ConvertTo-Json got hold of it. Fixing the output side of a corrupted string does nothing.

Three scripts affected: pre-tool-use.ps1, session-start.ps1, agent-stop.ps1. All Get-Content calls now have -Encoding UTF8.

I also replaced [System.Text.Encoding]::UTF8 with [System.Text.UTF8Encoding]::new($false) across all four PS1 scripts. [System.Text.Encoding]::UTF8 in .NET is UTF-8 with BOM. The $false constructor parameter sets encoderShouldEmitUTF8Identifier = false — UTF-8 without BOM — so JSON parsers receive clean output without a stray 0xEF 0xBB 0xBF preamble.

Bash scripts were already correct from v2.16.1.

To update: pull the latest hook scripts from the repo and replace your existing .github/hooks/scripts/ folder. The planning-with-files.json config is unchanged.

Release: https://github.com/OthmanAdi/planning-with-files/releases/tag/v2.18.1

Thank you for confirming it was still broken. Without that confirmation this would have stayed unfixed.

Ahmad

<!-- gh-comment-id:3968263046 --> @OthmanAdi commented on GitHub (Feb 26, 2026): Fixed. v2.18.1 is live. The root cause was on the read side, not the write side. Every `Get-Content` call in the PS1 scripts had no `-Encoding` parameter. In PowerShell 5.x, `Get-Content` without an explicit encoding reads files using the system ANSI code page — Windows-1252 in most Western locales. Any non-ASCII character in `task_plan.md` or `SKILL.md` was being corrupted at the moment it was read, before it ever reached the output pipe. The v2.16.1 fix was correct — fixing the pipe encoding was right — but the string was already garbled by the time `ConvertTo-Json` got hold of it. Fixing the output side of a corrupted string does nothing. Three scripts affected: `pre-tool-use.ps1`, `session-start.ps1`, `agent-stop.ps1`. All `Get-Content` calls now have `-Encoding UTF8`. I also replaced `[System.Text.Encoding]::UTF8` with `[System.Text.UTF8Encoding]::new($false)` across all four PS1 scripts. `[System.Text.Encoding]::UTF8` in .NET is UTF-8 with BOM. The `$false` constructor parameter sets `encoderShouldEmitUTF8Identifier = false` — UTF-8 without BOM — so JSON parsers receive clean output without a stray `0xEF 0xBB 0xBF` preamble. Bash scripts were already correct from v2.16.1. To update: pull the latest hook scripts from the repo and replace your existing `.github/hooks/scripts/` folder. The `planning-with-files.json` config is unchanged. Release: https://github.com/OthmanAdi/planning-with-files/releases/tag/v2.18.1 Thank you for confirming it was still broken. Without that confirmation this would have stayed unfixed. Ahmad
Author
Owner

@OthmanAdi commented on GitHub (Feb 26, 2026):

please let me know if its still happening

<!-- gh-comment-id:3968270729 --> @OthmanAdi commented on GitHub (Feb 26, 2026): please let me know if its still happening
Sign in to join this conversation.
No labels
bug
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/planning-with-files#46
No description provided.