fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output#31925
fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output#31925senguangd wants to merge 1 commit into
Conversation
fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output On Windows, PowerShell commands were passed via `-Command` flag, causing encoding corruption when the active console code page is not UTF-8 (e.g. GBK on zh-CN systems, Shift-JIS on ja-JP systems). The UTF-8 preamble added by previous attempts runs too late because `-Command` parses the string in the current code page *before* execution. This change switches to `-EncodedCommand` with Base64(UTF-16LE) encoding, which guarantees the command string survives transport intact regardless of the console code page. The UTF-8 preamble ([Console]::InputEncoding, [Console]::OutputEncoding, $OutputEncoding) is applied before the user command, ensuring both input and output use UTF-8. Applies to both the core bash tool and the opencode shell tool. Closes anomalyco#23636, anomalyco#31187, anomalyco#30205, anomalyco#31830, anomalyco#26882 Supersedes anomalyco#31346, anomalyco#31297, anomalyco#31658 (which all use `-Command`) Co-Authored-By: Claude <noreply@anthropic.com> @
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds PowerShell-specific handling on Windows to improve Unicode/UTF-8 behavior and reduce quoting/escaping issues when running shell commands.
Changes:
- Run PowerShell commands via
-EncodedCommand(UTF-16LE base64) instead of-Command. - Prefix PowerShell scripts with a UTF-8 encoding preamble to normalize input/output encoding.
- Centralize shell command construction in
packages/coreviamakeShellCommand.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/opencode/src/tool/shell.ts | Switches Windows PowerShell execution to -EncodedCommand and adds UTF-8 preamble helpers. |
| packages/core/src/tool/bash.ts | Adds PowerShell detection + -EncodedCommand path and routes execution through makeShellCommand. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const isPowerShell = (shell: string) => { | ||
| const name = path.basename(shell.trim().replace(/^["']|["']$/g, "")).toLowerCase() | ||
| return POWERSHELL_SHELLS.has(name) | ||
| } |
| function encodePowerShellCommand(command: string) { | ||
| return Buffer.from(command, "utf16le").toString("base64") | ||
| } | ||
|
|
||
| function withPowerShellUtf8Preamble(command: string) { | ||
| return ` | ||
| [Console]::InputEncoding = [System.Text.UTF8Encoding]::new($false); | ||
| [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new($false); | ||
| $OutputEncoding = [Console]::OutputEncoding; | ||
| ${command} | ||
| ` | ||
| } |
|
The following comment was made by an LLM, it may be inaccurate: Potential Duplicate PRs FoundThis PR (31925) is related to and supersedes the following earlier attempts to fix the same PowerShell encoding issue:
Why they're related: All four PRs (including the current one) address Windows PowerShell encoding corruption when the console code page is not UTF-8. However, PR #31925 uses a superior approach ( The current PR closes issues #23636, #31187, #30205, #31830, and #26882, indicating it's the comprehensive fix for this problem. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
|
The developer is working to resolve and close the PR. |
Issue for this PR
Closes #23636
Closes #31187
Closes #30205
Closes #31830
Closes #26882
Type of change
What does this PR do?
On Windows, PowerShell commands were passed via the
-Commandflag, causing encoding corruption when the active console code page is not UTF-8 (e.g. GBK/CP936 on zh-CN systems, Shift-JIS/CP932 on ja-JP systems). Previous attempts to fix this by prepending a UTF-8 preamble run too late —-Commandparses the string in the current code page before execution, so the preamble itself is already corrupted.This PR switches to
-EncodedCommandwith Base64(UTF-16LE) encoding, which guarantees the command string survives transport intact regardless of the console code page. The UTF-8 preamble ([Console]::InputEncoding,[Console]::OutputEncoding,$OutputEncoding) is injected before the user command inside the encoded payload.Files changed:
packages/core/src/tool/bash.tsisPowerShell(),encodePowerShellCommand(),withPowerShellUtf8Preamble(),makeShellCommand(). Replace directChildProcess.makewithmakeShellCommand()packages/opencode/src/tool/shell.tsencodePowerShellCommand(),withPowerShellUtf8Preamble(). Modifycmd()to use-EncodedCommandfor PowerShellWhy
-EncodedCommandover-Command: All existing PRs (#31346, #31297, #31658) use-Command, which means the UTF-8 preamble is decoded in the current code page before PowerShell parses it — so on a GBK system the preamble itself gets corrupted.-EncodedCommandbypasses this by encoding as Base64(UTF-16LE), which PowerShell decodes directly. The preamble then runs correctly and sets up UTF-8 before the user command.How did you verify your code works?
Write-Output "测试中文输出"renders correctly in opencode shell output on Windows with GBK code pagepowershell(5.1) andpwsh(7+)Screenshots / recordings
N/A — this is a shell encoding fix, not a UI change.
Checklist