Skip to content

fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output#31925

Closed
senguangd wants to merge 1 commit into
anomalyco:devfrom
senguangd:fix/powershell-encodedcommand-utf8
Closed

fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output#31925
senguangd wants to merge 1 commit into
anomalyco:devfrom
senguangd:fix/powershell-encodedcommand-utf8

Conversation

@senguangd

@senguangd senguangd commented Jun 11, 2026

Copy link
Copy Markdown

Issue for this PR

Closes #23636
Closes #31187
Closes #30205
Closes #31830
Closes #26882

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

On Windows, PowerShell commands were passed via the -Command flag, causing encoding corruption when the active console code page is not UTF-8 (e.g. GBK/CP936 on zh-CN systems, Shift-JIS/CP932 on ja-JP systems). Previous attempts to fix this by prepending a UTF-8 preamble run too late — -Command parses the string in the current code page before execution, so the preamble itself is already corrupted.

This PR switches to -EncodedCommand with Base64(UTF-16LE) encoding, which guarantees the command string survives transport intact regardless of the console code page. The UTF-8 preamble ([Console]::InputEncoding, [Console]::OutputEncoding, $OutputEncoding) is injected before the user command inside the encoded payload.

Files changed:

File Change
packages/core/src/tool/bash.ts Add isPowerShell(), encodePowerShellCommand(), withPowerShellUtf8Preamble(), makeShellCommand(). Replace direct ChildProcess.make with makeShellCommand()
packages/opencode/src/tool/shell.ts Add encodePowerShellCommand(), withPowerShellUtf8Preamble(). Modify cmd() to use -EncodedCommand for PowerShell

Why -EncodedCommand over -Command: All existing PRs (#31346, #31297, #31658) use -Command, which means the UTF-8 preamble is decoded in the current code page before PowerShell parses it — so on a GBK system the preamble itself gets corrupted. -EncodedCommand bypasses this by encoding as Base64(UTF-16LE), which PowerShell decodes directly. The preamble then runs correctly and sets up UTF-8 before the user command.

How did you verify your code works?

  • Applied the patch locally and verified that Write-Output "测试中文输出" renders correctly in opencode shell output on Windows with GBK code page
  • Verified the same commands work with both powershell (5.1) and pwsh (7+)

Screenshots / recordings

N/A — this is a shell encoding fix, not a UI change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

fix(shell): use PowerShell EncodedCommand for reliable UTF-8 output

On Windows, PowerShell commands were passed via `-Command` flag, causing
encoding corruption when the active console code page is not UTF-8 (e.g.
GBK on zh-CN systems, Shift-JIS on ja-JP systems). The UTF-8 preamble
added by previous attempts runs too late because `-Command` parses the
string in the current code page *before* execution.

This change switches to `-EncodedCommand` with Base64(UTF-16LE) encoding,
which guarantees the command string survives transport intact regardless
of the console code page. The UTF-8 preamble ([Console]::InputEncoding,
[Console]::OutputEncoding, $OutputEncoding) is applied before the user
command, ensuring both input and output use UTF-8.

Applies to both the core bash tool and the opencode shell tool.

Closes anomalyco#23636, anomalyco#31187, anomalyco#30205, anomalyco#31830, anomalyco#26882

Supersedes anomalyco#31346, anomalyco#31297, anomalyco#31658 (which all use `-Command`)

Co-Authored-By: Claude <noreply@anthropic.com>
@
Copilot AI review requested due to automatic review settings June 11, 2026 16:13
@github-actions github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label Jun 11, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds PowerShell-specific handling on Windows to improve Unicode/UTF-8 behavior and reduce quoting/escaping issues when running shell commands.

Changes:

  • Run PowerShell commands via -EncodedCommand (UTF-16LE base64) instead of -Command.
  • Prefix PowerShell scripts with a UTF-8 encoding preamble to normalize input/output encoding.
  • Centralize shell command construction in packages/core via makeShellCommand.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
packages/opencode/src/tool/shell.ts Switches Windows PowerShell execution to -EncodedCommand and adds UTF-8 preamble helpers.
packages/core/src/tool/bash.ts Adds PowerShell detection + -EncodedCommand path and routes execution through makeShellCommand.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +58
const isPowerShell = (shell: string) => {
const name = path.basename(shell.trim().replace(/^["']|["']$/g, "")).toLowerCase()
return POWERSHELL_SHELLS.has(name)
}
Comment on lines +60 to +71
function encodePowerShellCommand(command: string) {
return Buffer.from(command, "utf16le").toString("base64")
}

function withPowerShellUtf8Preamble(command: string) {
return `
[Console]::InputEncoding = [System.Text.UTF8Encoding]::new($false);
[Console]::OutputEncoding = [System.Text.UTF8Encoding]::new($false);
$OutputEncoding = [Console]::OutputEncoding;
${command}
`
}
@github-actions

Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Potential Duplicate PRs Found

This PR (31925) is related to and supersedes the following earlier attempts to fix the same PowerShell encoding issue:

  1. #31346 - fix(shell): set utf8 output encoding for powershell
  2. #31297 - fix(shell): force UTF-8 encoding for PowerShell output
  3. #31658 - fix: set default UTF-8 encoding for spawned subprocess on Windows

Why they're related: All four PRs (including the current one) address Windows PowerShell encoding corruption when the console code page is not UTF-8. However, PR #31925 uses a superior approach (-EncodedCommand with Base64 UTF-16LE) compared to the earlier attempts which only set output encoding flags via -Command. The PR description explicitly compares itself to these predecessors and explains why the new approach fixes the root cause that the others could not.

The current PR closes issues #23636, #31187, #30205, #31830, and #26882, indicating it's the comprehensive fix for this problem.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@senguangd

Copy link
Copy Markdown
Author

The developer is working to resolve and close the PR.

@senguangd senguangd closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment