Skip to content

Commit 623e72d

Browse files
authored
chore: add no-emdash/endash rule to agent instructions and CI lint (#24375)
Add a lint check that prevents introduction of Unicode emdash (U+2014) and endash (U+2013) characters. These are almost exclusively introduced by AI agents and conflict with the project writing style. The lint script (scripts/check_emdash.sh) checks only added lines in the current diff by default, so existing violations do not block CI. Pass --all to scan the entire repo for auditing. Agent instructions in AGENTS.md, site/AGENTS.md, and the docs style guide now explicitly ban emdash, endash, and " -- " as punctuation, with guidance to use commas, semicolons, or periods instead.
1 parent 9d0469f commit 623e72d

5 files changed

Lines changed: 146 additions & 3 deletions

File tree

.claude/docs/DOCS_STYLE_GUIDE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,13 @@ Then ask: "Could you provide a screenshot of the Template Insights page? I've ad
150150
- Inline: `` `coder server` ``
151151
- Blocks: Use triple backticks with language identifier
152152

153+
### Punctuation
154+
155+
- Do not use emdash (U+2014), endash (U+2013), or ` -- ` as punctuation
156+
in code, comments, string literals, or documentation. Use commas,
157+
semicolons, or periods instead. Restructure the sentence if needed.
158+
For numeric ranges, use a plain hyphen (e.g., `0-100`).
159+
153160
### Instructions
154161

155162
- **Numbered lists** for sequential steps

AGENTS.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,22 @@ ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
286286
ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
287287
```
288288

289+
### No Emdash or Endash
290+
291+
Do not use emdash (U+2014), endash (U+2013), or ` -- ` as punctuation
292+
in code, comments, string literals, or documentation. Use commas,
293+
semicolons, or periods instead. Restructure the sentence if needed.
294+
Do not replace an emdash with ` -- `. Unicode emdash and endash are
295+
caught by `make lint/emdash`.
296+
297+
```go
298+
// Good: uses a period to separate the clauses.
299+
// This is slow. We should cache it.
300+
301+
// Good: uses a comma to join related clauses.
302+
// This is slow, so we should cache it.
303+
```
304+
289305
### Avoid Unnecessary Changes
290306

291307
When fixing a bug or adding a feature, don't modify code unrelated to your

Makefile

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -699,11 +699,11 @@ endif
699699
# GitHub Actions linters are run in a separate CI job (lint-actions) that only
700700
# triggers when workflow files change, so we skip them here when CI=true.
701701
LINT_ACTIONS_TARGETS := $(if $(CI),,lint/actions/actionlint)
702-
lint: lint/shellcheck lint/go lint/ts lint/examples lint/helm lint/site-icons lint/markdown lint/check-scopes lint/migrations lint/bootstrap $(LINT_ACTIONS_TARGETS)
702+
lint: lint/shellcheck lint/go lint/ts lint/examples lint/helm lint/site-icons lint/markdown lint/check-scopes lint/migrations lint/bootstrap lint/emdash $(LINT_ACTIONS_TARGETS)
703703
.PHONY: lint
704704

705705
# Subset of lint that does not require Go or Node toolchains.
706-
lint-light: lint/shellcheck lint/markdown lint/helm lint/bootstrap lint/migrations lint/actions/actionlint lint/typos
706+
lint-light: lint/shellcheck lint/markdown lint/helm lint/bootstrap lint/migrations lint/actions/actionlint lint/typos lint/emdash
707707
.PHONY: lint-light
708708

709709
lint/site-icons:
@@ -738,6 +738,10 @@ lint/bootstrap:
738738
bash scripts/check_bootstrap_quotes.sh
739739
.PHONY: lint/bootstrap
740740

741+
lint/emdash:
742+
bash scripts/check_emdash.sh
743+
.PHONY: lint/emdash
744+
741745

742746
lint/helm:
743747
cd helm/

scripts/check_emdash.sh

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
# shellcheck source=scripts/lib.sh
4+
source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
5+
cdroot
6+
7+
echo "--- check for emdash/endash characters"
8+
9+
mode="changed"
10+
for arg in "$@"; do
11+
if [[ "$arg" == "--all" ]]; then
12+
mode="all"
13+
fi
14+
done
15+
16+
# Build the pattern from raw bytes so the script itself does not
17+
# contain literal emdash/endash characters (which would trigger
18+
# the check when the script is in the diff).
19+
emdash=$'\xE2\x80\x94'
20+
endash=$'\xE2\x80\x93'
21+
pattern="${emdash}|${endash}"
22+
23+
scan_all_files() {
24+
local output
25+
output=$(git ls-files -z | xargs -0 grep -IEn "$pattern" 2>/dev/null || true)
26+
if [[ -n "$output" ]]; then
27+
echo "$output"
28+
found=1
29+
else
30+
found=0
31+
fi
32+
}
33+
34+
if [[ "$mode" == "all" ]]; then
35+
scan_all_files
36+
else
37+
base=""
38+
if [[ -n "${GITHUB_BASE_REF:-}" ]]; then
39+
base="origin/${GITHUB_BASE_REF}"
40+
elif git rev-parse --verify origin/main >/dev/null 2>&1; then
41+
base=$(git merge-base HEAD origin/main 2>/dev/null || echo "origin/main")
42+
fi
43+
44+
if [[ -z "$base" ]]; then
45+
echo "WARNING: no base ref found, scanning all tracked files."
46+
scan_all_files
47+
else
48+
# Ensure the base ref is fetchable. CI shallow clones
49+
# (fetch-depth: 1) may not have the base branch available.
50+
if ! git rev-parse --verify "$base" >/dev/null 2>&1; then
51+
ref="${base#origin/}"
52+
echo "Base ref $base not found locally, fetching $ref..."
53+
git fetch origin "$ref" --depth=1 2>/dev/null || true
54+
if ! git rev-parse --verify "$base" >/dev/null 2>&1; then
55+
echo "ERROR: could not fetch base ref $base."
56+
exit 1
57+
fi
58+
fi
59+
60+
found=0
61+
if ! diff_output=$(git diff "$base" -U0 -- . 2>&1); then
62+
echo "ERROR: git diff against $base failed:"
63+
echo "$diff_output"
64+
exit 1
65+
fi
66+
67+
if [[ -z "$diff_output" ]]; then
68+
echo "OK: no changes to check."
69+
exit 0
70+
fi
71+
72+
# Parse the diff to check only added lines for emdash/endash.
73+
current_file=""
74+
current_line=0
75+
while IFS= read -r diff_line; do
76+
if [[ "$diff_line" =~ ^\+\+\+\ b/(.*) ]]; then
77+
current_file="${BASH_REMATCH[1]}"
78+
fi
79+
# Anchored to hunk header structure to avoid matching
80+
# digits from trailing function context.
81+
if [[ "$diff_line" =~ ^@@\ -[0-9,]+\ \+([0-9]+) ]]; then
82+
current_line=${BASH_REMATCH[1]}
83+
continue
84+
fi
85+
if [[ "$diff_line" =~ ^\+ ]] && [[ ! "$diff_line" =~ ^\+\+\+\ [ab/] ]]; then
86+
if echo "$diff_line" | grep -Eq "$pattern"; then
87+
echo "${current_file}:${current_line}:${diff_line:1}"
88+
found=1
89+
fi
90+
((current_line++)) || true
91+
fi
92+
done <<<"$diff_output"
93+
fi
94+
fi
95+
96+
if [[ "$found" -ne 0 ]]; then
97+
echo ""
98+
echo "ERROR: Found emdash (U+2014) or endash (U+2013) characters."
99+
echo ""
100+
echo " Do not use emdash or endash in code, comments, string literals,"
101+
echo " or documentation. Use commas, semicolons, or periods instead."
102+
echo " Restructure the sentence if needed. Do not replace them with"
103+
echo " ' -- ' either."
104+
echo ""
105+
echo " Example:"
106+
echo " Bad: This is slow [emdash] we should cache it."
107+
echo " Good: This is slow. We should cache it."
108+
echo " Good: This is slow, so we should cache it."
109+
echo ""
110+
exit 1
111+
fi
112+
113+
echo "OK: no emdash or endash characters found."

site/AGENTS.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,11 @@ When investigating or editing TypeScript/React code, always use the TypeScript l
7171
If sibling components initialize state with `useMemo`, don't switch to
7272
`useState(initialFn)` in the same file without reason.
7373
- Match errors by error code or HTTP status, never by comparing error
74-
message strings. String matching is brittle messages change, get
74+
message strings. String matching is brittle; messages change, get
7575
localized, or get reformatted.
76+
- Do not use emdash (U+2014), endash (U+2013), or ` -- ` as punctuation
77+
in code, comments, string literals, or documentation. Use commas,
78+
semicolons, or periods instead. Restructure the sentence if needed.
7679

7780
## TypeScript Type Safety
7881

0 commit comments

Comments
 (0)