feat: rank chat workspace templates by jaaydenh · Pull Request #25037 · coder/coder

jaaydenh · 2026-05-07T12:07:33Z

closes CODAGT-203

Summary

list_templates now returns a ranked shortlist with a recommendation, so the chat agent can pick the right template the way a colleague would: prefer what matches the request, what the user already uses, and what the rest of the organization uses. Instead of teaching the model an enum protocol in prompts, every result carries a fixed next_step instruction telling the agent what to do.

How list_templates works

Fetch: active, non-deprecated templates in the chat's organization, filtered by the admin template allowlist, authorized as the chat owner (no system escalation).

Query relevance (optional query argument): each template receives the highest tier any of its fields matches, and a higher tier always outranks a lower one regardless of usage:

Tier	Match
4	name or display name equals the query
3	name or display name starts with the query
2	name or display name contains the query
1	description contains the query (checked only when no name field matched)
0	no match; the template is excluded

Matching is case-insensitive and ignores spaces/hyphens/underscores (python gpu matches python-gpu).

Usage signals: a new GetTemplateRankingSignalsByOwnerID query returns, per template, the owner's active and recently-deleted workspace counts within a 60-day window, the last in-window usage, and the count of distinct developers with an active workspace (unclaimed prebuilds excluded).
Affinity score (computed in Go, per template, from that template's signals only):
```
affinity = 10 x (active + 0.5 x deleted) x 0.5^(days_since_last_use / 14)
         + ln(1 + active_developers)
```
active/deleted are the owner's in-window workspace counts, days_since_last_use is measured from the most recent in-window usage (the personal term is zero without in-window usage), and active_developers is the org-wide count. Personal usage carries 10x the weight of org popularity; the confidence floor is the score of two active developers (ln 3) and the required lead over the runner-up is ln 3 - ln 2.
Rank: query tier first (when a query is present), then affinity score, then name/ID for determinism. Results paginate 10 per page with next_page present only when more exist.

Recommendation contract

The result tells the agent what to do next instead of describing confidence levels:

recommended_template_id is present only when the top template is a clear winner: the only available template, a decisive query match, or an affinity score that clears a floor and leads the runner-up by a derived margin.
next_step is always present and is one of four fixed sentences: use the recommendation, ask the user to choose, retry a query that matched nothing, or report that no templates are available.

Per-template items carry raw evidence (active_developers, your_workspace_count, last_used_by_you) rather than derived labels. When signals fail to load, the tool logs and degrades to asking the user unless the query alone is decisive.

Prompts and the create_workspace/read_template descriptions reference the field through the chattool.NextStepField constant, so the instruction lives in one place and cannot drift. create_workspace remains idempotent and allowlist-enforced.

Authorization

The signals query runs with the chat owner's permissions: reading the owner's own workspaces plus a template-metadata read for the cross-user popularity count. dbauthz rejects the call if any requested template is not readable by the owner (covered by allow and deny method tests).

Docs

Adds docs/ai-coder/agents/tools/ explaining how agent tool calls work, with list_templates ranking and the next_step contract as the first documented tools.

jaaydenh · 2026-05-07T12:42:58Z

/coder-agents-review
@codex review

chatgpt-codex-connector · 2026-05-07T12:58:23Z

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coder-agents-review

Clean, well-decomposed PR. The ranking pipeline (fetch, score, enrich, rank, recommend, paginate, format) is easy to follow, and each stage is a pure or near-pure function. The dbauthz.AsSystemRestricted fix for the developer-count query corrects a latent bug where non-admin users always got zero active developer counts due to a silent RBAC failure. Six new tests cover the major ranking scenarios, and the runListTemplates/listTemplateItems helpers keep test noise down.

The main concern is the silent error-swallowing pattern in the two enrichment helpers. The old code had this for developer counts (and nobody noticed the broken auth because the error was hidden). The new code extends it to usage data and adds a confidence signal that triggers auto-selection, which amplifies the consequence of silent failures.

Severity breakdown: 1 P2, 8 P3, 3 Nit.

"One active developer versus zero is enough. The prompt tells the agent to treat recommended_template_id as the default and skip asking. For a first-time user with no personal history, the agent confidently creates a workspace from a template that one person happened to use." (Luffy)

🤖 This review was automatically generated with Coder Agents.

jaaydenh · 2026-05-15T11:08:54Z

/coder-agents-review

coder-agents-review

All 13 R1 findings addressed cleanly. The fixes are structural, not point patches: DEREM-6 became a shared templateRankSignals struct, DEREM-7 became a confidence threshold system with named constants and a 90-day recency window, DEREM-4 got proper error logging and graceful degradation. Test additions are proportional (210 new test lines for 108 net production lines).

The DEREM-4 fix introduced a new ordering issue: the rankingSignalsErr check now preempts the only_available_template path, which means a transient DB error in a single-template org causes the agent to ask the user to choose from a list of one. The fix is a one-line reorder.

The DEREM-5 fix (compact search normalization) was applied to name/displayName but not to description matching, leaving the same class of gap in the weakest match tier.

Severity breakdown: 1 P2, 4 P3.

"The ranking signals are irrelevant when there is no competition; the recommendation is trivially correct." (Hisoka)

🤖 This review was automatically generated with Coder Agents.

jaaydenh · 2026-05-18T09:21:07Z

/coder-agents-review
@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8ae6390e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coder-agents-review

All 18 prior findings addressed across two fix commits. The R2 fixes are solid: quartz clock injection, reordered recommendation checks, description compact search, internal unit tests for the error degradation path, and workspace-count threshold coverage. Netero found zero mechanical issues on this revision. Three reviewers (Mafuuu, Pariston, Meruem) verified the R2 fixes and found no new issues.

Two remaining P3s and a Nit, none blocking. The PR is in good shape.

Severity breakdown: 2 P3, 1 Nit.

"Delete the t.DisplayName entries from all three loops in templateQueryScore and every test in this file still passes. The display name code path is unreachable under the current test fixtures." (Bisky)

🤖 This review was automatically generated with Coder Agents.

…main # Conflicts: # coderd/x/chatd/chattool/createworkspace.go

jaaydenh · 2026-06-01T11:17:21Z

@codex review

chatgpt-codex-connector · 2026-06-01T11:20:05Z

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mafredri

The ranking/recommendation approach looks solid. One concern about stale guidance in workspaceDetachedAwareness that now contradicts the new template selection rules.

workspaceDetachedAwareness (prompt.go line 12, not in this diff) still says:

If a workspace is needed, use list_templates and read_template as needed before create_workspace.

This presents read_template as a standard step in the workflow. But the new <workspace-template-selection> block and the updated tool descriptions make it conditional (only when you need parameters or presets). The LLM sees both since workspaceDetachedAwareness is injected as a separate message in chatd.go:1617.

I think that line should be updated to match, something like:

If a workspace is needed, use list_templates before create_workspace. Call read_template only when you need parameter or preset details.

🤖 This review was generated with the help of Coder Agents.

Replace the lexicographic template-ranking comparator in list_templates with a frecency score (frequency discounted by recency), per reviewer feedback. - Add GetTemplateRankingSignalsByOwnerID, returning the user's recent active and recently-deleted workspace counts, last usage, and the count of distinct active developers in the org. Recently-deleted workspaces now contribute (recovering history the deleted=false filter discarded), scoped to a lookback window, and the prebuilds system user is excluded from the org popularity count. Replaces GetWorkspaceUsageGroupedByTemplateIDByOwnerID. - Compute the affinity score in Go (Wp*(active + Wd*deleted)*0.5^(age/half_life) + Wo*ln(1+org_devs)) because sqlc cannot reliably compile the parameterized decay expression; the query returns the raw signals. Weights, half-life, and lookback are explicit constants. - Recommendation confidence is now a single score comparison: a decisive query match recommends on its own, otherwise the top score must clear a floor derived from the active-developer minimum and lead the runner-up by a derived margin. Stale-but-frequent usage no longer recommends. - Replace the AsSystemRestricted call for the cross-user org count with a narrow dbauthz wrapper checking workspace-owner read plus a template-metadata read. - Clarify list_templates/read_template guidance in the detached prompt.

jaaydenh · 2026-06-04T12:51:35Z

/coder-agents-review

coder-agents-review · 2026-06-04T12:51:39Z

Chat: Review in progress | View chat
Requested: 2026-06-11 12:40 UTC by @jaaydenh
Spend: $101.72 / $200.00

deep-review v0.7.1 | Round 7 | 01ec5e4..36a6be6

Last posted: Round 7, 29 findings (2 P2, 21 P3, 6 Nit), APPROVE. Review

Finding inventory

Findings

#	Sev	Status	Location	Summary	Round	Reviewer	Posted
DEREM-1	P3	Author fixed (`34c7fac`)	dbauthz_test.go:4767	dbauthz test placed in wrong suite section (TestSystemFunctions instead of TestWorkspace)	R1	Netero	Yes
DEREM-2	P3	Author fixed (`34c7fac`)	listtemplates.go:261	Two selectTemplateRecommendation code paths (no_matching_templates, no_ranking_signal) lack test assertions	R1	Netero	Yes
DEREM-3	Nit	Author fixed (`34c7fac`)	workspaces.sql:500	Query name uses ForOwner instead of ByOwnerID per naming convention	R1	Netero	Yes
DEREM-4	P2	Author fixed (`34c7fac`)	listtemplates.go:192	Silent error swallowing in loadTemplateActiveDeveloperCounts and loadTemplateUsage produces misleading confidence signals	R1	Hisoka P2, Chopper P2, Knov P2, Mafuuu P3, Ryosuke P3, Leorio P3	Yes
DEREM-5	P3	Author fixed (`34c7fac`)	listtemplates.go:375	compactTemplateSearch strips spaces but not hyphens/underscores; multi-word queries miss hyphenated template names	R1	Mafuuu P3, Kite P3, Razor P3	Yes
DEREM-6	P3	Author fixed (`34c7fac`)	listtemplates.go:230	Sort, ambiguity predicate, and reason classifier encode ranking model three times with no structural coupling	R1	Meruem	Yes
DEREM-7	P3	Author fixed (`34c7fac`)	listtemplates.go:275	high_confidence_recommendation fires on weak signal (one old workspace or one active developer)	R1	Pariston P3, Luffy P3	Yes
DEREM-8	P3	Author fixed (`34c7fac`)	listtemplates_test.go:223	TestListTemplates_QueryRelevanceOutranksPersonalUsage tests different query scores, not usage-vs-query tiebreaking	R1	Bisky	Yes
DEREM-9	P3	Author fixed (`34c7fac`)	listtemplates.go:316	rank_reason labels create misleading signal inversion between recommended and non-recommended templates	R1	Hisoka P3, Razor P3	Yes
DEREM-10	P3	Author fixed (`34c7fac`)	listtemplates.go:151	total_count is post-filter; agent cannot distinguish empty-org from no-query-match	R1	Chopper P3	Yes
DEREM-11	Nit	Author fixed (`34c7fac`)	listtemplates.go:163	candidateRankedTemplates returns unranked templates; name misleads about postcondition	R1	Gon	Yes
DEREM-12	P3	Author fixed (`34c7fac`)	listtemplates.go:344	Query score tiers 1-4 are undocumented magic numbers driving ranking	R1	Gon	Yes
DEREM-13	Nit	Author fixed (`34c7fac`)	listtemplates.go:246	strings.Compare used where cmp.Compare would be uniform with rest of comparator	R1	Ging-Go	Yes
DEREM-14	P3	Author fixed (`b8ae639`)	listtemplates.go:158	time.Now() bypasses quartz clock; 90-day recency boundary untestable	R2	Netero	Yes
DEREM-15	P2	Author fixed (`b8ae639`)	listtemplates.go:333	rankingSignalsErr check preempts only_available_template; single-template path gated on irrelevant signals	R2	Hisoka P2, Mafuuu P3	Yes
DEREM-16	P3	Author fixed (`b8ae639`)	listtemplates.go:453	Description matching omits compactTemplateSearch; incomplete sibling of DEREM-5 fix	R2	Hisoka P3, Meruem P3, Chopper P3, Zoro Nit	Yes
DEREM-17	P3	Author fixed (`b8ae639`)	listtemplates.go:334	ranking_signals_unavailable branch (DEREM-4 fix) has no test coverage	R2	Bisky P3, Mafuuu P3, Mafu-san P3, Chopper Note	Yes
DEREM-18	P3	Author fixed (`b8ae639`)	listtemplates.go:365	WorkspaceCount >= 2 threshold in templateHasConfidentRankingSignal has no test	R2	Bisky	Yes
DEREM-19	P3	Author fixed (`4d357de`)	listtemplates_test.go:131	Display name matching in templateQueryScore has no isolated test; compact search resolves name first	R3	Bisky	Yes
DEREM-20	P3	Author fixed (`4d357de`)	listtemplates.go:164	errors.Join fuses independent enrichment failures; surviving signal cannot contribute to confidence	R3	Hisoka	Yes
DEREM-21	Nit	Author fixed (`4d357de`)	listtemplates_test.go:255	No test verifies page 2 retrieval; pagination offset arithmetic unverified	R3	Bisky	Yes
CRF-22	P3	Author fixed (`dfc293d`)	dbauthz_test.go:3376	Dbauthz test for GetTemplateRankingSignalsByOwnerID only exercises no-org path; org-scoped case silently overwritten	R4	Netero	Yes
CRF-23	P3	Author fixed (`dfc293d`)	prompt.go:11	workspaceDetachedAwareness contradicts new template selection guidance; deferred without ticket	R4	Mafu-san	Yes
CRF-24	P3	Author fixed (`dfc293d`)	listtemplates.go:460	relevance_signals claims used_by_you for deleted-only usage but item metadata absent (ActiveCount gate)	R4	Mafuuu P3, Razor Note, Hisoka Note	Yes
CRF-25	P3	Author fixed (`dfc293d`)	listtemplates_test.go:943	SQL test asserts LastUsedAt.Valid but never checks the actual value; MAX-to-MIN regression invisible	R4	Bisky	Yes
CRF-26	Nit	Author fixed (`5a7641f`)	listtemplates.go:159	Error from asOwner has no operation context	R4	Leorio	Yes
CRF-27	Nit	Author fixed (`dfc293d`)	listtemplates.go:91	ListTemplatesOptions doc says nothing about required fields or defaults	R4	Leorio	Yes
CRF-28	P3	Open	listtemplates_test.go:628	DeletedRecentPersonalUsageShowsEvidence omits next_step and recommended_template_id assertions	R7	Bisky	Yes
CRF-29	P3	Open	listtemplates_internal_test.go:158	No test exercises both-above-confidence-floor-with-large-gap branch of selectTemplateRecommendation	R7	Bisky	Yes

Law analysis

Effective LOC: 1635 (553 production, 1082 test, 151 generated). Head SHA: 4c20d3f. Verdict: Don't split. Enforcement: Advisory.

Round log

Round 1

Panel. 1 P2, 8 P3, 3 Nit. Reviewed against 3d03c39..f8882fc.

Round 2

All 13 R1 findings addressed in 34c7fac. Churn guard: PROCEED. Panel found 1 P2, 3 P3 new plus 1 P3 from Netero. Reviewed against 3d03c39..34c7fac.

Round 3

All 5 R2 findings addressed in b8ae639. Churn guard: PROCEED. Netero clean. Panel found 2 P3, 1 Nit. Reviewed against 3d03c39..b8ae639.

Round 4

All 3 R3 findings addressed in 4d357de. Major restructuring: new frecency SQL query, affinity scoring model. Churn guard: PROCEED. Netero 1 P3, Law don't-split. Reviewed against 6ecf804..4c20d3f.

Round 5

5 of 6 R4 findings addressed in dfc293d. CRF-26 (Nit, error context) silent. Churn guard: BLOCKED. Reviewed against 6ecf804..dfc293d.

Round 6

CRF-26 addressed in 5a7641f. Churn guard: PROCEED. Netero clean. No new findings. All 27 findings resolved. Reviewed against 6ecf804..5a7641f.

Round 7

No open findings from R6. Rebase + new commits (docs, next_step constants). Netero clean. Panel found 2 P3 (test gaps). Reviewed against 01ec5e4..36a6be6.

About deep-review

CRF = Coder Review Finding (P0-P4, Nit, Note)

Reviewer	Focus
Bisky	tests
Chopper	ops/errors
Churn-guard	change verification
Ging	language modernization
Gon	naming
Hisoka	edge cases
Killua	perf
Kite	change integrity
Knov	contracts
Knuckle	SQL
Kurapika	security
Law	decomposition
Leorio	docs
Luffy	product
Mafu-san	process
Mafuuu	contracts
Melody	dispatch/pairing
Meruem	structural
Nami	frontend
Netero	mechanical checks
Pariston	premise testing
Pen-botter	product gaps
Razor	verification
Robin	duplication
Ryosuke	Go arch
Takumi	concurrency
Zoro	shape

🤖 Managed by Coder Agents.

jaaydenh · 2026-06-05T12:19:07Z

@codex review

coder-agents-review · 2026-06-05T12:19:13Z

@jaaydenh ⛔ This review has reached its per-chat spend limit ($141.20 / $128.45). Further review rounds are paused.

To raise the limit and continue, comment:

/coder-agents-review set-spend-limit:179

This is a per-chat budget, separate from any account-level usage limit.

🤖 Managed by Coder Agents.

chatgpt-codex-connector · 2026-06-05T12:22:47Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mafredri · 2026-06-05T13:06:02Z

+	// because sqlc cannot reliably compile the parameterized decay expression;
+	// see GetTemplateRankingSignalsByOwnerID. Keeping the score and the
+	// confidence thresholds in the same place also avoids Postgres-versus-Go
+	// floating-point differences at confidence boundaries.


I feel this introduces a lot of complexity vs doing it in SQL. And floating-point differences seem like a contrived reason? It's used for sorting and PostgreSQL can do that, no need to re-sort in Go. It also seems very unlikely to matter in practice.

I think the issue is more about the how this comment is really not that useful and can be improved. Below are the main points about why its makes sense to keep the current split between Go and SQL.

The algorithm is much easier to unit test in Go without database setup, mock clocks, SQL generation, and type/cast issues.

The ranking combines Go-only query scoring with DB-derived usage signals, so SQL sorting would only own part of the ordering unless query scoring also moved into SQL.

The candidate list is already materialized in Go before pagination, so the extra in-memory sort is not a meaningful performance cost here.

SQL remains a stable, explainable aggregation query; Go owns the tunable recommendation policy.

Changing weights, half-life, threshold behavior, or future-age clamping does not require touching generated SQL.

The algorithm is much easier to unit test in Go without database setup, mock clocks, SQL generation, and type/cast issues.

I think this is kind of a wash, you could use dbgen to generate the data and have an end-to-end test of the algorithm. That said, I won't argue with this reasoning, splitting the logic into two parts makes testing one part simpler.

The ranking combines Go-only query scoring with DB-derived usage signals, so SQL sorting would only own part of the ordering unless query scoring also moved into SQL.

How do you mean?

The candidate list is already materialized in Go before pagination, so the extra in-memory sort is not a meaningful performance cost here.

This is the kind of reasoning I'd expect an LLM to give me. 😂

Changing weights, half-life, threshold behavior, or future-age clamping does not require touching generated SQL.

This is immaterial, changing Go requires rebuilding the binary, changing SQL requires make gen, this is not an argument worth making.

Either way, as long as we keep the Go implementation simple and not over-engineered, I won't push back on the split.

On "how do you mean": the final order sorts by query-relevance tier first, and that tier is computed in Go from the query against name/display name/description with compact matching, so an ORDER BY could only own the affinity part; Go would still re-sort by (query tier, affinity) anyway unless query scoring also moved into SQL.

Also tidied the contested justification in 19ce6aa: the comments no longer claim sqlc fragility, just that keeping the score in Go keeps the policy and thresholds in one place. The Go side also got simpler in 36a6be6 (selection enums and derived fields removed; ranking math unchanged).

I wish llms would identify when they respond to comments, last comment was not me

…atching

…truction list_templates previously taught the model a four-value selection_hint enum, an eleven-value recommendation_reason, per-item relevance_signals, and a derived user_selection_required flag, re-explained in two tool descriptions and two prompt sections. The ranked output now carries a single fixed next_step instruction at the point of use plus an optional recommended_template_id, and the per-item payload keeps only raw usage evidence. Ranking, confidence thresholds, and authorization are unchanged. Prompts reference the field through chattool.NextStepField so they cannot drift. Documents how agent tool calls work, including template selection and workspace creation, in docs/ai-coder/agents/tools.

github-actions · 2026-06-11T11:55:25Z

Docs preview

📖 View docs preview for docs/ai-coder/agents/tools/index.md

jaaydenh · 2026-06-11T12:40:10Z

/coder-agents-review set-spend-limit:200

coder-agents-review

All 27 prior findings remain resolved. CI green. The latest commits add next_step constants, docs, and a rebase. Netero found zero mechanical issues. Two reviewers (Mafuuu, Pariston) found no issues. Bisky found two test gaps in the new code, both P3.

Severity breakdown: 2 P3.

"Every scenario test asserts next_step and recommended_template_id. Except one." (Bisky)

🤖 This review was automatically generated with Coder Agents.

Asserts the recommendation fires for recent deleted-only personal usage (CRF-28), adds the both-above-floor-with-large-gap recommendation case (CRF-29), and rewords the affinity-score comments to drop the inaccurate sqlc fragility claim; the score lives in Go so the ranking policy and its confidence thresholds stay in one place.

Shows a concrete query and ranked two-template response in the agent tools doc, and clarifies that templates are scored independently with only raw evidence fields returned.

linear-code · 2026-06-12T12:24:39Z

CODAGT-203

johnstcn

Instead of computing an opaque score, could we instead try just surfacing the raw numbers in the list_templates response and letting the model decide for itself? This would also let us drop the recommended_template_id.

The NextStep field is definitely a good addition though.

johnstcn · 2026-06-15T09:53:51Z

+		s := templateRankingSignals{
+			ActiveCount:        row.ActiveCount,
+			DeletedRecentCount: row.DeletedRecentCount,
+			OrgDevs:            row.OrgDevs,


This is already provided by GetWorkspaceUniqueOwnerCountByTemplateID.

jaaydenh · 2026-06-16T10:20:25Z

Instead of computing an opaque score, could we instead try just surfacing the raw numbers in the list_templates response and letting the model decide for itself? This would also let us drop the recommended_template_id.

The NextStep field is definitely a good addition though.
@johnstcn

The way the ranking response is designed the score would not be that useful. The ranking is broken up into query tiers (see the PR description) and the affinity score is only used as a score to rank templates in each tier. If there is not query, then there are no tiers. It would be necessary to also return the tier and the affinity score to avoid confusing the LLM. Or create a new normalized score when a query is involved. I think this potentially adds unnecessary complexity.
The design of the response as it currently is attempts to simplify the decision making process so that the response is a ordered list of templates sorted by relevance with next_step providing a clear answer so that the LLM doesn’t need to do as much thinking/processing of the list_templates response.

If you have a strong argument that a score is necessary for each template in the response. I have to rethink and rework the logic a bit so it makes sense. I also think it may make sense to remove next_step so that the LLM then decides completely based on the information in the response containing the list of templates.

johnstcn · 2026-06-16T10:21:36Z

If you have a strong argument that a score is necessary for each template in the response.

I have no strong arguments here; it's hard to make the determination without data and test results.

# Conflicts: # coderd/x/chatd/chatd_test.go

johnstcn

Approving to unblock.

mafredri · 2026-06-18T07:49:31Z

Call read_template only when you need parameter or preset details before create_workspace.

This rule seems a bit suspect to me, the agent won't know if something matters for the task without checking, so the only signal is when the user explicitly ask for it? Don't know what the solution should be, though, if the goal is to avoid the tool call.

Other than that, LGTM 👍🏻

feat(coderd): rank chat workspace templates

35f980a

github-actions Bot assigned jaaydenh May 7, 2026

test(coderd/database): cover workspace usage authz

f8882fc

coder-agents-review Bot reviewed May 7, 2026

View reviewed changes

fix(coderd): address template ranking feedback

34c7fac

coder-agents-review Bot reviewed May 15, 2026

View reviewed changes

fix(coderd): follow up on template ranking feedback

b8ae639

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

Comment thread coderd/x/chatd/chattool/listtemplates.go Outdated

coder-agents-review Bot approved these changes May 18, 2026

View reviewed changes

Comment thread coderd/x/chatd/chattool/listtemplates_test.go

Comment thread coderd/x/chatd/chattool/listtemplates.go Outdated

Comment thread coderd/x/chatd/chattool/listtemplates_test.go

jaaydenh and others added 4 commits May 18, 2026 16:46

Merge branch 'main' into workspace-bdje

ee89c16

Merge remote-tracking branch 'origin/main' into merge-workspace-bdje-…

15acaf5

…main # Conflicts: # coderd/x/chatd/chattool/createworkspace.go

Merge branch 'main' into workspace-bdje

0fcc777

fix(coderd): address template ranking feedback

4d357de

jaaydenh changed the title ~~feat(coderd): rank chat workspace templates~~ feat: rank chat workspace templates Jun 1, 2026

Merge branch 'main' into workspace-bdje

f8e7819

fix(coderd/x/chatd): require template selection on weak matches

9d1ee50

jaaydenh marked this pull request as ready for review June 1, 2026 14:13

jaaydenh requested a review from mafredri June 1, 2026 14:14

mafredri reviewed Jun 1, 2026

View reviewed changes

Comment thread coderd/x/chatd/prompt.go Outdated

Comment thread coderd/x/chatd/chattool/createworkspace.go Outdated

jaaydenh requested a review from mafredri June 2, 2026 08:53

mafredri reviewed Jun 2, 2026

View reviewed changes

Comment thread coderd/database/queries/workspaces.sql Outdated

mafredri reviewed Jun 5, 2026

View reviewed changes

Comment thread coderd/database/queries/workspaces.sql Outdated

mafredri reviewed Jun 5, 2026

View reviewed changes

jaaydenh and others added 5 commits June 9, 2026 18:02

Merge branch 'main' into workspace-bdje

58f73db

chore: update comments

d7fe770

test(coderd/database/dbauthz): cover template ranking signals deny path

95f7beb

refactor(coderd/x/chatd/chattool): simplify templateQueryScore tier m…

71eca18

…atching

coder-agents-review Bot approved these changes Jun 11, 2026

View reviewed changes

Comment thread coderd/x/chatd/chattool/listtemplates_test.go

Comment thread coderd/x/chatd/chattool/listtemplates_internal_test.go

jaaydenh requested a review from johnstcn June 11, 2026 13:17

jaaydenh added 3 commits June 12, 2026 09:59

docs: add list_templates request and response example

fcec96b

Shows a concrete query and ranked two-template response in the agent tools doc, and clarifies that templates are scored independently with only raw evidence fields returned.

docs: state the exact deleted-workspace weight in template ranking

9e38592

docs: detail query relevance tiers and the affinity formula

4f6ac74

jaaydenh requested a review from mafredri June 12, 2026 10:24

johnstcn reviewed Jun 15, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into merge-conflicts-annk

9a91632

# Conflicts: # coderd/x/chatd/chatd_test.go

jaaydenh mentioned this pull request Jun 17, 2026

feat: add list_templates ranking telemetry #26458

Open

johnstcn approved these changes Jun 17, 2026

View reviewed changes

jaaydenh merged commit bc44cdd into main Jun 18, 2026
32 of 33 checks passed

jaaydenh deleted the workspace-bdje branch June 18, 2026 05:41

github-actions Bot locked and limited conversation to collaborators Jun 18, 2026

Conversation

jaaydenh commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How list_templates works

Recommendation contract

Authorization

Docs

Uh oh!

jaaydenh commented May 7, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 7, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaaydenh commented May 15, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaaydenh commented May 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaaydenh commented Jun 1, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 1, 2026

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jaaydenh commented Jun 4, 2026

Uh oh!

coder-agents-review Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Findings

Law analysis

Round log

Round 1

Round 2

Round 3

Round 4

Round 5

Round 6

Round 7

Uh oh!

jaaydenh commented Jun 5, 2026

Uh oh!

coder-agents-review Bot commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 5, 2026

Uh oh!

jaaydenh commented May 7, 2026 •

edited

Loading

coder-agents-review Bot commented Jun 4, 2026 •

edited

Loading