feat(coderd/database): add chat_message_search_text function#26968
feat(coderd/database): add chat_message_search_text function#26968johnstcn wants to merge 3 commits into
Conversation
8ab2d7e to
3b783dc
Compare
|
/coder-agents-review |
|
Chat: Review posted | View chat Review historydeep-review v0.9.0 | Round 2 | Last posted: Round 2, 6 findings (4 Nit, 2 Note), COMMENT. Review Finding inventoryFindings
Contested and acknowledgedCRF-5 (Note, 000537_chat_message_search_text.up.sql:9) - Four extraction implementations with different separators
Round logRound 1Panel (17 reviewers). Netero: no findings. 3 Nit, 2 Note. 0 dropped. Reviewed against 1989db0..3b783dc. Round 2Panel (9 reviewers). Netero: no findings. All R1 findings addressed. 2 new Nit. 0 dropped. Reviewed against 1989db0..e198c2b. About deep-reviewCRF = Coder Review Finding (P0-P4, Nit, Note)
|
There was a problem hiding this comment.
Clean, well-scoped PR: a single SQL extraction function with comprehensive Postgres-backed tests, correctly decomposed from the larger chat search work (CODAGT-591 -> CODAGT-722 -> this). 12 of 17 reviewers found no issues. Netero's mechanical sweep was clean.
3 Nits, 2 Notes. No blocking issues.
The PR description references "migration 000536" but the actual file is 000537 (renumbered in the second commit). Worth updating so future readers can find the migration file.
"No mocks, no fakes, no costume jewelry. Every row proves real behavior." -Bisky, on the test suite
🤖 This review was automatically generated with Coder Agents.
|
/coder-agents-review |
There was a problem hiding this comment.
All five R1 findings addressed. Comment trimmed, IF EXISTS added, OnlyNonTextParts test added, PR description updated, extraction landscape acknowledged. All fixes verified by 9 reviewers against e198c2b.
2 new Nits (comment style). No blocking issues.
"Everything holds." -Hisoka ♠
🤖 This review was automatically generated with Coder Agents.
|
|
||
| cases := []struct { | ||
| name string | ||
| content sql.NullString // JSONB input; invalid means SQL NULL. |
There was a problem hiding this comment.
Nit [CRF-6] Comment // JSONB input; invalid means SQL NULL. restates what sql.NullString already encodes. No other test in coderd/database/ annotates sql.Null* fields this way. The $1::jsonb cast in the query two lines below provides the JSONB context. (Gon)
🤖
| cases := []struct { | ||
| name string | ||
| content sql.NullString // JSONB input; invalid means SQL NULL. | ||
| want sql.NullString // expected text; invalid means SQL NULL. |
There was a problem hiding this comment.
Nit [CRF-7] Same: want sql.NullString and the field name say everything. (Gon)
🤖
Adds an
IMMUTABLE PARALLEL SAFESQL functionchat_message_search_text(content jsonb) RETURNS text(migration000537) that extracts the space-joined text of alltype='text'parts from a chat message's JSONB content, in order. Non-array content (legacy content_version=0 scalar strings) returns NULL, as does an array with no text parts. Includes a Postgres-backed test exercising the function directly.This is groundwork for chat history search (CODAGT-591): a later change (CODAGT-722) builds a partial expression GIN index over
to_tsvector('simple', chat_message_search_text(content)). No index, trigger, or callers are added here.Assumptions: single-space separator for tokenization; NULL (not empty string) for empty/scalar/NULL input; legacy content is intentionally excluded from search per the RFC.
GetChatUserPromptsByChatIDhas a similar but not identical extraction (empty separator, extra filtering), so it is not rewritten here.Closes https://linear.app/codercom/issue/CODAGT-721
Implementation plan
CODAGT-721: Add
chat_message_search_textextraction functionIssue: https://linear.app/codercom/issue/CODAGT-721
Parent: CODAGT-591 (Implement Chat History Search)
Branch:
cian/codagt-721-add-chat_message_search_text-extraction-functionDeliverable: one draft PR containing a single SQL migration pair plus tests.
Context
The chat search RFC (Notion: "Chat Messages Search", post-PRD follow-up) settled on
a partial expression GIN index over
to_tsvector('simple', chat_message_search_text(content))onchat_messages,built at runtime with
CREATE INDEX CONCURRENTLYby a dbpurge-style task(CODAGT-722). This issue lands only the
IMMUTABLEextraction function the indexexpression needs. No index, no trigger, no stored column, no callers.
Requirements (from issue + RFC)
coderd/database/migrations(next number after000535; usecreate_migration.sh).chat_message_search_text(content jsonb) RETURNS text,IMMUTABLE,single statement, no
CONCURRENTLY..down.sqldrops the function.GetChatUserPromptsByChatIDextraction incoderd/database/queries/chats.sql:422):contentis NULL orjsonb_typeof(content) <> 'array'(legacycontent_version=0 scalar strings): return NULL. This intentionally excludes
legacy message content per the RFC.
string_agg(part->>'text', ' ' ORDER BY ordinality)overjsonb_array_elements(content)wherepart->>'type' = 'text'.type='text'parts. Reasoning, tool-call/result, file, source,context-file, and skill parts are excluded (role/visibility/deleted
filtering belongs in the index predicate, CODAGT-722, not here).
string_aggover zero rows returns NULL. Fine.LANGUAGE sql(not plpgsql): a single-expression SQL function is enough,inlinable, and trivially
IMMUTABLE.IMMUTABLE+PARALLEL SAFE.IMMUTABLEis a promise, not enforced; the function reads only itsargument, so the promise holds.
Red: tests first
Migration-level DB tests. Where: a new test in
coderd/database/migrations/migrate_test.gostyle is overkill; instead add aPostgres-backed test (e.g.
coderd/database/chat_message_search_text_test.goor extend an existing querier test file per
.claude/docs/DATABASE.mdandTESTING.md conventions; confirm placement against how other SQL functions are
tested during implementation). Test executes
SELECT chat_message_search_text($1::jsonb)directly. Cases:reasoning,tool-call) → text parts joined by a single space, in order."hello") → NULL.[]) → NULL.typeor missingtext→ ignored / NULL textskipped (
string_aggskips NULLs).Also:
TestMigrateUpDownUpstyle idempotence is already covered by theexisting migration framework; no extra work unless
make testsays otherwise.Green: implementation
./coderd/database/migrations/create_migration.sh chat_message_search_textUp migration:
Note:
jsonb_array_elementson a scalar raises an error, so the scalarguard cannot live only in the WHERE clause of a FROM over it. Use
CASE WHEN jsonb_typeof(content) = 'array' THEN (subquery) ENDor plpgsqlwith an IF, as in the RFC sketch. Decide during implementation; the tests
pin the behavior either way.
Down migration:
DROP FUNCTION chat_message_search_text(jsonb);make gen(dump.sql picks up the function; no sqlc changes expected).Run the new tests +
make test RUN=<new test>.Refactor
GetChatUserPromptsByChatID; do NOTrewrite that query to use the new function in this PR (out of scope), but
note the follow-up if the duplication is exact.
only
type='text'parts are indexed).Verification
make genproduces no diff outside the migration + dump.sql.make lint,make test RUN=...pass.Assumptions (flag in PR)
' '; the existinguser-prompts query uses
''). Space is correct for tokenization."null/empty"; NULL is what
string_agggives and is fine forto_tsvector(the index expression handles NULL).legacy message content from search.
This PR was generated by Coder Agents on behalf of @johnstcn.