Skip to content

Fix has token serialize plan#102130

Open
Ergus wants to merge 8 commits intoClickHouse:masterfrom
Ergus:Fix_hasToken_serialize_plan
Open

Fix has token serialize plan#102130
Ergus wants to merge 8 commits intoClickHouse:masterfrom
Ergus:Fix_hasToken_serialize_plan

Conversation

@Ergus
Copy link
Copy Markdown
Member

@Ergus Ergus commented Apr 8, 2026

When serialize_query_plan = 1 is active (e.g. in "distributed plan" CI mode), ActionsDAG::deserializeConstant recreates ColumnConst with size_ = 0 as a placeholder (broadcast size is unknown at deserialization time). The data column inside the ColumnConst still holds the actual value.

initializeSearchTokens in hasAnyAllTokens.cpp called column_needles->empty() which delegates to ColumnConst::size() and returns true when size_ = 0.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=0e2f4c6c1fef69f5c0ba46252d04b1cd4687b661&name_0=MasterCI&name_1=Stateless%20tests%20%28amd_asan_ubsan%2C%20distributed%20plan%2C%20parallel%2C%202%2F2%29

Followup for #102128 fixes new issue exposed in.
This could also avoid: #102108
And solves the root cause of exposed problem with new test 02346_text_index_bug101913 from #101939 fixing #101913

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixes hasAllTokens and hasAnyTokens returning no matches when queries are executed via a Distributed table with serialize_query_plan = 1.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 8, 2026

Workflow [PR], commit [7d6cd2b]

Summary:

job_name test_name status info comment
Upgrade check (amd_release) failure
Check failed failure cidb, issue
Unit tests (tsan) error
Unit tests (msan) error
Unit tests (amd_llvm_coverage) error
Unit tests (asan_ubsan) dropped
LLVM Coverage dropped

AI Review

Summary

This PR fixes hasAllTokens/hasAnyTokens behavior when serialize_query_plan = 1 recreates a ColumnConst needles argument with size = 0 during plan deserialization. The change correctly avoids treating such constants as empty by checking the nested data column for const needles, and adds a regression test covering both Array(String) and String needle paths on distributed execution. High-level verdict: the fix is focused, correct, and adequately tested for the reported failure mode.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Apr 8, 2026
@Ergus Ergus force-pushed the Fix_hasToken_serialize_plan branch from a0f377b to 37e30c9 Compare April 8, 2026 21:02
@Ergus Ergus mentioned this pull request Apr 8, 2026
1 task
if (!column_needles)
return {};

/// When the query plan is serialized and deserialized (e.g. with serialize_query_plan=1),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: column_const --> column_needles_const

More importantly, my brain is broken. What exact type of needle column is a problem?

Copy link
Copy Markdown
Member Author

@Ergus Ergus Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| What exact type of needle column is a problem?

Any serialized constColumn.

The deserialization code looks like:

auto column = type.createColumn();
type.getDefaultSerialization()->deserializeBinary(*column, in, FormatSettings{});
return ColumnConst::create(std::move(column), 0);
}

@ClickHouse ClickHouse deleted a comment from alexey-milovidov Apr 10, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 12, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.10% +0.10%
Functions 90.80% 90.90% +0.10%
Branches 76.50% 76.60% +0.10%

Changed lines: 62.50% (5/8) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants