Add more regexp_replace test coverage#21485
Merged
alamb merged 3 commits intoapache:mainfrom Apr 9, 2026
Merged
Conversation
Dandandan
approved these changes
Apr 8, 2026
Contributor
Author
|
Thanks @Dandandan |
github-merge-queue bot
pushed a commit
that referenced
this pull request
Apr 14, 2026
- Draft as it builds on #21379 ## Which issue does this PR close? - Follow on tp #21379 from @Dandandan ## Rationale for this change #21379 adds a specific optimization, but it had some non trivial code duplication I wanted to reduce the duplication and also make the code easier to read(in my opinion) ## What changes are included in this PR? Move the special RegExp logic into its own struct, and add copious comments ## Are these changes tested? By existing tests, and the additional tests added in - #21485 ## Are there any user-facing changes? No this is internal changes only I ran benchmarks and see no change in performance (as expected)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
regexp_replaceby stripping trailing .* from anchored patterns. 2.4x improvement (ClickBench Q28) #21379Rationale for this change
While reviewing #21379 I noticed there was minimal Utf8View coverage of the related code.
What changes are included in this PR?
Update the regexp_replace tests to cover utf8, largeutf8, utf8view and dictionary
Are these changes tested?
Yes only tests
I verified these tests also pass when run on
regexp_replaceby stripping trailing .* from anchored patterns. 2.4x improvement (ClickBench Q28) #21379Are there any user-facing changes?
No