Refactor post-aggregate clause rebasing and validation with shared helpers#22817
Open
kosiew wants to merge 8 commits into
Open
Refactor post-aggregate clause rebasing and validation with shared helpers#22817kosiew wants to merge 8 commits into
kosiew wants to merge 8 commits into
Conversation
- Added focused test cases for diagnostics in aggregate queries: - SELECT invalid diagnostic - HAVING invalid diagnostic - QUALIFY invalid diagnostic - ORDER BY invalid diagnostic - DISTINCT ON invalid diagnostic - ORDER BY aggregate alias valid case - DISTINCT ON aggregate alias valid case
…gregate expressions - Introduced `rebase_and_validate_post_aggregate_exprs` to streamline the SELECT projection rebase/validation process. - Updated SELECT projection to utilize the new helper, while leaving HAVING, QUALIFY, ORDER BY, and DISTINCT ON unchanged.
…ype in select.rs - Renamed `valid_exprs` to `valid_column_exprs` in `datafusion/sql/src/select.rs` - Shortened collect type from `Vec<Expr>` to `Vec<_>` - In `datafusion/sqllogictest/test_files/aggregate.slt`: - Replaced repeated CTEs with one test table: `post_aggregate_clause_validation` - Maintained the same diagnostics and outputs - Added cleanup with `DROP TABLE`
…ggregate expressions - Added rebase_and_validate_optional_post_aggregate_expr helper - Migrated HAVING clause to use the new helper - Migrated QUALIFY clause to use the new helper - Kept ORDER BY / DISTINCT ON implementation unchanged
- Implemented rebase_and_validate_post_aggregate_exprs for DISTINCT ON on_exprs - Maintained validation using all_valid_exprs - Kept ORDER BY logic unchanged
…r and reorder ORDER BY logic - Introduced a private helper function `rebase_and_validate_post_aggregate_sort_exprs` - Reorganized the ORDER BY handling: - Moved rebase logic - Updated SELECT-alias remapping - Rebuilt SortExpr - Ensured validation with OrderBy - Maintained existing logic for all_valid_exprs - Kept DISTINCT ON/HAVING/QUALIFY handling unchanged from previous migrations
- Updated optional helper to return explicit Ok(Some(rebased_expr)) - Introduced private order_by_select_alias_expr - Modified ORDER BY helper to utilize named alias-remap helper - Enhanced all_valid_exprs alias extraction with match usage - Condensed ORDER BY call-site comment for clarity
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #22684
Rationale for this change
SqlToRel::aggregatecontained several implementations of the same post-aggregate workflow across SELECT, HAVING, QUALIFY, ORDER BY, and DISTINCT ON:CheckColumnsMustReferenceAggregatePurpose.Maintaining this logic in multiple places increases the risk of behavioral drift during future planner changes. This change consolidates the shared mechanics into private helper functions while preserving existing SQL semantics, alias handling, sort behavior, and clause-specific error messages.
What changes are included in this PR?
Added private helper functions for post-aggregate expression processing:
rebase_and_validate_post_aggregate_exprsrebase_and_validate_optional_post_aggregate_exprrebase_and_validate_post_aggregate_sort_exprsorder_by_select_alias_exprMigrated SELECT projection post-aggregate rebasing and validation to shared helper logic.
Migrated HAVING and QUALIFY post-aggregate rebasing and validation to shared helper logic.
Migrated DISTINCT ON post-aggregate rebasing and validation to shared helper logic while continuing to validate against aggregate outputs plus valid SELECT aliases.
Migrated ORDER BY post-aggregate rebasing and validation to a dedicated helper that:
SortExpr::with_expr.Simplified aggregate planning code by removing duplicated rebasing and validation logic.
Are these changes tested?
Yes.
Added SQLLogicTest coverage in
aggregate.sltfor:Are there any user-facing changes?
No intended user-facing behavior changes.
This PR is a refactor that preserves existing aggregate planning semantics while adding regression coverage for clause-specific validation diagnostics and alias-handling behavior.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.