Skip to content

Allow DataFrame.aggregate to accept None for no grouping#1581

Merged
timsaucer merged 1 commit into
apache:mainfrom
kosiew:aggregation-enhancement-1502
Jun 7, 2026
Merged

Allow DataFrame.aggregate to accept None for no grouping#1581
timsaucer merged 1 commit into
apache:mainfrom
kosiew:aggregation-enhancement-1502

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented Jun 6, 2026

Which issue does this PR close?

Rationale for this change

This makes DataFrame.aggregate accept None as a more Pythonic way to express aggregation over the whole DataFrame without grouping, equivalent to passing an empty list.

What changes are included in this PR?

  • Updates DataFrame.aggregate to allow group_by=None.
  • Treats None the same as an empty group_by list.
  • Updates the aggregate docstring examples and user guide documentation to mention None.

Are these changes tested?

Yes. This PR adds:

test_aggregate_none_group_by_equivalent_to_empty_list

Are there any user-facing changes?

Yes. Users can now call:

df.aggregate(None, [f.count()])

as an alternative to:

df.aggregate([], [f.count()])

This is not a breaking API change.

LLM-generated code disclosure

This PR includes code, comments generated with assistance from LLM. All LLM-generated content has been manually reviewed and tested.

…y list

- Updated `group_by` method to accept `None` and normalize it to an empty list.
- Improved docstring for clarity.
- Added regression test in `test_dataframe.py` to verify that `None` equals an empty list.
- Updated documentation to mention that `group_by=None` is now supported.
@kosiew kosiew requested a review from timsaucer June 6, 2026 09:57
Copy link
Copy Markdown
Contributor

@ntjohnson1 ntjohnson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

When the :code:`group_by` list is empty the aggregation is done over the whole :class:`.DataFrame`.
For grouping the :code:`group_by` list must contain at least one column.
When :code:`group_by` is :code:`None` or an empty list, the aggregation is done over the whole
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: It's nicer to have each sentence on it's own line for managing diffs. I think Tim has a PR up to convert the rst to markdown anyway so probably not that critical

@timsaucer
Copy link
Copy Markdown
Member

Thank you @kosiew for another excellent addition! Thank you @ntjohnson1 for the review.

@timsaucer timsaucer merged commit 43df9f7 into apache:main Jun 7, 2026
31 checks passed
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented Jun 8, 2026

@ntjohnson1, @timsaucer
Thanks for the review and feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve aggregation across entire dataframe

3 participants