Thanks, the problem exist in partial path now actually, using emit::First is usually a bad idea unless the sorted input case.
I think at least we should rename the emit::First and add comments to tell that.
Originally posted by @Rachelint in #23182 (comment)
#23182 fixed final hash aggregate terminal output by materializing once and slicing.
However, the partial hash aggregate path still uses emit_to_for_batch_size(...)
and therefore EmitTo::First(batch_size) when group_count > batch_size.
EmitTo::First is not just "return first n rows"; it also shifts remaining
group indexes and maintains GroupValues lookup state. For terminal drain of a
hash table, this may repeat work that is no longer needed.
Need to verify:
- whether high-cardinality partial aggregate output shows similar emitting_time overhead
- whether materialize-once-then-slice is acceptable for partial output
- whether EmitTo::First should be renamed/commented to make destructive renumbering explicit
Originally posted by @Rachelint in #23182 (comment)
#23182 fixed final hash aggregate terminal output by materializing once and slicing.
However, the partial hash aggregate path still uses emit_to_for_batch_size(...)
and therefore EmitTo::First(batch_size) when group_count > batch_size.
EmitTo::First is not just "return first n rows"; it also shifts remaining
group indexes and maintains GroupValues lookup state. For terminal drain of a
hash table, this may repeat work that is no longer needed.
Need to verify: