Skip to content

feat(bigframes): Support unstable sort_values, sort_index#16665

Draft
TrevorBergeron wants to merge 1 commit intomainfrom
tbergeron_unstable_sort
Draft

feat(bigframes): Support unstable sort_values, sort_index#16665
TrevorBergeron wants to merge 1 commit intomainfrom
tbergeron_unstable_sort

Conversation

@TrevorBergeron
Copy link
Copy Markdown
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements stable sorting support by adding a kind parameter to sort_values and sort_index methods and updating the underlying core logic to handle sorting stability. The review feedback recommends centralizing the list of stable sort algorithms into a constant for better maintainability and suggests several refinements to type hints and default values in method overloads. Furthermore, the reviewer identified an opportunity to propagate the kind parameter when sorting DataFrames along the column axis to ensure consistent behavior.

]
VALID_WRITE_ENGINES = typing.get_args(WriteEngineType)

DEFAULT_SORT_KIND = "stable"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To avoid repeating the list of stable sort algorithms across multiple modules, consider defining a constant here.

Suggested change
DEFAULT_SORT_KIND = "stable"
DEFAULT_SORT_KIND = "stable"
STABLE_SORT_KINDS = ("stable", "mergesort")

for column in index_columns
]
return Index(self._block.order_by(ordering))
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the STABLE_SORT_KINDS constant for consistency and easier maintenance.

Suggested change
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
is_stable = (kind or constants.DEFAULT_SORT_KIND) in constants.STABLE_SORT_KINDS

*,
ascending: bool = ...,
inplace: Literal[False] = ...,
kind: str = ...,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The kind parameter should allow None to match the implementation and other overloads.

Suggested change
kind: str = ...,
kind: str | None = ...,

*,
ascending: bool = ...,
inplace: Literal[True] = ...,
kind: str = ...,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The kind parameter should allow None to match the implementation and other overloads.

Suggested change
kind: str = ...,
kind: str | None = ...,

Comment on lines +2455 to +2456
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
block = self._block.order_by(ordering, stable=is_stable)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The kind parameter is correctly used for axis=0, but it should also be propagated to self.columns.sort_values when axis=1 (the else block) to ensure consistent sorting behavior for columns. Also, consider using the STABLE_SORT_KINDS constant.

Suggested change
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
block = self._block.order_by(ordering, stable=is_stable)
is_stable = (kind or constants.DEFAULT_SORT_KIND) in constants.STABLE_SORT_KINDS
block = self._block.order_by(ordering, stable=is_stable)

inplace: Literal[False] = ...,
ascending: bool | typing.Sequence[bool] = ...,
kind: str = ...,
kind: str | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In overloads, it is conventional to use ... as the default value instead of None.

Suggested change
kind: str | None = None,
kind: str | None = ...,

inplace: Literal[True] = ...,
ascending: bool | typing.Sequence[bool] = ...,
kind: str = ...,
kind: str | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In overloads, it is conventional to use ... as the default value instead of None.

Suggested change
kind: str | None = None,
kind: str | None = ...,

else order.descending_over(column_id, na_last)
)
block = self._block.order_by(ordering)
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the STABLE_SORT_KINDS constant for consistency.

Suggested change
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
is_stable = (kind or constants.DEFAULT_SORT_KIND) in constants.STABLE_SORT_KINDS

raise ValueError(f"No axis named {axis} for object type Series")
if na_position not in ["first", "last"]:
raise ValueError("Param na_position must be one of 'first' or 'last'")
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the STABLE_SORT_KINDS constant for consistency.

Suggested change
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
is_stable = (kind or constants.DEFAULT_SORT_KIND) in constants.STABLE_SORT_KINDS

for column in block.index_columns
]
block = block.order_by(ordering)
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the STABLE_SORT_KINDS constant for consistency.

Suggested change
is_stable = (kind or constants.DEFAULT_SORT_KIND) in ["stable", "mergesort"]
is_stable = (kind or constants.DEFAULT_SORT_KIND) in constants.STABLE_SORT_KINDS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant