BUG: Fix np.strings.slice if stop=None or start and stop >= len (#29944)#30059
Merged
charris merged 1 commit intonumpy:maintenance/2.3.xfrom Oct 23, 2025
Merged
BUG: Fix np.strings.slice if stop=None or start and stop >= len (#29944)#30059charris merged 1 commit intonumpy:maintenance/2.3.xfrom
charris merged 1 commit intonumpy:maintenance/2.3.xfrom
Conversation
…y#29944) Python treats `slice(-1)` differently from `slice(-1, None)`: The first is interpreted as `slice(None, -1, None)`, while the second becomes `slice(-1, None, None)`, according to the logic in `slice_new`. However, `np.strings.slice` treats these identically, as it cannot distinguish unset arguments from arguments set to None. This makes it impossible to get the last characters of each string, for example: ```python >>> a = np.array(['hello', 'world']) >>> np.strings.slice(a, -2, None) # should return last two characters array(['hel', 'wor'], dtype='<U5') ``` This commit fixes that behavior: ```python >>> a = np.array(['hello', 'world']) >>> np.strings.slice(a, -2, None) # returns last characters as expected array(['lo', 'ld'], dtype='<U5') >>> np.strings.slice(a, -2) # original behavior preserved if no stop array(['hel', 'wor'], dtype='<U5') ``` It does this by adding a `stop=np._NoValue` default argument to `np.strings.slice`, which can be overridden with `None`. This commit also adds test conditions to `numpy/_core/tests/test_strings.py::TestMethods::test_slice` to verify that the slicing behavior matches Python's `slice`. Note that 4 newly added test conditions are commented out for now, as they cause errors with the "T" dtype. To reproduce: ``` >>> np.__version__ '2.3.3' >>> a = np.array(['hello', 'world'], dtype="T") >>> np.strings.slice(a, 5, 7) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "numpy-dev/lib/python3.12/site-packages/numpy/_core/strings.py", line 1823, in slice return _slice(a, start, stop, step) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ MemoryError: Failed to allocate string in slice ``` This causes either a MemoryError or kills the process with code 251. * BUG: Fix np.strings.slice when start and stop >= len Allows commented test_slice conditions to be uncommented.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Python treats
slice(-1)differently fromslice(-1, None): The first is interpreted asslice(None, -1, None), while the second becomesslice(-1, None, None), according to the logic inslice_new.However,
np.strings.slicetreats these identically, as it cannot distinguish unset arguments from arguments set to None. This makes it impossible to get the last characters of each string, for example:This commit fixes that behavior:
It does this by adding a
stop=np._NoValuedefault argument tonp.strings.slice, which can be overridden withNone.This commit also adds test conditions to
numpy/_core/tests/test_strings.py::TestMethods::test_sliceto verify that the slicing behavior matches Python'sslice. Note that 4 newly added test conditions are commented out for now, as they cause errors with the "T" dtype. To reproduce:This causes either a MemoryError or kills the process with code 251.
Allows commented test_slice conditions to be uncommented.