ENH: add NDArrayBackedExtensionArray to public API#45544
Conversation
tswast
commented
Jan 21, 2022
- closes #xxxx
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
|
In the |
|
no objections |
| ExtensionArray, | ||
| ExtensionScalarOpsMixin, | ||
| ) | ||
| from pandas.core.arrays._mixins import NDArrayBackedExtensionArray |
There was a problem hiding this comment.
i am shocked we don't have a tests that asserts this api e.g. test_api.py can you add one ?
There was a problem hiding this comment.
we should probably import this into pandas.core.arrays.__init__ and here do the import from pandas.core.arrays
|
I've been meaning to get around to this for a while. Thanks @tswast for beating me to it! One other thing we should ideally do is add a paragraph in the doc/source/development/extending.rst about this. something to the effect of "if your EA is a thin wrapper around an ndarray, you can save some effort by using ..." |
… into python-db-dtypes-pandas-issue28
|
Added some tests and docs. Should be ready for review now. |
Just pulled in the latest changes. Would love to get this documented, as it has been helpful in building the |
| Convert a value or values for use in setting a value or values in the backing | ||
| NumPy array. | ||
|
|
||
| ``_validate_searchsorted_value`` |
There was a problem hiding this comment.
In 2.0 i think this is going away and we'll re-use _validate_setitem_value for this
There was a problem hiding this comment.
Clarified that most implementations will be identical to _validate_setitem_value.
There was a problem hiding this comment.
_validate_searchsorted_value is gone now
There was a problem hiding this comment.
can you remove _validate_searchsorted_value here
|
|
||
|
|
||
| To support 2D arrays, use the ``_from_backing_data`` helper function when a | ||
| method is called on multi-dimensional data. |
There was a problem hiding this comment.
specify the data should be of the same dtype as self._ndarray?
|
couple of comments, otherwise lgtm. thanks for your patience |
|
@tswast can you merge main. some of the deprecations discussed above have been enforced |
…andas into python-db-dtypes-pandas-issue28
|
@jbrockmendel I've synced to main and addressed a remaining docs build warning. |
| _internal_fill_value = numpy.datetime64("NaT") | ||
|
|
||
| def __init__(self, values): | ||
| backing_array_dtype = "<M8[ns]" |
There was a problem hiding this comment.
can you make this a np.dtype object instead of a string
|
|
||
| def _validate_setitem_value(self, value): | ||
| if pandas.api.types.is_list_like(value): | ||
| return [self._validate_scalar(v) for v in value] |
There was a problem hiding this comment.
this should be an ndarray of the same dtype as self._ndarray
| - :func:`timedelta_range` now supports a ``unit`` keyword ("s", "ms", "us", or "ns") to specify the desired resolution of the output index (:issue:`49824`) | ||
| - :meth:`DataFrame.to_json` now supports a ``mode`` keyword with supported inputs 'w' and 'a'. Defaulting to 'w', 'a' can be used when lines=True and orient='records' to append record oriented json lines to an existing json file. (:issue:`35849`) | ||
| - Added ``name`` parameter to :meth:`IntervalIndex.from_breaks`, :meth:`IntervalIndex.from_arrays` and :meth:`IntervalIndex.from_tuples` (:issue:`48911`) | ||
| - :class:`NDArrayBackedExtensionArray` now exposed in the public API. (:issue:`45544`) |
| def test_api(self): | ||
| checkthese = self.classes + self.funcs + self.misc | ||
|
|
||
| self.check(namespace=extensions, expected=checkthese) |
There was a problem hiding this comment.
im not that familiar with this test file. what is being tested here?
|
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
|
@tswast now that 2.2 is out the door, and now that Pandas 2.x in general has made huge strides in its ExtensionArray implementation(s), could we get this PR back to active status by merging in main and resubmitting? If you don't have time for this, perhaps I can take a crack at it (with help from other Pint-Pandas people). In that case, what is the git-friendly way for me to pick up where you left off? We have an active use case that's ready to put this to the test. CC: @andrewgsavage |
|
@MichaelTiemannOSC I had a crack at it in my PR #56755. It's still active, just waiting on some help from pandas devs. |
|
Thanks @MichaelTiemannOSC and @andrewgsavage , I don't have time to revive this, so I appreciate your efforts. |