-
-
Notifications
You must be signed in to change notification settings - Fork 12.4k
ENH: allow larger than C int sized structured dtypes #31332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 20 commits
bd31722
de85568
de8440f
d83e9b9
377836b
d9bba0c
ebcfa4a
8812144
9dbee21
b266f4c
997f0c4
208193c
24ad411
3892f1d
3dba943
8fbdd2f
4c51883
ae63495
692a3f1
673b19b
53126e0
b285606
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| Structured dtypes now support larger field sizes | ||
| ------------------------------------------------ | ||
| It is now possible to construct structured data types with | ||
| field sizes and offsets that exceed the size of a standard C | ||
| integer. Arrays using these structured data types are now | ||
| also possible to construct. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,13 +9,15 @@ | |
|
|
||
| import numpy as np | ||
| from numpy.testing import ( | ||
| IS_64BIT, | ||
| assert_, | ||
| assert_array_almost_equal, | ||
| assert_array_equal, | ||
| assert_equal, | ||
| assert_raises, | ||
| temppath, | ||
| ) | ||
| from numpy.testing._private.utils import requires_memory | ||
|
|
||
|
|
||
| class TestFromrecords: | ||
|
|
@@ -108,6 +110,17 @@ def test_recarray_fromfile(self): | |
| assert_equal(r1, r2) | ||
| assert_equal(r2, r3) | ||
|
|
||
| @pytest.mark.skipif(not IS_64BIT, reason="test requires 64-bit system") | ||
| @requires_memory(free_bytes=2e9) | ||
| def test_recarray_fromfile_massive(self, tmpdir): | ||
| kind = [("x", np.float64, 2 ** 28)] | ||
| kind_dtype = np.dtype(kind) | ||
| rec_arr = np.array((1,), dtype=kind_dtype) | ||
| with tmpdir.as_cwd(): | ||
| rec_arr.tofile("f.data") | ||
| actual = np.fromfile("f.data", dtype=kind_dtype) | ||
| assert actual.itemsize == 2 ** 28 * 8 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ohh, fun. OTOH, the error here would be that we are not reading everything (i.e. the last bit of the result not being -1). But I think there is a fun little thing happening here:
So, no problem until we reach 2**32 at which point the error would be reading nothing at all. Possible we could employ a funny trick here: Just try to read 1 element of a 2**32+1 sized dtype from a short file (not empty with the +1).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Playing with this a bit, something really strange/bad happens somewhere--I can retrieve items from the reconstituted array quickly, but attempting to perform assertions on even those individual elements is incredibly slow. For example, item = actual["x"][0][1].item() # fast
assert item > 0 # prohibitively slowIt is only slow if it is destined to fail (!), with or without the I didn't find any advantage to using For now, I've pushed in a revision to
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think I just skipped over that it seemed to me that we pass
In general, or in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yeah, thanks, looks like suppressing the traceback allows the failure scenario to happen in a second or two (when type change is reverted): That's annoying that it doesn't auto truncate/summarize. This also "works" though is not very nice: --- a/numpy/_core/tests/test_records.py
+++ b/numpy/_core/tests/test_records.py
@@ -122,7 +122,9 @@ def test_recarray_fromfile_massive(self, tmpdir):
actual = np.fromfile("f.data", dtype=kind_dtype)
assert actual.itemsize == 2 ** 29 * 8
item = actual["x"][0][1]
- assert_allclose(item, 1)
+ if item != 1:
+ # avoid hang from pytest traceback dumping massive array
+ pytest.fail("fromfile elsize error", pytrace=False)I'll leave this one alone for now--seems to be getting closer to something sensible perhaps... |
||
|
|
||
| def test_recarray_from_obj(self): | ||
| count = 10 | ||
| a = np.zeros(count, dtype='O') | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.