Skip to content

Fix setting of NULL values in logical FITS columns#19694

Open
astrofrog wants to merge 1 commit into
astropy:mainfrom
astrofrog:fix-18755-vla-logical-validation
Open

Fix setting of NULL values in logical FITS columns#19694
astrofrog wants to merge 1 commit into
astropy:mainfrom
astrofrog:fix-18755-vla-logical-validation

Conversation

@astrofrog
Copy link
Copy Markdown
Member

@astrofrog astrofrog commented May 6, 2026

While it is possible to set logical column to byte string arrays (e.g. np.array([b"F", b"\x00", b"T"]), things silently fail or error when trying to pass python None or masked arrays - e.g. using:

col6 = fits.Column(name="f", format="PL", array=[[None, False, True]])

as mentioned in #18755.

For example when trying to pass a list with True/False/None:

In [1]: import numpy as np

In [2]: from astropy.io import fits

In [3]: col = fits.Column(name="flag", format="PL", array=[[None, False, True]])

In [4]: fits.BinTableHDU.from_columns([col]).writeto("/tmp/case1.fits", overwrite=True)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 fits.BinTableHDU.from_columns([col]).writeto("/tmp/case1.fits", overwrite=True)

File ~/Code/Astropy/astropy/astropy/io/fits/hdu/table.py:156, in _TableLikeHDU.from_columns(cls, columns, header, nrows, fill, character_as_bytes, logical_as_bytes, **kwargs)
     98 """
     99 Given either a `ColDefs` object, a sequence of `Column` objects,
    100 or another table HDU or table data (a `FITS_rec` or multi-field
   (...)    153 ``__init__`` may also be passed in as keyword arguments.
    154 """
    155 coldefs = cls._columns_type(columns)
--> 156 data = FITS_rec.from_columns(
    157     coldefs,
    158     nrows=nrows,
    159     fill=fill,
    160     character_as_bytes=character_as_bytes,
    161     logical_as_bytes=logical_as_bytes,
    162 )
    163 hdu = cls(
    164     data=data,
    165     header=header,
   (...)    168     **kwargs,
    169 )
    170 coldefs._add_listener(hdu)

File ~/Code/Astropy/astropy/astropy/io/fits/fitsrec.py:412, in FITS_rec.from_columns(cls, columns, nrows, fill, character_as_bytes, logical_as_bytes)
    410         continue
    411 elif isinstance(recformat, _FormatP):
--> 412     data._cache_field(name, _makep(inarr, field, recformat, nrows=nrows))
    413     continue
    414 # TODO: Find a better way of determining that the column is meant
    415 # to be FITS L formatted

File ~/Code/Astropy/astropy/astropy/io/fits/column.py:2295, in _makep(array, descr_output, format, nrows)
   2288     data_output[idx] = np.asarray(rowval, dtype="S1")
   2289 elif is_logical:
   2290     # Route through int8 first so non-numeric/non-bool inputs
   2291     # (strings, None, ...) raise at write time, matching the
   2292     # behavior of astropy <= 7.2.0; bypassing this and using
   2293     # ``astype(bool)`` directly would silently coerce e.g.
   2294     # ``["T", "F"]`` to ``[True, True]``.
-> 2295     data_output[idx] = np.array(rowval, dtype=np.int8).astype(bool, copy=False)
   2296 else:
   2297     data_output[idx] = np.array(rowval, dtype=format.dtype)

TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

Or a masked array, which would make sense:

In [5]: col = fits.Column(name="flag", format="PL", array=[np.ma.masked_array([False, False, True], mask=[True, False, False])])

In [6]: fits.BinTableHDU.from_columns([col]).writeto("/tmp/case2.fits", overwrite=True)

In [7]: fits.open("/tmp/case2.fits", logical_as_bytes=True)[1].data["flag"][0].tobytes()
Out[7]: b'FFT'

With this PR, both cases are now handled correctly:

In [1]: import numpy as np

In [2]: from astropy.io import fits
^[[A
In [3]: col = fits.Column(name="flag", format="PL", array=[[None, False, True]])

In [4]: fits.BinTableHDU.from_columns([col]).writeto("/tmp/case1.fits", overwrite=True)

In [5]: fits.open("/tmp/case1.fits", logical_as_bytes=True)[1].data["flag"][0].tobytes()
Out[5]: b'\x00FT'

In [6]: col = fits.Column(name="flag", format="PL", array=[np.ma.masked_array([False, False, True], mask=[True, False, False])])

In [7]: fits.BinTableHDU.from_columns([col]).writeto("/tmp/case2.fits", overwrite=True)

In [8]: fits.open("/tmp/case2.fits", logical_as_bytes=True)[1].data["flag"][0].tobytes()
Out[8]: b'\x00FT'

This fix applies to both the VLAs and fixed L columns.

@saimn - do you agree this is desirable? If so, I'm flexible as to whether we call this a bugfix or a feature - I'd lean towards bugfix to avoid people running into issues in the 8.0.x branch, but I'm happy to defer to your call.

This shouldn't be backported to v7.2.x, only v8.0.x.

  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@astrofrog
Copy link
Copy Markdown
Member Author

This is not yet ready for review

@astrofrog astrofrog force-pushed the fix-18755-vla-logical-validation branch 3 times, most recently from 918bb4f to 16455da Compare May 7, 2026 09:12
@astrofrog astrofrog added Bug backport-v8.0.x on-merge: backport to v8.0.x labels May 7, 2026
@astrofrog astrofrog force-pushed the fix-18755-vla-logical-validation branch 2 times, most recently from 81daa26 to 51854c8 Compare May 7, 2026 09:50
@astrofrog astrofrog added this to the v8.0.0 milestone May 7, 2026
@astrofrog astrofrog force-pushed the fix-18755-vla-logical-validation branch from 51854c8 to cc908ea Compare May 7, 2026 09:54
@astrofrog astrofrog added the Extra CI Run cron CI as part of PR label May 7, 2026
@astrofrog astrofrog force-pushed the fix-18755-vla-logical-validation branch from cc908ea to 5bcba81 Compare May 7, 2026 10:05
@astrofrog astrofrog changed the title Remaining VLA/logical fixes Fix setting of NULL values in logical FITS columns May 7, 2026
@astrofrog astrofrog marked this pull request as ready for review May 7, 2026 10:08
@astrofrog astrofrog requested a review from saimn as a code owner May 7, 2026 10:08
@astrofrog astrofrog force-pushed the fix-18755-vla-logical-validation branch from 5bcba81 to a3547a0 Compare May 7, 2026 10:19
@astrofrog
Copy link
Copy Markdown
Member Author

This is now ready for review

@pllim
Copy link
Copy Markdown
Member

pllim commented May 15, 2026

Should we get this in for the beta?

@astrofrog
Copy link
Copy Markdown
Member Author

No because I don't want to rush it and I want @saimn to review.

@kYwzor
Copy link
Copy Markdown
Member

kYwzor commented May 16, 2026

I've played around with this branch a bit. It seems to indeed enable us to write NULL values, which is great. By pure chance, I ended up finding a weird behavior that I'd classify as a bug. Running this:

from astropy.io import fits

col1 = fits.Column(name="a", format="PL", array=[[True, False, True]])
col2 = fits.Column(name="b", format="PL", array=[[None, False, True]])
col3 = fits.Column(name="c", format="L", array=[None, True, False])
col4 = fits.Column(name="d", format="L", array=[None])
col5 = fits.Column(name="e", format="L", array=[True])
col6 = fits.Column(name="f", format="D", array=[1.0,2.0])
col7 = fits.Column(name="g", format="PD", array=[[1.0,2.0]])
tab = fits.BinTableHDU.from_columns(name="test", columns=[col1, col2, col3, col4, col5, col6, col7])

testfile = "example.fits"
tab.writeto(testfile, overwrite=True)
t = fits.getdata(testfile, logical_as_bytes=True)
print(repr(t))

We get :

FITS_rec([([b'T', b'F', b'T'], [b'', b'F', b'T'], b'', b'', b'T', 1., [1.0, 2.0]),
          ([b'F', b'F', b'F'], [b'', b'', b''], b'T', b'', b'', 2., [0.0, 0.0]),
          ([b'F', b'F', b'F'], [b'', b'', b''], b'F', b'', b'', 0., [0.0, 0.0])],
         dtype=(numpy.record, [('a', '>i4', (2,)), ('b', '>i4', (2,)), ('c', 'i1'), ('d', 'i1'), ('e', 'i1'), ('f', '>f8'), ('g', '>i4', (2,))]))

I wasn't aware of this, but apparently io.fits has some sort of mechanism that, when one of the columns has more rows than the others, those other columns are silently expanded to match the number of rows (not related to this PR, I verified the same occurs in main). However, this "expansion" is a bit weird on PL columns: if a column only contains True/False values, new rows are created with all their values set to False, while if a column contains NULL values, new rows are created with all their values set to NULL. Meanwhile, L columns always expand with NULL.

I am unsure which option (filling-in with False or NULL) is the correct one, but clearly it should be consistent and only use one those options. My gut instinct says to always fill in with NULL, as that's the most neutral option (and matches will with the concept of masking), but I've also noticed that float values are getting filled in as 0 rather than NaN, so clearly this is not very well defined.

Side note: I'm doing print(repr(t)) because if I attempt to simply do print(t) it will result in a crash. I think this is also a bug, but it was introduced in #19374

@kYwzor
Copy link
Copy Markdown
Member

kYwzor commented May 16, 2026

Just to note it down, this doesn't address the astropy.table case yet (but that would probably make more sense in a separate PR anyway).

E.g. running:

import numpy as np

from astropy.io import fits
from astropy.table import Table

testfile = "testfile.fits"

t_w = Table(rows=[[np.ma.masked],[False],[True]],names=["a"])
t_w.write(testfile, overwrite=True)
print(repr(fits.getdata(testfile, logical_as_bytes=True)))

prints out

FITS_rec([(b'T',), (b'F',), (b'T',)],
         dtype=(numpy.record, [('a', 'i1')]))

i.e. the masked value got silently converted to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-v8.0.x on-merge: backport to v8.0.x Bug Extra CI Run cron CI as part of PR io.fits

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants