fix(codice): normalize epoch_delta metadata and epoch links#3231
Conversation
Normalize CoDICE `epoch_delta_minus` and `epoch_delta_plus` across the five L2 products that currently emit them: - `imap_codice_l2_hi-omni` - `imap_codice_l2_hi-sectored` - `imap_codice_l2_lo-sw-species` - `imap_codice_l2_hi-direct-events` - `imap_codice_l2_lo-direct-events` Before this change, the products were split across three inconsistent states: - `hi-omni` already wrote `epoch_delta_*` as `CDF_INT8`, but the final output `epoch` variable did not retain `DELTA_MINUS_VAR` / `DELTA_PLUS_VAR` - `lo-sw-species` still wrote `epoch_delta_*` as floating-point values with float-style metadata, so SPDF accepted the file even though it did not follow the intended TT2000-style integer-duration convention - direct-events still wrote `epoch_delta_*` as `CDF_DOUBLE` while the L2 attrs were already trying to describe them as integer support variables, which caused SPDF validator failures on datatype, format, and fill semantics Fix the shared CoDICE timing source so delta durations are computed as exact integer nanoseconds. Specifically: - update `get_codice_epoch_time()` to compute `delta_times` with integer arithmetic end-to-end - keep the center-time calculation in floating-point seconds, but feed it integer `delta_times` - avoid float-to-int conversion entirely, so the written delta variables naturally serialize as `CDF_INT8` Also normalize the CoDICE `epoch_delta_*` metadata to one shared integer-duration convention: - `FILLVAL = -9223372036854775808` - `FORMAT = I19` - `VALIDMIN = 0` - `VALIDMAX = 128000000000` The chosen `VALIDMAX` corresponds to the current CoDICE timing model: - `delta = num_spins * spin_period / 2` - `spin_period VALIDMAX = 16 s` - `num_spins max = 16` - worst-case delta = `128 s = 128000000000 ns` Apply that metadata in: - the shared CoDICE L1A attrs - the explicit L2 `epoch_delta_*` overrides for: - hi-omni - hi-sectored - hi-direct-events - lo-direct-events Finally, attach the delta-variable links to the final emitted `epoch` variable in all five target L2 products: - `DELTA_MINUS_VAR = "epoch_delta_minus"` - `DELTA_PLUS_VAR = "epoch_delta_plus"` This is done on the final L2 dataset rather than on intermediate source datasets so the links survive writeout consistently. Extend the written-CDF regression coverage to verify: - representative CoDICE L1A products now write `epoch_delta_*` as `CDF_INT8` - `hi-omni` and `hi-sectored` write `CDF_INT8` deltas with the normalized metadata and carry the `epoch` delta links - `lo-sw-species`, `hi-direct-events`, and `lo-direct-events` do the same Validated locally with: - `pytest -q imap_processing/tests/codice/test_codice_l1a.py` - `pytest -q imap_processing/tests/codice/test_codice_hi_l2.py` - `pytest -q imap_processing/tests/codice/test_codice_l2.py -k 'sw_species or lo_de or hi_de'` - `pre-commit` on the touched CoDICE files
tmplummer
left a comment
There was a problem hiding this comment.
Just two comments to respond to.
| return dataset | ||
|
|
||
|
|
||
| def _attach_epoch_delta_links(dataset: xr.Dataset) -> xr.Dataset: |
There was a problem hiding this comment.
Is there a reason that this isn't done in the variable attrs yaml? I believe that the underlying functionality allows you to specify only what you need and the uses dict.update() so that existing attributes don't get lost. For example, here is an L2 Map epoch definition:
epoch:
BIN_LOCATION: 0
CATDESC: Map time range start, number of nanoseconds since J2000 with leap seconds included
DELTA_MINUS_VAR: epoch_delta_minus
DELTA_PLUS_VAR: epoch_delta
FIELDNAM: J2000 Nanoseconds
There was a problem hiding this comment.
Good point. Didn't quite understand that was how it worked.
I moved the DELTA_MINUS_VAR / DELTA_PLUS_VAR links into the CoDICE L2 variable-attrs YAMLs and removed the in-code epoch attr mutation.
One small follow-up was needed for lo-sw-species: unlike the other affected CoDICE L2 products, that path was carrying forward the loaded epoch coordinate through the intensity calculations without later updating its attrs from the L2 attr manager. As a result, after moving the delta-link definitions into YAML, lo-sw-species was not picking them up automatically in the final written CDF.
I fixed that by adding the same final epoch attr update pattern already used elsewhere, so the YAML-defined DELTA_MINUS_VAR / DELTA_PLUS_VAR links are applied consistently before writeout.
|
|
||
| pytestmark = pytest.mark.external_test_data | ||
|
|
||
| EXPECTED_EPOCH_DELTA_VALIDMAX = 128000000000 |
There was a problem hiding this comment.
Can you add a comment about where this value comes from?
There was a problem hiding this comment.
Added a short comment in the test explaining where 128000000000 comes from.
It is derived from the current CoDICE timing model: epoch_delta = num_spins * spin_period / 2, with spin_period VALIDMAX = 16 s and num_spins max = 16, which gives a worst-case delta of 128 s = 128000000000 ns.
Move the CoDICE `epoch` delta-link metadata out of `codice_l2.py` and into the product variable-attrs YAMLs, as suggested in review. This keeps the `DELTA_MINUS_VAR` / `DELTA_PLUS_VAR` links declared in the same metadata layer as the rest of the emitted CDF attributes, instead of mutating the final `epoch` attrs in code. The affected products are: - `imap_codice_l2_hi-omni` - `imap_codice_l2_hi-sectored` - `imap_codice_l2_lo-sw-species` - `imap_codice_l2_hi-direct-events` - `imap_codice_l2_lo-direct-events` Also remove the now-unneeded in-code helper that attached those attrs to the final dataset. One follow-up change is needed in the `lo-sw-species` processing path: after the intensity calculation, refresh the `epoch` attrs from the L2 attr manager so the YAML-defined delta links are carried into the final written file. The other CoDICE L2 paths were already reapplying `epoch` attrs before writeout. Finally, add a short test comment explaining the provenance of `EXPECTED_EPOCH_DELTA_VALIDMAX = 128000000000`, since that bound is derived from the current CoDICE timing model rather than from the raw `int64` storage limit. Validated locally with: - `pytest -q imap_processing/tests/codice/test_codice_l1a.py` - `pytest -q imap_processing/tests/codice/test_codice_hi_l2.py` - `pytest -q imap_processing/tests/codice/test_codice_l2.py -k 'sw_species or lo_de or hi_de'` - `pre-commit run --files ...`
c7fa4a8
into
IMAP-Science-Operations-Center:dev
Summary
Normalize CoDICE
epoch_delta_minusandepoch_delta_plusacross the reviewed L2 products so they are written as integer nanosecond durations and linked fromepochin the final emitted CDFs.This PR is part of the broader ISTP / metadata cleanup tracked in issue #2577. Within that larger effort, this branch fixes the remaining CoDICE
epoch_delta_*datatype and linkage inconsistencies that were still showing up across the current L2 products.Scope:
imap_codice_l2_hi-omniimap_codice_l2_hi-sectoredimap_codice_l2_lo-sw-speciesimap_codice_l2_hi-direct-eventsimap_codice_l2_lo-direct-eventsWhy
Before this change, the CoDICE products were split across several inconsistent states:
epoch_delta_*asCDF_INT8epoch.DELTA_MINUS_VAR/DELTA_PLUS_VARlinks entirelyThat produced a mix of:
epochmetadata across otherwise similar CoDICE L2 filesSo within the broader cleanup under #2577, this PR makes the CoDICE
epoch_delta_*variables consistent at the file level.What Changed
Compute
epoch_delta_*as integer nanosecondsUpdate the shared CoDICE timing helper so
epoch_delta_minusandepoch_delta_plusare computed using exact integer nanosecond arithmetic and returned asnp.int64.This avoids:
Normalize shared
epoch_delta_*metadataUpdate the shared CoDICE metadata so
epoch_delta_minusandepoch_delta_plususe integer-duration CDF metadata:FILLVAL = -9223372036854775808FORMAT = I19VALIDMIN = 0VALIDMAX = 128000000000The
VALIDMAXhere is based on the current CoDICE timing model rather than the fullint64storage range. In the current logic:epoch_delta = num_spins * spin_period / 2spin_period VALIDMAX = 16 snum_spins max = 16So the worst-case duration is:
16 * 16 s / 2 = 128 swhich gives:
128 s = 128000000000 nsThat is why this PR uses a physically meaningful duration bound rather than
9223372036854775807.Normalize product-specific direct-event / Hi overrides
Update the explicit
epoch_delta_*L2 overrides in the affected CoDICE attr templates so they all use the same integer-duration metadata.This removes the prior split where:
Add
epochdelta-variable links on final emitted outputsEnsure the final emitted
epochvariable carries:DELTA_MINUS_VAR = "epoch_delta_minus"DELTA_PLUS_VAR = "epoch_delta_plus"for all five target L2 products.
This makes the final written files consistent, including the products that previously wrote valid
epoch_delta_*variables but lost theepochlinkage.Testing
Extended the written-CDF CoDICE regression coverage to verify:
epoch_delta_*asCDF_INT8hi-omniandhi-sectoredwriteCDF_INT8deltas with the normalized metadata and carry theepochdelta linkslo-sw-species,hi-direct-events, andlo-direct-eventsdo the sameValidated locally with:
pytest -q imap_processing/tests/codice/test_codice_l1a.py pytest -q imap_processing/tests/codice/test_codice_hi_l2.py pytest -q imap_processing/tests/codice/test_codice_l2.py -k 'sw_species or lo_de or hi_de' pre-commit run --files \ imap_processing/cdf/config/imap_codice_l1a_variable_attrs.yaml \ imap_processing/cdf/config/imap_codice_l2-hi-direct-events_variable_attrs.yaml \ imap_processing/cdf/config/imap_codice_l2-hi-omni_variable_attrs.yaml \ imap_processing/cdf/config/imap_codice_l2-hi-sectored_variable_attrs.yaml \ imap_processing/cdf/config/imap_codice_l2-lo-direct-events_variable_attrs.yaml \ imap_processing/codice/codice_l2.py \ imap_processing/codice/utils.py \ imap_processing/tests/codice/test_codice_hi_l2.py \ imap_processing/tests/codice/test_codice_l1a.py \ imap_processing/tests/codice/test_codice_l2.py