Bug report
Bug description:
The C implementation of datetime.fromisoformat() silently drops the sub-second part of a
UTC offset whenever the offset's whole-second part is zero, returning timezone.utc. This
breaks round-trips: a value the module itself produces via isoformat() is not parsed
back faithfully.
>>> from datetime import datetime, timezone, timedelta
>>> dt = datetime(2020, 6, 15, 12, 34, 56, tzinfo=timezone(timedelta(microseconds=1)))
>>> s = dt.isoformat()
>>> s
'2020-06-15T12:34:56+00:00:00.000001'
>>> datetime.fromisoformat(s).utcoffset() # C accelerator
datetime.timedelta(0) # the 1-microsecond offset is gone
The pure-Python implementation is correct:
>>> import _pydatetime
>>> _pydatetime.datetime.fromisoformat(s).utcoffset()
datetime.timedelta(microseconds=1)
The negative case '...-00:00:00.000001' is dropped the same way.
This is not a non-standard format: isoformat() is documented to emit
+HH:MM[:SS[.ffffff]] and fromisoformat() is documented to accept it ("Time zone
offsets may have fractional seconds"), so this is documented round-trippable data being
silently dropped.
Root cause
Modules/_datetimemodule.c, tzinfo_from_isoformat_results() short-circuits to UTC on the
whole-second offset alone, ignoring the parsed sub-second component:
// Create a timezone from offset in seconds (0 returns UTC)
if (tzoffset == 0) {
return Py_NewRef(CONST_UTC(NO_STATE));
}
When tzoffset == 0 but tz_useconds != 0, the sub-second part is discarded. The
pure-Python implementation checks all offset components before collapsing to UTC.
Suggested fix
Only short-circuit to UTC when both the whole-second and sub-second parts are zero:
if (tzoffset == 0 && tz_useconds == 0) {
return Py_NewRef(CONST_UTC(NO_STATE));
}
The fall-through path already builds the correct timezone(timedelta(...)). A plain
+00:00 offset still returns timezone.utc. I confirmed against a full C-vs-pure-Python
differential that the only inputs whose behaviour changes are exactly these zero-whole-
second sub-second offsets, with no other divergence introduced. I have a patch and a
round-trip regression test (running under both implementations) ready.
This is a sibling to #152060, a separate fromisoformat() defect in the pure-Python
implementation.
CPython versions tested on:
3.16 (main, built from source)
Operating systems tested on:
macOS
Linked PRs
Bug report
Bug description:
The C implementation of
datetime.fromisoformat()silently drops the sub-second part of aUTC offset whenever the offset's whole-second part is zero, returning
timezone.utc. Thisbreaks round-trips: a value the module itself produces via
isoformat()is not parsedback faithfully.
The pure-Python implementation is correct:
The negative case
'...-00:00:00.000001'is dropped the same way.This is not a non-standard format:
isoformat()is documented to emit+HH:MM[:SS[.ffffff]]andfromisoformat()is documented to accept it ("Time zoneoffsets may have fractional seconds"), so this is documented round-trippable data being
silently dropped.
Root cause
Modules/_datetimemodule.c,tzinfo_from_isoformat_results()short-circuits to UTC on thewhole-second offset alone, ignoring the parsed sub-second component:
When
tzoffset == 0buttz_useconds != 0, the sub-second part is discarded. Thepure-Python implementation checks all offset components before collapsing to UTC.
Suggested fix
Only short-circuit to UTC when both the whole-second and sub-second parts are zero:
The fall-through path already builds the correct
timezone(timedelta(...)). A plain+00:00offset still returnstimezone.utc. I confirmed against a full C-vs-pure-Pythondifferential that the only inputs whose behaviour changes are exactly these zero-whole-
second sub-second offsets, with no other divergence introduced. I have a patch and a
round-trip regression test (running under both implementations) ready.
This is a sibling to #152060, a separate
fromisoformat()defect in the pure-Pythonimplementation.
CPython versions tested on:
3.16 (main, built from source)
Operating systems tested on:
macOS
Linked PRs