Skip to content

datetime.fromisoformat() (C) drops the sub-second part of a UTC offset, breaking round-trips #152079

Description

@tonghuaroot

Bug report

Bug description:

The C implementation of datetime.fromisoformat() silently drops the sub-second part of a
UTC offset whenever the offset's whole-second part is zero, returning timezone.utc. This
breaks round-trips: a value the module itself produces via isoformat() is not parsed
back faithfully.

>>> from datetime import datetime, timezone, timedelta
>>> dt = datetime(2020, 6, 15, 12, 34, 56, tzinfo=timezone(timedelta(microseconds=1)))
>>> s = dt.isoformat()
>>> s
'2020-06-15T12:34:56+00:00:00.000001'
>>> datetime.fromisoformat(s).utcoffset()      # C accelerator
datetime.timedelta(0)                          # the 1-microsecond offset is gone

The pure-Python implementation is correct:

>>> import _pydatetime
>>> _pydatetime.datetime.fromisoformat(s).utcoffset()
datetime.timedelta(microseconds=1)

The negative case '...-00:00:00.000001' is dropped the same way.

This is not a non-standard format: isoformat() is documented to emit
+HH:MM[:SS[.ffffff]] and fromisoformat() is documented to accept it ("Time zone
offsets may have fractional seconds"), so this is documented round-trippable data being
silently dropped.

Root cause

Modules/_datetimemodule.c, tzinfo_from_isoformat_results() short-circuits to UTC on the
whole-second offset alone, ignoring the parsed sub-second component:

// Create a timezone from offset in seconds (0 returns UTC)
if (tzoffset == 0) {
    return Py_NewRef(CONST_UTC(NO_STATE));
}

When tzoffset == 0 but tz_useconds != 0, the sub-second part is discarded. The
pure-Python implementation checks all offset components before collapsing to UTC.

Suggested fix

Only short-circuit to UTC when both the whole-second and sub-second parts are zero:

if (tzoffset == 0 && tz_useconds == 0) {
    return Py_NewRef(CONST_UTC(NO_STATE));
}

The fall-through path already builds the correct timezone(timedelta(...)). A plain
+00:00 offset still returns timezone.utc. I confirmed against a full C-vs-pure-Python
differential that the only inputs whose behaviour changes are exactly these zero-whole-
second sub-second offsets, with no other divergence introduced. I have a patch and a
round-trip regression test (running under both implementations) ready.

This is a sibling to #152060, a separate fromisoformat() defect in the pure-Python
implementation.

CPython versions tested on:

3.16 (main, built from source)

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirtype-bugAn unexpected behavior, bug, or error
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions