Skip to content

gh-152157: Reject empty fraction before timezone in C fromisoformat#152161

Open
tonghuaroot wants to merge 1 commit into
python:mainfrom
tonghuaroot:fix-c-fromisoformat-empty-fraction-tz
Open

gh-152157: Reject empty fraction before timezone in C fromisoformat#152161
tonghuaroot wants to merge 1 commit into
python:mainfrom
tonghuaroot:fix-c-fromisoformat-empty-fraction-tz

Conversation

@tonghuaroot

Copy link
Copy Markdown
Contributor

Fixes gh-152157

datetime.fromisoformat() and time.fromisoformat() accepted, in the C accelerator, a
decimal separator (. or ,) followed by zero fractional digits when a timezone
designator (+/-/Z) came next, e.g. '2020-01-01T12:34:56.+05:00' or '12:34:56.Z'.
The pure-Python implementation already raised ValueError for these. ISO 8601 §4.2.2.4 and
RFC 3339 §5.6 require at least one digit after the decimal sign, so the C side was the
lenient/incorrect one.

Divergence (before this PR)

Input C accelerator Pure-Python
'2020-01-01T12:34:56.+05:00' parses, microsecond=0 ValueError
'2020-01-01T12:34:56.Z' parses, microsecond=0 ValueError
'2020-01-01T12:34:56,+05:00' parses, microsecond=0 ValueError
'12:34:56.+05:00' parses, microsecond=0 ValueError
'12:34:56,+05:00' parses, microsecond=0 ValueError
'12:34:56.Z' parses, microsecond=0 ValueError

After this PR both implementations raise ValueError for all of the above, and all
previously-accepted valid strings (.5, .123456, comma fractions, no-fraction +offset,
Z) parse identically on both.

Fix

In parse_hh_mm_ss_ff() the decimal-separator case is now handled before the generic
end-of-substring check, so a separator with no following digit is rejected instead of being
treated as trailing content. The standalone trailing separator ('12:34:56.') remains
rejected. No change to the valid-fraction or no-fraction paths.

Note for reviewers

The spec requires ≥1 digit after the decimal sign, and the C implementation already rejects
a standalone trailing separator, so this PR fixes C to reject. The alternative direction
(making pure-Python lenient) is possible but contradicts both the spec and C's own
end-of-string behavior; happy to flip the approach if you'd rather standardize on lenient.

Tests

Added the four empty-fraction-before-tz inputs (offset, negative offset, Z, comma) to the
datetime and time fromisoformat failure tests, which run against both the C and
pure-Python implementations. Positive parity (valid fractions, comma fractions, and
no-fraction-with-offset) is already covered by the existing *_examples tests, which also
run on both implementations.

./python.exe -m test test_datetime passes (both impls), with no refleaks for the touched
tests.

…rmat

The C accelerator for datetime.fromisoformat() and time.fromisoformat()
accepted a decimal separator (. or ,) with no following digit when a
timezone designator came next, e.g. '12:34:56.+05:00', while the
pure-Python implementation correctly raised ValueError. Handle the
decimal-separator case before the generic end-of-substring check so an
empty fraction is rejected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

datetime/time.fromisoformat: C accelerator accepts an empty fraction before a timezone designator

1 participant