Skip to content

Reject non-ASCII digits in JSON numbers#7982

Open
teddygood wants to merge 1 commit into
RustPython:mainfrom
teddygood:fix-json-nonascii-digits
Open

Reject non-ASCII digits in JSON numbers#7982
teddygood wants to merge 1 commit into
RustPython:mainfrom
teddygood:fix-json-nonascii-digits

Conversation

@teddygood
Copy link
Copy Markdown

@teddygood teddygood commented May 26, 2026

Refs #7611

  • Align the Rust _json number scanner with RFC 8259 / CPython behavior by only consuming valid ASCII JSON number grammar.
  • Enable test_nonascii_digits_rejected for both TestPyDecode and TestCDecode, and remove the now-redundant TestCFail.test_failures expected-failure wrapper.

Summary by CodeRabbit

  • Refactor
    • Improved JSON number parsing implementation to enhance RFC 8259 compliance and robustness.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: b16bcf36-4d2d-4ffb-8d0d-bcaf60ec10ed

📥 Commits

Reviewing files that changed from the base of the PR and between fb531ec and ab7fdd3.

⛔ Files ignored due to path filters (2)
  • Lib/test/test_json/test_decode.py is excluded by !Lib/**
  • Lib/test/test_json/test_fail.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • crates/stdlib/src/json.rs

📝 Walkthrough

Walkthrough

The JsonScanner::parse_number method is refactored from a flag-based byte loop to an index-driven RFC 8259-compliant scanner. The new parser enforces strict state transitions: optional leading -, required integer part, optional fractional part only when . is followed by a digit, and optional exponent only when e/E is followed by an optional sign and at least one digit. It determines the parsed byte length and constructs a string slice for downstream float or integer parsing.

Changes

JSON Number Parsing Refactor

Layer / File(s) Summary
RFC 8259-compliant number parser
crates/stdlib/src/json.rs
parse_number replaces boolean flag tracking (has_neg, has_decimal, has_exponent, has_e_sign) with index-based scanning that enforces RFC 8259 number structure: optional minus, required integer (0 or leading nonzero), optional fractional part (only . followed by digit), optional exponent (only e/E with optional sign and digits). Sets is_float based on decimal or exponent presence, then slices and delegates to existing float/int/BigInt parsing.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A scanner hops through JSON bytes,
With index dance and RFC lights,
Minus, digits, dots, and e's—
Each state flows smooth and precise with ease!
No flags to flip, just cursor's march,
Through numbers' strict and RFC arch! 🔢✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Reject non-ASCII digits in JSON numbers' directly and specifically describes the main change: rewriting the JSON number scanner to only accept ASCII characters per RFC 8259, rejecting non-ASCII digits.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] lib: cpython/Lib/json
[ ] test: cpython/Lib/test/test_json (TODO: 8)

dependencies:

  • json (native: _json, decoder, encoder, json.tool, sys)
    • _colorize, argparse, codecs, re

dependent tests: (10 tests)

  • json: test_logging test_plistlib test_subprocess test_sysconfig test_tomllib test_tools test_traceback test_zoneinfo
    • importlib.metadata: test_importlib
    • multiprocessing.resource_tracker: test_concurrent_futures

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

Copy link
Copy Markdown
Contributor

@ShaharNaveh ShaharNaveh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TYSM!
Welcome to the project:)

@ShaharNaveh
Copy link
Copy Markdown
Contributor

@teddygood CI failures are not related to your changes. Github is having issues with github actions.

I'll try to remember to rerun your CI once it's resolved, but if I don't please feel free to ping me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants