Skip to content

Use Sphinx 1.4.9 for now#15

Closed
methane wants to merge 1 commit intopython:masterfrom
methane:sphinx-1.4
Closed

Use Sphinx 1.4.9 for now#15
methane wants to merge 1 commit intopython:masterfrom
methane:sphinx-1.4

Conversation

@methane
Copy link
Copy Markdown
Member

@methane methane commented Feb 11, 2017

Sphinx 1.5 is more strict.
We should fix them before using Sphinx 1.5 on Travis.

@methane
Copy link
Copy Markdown
Member Author

methane commented Feb 11, 2017

test failed even with sphinx-1.4.9
#16 may fix it.

@methane methane closed this Feb 11, 2017
@methane methane deleted the sphinx-1.4 branch February 11, 2017 02:33
paulmon added a commit to paulmon/cpython that referenced this pull request Jan 10, 2019
gnprice added a commit to gnprice/cpython that referenced this pull request Aug 28, 2019
TODO:
 - news etc.?
 - test somehow?  at least make sure semantic tests are adequate
 - that "older version" path... shouldn't it be MAYBE?
 - mention explicitly in commit message that *this* is the actual
   algorithm from UAX python#15
 - think if there are counter-cases where this is slower.
   If caller treats MAYBE same as NO... e.g. if caller actually just
   wants to normalize?  May need to parametrize and offer both behaviors.

This lets us return a NO answer instead of MAYBE when that's what a
Quick_Check property tells us; or also when that's what the canonical
combining classes tell us, after a Quick_Check property has said "maybe".

At a quick test on my laptop, the existing code takes about 6.7 ms/MB
(so 6.7 ns per byte) when the quick check returns MAYBE and it has to
do the slow comparison:

  $ ./python -m timeit -s 'import unicodedata; s = "\uf900"*500000' -- \
      'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 6.67 msec per loop

With this patch, it gets the answer instantly (78 ns) on the same 1 MB
string:

  $ ./python -m timeit -s 'import unicodedata; s = "\uf900"*500000' -- \
      'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 78 nsec per loop
gnprice added a commit to gnprice/cpython that referenced this pull request Aug 28, 2019
The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX python#15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop
gnprice added a commit to gnprice/cpython that referenced this pull request Aug 29, 2019
benjaminp pushed a commit that referenced this pull request Sep 4, 2019
…H-15558)

The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX #15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop

This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.

With this, that case is actually faster than in master!

$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop

$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 4, 2019
…orithm. (pythonGH-15558)

The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX pythonGH-15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop

This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.

With this, that case is actually faster than in master!

$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop

$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
(cherry picked from commit 2f09413)

Co-authored-by: Greg Price <gnprice@gmail.com>
miss-islington added a commit that referenced this pull request Sep 4, 2019
GH-15558)

The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX GH-15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop

This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.

With this, that case is actually faster than in master!

$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop

$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
(cherry picked from commit 2f09413)

Co-authored-by: Greg Price <gnprice@gmail.com>
lisroach pushed a commit to lisroach/cpython that referenced this pull request Sep 10, 2019
…ithm. (pythonGH-15558)

The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX python#15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop

This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.

With this, that case is actually faster than in master!

$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop

$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
DinoV pushed a commit to DinoV/cpython that referenced this pull request Jan 14, 2020
…ithm. (pythonGH-15558)

The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX python#15.

However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.

Implement the standard's algorithm.  This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.

At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:

  $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  50 loops, best of 5: 4.39 msec per loop

With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:

  $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
      -- 'unicodedata.is_normalized("NFD", s)'
  5000000 loops, best of 5: 58.2 nsec per loop

This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.

With this, that case is actually faster than in master!

$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop

$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
    -- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
emmatyping referenced this pull request in emmatyping/cpython Mar 16, 2020
Now we can also remove `__setstate__`.
nanjekyejoannah added a commit to nanjekyejoannah/cpython that referenced this pull request Dec 1, 2022
16: Warn for specific thread module methods r=ltratt a=nanjekyejoannah

Dont merge until python#13  and  python#14 are merged, some helper code cuts across.

This replaces python#15 

Threading module Notes

Python 2:

```
>>> from thread import get_ident
>>> from threading import get_ident
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name get_ident
>>> import threading
>>> from threading import _get_ident
>>>
```

Python 3:

```
>>> from threading import get_ident
>>> from thread import get_ident
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'thread'
>
```

**Note:**

There is no neutral way of porting

Co-authored-by: Joannah Nanjekye <jnanjekye@python.org>
Eclips4 pushed a commit to Eclips4/cpython that referenced this pull request Nov 17, 2025
This commit updates the build system to automatically detect cargo and
enable/disable _base64 without needing to pass a flag. If cargo is unavailable, _base64 is disabled.

It also updates cpython-sys to use a hand written header (which is what
Linux seems to do) and splits off the parser bindings to be handled in
the future (since the files are included differently).
Eclips4 pushed a commit to Eclips4/cpython that referenced this pull request Jan 1, 2026
Co-authored-by: Emma Smith <emma@emmatyping.dev>
SonicField added a commit to SonicField/cpython that referenced this pull request Apr 19, 2026
borrowed_regs stored PhxRegState* pointers into live_regs.values (flat
hash map). When phx_sm_get_or_create triggered sm_grow (at 16+ entries),
all values relocated to new memory, leaving borrowed_regs with dangling
pointers. Next invalidate_bs_impl read freed memory → SIGSEGV.

C++ unordered_map had reference stability (node-based storage); our C
PhxStateMap (open-addressing) does not.

Fix: store model register keys (void*) instead of value pointers. Look
up PhxRegState by key via phx_sm_get on each access. Model keys are HIR
Register* pointers that live in the graph, not the hash map.

Bug python#15: triggered by nbody benchmark (~28 simultaneous live registers
from nested loops + float temporaries, exceeding initial capacity of 32).
SonicField added a commit to SonicField/cpython that referenced this pull request Apr 19, 2026
phx_rc_kill_registers cached PhxRegState* pointers from phx_sm_get into
a local RegCopy array, then iterated calling phx_rc_kill_register which
calls phx_sm_erase. Erase rehashes subsequent probe-chain entries,
potentially moving them and invalidating cached pointers.

Fix: store model keys and kind (int) in RegCopy, re-lookup PhxRegState
via phx_sm_get before each kill_register call. Same pattern as bug python#15.
SonicField added a commit to SonicField/cpython that referenced this pull request Apr 22, 2026
Per supervisor 2026-04-22 03:06:55Z + theologian 03:07:12Z + pythia python#58:
push 44 introduces the W3 R4 oracle dispatcher in compiler.cpp behind
#ifdef RC_ORACLE. The push-44 nm production-binary check is one-shot —
need a STANDING gate assertion so future compiler.cpp edits cannot
silently leak RC_ORACLE dispatch into production.

Failure mode caught:
  Any future commit that drops, inverts, or accidentally hard-defines
  the #ifdef RC_ORACLE guard would leak the C++ rc_oracle dispatch path
  (linked from libphoenix_rc_oracle.a) into the production python
  binary. Without this assertion, the leak is invisible until the next
  manual nm audit. Same silent-failure class as the cp-||-true loophole
  (catch python#4, push 38) — accepted bad state silently.

Implementation (5 LOC after BINARY_MATCH (clean) ✓):
  RC_ORACLE_LEAK=$(nm $PYTHON | grep -c 'rc_oracle')
  if [ $RC_ORACLE_LEAK -ne 0 ]; then
      echo BINARY_RC_ORACLE_LEAK_DETECTED ...
      exit 1
  fi
  echo BINARY_RC_ORACLE_OK: production binary clean (0 rc_oracle symbols)

Verbatim wording per gatekeeper item python#15 (03:07:25Z):
  - PASS: 'BINARY_RC_ORACLE_OK: production binary clean (0 rc_oracle symbols)'
  - FAIL: 'BINARY_RC_ORACLE_LEAK_DETECTED' + FATAL + exit 1
  - Mirrors BINARY_DIRTY discipline (catch silent failure structurally)

Verification (compile-clean pre-commit):
  bash -n scripts/gate_phoenix.sh: SYNTAX OK
  Inserted at line 120 (immediately after BINARY_MATCH block at line 119).

Bundled into push 44 (rather than standalone push 45) because the
dispatcher lands in this push — the leak-check guards it from day 1
instead of leaving a one-push window where item python#15 isn't enforced.

Push 44 batch grows 3 → 4 commits:
  THIS COMMIT  — gate item python#15 (RC_ORACLE leak assertion)
  63568c0   — W3 Step 5 expansion (4 injection classes + invariant python#7)
  4f591a1   — W3 Step 5 v1 (rc_oracle_self_test.sh)
  a99db92   — W3 Steps 1-4 (scratch lib + dispatcher)

ABBA cap usage: 17 → 18 (4 commits this push).
SonicField added a commit to SonicField/cpython that referenced this pull request Apr 22, 2026
Bug surfaced during Step 5 execution at HEAD 9cbf413:

  /data/users/alexturner/phoenix/cpython/Python/jit/compiler.cpp:136:10:
  error: expected unqualified-id
  ...:138:5: error: use of undeclared identifier 'rc_oracle_run'

Root cause: my push 44 W3 dispatcher (a99db92 Step 3.5, also in
6450421c93 amended → a99db92) declared `extern "C" int rc_oracle_run(...)`
inside the function body. C++17 [dcl.link] forbids linkage-specifications
in block scope — they may only appear at namespace scope.

Production builds (RC_ORACLE undefined) MASKED the bug because the
entire #ifdef block is absent; the linkage-specification only enters
the parser when RC_ORACLE is defined. push 44 gate caught nothing
because no python build defines RC_ORACLE.

Fix: move `extern "C" int rc_oracle_run(void *func);` to file scope
(line 32, between #include block and `namespace jit`), guarded by
the same #ifdef RC_ORACLE so production builds remain unaffected.
Inside the function body, just call rc_oracle_run() (now visible
via the file-scope declaration).

Verification:
  - Production cmake --build phoenix_jit + make python: PASS, 0 errors
    (RC_ORACLE undefined → both forward-decl and dispatcher absent)
  - Out-of-band -DRC_ORACLE=1 compile of compiler.cpp:
    PASS (3.2 MB compiler_rc_oracle.cpp.o), 0 errors
  - nm production python | grep rc_oracle: 0 matches (Item python#15
    falsifier still satisfied)

Push 45 (gate-script fix follow-up). Discovery + fix is itself the
push 44 W3 oracle's first real value: the synthetic-injection
infrastructure surfaced a real linkage bug in the dispatcher that
production gates would never have caught (RC_ORACLE not defined →
block absent → bug invisible).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants