gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148334
gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148334Fidget-Spinner merged 6 commits intopython:mainfrom
Conversation
Co-Authored-By: Victor Stinner <vstinner@python.org>
|
Tested with
|
|
So it seems whether the original bug reproduces on main or not is nondeterministic 😨 . However, I still believe this fix is right. |
I built the Python main branch (without the fix) 3 times in a row from scratch ( I built the Python main branch on Fedora Rawhide (clang 22.1.3) with: On the two first builds, Script: import _testcapi
import functools
def py_call():
return _testcapi.stack_pointer()
def call():
return py_call()
start = _testcapi.stack_pointer()
call = functools.partial(call)
top = call()
diff = start - top
print("stack memory per call: %.1f kB" % (diff / 1024))Output on the first two builds: Output on the third build: Note: I installed clang and its dependencies using |
|
To get a more reproducible output, I tried using Using I built the Python main branch 3 times with these options and I got the same result each time: |
|
Ok, now testing the fix. I built Python with the fix 5 times in a row and
Commands: ./configure --enable-optimizations --with-lto --enable-shared CC=clang LD=clang
time make -j14
LD_LIBRARY_PATH=$PWD ./python -m test -v test_callResult: test_call pass successfully and my script outputs Moreover, the C flags are set as expected in Makefile: |
vstinner
left a comment
There was a problem hiding this comment.
LGTM. The fix works as expected according to my manual tests (see above).
In general, I'm not fan of using a specific compiler option depending on the compiler veresion. In this case, I'm fine with it, it's a good trade-off.
|
Thanks @vstinner for the very thorough testing and review :). |
|
Thanks @Fidget-Spinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…clang 22 (pythonGH-148334) (cherry picked from commit e007631) Co-authored-by: Ken Jin <kenjin@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
|
GH-148349 is a backport of this pull request to the 3.14 branch. |
|
Since the bug occurs randomly on clang 22 (coarse estimation 1/3 builds affected), I double checked clang 21. I built the Python main branch (without the fix) multiple times on Fedora Stable (clang 21.1.8): I failed to reproduce the issue. I confirm that the issue is specific to clang 22.
By the way, we do have Fedora Rawhide Clang buildbots (AArch64, AMD64, PPC64LE, s390x) with Clang 22.1.3, but they don't test the LTO+PGO case:
|
I built the commit 429c1d3 wih clang 21 (Fedora Stable): I failed to reproduce the issue, test_call pass successfully and my script gives |
It seems that on clang-22, the inliner is too aggressive on _PyEval_EvalFrameDefault when on computed goto interpreter. Together with some strange interaction with the stackref buffer, the function requires 40kB of stack space (!!!) versus the usual 1-2kB normally used.
This sets the inline limit to functions of max 512B stack space (1/4th of normal) allowed to be inlined in
ceval.c. I checked the dissasembly and the new function uses about 2kB of stack.python -m test -v test_callsegfault with clang 22.1.3 build #148284📚 Documentation preview 📚: https://cpython-previews--148334.org.readthedocs.build/