Skip to content

Tracing JIT: PHP-FPM worker segfaults on first request of a Symfony 8.1 application #22558

Description

@javiereguiluz

Description

A PHP-FPM worker segfaults while serving the first request (cold application cache) of a Symfony 8.1 application when the tracing JIT is enabled (opcache.jit=1235). The worker dies mid-request, FPM logs child exited on signal 11 (SIGSEGV), and the respawned worker serves the same request fine.

Reproduced on the official Docker images (unmodified builds):

PHP build Arch JIT Result
php:8.5.7-fpm (official) x86_64 1235 / 256M SIGSEGV on first request
php:8.5.7-fpm (official) aarch64 1235 / 128M SIGSEGV on first request
php:8.4-fpm = 8.4.22 (official) x86_64 1235 / 256M SIGSEGV on first request
php:8.5.7-fpm (official) x86_64 JIT disabled OK
Same app, previous release (Symfony 8.0) x86_64 1235 / 256M OK

It also fails deterministically in GitHub Actions (shivammathur/setup-php builds, which enable the JIT by default). It broke symfony/demo's daily E2E workflow every day since the Symfony 8.1 version of the app was released, for example:

https://github.com/symfony/demo/actions/runs/28214827138

dmesg shows:

php-fpm8.5[5018]: segfault at 7f273e598ffc ip 00005581c66e35a6 sp 00007ffc56209420 error 4 in php-fpm8.5[2e35a6,5581c6537000+4da000]

Steps to reproduce

# 1. Create the application (Symfony Demo 3.1, open source)
docker run --rm -v "$PWD":/work -w /work composer:latest create-project symfony/symfony-demo:v3.1.0 demo

# 2. JIT configuration (128M so it also works on aarch64, where >128M disables the JIT)
printf 'opcache.enable=1\nopcache.jit=1235\nopcache.jit_buffer_size=128M\n' > jit.ini

# 3. Start PHP-FPM with the JIT enabled
docker run -d --name jit-crash \
  -v "$PWD/demo:/srv/demo" \
  -v "$PWD/jit.ini:/usr/local/etc/php/conf.d/jit.ini" \
  php:8.5.7-fpm

# 4. Ensure a cold app cache and send one FastCGI request for "GET /"
docker exec jit-crash bash -c '
  apt-get update -qq && apt-get install -y -qq libfcgi-bin > /dev/null
  rm -rf /srv/demo/var/cache && mkdir -p /srv/demo/var/log && chmod -R 777 /srv/demo/var
  SCRIPT_FILENAME=/srv/demo/public/index.php REQUEST_METHOD=GET REQUEST_URI=/ \
  SCRIPT_NAME=/index.php SERVER_NAME=127.0.0.1 SERVER_PORT=8000 SERVER_PROTOCOL=HTTP/1.1 \
  cgi-fcgi -bind -connect 127.0.0.1:9000 > /tmp/resp.txt
  echo "cgi-fcgi exit code: $?"    # 104: worker closed the connection mid-response
'

# 5. Confirm the segfault
docker logs jit-crash 2>&1 | grep signal
# [pool www] child 7 exited on signal 11 (SIGSEGV) after 44.016617 seconds from start

Expected: the request returns the demo homepage (as it does with the JIT
disabled, or on any later request).

Actual: cgi-fcgi exits with code 104 and the FPM log shows the worker
exited on SIGSEGV.

Notes

  • The first request with a cold var/cache compiles the Symfony container and Twig templates. That's the workload that gets hot enough to trigger trace compilation. Requests against a warm cache never crash, and after one crash the respawned worker serves everything fine, so the crash happens exactly once per cold start.
  • The previous release of the same application (symfony/symfony-demo:v3.0.2, Symfony 8.0) does not trigger the crash under identical JIT settings, so the trigger is somewhere in the code paths new in Symfony 8.1.
  • I attempted a backtrace. gdb catches the SIGSEGV but the official images are stripped and the crashing frames are in JIT-emitted code, so all frames show ??. Happy to re-run against a debug build or bisect with opcache.jit_bisect_limit if that helps.

PHP Version

PHP 8.4 and 8.5.

Operating System

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions