Skip to content

PHP exits with status code 139, only on aarch64 with extension opcache enabled #15957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
korridor opened this issue Sep 19, 2024 · 20 comments
Closed

Comments

@korridor
Copy link

korridor commented Sep 19, 2024

Description

I'm the maintainer of a PHP application called solidtime and we use FrankenPHP (with Laravel Octane) for our production image. That works fine, but we wanted to add support for ARM.
The problem when we start the PHP application with the ARM image, it exits with the status code 139.

I already reported this in the FrankenPHP GitHub repository (the issue) and we figured out that this problem only happens if opcache is enabled. Since it looks like an issue with opcache and ARM, the maintainer of FrankenPHP (@dunglas) told me to report this issue here.

PHP Version

8.3.7

Operating System

Debian GNU/Linux 12 (bookworm)

@cmb69
Copy link
Member

cmb69 commented Sep 19, 2024

Is it status code 138 or 139 (the FrankenPHP ticket mentions either); 138 would be SIGBUS, 139 SIGSEGV.

OPcache configuration copied from the FrankenPHP ticket:
opcache.blacklist_filename => no value => no value
opcache.dups_fix => Off => Off
opcache.enable => On => On
opcache.enable_cli => On => On
opcache.enable_file_override => Off => Off
opcache.error_log => no value => no value
opcache.file_cache => 60 => 60
opcache.file_cache_consistency_checks => On => On
opcache.file_cache_only => Off => Off
opcache.file_update_protection => 0 => 0
opcache.force_restart_timeout => 180 => 180
opcache.huge_code_pages => Off => Off
opcache.interned_strings_buffer => 16 => 16
opcache.jit => function => function
opcache.jit_bisect_limit => 0 => 0
opcache.jit_blacklist_root_trace => 16 => 16
opcache.jit_blacklist_side_trace => 8 => 8
opcache.jit_buffer_size => 128M => 128M
opcache.jit_debug => 0 => 0
opcache.jit_hot_func => 127 => 127
opcache.jit_hot_loop => 64 => 64
opcache.jit_hot_return => 8 => 8
opcache.jit_hot_side_exit => 8 => 8
opcache.jit_max_exit_counters => 8192 => 8192
opcache.jit_max_loop_unrolls => 8 => 8
opcache.jit_max_polymorphic_calls => 2 => 2
opcache.jit_max_recursive_calls => 2 => 2
opcache.jit_max_recursive_returns => 2 => 2
opcache.jit_max_root_traces => 2048 => 2048
opcache.jit_max_side_traces => 256 => 256
opcache.jit_max_trace_length => 1024 => 1024
opcache.jit_prof_threshold => 0.001 => 0.001
opcache.lockfile_path => /tmp => /tmp
opcache.log_verbosity_level => 1 => 1
opcache.max_accelerated_files => 32531 => 32531
opcache.max_file_size => 0 => 0
opcache.max_wasted_percentage => 5 => 5
opcache.memory_consumption => 256M => 256M
opcache.opt_debug_level => 0 => 0
opcache.optimization_level => 0x7FFEBFFF => 0x7FFEBFFF
opcache.preferred_memory_model => no value => no value
opcache.preload => no value => no value
opcache.preload_user => no value => no value
opcache.protect_memory => Off => Off
opcache.record_warnings => Off => Off
opcache.restrict_api => no value => no value
opcache.revalidate_freq => 2 => 2
opcache.revalidate_path => Off => Off
opcache.save_comments => On => On
opcache.use_cwd => Off => Off
opcache.validate_permission => Off => Off
opcache.validate_root => Off => Off
opcache.validate_timestamps => Off => Off

Anyhow, can you please provide a stack backtrace?

If that is not possible, try running with opcache.protect_memory=1.

@korridor korridor changed the title PHP exits with status code 138, only on aarch64 with extension opcache enabled PHP exits with status code 139, only on aarch64 with extension opcache enabled Sep 19, 2024
@korridor
Copy link
Author

korridor commented Sep 19, 2024

@cmb69 Thanks for noticing my typo. The real exit code is 139 and I just corrected it in the issues. I'll try the opcache.protect_memory=1 as soon as possible.

@korridor
Copy link
Author

korridor commented Sep 19, 2024

@cmb69 Setting opcache.protect_memory=1 does not seam to change anything. It still fails with exit code 139.

Regarding the stack backtrace, do you have any pointers on how to use that inside of a docker container? I tried to write into /proc/sys/kernel/core_pattern from inside the container as root but I still got a bash: /proc/sys/kernel/core_pattern: Read-only file system error.

@cmb69
Copy link
Member

cmb69 commented Sep 19, 2024

Setting opcache.protect_memory=1 does not seam to change anything. It still fails with exit code 139.

Yeah, right opcache.protect_memory=1 is only useful if you get a core dump.

Regarding the stack backtrace, do you have any pointers on how to use that inside of a docker container?

Maybe https://ddanilov.me/how-to-configure-core-dump-in-docker-container helps?

@iluuu1994
Copy link
Member

You may also try building PHP with --enable-address-sanitizer. Segfaults in C are often detached from their actual cause, meaning they occur significantly later, in a different place, due to random memory corruption. ASan helps catching memory violations where they occur. Alternatively, you may try running it with Valgrind.

@nielsdos
Copy link
Member

Also, you're using the JIT. Does the segfault also happen if you disable the JIT?

@iluuu1994
Copy link
Member

@nielsdos Good catch. I missed that. JIT is almost certainly to blame then.

@korridor
Copy link
Author

@nielsdos I just tried it with JIT disabled, and it still exits with exit code 139.

From php -i:
opcache.jit => disable => disable

@korridor
Copy link
Author

korridor commented Sep 20, 2024

@cmb69 Thanks for the link, I'll try that, but I'm off next week, but I can try it in two weeks!

Copy link

github-actions bot commented Oct 5, 2024

No feedback was provided. The issue is being suspended because we assume that you are no longer experiencing the problem. If this is not the case and you are able to provide the information that was requested earlier, please do so. Thank you.

@korridor
Copy link
Author

korridor commented Oct 7, 2024

@nielsdos @iluuu1994 Ok, so I tested this again. You were right, disabling JIT fixed the problem. (don't know how I came to a different conclusion previously)

Since it would still be great to have JIT working on ARM, I tried to create a core dump, but I'm not sure if I did it correctly.

I followed the guide from the link for Docker setups and after starting the server again I got two new core dump files:

ls -alh /tmp: (only relevant files)

-rw-------  1 root octane 430M Oct  7 17:48 core.thpool-1.57
-rw-------  1 root octane 430M Oct  7 17:48 core.thpool-4.31

Then I ran gdb thpool-1 /tmp/core.thpool-1.57:

thpool-1: No such file or directory.

warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
[New LWP 67]
[New LWP 61]
[New LWP 62]
[New LWP 63]
[New LWP 68]
[New LWP 58]
[New LWP 64]
[New LWP 66]
[New LWP 60]
[New LWP 59]
[New LWP 57]
[New LWP 65]
[New LWP 69]
[New LWP 70]
Core was generated by `/usr/local/bin/frankenphp run -c /var/www/html/vendor/laravel/octane/src/Comman'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000047f6c8 in ?? ()
[Current thread is 1 (LWP 67)]

Then I ran bt:

#0  0x000000000047f6c8 in ?? ()
#1  0x000000000045c6ac in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Since that seamed incorrect to me, I tried the same thing with the other file (core.thpool-4.31).

thpool-4: No such file or directory.

warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
[New LWP 44]
[New LWP 35]
[New LWP 39]
[New LWP 40]
[New LWP 32]
[New LWP 43]
[New LWP 38]
[New LWP 33]
[New LWP 36]
[New LWP 37]
[New LWP 34]
[New LWP 42]
[New LWP 31]
[New LWP 41]
Core was generated by `/usr/local/bin/frankenphp run -c /var/www/html/vendor/laravel/octane/src/Comman'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000047f6c8 in ?? ()
[Current thread is 1 (LWP 44)]
#0  0x000000000047f6c8 in ?? ()
#1  0x000000000045c6ac in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@korridor
Copy link
Author

korridor commented Oct 7, 2024

Ok I think I figured out that that the command needs to be gdb php /tmp/core.thpool-1.57 instead of gdb thpool-1 /tmp/core.thpool-1.57.

This is the new output:

Reading symbols from php...
(No debugging symbols found in php)

warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing

warning: core file may not match specified executable file.
[New LWP 67]
[New LWP 61]
[New LWP 62]
[New LWP 63]
[New LWP 68]
[New LWP 58]
[New LWP 64]
[New LWP 66]
[New LWP 60]
[New LWP 59]
[New LWP 57]
[New LWP 65]
[New LWP 69]
[New LWP 70]
Cannot access memory at address 0xf9403fe7f94053ee
Cannot access memory at address 0xf9403fe7f94053e6
Cannot access memory at address 0xf9403fe7f94053e6
Unsupported JIT protocol version 2433843266 in descriptor (expected 1)
Core was generated by `/usr/local/bin/frankenphp run -c /var/www/html/vendor/laravel/octane/src/Comman'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000047f6c8 in add_assoc_string_ex ()
[Current thread is 1 (LWP 67)]
#0  0x000000000047f6c8 in add_assoc_string_ex ()
#1  0x000000400000000b in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@gazzoy
Copy link

gazzoy commented Dec 11, 2024

Any updates?

@nielsdos
Copy link
Member

nielsdos commented Feb 26, 2025

I debugged this for a long time, through emulation, and after hitting a number of bugs in emulation for arm on intel in qemu/Valgrind that I had to work around, I did find one bug in FrankenPHP. (I don't have native ARM hardware, and I'm not aware of someone who is familiar with JIT who has that hardware, we all use emulation AFAIK). Unfortunately, it's not the one causing your issues. In fact, I can't reproduce your issue in emulation. I would need clearer instructions, and it's also possible that due to emulation I can't realistically hit this issue (i.e. it might be a race condition, we had one of those that I fixed a few months ago for FrankenPHP).

Anyway, here's the bug I found:
If you use FrankenPHP with a zend-signal enabled PHP, you'll get a random memory corruption causing an infinite loop in the TSRM, causing a resource exhaustion in the max timer code, causing a bailout without a bail address set, causing a segfault.
Basically, FrankenPHP never calls zend_signals_startup which means that the "fast offset" in the TSRM will remain 0, which means that the check bool of the "zend signal globals" and the thread field of the tsrm_ls will overlap, so when check is written the thread pointer is corrupted.
However, this code implies that FrankenPHP can support zend-signals:
https://github.com/dunglas/frankenphp/blob/f64c0f948e44ec1acdc85e1925230b8a92f0d3fe/frankenphp.c#L60-L64
cc @dunglas

Anyway, this is not your bug because the image is compiled with a PHP without zend-signals.
However, the image I build from your image was using PHP 8.3.7, which is outdated, and since that version there were fixes including some that are related to FrankenPHP. So I would advice you to try on a newer PHP version. (In fact, I built a custom PHP 8.3 version from the latest branch, so that might be why I don't see your bug)

Copy link

No feedback was provided. The issue is being suspended because we assume that you are no longer experiencing the problem. If this is not the case and you are able to provide the information that was requested earlier, please do so. Thank you.

@korridor
Copy link
Author

@nielsdos Thanks for looking into that! :) Lets hope this will be fixed in FrankenPHP. Regarding you not having a ARM hardware to reproduce that. I have a small Hetzner ARM vServer that I use to test the ARM images for solidtime. If you need this to test this further, I can give you access to that server, but it's a rather weak one.

@dunglas
Copy link
Member

dunglas commented Mar 13, 2025

I don't know if it's related, but disabling Zend Signals is currently mandatory to use FrankenPHP.

@korridor
Copy link
Author

@dunglas Sorry I missread the message from @nielsdos. Thanks for the information! Since my image that had the reported problem, had Zend Signals disabled (as far as I can tell) this was not the cause of the issue.

@nielsdos
Copy link
Member

@korridor I'd advice you to first retest this on a latest 8.3 release, as I said there were quite a few JIT/opcache related fixes; including a race condition that affected FrankenPHP that I fixed some time ago. If you then still encounter the issue we can communicate on debugging on your server. Feel free to reach out to me via dossche[dot]niels[at]gmail[you know the TLD, but I don't spell it out for spambot reasons]

@pierre-cba
Copy link

Hello,
I don't know if this can help you, however, on a MacBook Pro M4 (arm) using the frankenphp 1.5 php 8.4 docker image and enabling opcache, I reproduce the error "php-1 exited with code 139"
If I disable opcache in /usr/local/etc/php/conf.d/docker-php-ext-opcache.ini
I no longer have the error

Best, Pierre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants