-
Notifications
You must be signed in to change notification settings - Fork 577
lib/ExtUtils/t/Embed.t: failures on OpenBSD when compiling with gcc but not with clang #22125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I tried this on my -current sparc64 with both
|
And those were configured with
Is there some reason why OpenBSD's port of |
Noting the 2 instances of
Further analysis needs more C-foo than I have. |
I should add that, as the above invocation suggests, I configured with each of the two options separately and learned that the test failure is being generated by |
Yes, I copied the incantation from the first message:
Just changing
I believe we have some architectures still that clang doesn't support and we are stuck with gcc, but where possible it is becoming the system compiler and when stable, we stop installing gcc. And yes, it is because the clang licence is more acceptable to OpenBSD. I'm fairly sure we will not bring any GPL3 code into the base system at all. |
On Sat, Apr 06, 2024 at 08:45:25AM -0700, James E Keenan wrote:
$ cd t; ./perl harness -v ../lib/ExtUtils/t/Embed.t; cd -
ld: error: undefined symbol: Perl_more_sv
>>> referenced by embed_test.c
>>> /tmp//ccE9t92Z.o:(S_new_SV)
ld: error: undefined symbol: PL_sv_serial
>>> referenced by embed_test.c
>>> /tmp//ccE9t92Z.o:(S_new_SV)
>>> referenced by embed_test.c
>>> /tmp//ccE9t92Z.o:(S_new_SV)
When perl is built with -DDEBUG_LEAKING_SCALARS, it changes a few things.
In particular:
1) a static function called S_new_SV() is created that the macro new_SV()
calls, rather than new_SV() doing the work directly itself.
2) The variable PL_sv_serial is defined.
In the Embed.t test, it tests perl's ability to embed perl in other code.
So it creates a standalone C program, embed_test.c, which is compiled and
linked to the perl library perl.o.
Something is going wrong at link time: the symbols PL_sv_serial and
Perl_more_sv, apparently referenced from S_new_SV, can't be found by the
look of it.
Perhaps it's something like perl getting compiled with
DEBUG_LEAKING_SCALARS, while embed_test.c isn't?
S_new_SV() is weird: although it's static, it's *not* declared as inline,
even although its defined in sv_inline.h. I don't know whether this is a
mistake. It was in sv.c until 5.36.0. It was moved by this commit:
commit 75acd14
Author: Richard Leach ***@***.***>
AuthorDate: Sun Feb 6 22:52:54 2022 +0000
Make newSV_type an inline function
When a new SV is created and upgraded to a type known at compile time,
uprooting a SV head and then using the general-purpose upgrade function
(sv_upgrade) is clunky. Specifically, while uprooting a SV head is
lightweight (assuming there are unused SVs), sv_upgrade is too big to be
inlined, contains many branches that can logically be resolved at compile
time for known start & end types, and the lookup of the correct
body_details struct may add CPU cycles.
This commit tries to address that by making newSV_type an inline function
and including only the parts of sv_upgrade needed to upgrade a SVt_NULL.
When the destination type is known at compile time, a decent compiler will
inline a call to newSV_type and use the type information to throw away all
the irrelevant parts of the sv_upgrade logic.
Because of the spread of type definitions across header files, it did not
seem possible to make the necessary changed inside sv.h, and so a new
header file (sv_inline.h) was created. For the inlined function to work
outside of sv.c, many definitions from that file were moved to sv_inline.h.
Finally, in order to also benefit from this change, existing code in sv.c
that does things like this:
SV* sv;
new_SV(sv);
sv_upgrade(sv, SVt_PV)
has been modified to read something like:
SV* sv = newSV_type(SVt_PV);
…--
You're only as old as you look.
|
Bisecting on OpenBSD-6.9 with this invocation:
... confirmed the breaking commit:
|
Thanks, I'll try to look into this later this week |
I'm assuming this is indeed an error by omission on my part, but haven't had chance to get an OpenBSD VM running yet. |
I had a look at this and reproduced it locally. libperl.a looks fine:
So what do we end up linking against? I edited
So we're linking against the system libperl, which of course wasn't built with So while 75acd14 revealed this problem, it didn't cause the problem. |
I've tried a few approaches to this
The only way I could get it to link the correct libperl was a direct reference:
I don't see a way to get ld to trace where it looks for libraries, so I don't see a way to trace this any further. |
@tonycoz - Thanks for figuring out what was going on here re: linking. |
I need to look into it further, perl itself seems to link fine. |
The perl executable links the library via direct reference:
So I see two issues directly related to the test here:
I can see a couple of issues not directly related to the failures:
|
I'm working on a PR for these |
@richardleach @tonycoz: |
As far as a PR for the "issues not directly related to the failures", I'm currently stuck on figuring out how to correctly declare - or otherwise make visible - the |
Ok, I cannot see how to make the |
As far as I can tell the build issue itself is very long standing, unrelated to 75acd14 and can't be fixed in the C source.
At base it looks like a bug in the old gcc on that platform, since the default clang works fine. It could be added to known issues for the 5.40 perldelta. |
@tonycoz, now that we're well into the 5.41 dev cycle, is there any way we could move work on this problem forward? Thanks. |
@tonycoz, ping ^^ |
As https://perl5.test-smoke.org has come back to life (thanks to Leo, Olaf, et al.), I've been able to gather more data about this problem -- and am glad that the problem was partially solved by @richardleach's c79fe2b back in June. I noticed that after a string of 5 smoke-test run failures on my OpenBSD-6.9 VM between April 6 and June 11, no failures have been observed since. I bisected with the following invocation:
... and found that the first "bad" (really, good, in this case) commit was 👍
This build used
... then built and tested.
Note that the very first entry in this issue back in April was reporting an unthreaded build. I'm hoping that we can get some smoke-tests on more up-to-date OpenBSDs (7.5) with more modern |
Note that c79fe2b didn't fix the underlying issue: linking with the wrong libperl. I've spent some time on this, mostly trying to work out how to detect the mismatch so the test fails when it should fail. |
Issue Perl#22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
Issue #22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes #22125
Issue #22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes #22125
Some improvements, bc this is a new API/fnc, without requiring CV function ptr compatibility like the XS version has. Some suggestions, Passing a const char */int error to the caller/root binary is more nice, so the caller (assume no human) can tuck the error msg away in some log. This situation is a little unrealistic, since an embedded-perl app, isnt going to be uploaded to CI/published to customers, if it doesnt even start on the author's box. So getting the error code with a C debugger after a SEGV, or getting it in the console, is the same thing, since the end user has (lets hope) to have basic C debugging tools before trying to make a embedded libperl app. The new function needs a better way to communicate to the caller than I can imagine this FN being useful in some kind of current or future git bisect like setup, with 100 libperl.so/dll'es in one directory. |
Perhaps separate The problem with returning a That said, the attempted fixes to the Embed.t linking didn't work, and the new check has revealed a similar problem on Solaris, yay.
or maybe this broke it. |
Perl has gotten CVE s before where malloc() ret NULL doesn't SEGV or doesn't hard process exit, b/c Perl_croak() was used to dispatch the error and there was a eval {} block, in the way. Even p5's do_exit() my_exit() and p5 C api disabling setjmp longjmp, is still a security problem. It's trivial for the last two to start calling back into PP code with sub DESTROY and mgset mgget mgfree and overload.pm. I haven't had the time to write a real poc but I have played around with randomly breaking/forcing thru hook malloc() =null and the temps stack and HV head stack grow reallocators, failing. Yes perl worked perfectly upto 2^32 - exactly 4096 bytes, of OS malloc memory, and the very last VM page, was wanted for a SV head arena, pl stk or tmps stk expand. XS handshake has a die_no_perl("xyz") for the very specific reason that I realized that there will be endless problems/segvs dispatching the error string back to the stderr console, through any existing Perl API. Malloc()? Unavailable cuz we just detected someone commited arson on malloc()s internals. SPrintf()? Format strings? Crazy 🤪 talk we're reporting the local serialization mutex timed out deadlicked, how you think you're going to reenter it? PerlIO? I dare you to find a code path to reach ring-0 syswrite() through that api, after I demapped glibc's NLS mmaped DB backend from the process. Same for fprintf(), I think I used fwrite() in die_no_perl() because it was the most stable and most portable way to write to the console when I single stepped in assembly code all the way to NtExitProcess(exitcode). Reaching Linux or OSX kernel with public api in assembly code is beyond scope for too much work for too few failure modes. Just assume 95% of Libc is broken, not that Libc was removed from virtual memory, and the day job work ticket says to report failure mode to stderr to user of Libc disappearing randomly from virtual memory without a visible segv.
C strings allocated as constant global in the bad child lib perl sound safe enough. The collar cannot be that stupid writing and embedded Pearl program to identify trap resume the ABI mismatch of lipearl.so, demap the bad child libperl.so, continue to search on some list of disc paths or Brute Force the directory tree for another libperl.so then at process exit dump the log to disk and crash bc of earlier demapped libperl.so. If someone knows how to unload a libperl.so from virtual address space they will know how to use string length and mem copy. A string is a horrible format though for error handling since it's not machine parsible only human parsible. I have brainstormed that XS embed ABI handshake really needs 4 critical data fields. U16 or U8 X 3 for 41 , 8, 0x_alpha, NO U16 for 0x5 !!! U32 X 2 for non_bin_compat_options bit vector from Perl -V, but that list of defines is absolute trash by now and I have a half of bug ticket for half the items in there proving they will crash or change my Pearl size if those defines are different Myperl size, there are some things that cannot be detected by Configure and aren't even a perl core option like the internals of stat_t that can change with a "simple monthly security update" to GCC or MSVC silently installed on a user's system, and for who knows what reason why that CC update broke Global ABI state. " I see you used the LTS package manager on a non LTS Enabled Ubuntu that was EOLed, here is the phone number for our sales dept for Enterprise VIP customers" [we break ABI API on purpose for same age same version same build date same build number, OSes, but one ISO is tagged LTS [corporate sales] and the other is tagged "consumer"/mainline] |
Issue #22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes #22125
Issue #22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes #22125
Issue Perl#22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes Perl#22125
Issue Perl#22125 detected that we weren't linking the correct library with the embedded test with gcc on OpenBSD, so add an API to perform a sanity check by comparing the size of the perl interpreter structure (or its size if it was a structure) and expected perl API version between those seen in the binary and those compiled into libperl.
When building with gcc, lib/ExtUtils/t/Embed.t would link against the system libperl.a rather than the newly built libperl.a. Due to the limited API used by the sample code this typically didn't crash, but some configuration changes could result in a crash or a link error. For OpenBSD, change the link options to more closely match those used when building the perl executable, which results in linking against the correct library. Fixes Perl#22125
For several months I have been noticing smoke-test failures under certain configurations on OpenBSD-7.4; see, e.g., report 5052514. The failures occur when configuring a perl built with these command-line switches:
... and then, after running
make test_prep
, calling:I didn't pay these much attention at first because at first the perl was being built with a compiler I had never heard of,
eg++
. But more recently these failures were occuring withgcc
as the compiler.I haven't been able to install an OpenBSD more recent than 6.9, but I decided to explore this problem on that version today regardless. I got the test failure reported above on blead. I then decided to see whether these failures were due to a recent code change or whether @cjg-cguevara's exploration of these config options had merely revealed a problem that was "always" there. I tested at tag
v5.38.0
and got the failure reported above.Since the
gcc
version I was testing with dates back to 2007, I decided to try with the system's default C-compiler,clang-10
.With this more recent
clang
, the failing tests PASSed.It would be good if someone could test these config options on an up-to-date OpenBSD with both
gcc
andclang
. We can then try to evaluate the source of the test failures. cc: @afresh1The text was updated successfully, but these errors were encountered: