-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Fix the damn deadlock #7553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the damn deadlock #7553
Conversation
Managed to repro this. Here is a backtrace of all threads: https://gist.github.com/andrewrk/ee0fc81ebd4d05912a633f0108ca68cb 2 of the threads are waiting on |
We were violating the POSIX standard which resulted in a deadlock on musl v1.1.24 on aarch64 alpine linux, uncovered with the new ThreadPool usage in the stage2 compiler. std.os execv functions that accept an Allocator parameter are removed because they are footguns. The POSIX standard does not allow calls to malloc() between fork() and execv() and since it is common to both (1) call execv() after fork() and (2) use std.heap.c_allocator, Programmers are encouraged to go through the `std.process` API instead, causing some dissonance when combined with `std.os` APIs. I also slapped a big warning message on all the relevant doc comments.
Surely this time all the problems have been fixed
// This means we must do the null-termination of argv and env vars here. | ||
const argv_buf = try arena.alloc(?[*:0]u8, self.argv.len + 1); | ||
for (self.argv) |arg, i| { | ||
const arg_buf = try arena.alloc(u8, arg.len + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use allocSentinel
here
const arg_buf = try arena.alloc(u8, arg.len + 1); | |
const arg_buf = try arena.allocSentinel(u8, arg.len, 0); |
// can fail between fork() and execve(). | ||
// Therefore, we do all the allocation for the execve() before the fork(). | ||
// This means we must do the null-termination of argv and env vars here. | ||
const argv_buf = try arena.alloc(?[*:0]u8, self.argv.len + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const argv_buf = try arena.alloc(?[*:0]u8, self.argv.len + 1); | |
const argv_buf = try arena.allocSentinel(?[*:0]u8, self.argv.len, null); |
|
||
pub fn createNullDelimitedEnvMap(arena: *mem.Allocator, env_map: *const std.BufMap) ![:null]?[*:0]u8 { | ||
const envp_count = env_map.count(); | ||
const envp_buf = try arena.alloc(?[*:0]u8, envp_count + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const envp_buf = try arena.alloc(?[*:0]u8, envp_count + 1); | |
const envp_buf = try arena.allocSentinel(?[*:0]u8, envp_count, null); |
src/Cache.zig
Outdated
/// The plan is to enable this for debug builds only, but for now we enable | ||
/// it always to catch a deadlock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now enabled only for debug builds as of 2b97b75
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. There's a good bit of cleanup that can be achieved using std.mem.Allocator.alloSentinel()
but the code was already working before it got moved around so whether or not to do that is up to you.
2b97b75
to
cb290ed
Compare
Did the |
break :m @ptrCast([*:null]?[*:0]u8, os.environ.ptr); | ||
} else { | ||
// TODO come up with a solution for this. | ||
@compileError("missing std lib enhancement: ChildProcess implementation has no way to collect the environment variables to forward to the child process"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the upshot of not allowing Linux libraries not linking libc to use std.process.Child
even when setting an env_map
on the Child (this has since moved, but the logic is the same). At first I thought this was a compiler bug since not all arms of the condition are comptime known:
const std = @import("std")
fn IterThingOnce(Thing: anytype) type {
return struct {
yielded: bool = false,
optional_thing: ?Thing,
pub fn next(self: *@This()) ?Thing {
if (self.optional_thing) |thing| {
if (self.yielded) {
return null;
} else {
defer self.yielded = true;
return thing;
}
} else if (false) {
@compileError("This can't happen since `false` is a comptime known literal.");
} else {
// Compiles and test passes by exchanging commented out line below.
@compileError("This seems like it should not happen.");
// return null;
}
}
};
}
test "Compiler Bug?" {
var example: IterThingOnce([]const u8) = .{.optional_thing = "Slatibartfast"};
try std.testing.expectEqual("Slatibartfast", example.next().?);
try std.testing.expect(example.next() == null);
}
... error: This seems like it should not happen.
@compileError("This seems like it should not happen.");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
But, on reflection, that's the point of the else arm triggering a compile error - that branch can be conditionally hit at runtime.
I guess what I'm wondering is if that was the intention; i.e.: Not allow the Linux library with no libc case to get a runtime error instead iff they do not set up an env_map
. I'd prefer the runtime error, since I know what I'm doing and set the env_map
up (I actually go further and setup std.os.environ
with my own ~start code in the library entry point).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please file an issue about this; it's very likely to get lost on a 4 years old PR. It seems like a case we should fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha - ok, excellent. I'll do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch adds debug code for detecting any deadlock in the Cache system. The fix is not in this branch yet but hopefully this new debug code will lead us to it.
I confirmed that when removing the code protecting detection of the same C source files:
The detection code is indeed triggered: