Skip to content

Conversation

@VGalaxies
Copy link
Contributor

@VGalaxies VGalaxies commented Jul 25, 2025

This PR improves compatibility and safety when using pprof-rs within a Go program via CGo.

Motivation

When non-Go code that installs signal handlers is linked with a Go program, it must follow specific rules to avoid crashing the application. This is because Go uses small, dynamically-sized stacks for its goroutines. As outlined in the official Go documentation on signals:

If the non-Go code installs any signal handlers, it must use the SA_ONSTACK flag with sigaction. Failing to do so is likely to cause the program to crash if the signal is received.

Also, the previous implementation set SIGPROF handler to SIG_IGN (ignore) upon shutdown. This would incorrectly remove Go's pprof handler, silently breaking Go's profiling capabilities.

This PR fixes these issues by adding the required flag and properly restoring the original signal handler.

Changes

  • The SA_ONSTACK flag is now included when registering the SIGPROF handler controlled by on_stack param.
  • The original sigaction is saved when the profiler starts.
  • This saved sigaction is restored when the profiler stops, ensuring the original handler (e.g., Go's) is correctly reinstated.

Impact

  • Performance: The performance impact is negligible. The SA_ONSTACK flag has zero cost during normal execution and an insignificant cost during save and restore the original sigaction.
  • Memory: There is no additional memory overhead from this change. The alternate signal stack is already allocated by the Go runtime; In non-Go environments, the on_stack option is disabled by default, so the SA_ONSTACK flag is not used and no extra memory is allocated.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jul 25, 2025

Welcome @VGalaxies! It looks like this is your first PR to tikv/pprof-rs 🎉

@YangKeao YangKeao self-requested a review July 25, 2025 03:57
src/profiler.rs Outdated
// SA_ONSTACK will deliver the signal on an alternate stack. This is crucial
// to prevent a stack overflow if the signal arrives at a thread with
// a small stack, which is common when use pprof-rs in Go runtimes.
flags |= signal::SaFlags::SA_ONSTACK;
Copy link
Member

@YangKeao YangKeao Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, adding SA_ONSTACK will allocate another stack for the signal handler, which will make the result of pprof-rs meaningless because pprof-rs need to unwind the stack.

Maybe the better way to solve this is just to not use pprof-rs with go runtime?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VGalaxies please check and provide screenshot with this change

Copy link
Contributor Author

@VGalaxies VGalaxies Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an alternate stack does not prevent stack unwinding. When a signal handler is invoked with the SA_SIGINFO flag, the kernel provides a ucontext_t pointer as the third argument (see https://man7.org/linux/man-pages/man2/sigaction.2.html). This ucontext_t contains a complete snapshot of the interrupted thread's CPU registers (including the stack pointer RSP and frame pointer RBP), which point to the original application stack, not the alternate signal stack.

In perf_signal_handler function, we pass this ucontext directly to TraceImpl::trace. The unwinding logic then uses this context to correctly walk the original stack from the point of interruption. Therefore, I think running the handler on an alternate stack does not affect the profiling results.

Below are the results of cpu profile using the modified pprof-rs:

image

The red box section is the tikv stacks, and outside the red box there are some tidb stacks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh TiKV CPU profile now also contains golang's stack. Maybe we can disable golang's CPU profile HTTP API for our build.

Copy link
Member

@YangKeao YangKeao Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In perf_signal_handler function, we pass this ucontext directly to TraceImpl::trace. The unwinding logic then uses this context to correctly walk the original stack from the point of interruption. Therefore, I think running the handler on an alternate stack does not affect the profiling results.

Thanks for the explanation 👍. pprof-rs doesn't always use the ucontext (and the register inside) to unwind. By default, it'll use backtrace_rs as the implementation, which will ignore the ucontext and just use the current stack. Therefore, the sa_stack only works for the frame_pointer implementation.

I'd suggest to only include sa_stack function when feature frame-pointer is enabled, and add some comments to this method.

BTW, does every platforms (mac, linux, bsd) include this flag? If not, we'd better also check the target_os.

#[cfg(all(target_os = "linux", feature = "frame-pointer"))]

Copy link
Contributor Author

@VGalaxies VGalaxies Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, it'll use backtrace_rs as the implementation, which will ignore the ucontext and just use the current stack. Therefore, the sa_stack only works for the frame_pointer implementation.

Thanks for the great feedback! You're right, this approach is only effective with the frame-pointer unwinder. The above testing was done within the TiKV release build, where the frame pointer is enabled by default in the Makefile.

BTW, does every platforms (mac, linux, bsd) include this flag? If not, we'd better also check the target_os.

The SA_ONSTACK flag is a standard feature defined by the POSIX.1-2001 standard, so we can expect SA_ONSTACK to be available on all major POSIX-compliant operating systems. It is documented in the man pages for Linux, FreeBSD (which is similar to macOS), OpenBSD, and more...

flags |= signal::SaFlags::SA_ONSTACK;
}
let sigaction = signal::SigAction::new(handler, flags, signal::SigSet::empty());
let old_action = unsafe { signal::sigaction(signal::SIGPROF, &sigaction) }?;
Copy link

@lance6716 lance6716 Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in the golang's doc

If the non-Go code installs a signal handler for any of the synchronous signals (SIGBUS, SIGFPE, SIGSEGV), then it should record the existing Go signal handler

and there are also specifications I didn't know before, like "signal mask for threads". Did you also checked that this library and our program can satisfy that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the entire pprof-rs library and found that it only calls signal::sigaction to handle the SIGPROF signal, which is an asynchronous signal. It does not install any handlers for synchronous signals.

Additionally, pprof-rs has no function calls that modify the thread's signal mask. The only related part is signal::SigSet::empty() in the sigaction call. The sa_mask field temporarily blocks specified signals while the perf_signal_handler is running, and by passing an empty set, it means that no additional signals are requested to be blocked.

Signed-off-by: VGalaxies <[email protected]>
Signed-off-by: VGalaxies <[email protected]>
Signed-off-by: VGalaxies <[email protected]>
Signed-off-by: VGalaxies <[email protected]>
Copy link
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YangKeao
Copy link
Member

Please fix the clippy:

image

Then we can merge this PR 🍻 .

@VGalaxies
Copy link
Contributor Author

Please fix the clippy:

image Then we can merge this PR 🍻 .

You're spot on. The unused_mut warning appears because the mut is only required when the frame-pointer feature is enabled, and that feature isn't active in the CI's clippy job.

I wanted to avoid using #[allow(unused_mut)], so I've refactored the code to use variable shadowing instead. This resolves the lint in a way that feels a bit cleaner 😂

@VGalaxies VGalaxies requested a review from YangKeao July 29, 2025 07:01
@YangKeao YangKeao merged commit 0c52c5f into tikv:master Jul 29, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants