Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in libscap reading IPv4 sockets from /proc. #2272

Open
shane-lawrence opened this issue Feb 6, 2025 · 1 comment · May be fixed by #2271
Open

Segfault in libscap reading IPv4 sockets from /proc. #2272

shane-lawrence opened this issue Feb 6, 2025 · 1 comment · May be fixed by #2271
Labels
kind/bug Something isn't working
Milestone

Comments

@shane-lawrence
Copy link

shane-lawrence commented Feb 6, 2025

Describe the bug

Falco segfaults when reading TIME_WAIT sockets from /proc/###/net/tcp at the end of the buffer.

How to reproduce it
This is difficult to replicate in production because it only occurs when the 1 MB buffer ends with a socket with fdinfo->ino=0.

I created a test in https://github.com/falcosecurity/libs/pull/2271/files and confirmed that it triggers a segfault (when tested without the corresponding fix in that PR).

Expected behaviour
Falco should accurately parse the sockets or gracefully throw an error without crashing.

Screenshots
Logs copied below.

Environment

  • Falco version: 0.39.2
  • System info: tested on multiple systems
  • Cloud provider or hardware configuration: GCP
  • OS: COS, Alpine, distroless, Debian, Ubuntu
  • Kernel: 6.1.100+
  • Installation method: dockerhub Falco image, build from source

Additional context
On a specific subset of large nodes hosting busy web servers, I'm seeing segmentation faults from Falco when server pods redeploy. Initially there were no debug symbols and the debug build wouldn't run at all, so I added debug symbols in falcosecurity/falco#3440, which allowed a full stack trace but some variables were optimized out. The error was:

Thread 1 "falco" received signal SIGSEGV, Segmentation fault.
0x0000555556219b23 in scap_fd_read_ipv4_sockets_from_proc_fs (dir=dir@entry=0x7fffffff6980 "/host/proc/2054594/net/tcp", l4proto=l4proto@entry=2,
    sockets=sockets@entry=0x5555594d5fe8, error=error@entry=0x7fffffff6480 "\220\357\037YUU")
    at /usr/src/falcosecurity/falco/build/falcosecurity-libs-repo/falcosecurity-libs-prefix/src/falcosecurity-libs/userspace/libscap/linux/scap_fds.c:809

The offending line is

HASH_ADD_INT64((*sockets), ino, fdinfo);
which adds the parsed socket info (fdinfo) to the sockets hash table. Patching it with some custom log statements showed that it was always crashing while parsing sockets that are in a TIME_WAIT state, and those sockets have an inode value of zero.

I had trouble getting AddressSanitizer to work in wolfi or alpine (I think it's supposed to work on musl now but not out-of-the-box), so I switched to an Ubuntu base image and asan reported this:

==311==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7de7be698800 at pc 0x5754d2eef8a7 bp 0x7ffc43f41760 sp 0x7ffc43f41750
READ of size 1 at 0x7de7be698800 thread T0
    #0 0x5754d2eef8a6 in scap_fd_read_ipv4_sockets_from_proc_fs /usr/src/falcosecurity/libs/userspace/libscap/linux/scap_fds.c:782
    #1 0x5754d2ef30cb in scap_fd_read_sockets /usr/src/falcosecurity/libs/userspace/libscap/linux/scap_fds.c:1152
    #2 0x5754d2ef3fa2 in scap_fd_handle_socket /usr/src/falcosecurity/libs/userspace/libscap/linux/scap_fds.c:369

0x7de7be698800 is located 0 bytes to the right of 1048576-byte region [0x7de7be598800,0x7de7be698800)
allocated by thread T0 here:
    #0 0x7de7e0556887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x5754d2eef50b in scap_fd_read_ipv4_sockets_from_proc_fs /usr/src/falcosecurity/libs/userspace/libscap/linux/scap_fds.c:682
...
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/falcosecurity/libs/userspace/libscap/linux/scap_fds.c:782 in scap_fd_read_ipv4_sockets_from_proc_fs

That's the line number in my modified version (with extra debug logs), but it corresponds to userspace/libscap/linux/scap_fds.c:769 upstream. It's clear in that line that we're reading a character before checking if it's out of bounds.

Fixing that line resolves the problem, but it's still not clear to my why the segfault without asan was occurring later in the function (at HASH_ADD_INT64).

@shane-lawrence shane-lawrence added the kind/bug Something isn't working label Feb 6, 2025
@FedeDP
Copy link
Contributor

FedeDP commented Feb 6, 2025

/milestone 0.21.0

Thanks for the in-depth bug research :) and for the fix obviously!

It's clear in that line that we're reading a character before checking if it's out of bounds.

💯 agree.

@poiana poiana added this to the 0.21.0 milestone Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants