Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] mmap os error 12 in Cloud Run #292

Closed
egibs opened this issue Jan 16, 2025 · 13 comments · Fixed by #295
Closed

[Question] mmap os error 12 in Cloud Run #292

egibs opened this issue Jan 16, 2025 · 13 comments · Fixed by #295

Comments

@egibs
Copy link

egibs commented Jan 16, 2025

I'm seeing an interesting panic when running Go code that leverages yara-x in Cloud Run when and scanning certain files:
Image

I imagine that this has something to do with Cloud Run's filesystem being entirely in-memory since I'm unable to reproduce this anywhere else outside of a Docker container using a tmpfs and memory limit. Additionally, even when using 24Gi+ of memory for the Cloud Run Service, creating a single scanner will result in this panic when scanning said files.

For reference, the scanner is being created with roughly ~15,000 rules, but it doesn't seem like that should require a reservation of 8GB of memory (0x200000000 bytes) unless I'm completely misunderstanding.

// Initialize the ScanContext.wasm_store pointer that was initially
// dangling.
wasm_store.data_mut().wasm_store =
NonNull::from(wasm_store.as_ref().deref());
// Global variable that will hold the value for `filesize`. This is
// initialized to 0 because the file size is not known until some
// data is scanned.
let filesize = Global::new(
wasm_store.as_context_mut(),
GlobalType::new(ValType::I64, Mutability::Var),
Val::I64(0),
)
.unwrap();
// Global variable that is set to `true` when the Aho-Corasick pattern
// search phase has been executed.
let pattern_search_done = Global::new(
wasm_store.as_context_mut(),
GlobalType::new(ValType::I32, Mutability::Var),
Val::I32(0),
)
.unwrap();
// Compute the base offset for the bitmap that contains matching
// information for patterns. This bitmap has 1 bit per pattern, the
// N-th bit is set if pattern with PatternId = N matched. The bitmap
// starts right after the bitmap that contains matching information
// for rules.
let matching_patterns_bitmap_base =
MATCHING_RULES_BITMAP_BASE as u32 + num_rules.div_ceil(8);
// Compute the required memory size in 64KB pages.
let mem_size = u32::div_ceil(
matching_patterns_bitmap_base + num_patterns.div_ceil(8),
65536,
);
let matching_patterns_bitmap_base = Global::new(
wasm_store.as_context_mut(),
GlobalType::new(ValType::I32, Mutability::Const),
Val::I32(matching_patterns_bitmap_base as i32),
)
.unwrap();
// Create module's main memory.
let main_memory = wasmtime::Memory::new(
wasm_store.as_context_mut(),
MemoryType::new(mem_size, None),
)
.unwrap();

Here's the rough outline of what's happening:

yxc, err := yarax.NewCompiler()
if err != nil {
	return nil, fmt.Errorf("yarax compiler: %w", err)
}

# loop through rule sources and add them to the compiler
for ... {
  ...
  # loop through rules and add them to the compiler
  if err := yxc.AddSource(string(bs), yarax.WithOrigin(path)); err != nil {
  	return fmt.Errorf("failed to parse %s: %v", path, err)
  }
}

# build rules
yrs := yxc.Build()

# create new scanner with built rules
scanner := yarax.NewScanner(rules)
fc, err := os.ReadFile(path)
if err != nil {
	return nil, err
}

# scan file contents with scanner
res, err := scanner.Scan(fc)
if err != nil {
	return nil, err
}
...

FWIW, I don't see any difference in behavior between scanner.Scan or rules.Scan(fc) since the latter is just creating a new scanner behind the scenes. Could this just be due to reading said larger files into memory and then scanning them or is there something else at play?

@egibs
Copy link
Author

egibs commented Jan 16, 2025

Oh, I glossed over this (though this shouldn't apply in our case):

// For files smaller than ~500MB reading the whole file is faster than
// using a memory-mapped file.
let data = if size < 500_000_000 {
buffered_file = Vec::with_capacity(size as usize);
file.read_to_end(&mut buffered_file).map_err(|err| {
ScanError::OpenError { path: path.to_path_buf(), source: err }
})?;
ScannedData::Vec(buffered_file)
} else {
mapped_file = MmapFile::open(path).map_err(|err| {
ScanError::MapError { path: path.to_path_buf(), source: err }
})?;
ScannedData::Mmap(mapped_file)
};

@plusvic
Copy link
Member

plusvic commented Jan 17, 2025

The panic is occurring here:

let main_memory = wasmtime::Memory::new(
wasm_store.as_context_mut(),
MemoryType::new(mem_size, None),
)
.unwrap();

Apparently there's not enough memory for the WASM runtime. This depends on the number of rules, not the size of the scanned file. Could you replace the .unwrap() with the snippet below so that we can see how many memory pages are being requested? The issue may be that wasmtime tries to reserve at least 8GB regardless of how many pages do you request, or maybe we must set not only the minimum number of pages, but the maximum number as well.

        // Create module's main memory.
        let main_memory = wasmtime::Memory::new(
            wasm_store.as_context_mut(),
            MemoryType::new(mem_size, None),
        )
        .unwrap_or_else(|_| {
            panic!(
                "{}",
                format!(
                    "unable to create wasmtime memory with {} pages",
                    mem_size
                )
                .to_string()
            )
        });

@plusvic
Copy link
Member

plusvic commented Jan 17, 2025

After some investigation it looks that wasmtime is allocating 8GB of address space (bytecodealliance/wasmtime#3564) This doesn't means that it really uses 8GB of physical RAM, but it needs a contiguous block of virtual address space that will hold the WASM memory.

The operating system can impose a limit to the size of the process address space, in most Linux systems this is unlimited (you are limited only by the physical RAM), but you are hitting the limit imposed by the Cloud Run environment. As you said this can be reproduced using a Docker container, or more easily: changing the limits in your Linux system. For instance, this sets the limit to 4194304KB, (0x100000000 bytes):

ulimit -v 4194304

With a very simple rule the issue is reproduced:

yr scan test.yar test.yar
warning[invariant_expr]: invariant boolean expression
 --> test.yar:1:21
  |
1 | rule t { condition: true }
  |                     ---- this expression is always true
  |
  = note: rule `t` is always `true`
thread '<unnamed>' panicked at lib/src/scanner/mod.rs:270:10:
called `Result::unwrap()` on an `Err` value: mmap failed to reserve 0x104000000 bytes

Caused by:
    Cannot allocate memory (os error 12)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@plusvic
Copy link
Member

plusvic commented Jan 17, 2025

Please try this change f1fc21d, experiment with different values and tell me if it works. I would say that 2GB is enough for accomodating even the largest set of rules, and it should work in most environment.

I could also make it configurable.

@egibs
Copy link
Author

egibs commented Jan 17, 2025

Please try this change f1fc21d, experiment with different values and tell me if it works. I would say that 2GB is enough for accomodating even the largest set of rules, and it should work in most environment.

I could also make it configurable.

Will do! Making it configurable would be great if it's not too much of a lift, especially since it would allow us to make changes on the fly if we run into any further issues in more restrictive environments.

@egibs
Copy link
Author

egibs commented Jan 17, 2025

Looks like only a single page is created before panicking:

thread '<unnamed>' panicked at lib/src/scanner/mod.rs:268:13:
unable to create wasmtime memory with 1 pages
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind
stack backtrace:
   0:     0xffffb599b680 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hf959716447caf921
   1:     0xffffb59be524 - core::fmt::write::h352d02c3b6ffb499
   2:     0xffffb5998cf0 - std::io::Write::write_fmt::ha7eb8f6143c2961b
   3:     0xffffb599b534 - std::sys::backtrace::BacktraceLock::print::ha04caf6077834efe
   4:     0xffffb599c5bc - std::panicking::default_hook::{{closure}}::h26b7903db5fee9bc
   5:     0xffffb599c404 - std::panicking::default_hook::hc21fda6094353169
   6:     0xffffb599cbf0 - std::panicking::rust_panic_with_hook::hcd058fba18673021
   7:     0xffffb599ca1c - std::panicking::begin_panic_handler::{{closure}}::h5c6a3b6bbd877c39
   8:     0xffffb599bb7c - std::sys::backtrace::__rust_end_short_backtrace::h06b257a687347a5c
   9:     0xffffb599c704 - rust_begin_unwind
  10:     0xffffb5139658 - core::panicking::panic_nounwind_fmt::h3dae08c43cb6d1ef
  11:     0xffffb51396d0 - core::panicking::panic_nounwind::h64986efe0ae55c1b
  12:     0xffffb51397bc - core::panicking::panic_cannot_unwind::hf067d98dc8cc2018
  13:     0xffffb51550f4 - yrx_scanner_create
  14:           0x9177b8 - _cgo_1093191e5e44_Cfunc_yrx_scanner_create
                               at /tmp/go-build/cgo-gcc-prolog:118:11
  15:           0x47d64c - runtime.asmcgocall
                               at /usr/lib/go/src/runtime/asm_arm64.s:1000
thread caused non-unwinding panic. aborting.
thread '<unnamed>' panicked at lib/src/scanner/mod.rs:268:13:
unable to create wasmtime memory with 1 pages
thread '<unnamed>' panicked at lib/src/scanner/mod.rs:268:13:
unable to create wasmtime memory with 1 pages
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind
stack backtrace:
   0:     0xffffb599b680 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hf959716447caf921
   1:     0xffffb59be524 - core::fmt::write::h352d02c3b6ffb499
   2:     0xffffb5998cf0 - std::io::Write::write_fmt::ha7eb8f6143c2961b
   3:     0xffffb599b534 - std::sys::backtrace::BacktraceLock::print::ha04caf6077834efe
   4:     0xffffb599c5bc - std::panicking::default_hook::{{closure}}::h26b7903db5fee9bc
   5:     0xffffb599c404 - std::panicking::default_hook::hc21fda6094353169
   6:     0xffffb599cbf0 - std::panicking::rust_panic_with_hook::hcd058fba18673021
   7:     0xffffb599ca1c - std::panicking::begin_panic_handler::{{closure}}::h5c6a3b6bbd877c39
   8:     0xffffb599bb7c - std::sys::backtrace::__rust_end_short_backtrace::h06b257a687347a5c
   9:     0xffffb599c704 - rust_begin_unwind
  10:     0xffffb5139658 - core::panicking::panic_nounwind_fmt::h3dae08c43cb6d1ef
  11:     0xffffb51396d0 - core::panicking::panic_nounwind::h64986efe0ae55c1b
  12:     0xffffb51397bc - core::panicking::panic_cannot_unwind::hf067d98dc8cc2018
  13:     0xffffb51550f4 - yrx_scanner_create
  14:           0x9177b8 - _cgo_1093191e5e44_Cfunc_yrx_scanner_create
                               at /tmp/go-build/cgo-gcc-prolog:118:11
  15:           0x47d64c - runtime.asmcgocall
                               at /usr/lib/go/src/runtime/asm_arm64.s:1000
thread caused non-unwinding panic. aborting.
thread '<unnamed>' panicked at lib/src/scanner/mod.rs:268:13:
unable to create wasmtime memory with 1 pages
thread '<unnamed>' panicked at core/src/panicking.rs:223:5:
panic in a function that cannot unwind

Here's a stack trace excerpt too:

stack backtrace:
   0:     0xffffb599b680 - <std::sys::backtrace::BacktraceLock::printSIGABRT: abort::DisplayBacktrace as core::fmt::Display>::fmt::hf959716447caf921
   1:     0xffffb59be524 - core::fmt::write::h352d02c3b6ffb499

 PC= 0xffffb4e65058 
   2:     0xffffb5998cf0 - std:: m=0 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 51 gp=0x40002f2380 m=0 mp=0x24a8ca0 [syscall]:
io::Write::write_fmt::ha7eb8f6143c2961b
   3:     0xffffb599b534 - std::sys::backtrace::BacktraceLock::print::ha04caf6077834efe
 runtime.cgocall ( 0x9177904, : 0x40002e9198 )
         /usr/lib/go/src/runtime/cgocall.go 0xffffb599c5bc - std::panicking::default_hook:::167{ +{0x44closure fp=}0x40002e9160} sp=::0x40002e9120h26b7903db5fee9bc pc=
0x46eaf4
  5:     0xgithub.com/VirusTotal/yara-x/go._Cfunc_yrx_scanner_createffffb599c404( - 0x95b4f30std, ::0x400012f380panicking)
::      default_hook_cgo_gotypes.go:::hc21fda6094353169787
 + 0x34  fp= 0x40002e91906 sp=: 0x40002e9160  pc= 0x56b184
 0xffffb599cbf0 - std::panicking::rust_panic_with_hook::hcd058fba18673021
github.com/VirusTotal/yara-x/go.NewScanner.func1 (  0x4000136e607?: ,  0x400012f380 )
         /root/go/pkg/mod/github.com/!virus!total/yara-x/[email protected]/scanner.go0x:ffffb599ca1c91 -  +std0x78:: fp=panicking0x40002e91d0:: sp=begin_panic_handler0x40002e9190:: pc={0x56ff88{
closure}}::h5c6a3b6bbd877c39
   8:     0xffffb599bb7c - std::sys::backtrace::__rust_end_short_backtrace::h06b257a687347a5cgithub.com/VirusTotal/yara-x/go.NewScanner
( 0x4000516c88 )
        9/root/go/pkg/mod/github.com/!virus!total/yara-x/[email protected]/scanner.go: : 91  + 0x5c  fp=0x0x40002e9210ffffb599c704 sp= - 0x40002e91d0rust_begin_unwind pc=
0x56fdfc
  github.com/VirusTotal/yara-x/go.(*Rules).Scan10(: 0x40001fbce0 ? ,  { 0x40020220000x, ffffb51396580xaacd - , core0xaace::}panicking)
::      panic_nounwind_fmt/root/go/pkg/mod/github.com/!virus!total/yara-x/[email protected]/main.go:::h3dae08c43cb6d1ef138
 +0x38  fp= 0x40002e928011 sp=: 0x40002e9210  pc= 0x56d968
 0xffffb51396d0 - ...core(::panicking{::0x204a450panic_nounwind, ::0x4000178410h64986efe0ae55c1b}
, { 0x1 , 120x0: ,  0x0 ,  0x0 , 0x0x0ffffb51397bc,  - 0x0core, ::{panicking0x0::, panic_cannot_unwind0x0::, hf067d98dc8cc2018...
},  ... }13, : ... )
         ... 0xffffb51550f4 - yrx_scanner_create
  14:           0x9177b8 - _cgo_1093191e5e44_Cfunc_yrx_scanner_create
                               at /tmp/go-build/cgo-gcc-prolog:78 +0x304 fp=0x40002e9910 sp=0x40002e9280 pc=0x8d2eb4
:118:11
  15...:           ({0x204a450, 0x4000178410}, {0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x{47d64c0x0 - , runtime.asmcgocall0x0
,  ... } ,  ... } ,  ... )
         ... : 557  + 0xf0  fp= 0x40002e9ad0  sp= 0x40002e9910             at  pc=/usr/lib/go/src/runtime/asm_arm64.s0x8d6660:
1000
...thread caused non-unwinding panic. aborting.

@egibs
Copy link
Author

egibs commented Jan 17, 2025

Played around with the new commit a bit.

With a 16Gi equivalent limit in a container, I can run a decent number of concurrent rules.Scan operations. Since each Scan is allocating 2GB of memory, I can run about 5 concurrent scans before hitting the panic since other operations are consuming memory as well.

I dropped the reservation value as low as 32MB and it didn't seem to make much difference in how many concurrent scans I was able to run successfully. (5 at 2GB; 6 at 32MB with crashes at 7 and above). That said, the lower reservation did help but I think any changes in the API will have to be paired with optimizations on the calling side as well.

Edit: bytecodealliance/wasmtime#3695 (comment) pretty much covers the same issue and bytecodealliance/wasmtime#9545 is mentioned as the follow up (with the modified settings I referenced above).

@egibs
Copy link
Author

egibs commented Jan 17, 2025

Based on this comment around using "8GB guard pages": bytecodealliance/wasmtime#1501 (comment)

I tried turning off https://docs.rs/wasmtime/latest/wasmtime/struct.Config.html#method.guard_before_linear_memory which seemed to help as well (still having to use a more conservative concurrency limit):

config.guard_before_linear_memory(false);

Dropping the reservation value is still important since I encountered the same crash with the default value; I landed on 0x2000000 as mentioned above.

I think having these four options as being configurable should provide enough flexibility:

config.guard_before_linear_memory
config.memory_reservation
config.memory_reservation_for_growth
config.memory_guard_size

@egibs
Copy link
Author

egibs commented Jan 18, 2025

Using these config options present in 26.0.1 of wasmtime (the version in the v0.12.0 release), I've had no crashes at all with reasonable concurrency settings (runtime.NumCPU()):

config.dynamic_memory_reserved_for_growth(0x20000000); // 512MB; this defaults to 2GB (0x80000000)
config.guard_before_linear_memory(false);
config.static_memory_forced(false);
config.static_memory_guard_size(0x10000); // 64KB to match dynamic guard size
// Configures the maximum size, in bytes, where a linear memory is considered static, above which it’ll be considered dynamic
config.static_memory_maximum_size(0); // 0 is always dynamic

I'm not sure how these translate to the newly-reworded 28.0.1 settings, but it shouldn't be hard to figure out since they only renamed the options.

I doubt these are anywhere near as performant as the default settings, so I think being able to tune them will be important.

@plusvic
Copy link
Member

plusvic commented Jan 18, 2025

@egibs With the lastest changes in https://github.com/VirusTotal/yara-x/tree/addr-space-reservation I believe you should not have problems.

@egibs
Copy link
Author

egibs commented Jan 18, 2025

@plusvic -- looks good to me! I tested your new changes against the changes in my previous comment and they were functionally the same.

@plusvic
Copy link
Member

plusvic commented Jan 20, 2025

@egibs I intend to merge #295 for fixing this issue. At the end I kept the guard page in front of the memory, as this should not increase the virtual address space consumption by too much, and it provides additional protection against bugs.

I think this final configuration should be enough in most cases, but let me know if works for you. For the time being I'm not adding any API for configuring this. I prefer to find a configuration that works in all real-life cases, and don't burden users with too many details about the inner workings of the library. If this proves to be insufficient, then I can add some configuration knobs.

@egibs
Copy link
Author

egibs commented Jan 20, 2025

@plusvic -- I'm not seeing any OOMs with the default guard settings which appear to be 32MiB:

The default value for this property is 32MiB on 64-bit platforms. This allows eliminating almost all bounds checks on loads/stores with an immediate offset of less than 32MiB. On 32-bit platforms this defaults to 64KiB.

Based on what I'm seeing this is ready for :shipit:.

plusvic added a commit that referenced this issue Jan 20, 2025
Each WASM module tries to reserve a large amount of virtual address space (4GB in 64-bits systems). This doesn't means that they actually needs this amount of RAM, it only means that it reserves address space in case the memory required by the module grows dynamically. As each YARA-X scanner needs one of these WASM modules, this implies that YARA-X ends up reserving a very large amount of virtual address space. This is problematic in some environments where the amount of virtual address space per process is limited (in Docker containers, for instance).

This PR reduces the amount of virtual address space reserved by each WASM module, mainly by limiting the amount of memory of each WASM module to 16MB. This is enough to accommodate very large sets of rules, with millions of rules and patterns.

Closes #292.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants