Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write fuzz inputs to a shared memory region before running a task #20803

Open
andrewrk opened this issue Jul 26, 2024 · 3 comments · Fixed by #22862 · May be fixed by #23416
Open

write fuzz inputs to a shared memory region before running a task #20803

andrewrk opened this issue Jul 26, 2024 · 3 comments · Fixed by #22862 · May be fixed by #23416
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jul 26, 2024

Extracted from #20773.

Currently, a fuzz test failure looks like this:

andy@bark ~/t/abc> zig build test --fuzz 
test
└─ run test failure
/home/andy/local/lib/zig/std/testing.zig:546:14: 0x11575c9 in expect (test)
    if (!ok) return error.TestUnexpectedResult;
             ^
/home/andy/tmp/abc/src/main.zig:28:5: 0x1157691 in test.fuzz example (test)
    try std.testing.expect(!std.mem.eql(u8, "canyoufindme", input_bytes));
    ^
failed with error.TestUnexpectedResult
error: the following command exited with error code 1:
/home/andy/tmp/abc/.zig-cache/o/eea1979fed4d51bc1ca0d161af979e22/test --seed=0x48fe2aeb --listen=- 
error: all fuzz workers crashed
error: the following build command failed with exit code 1:
/home/andy/tmp/abc/.zig-cache/o/bc5565bbec3a56db01acb2ab6b348742/build /home/andy/local/bin/zig /home/andy/local/lib/zig /home/andy/tmp/abc /home/andy/tmp/abc/.zig-cache /home/andy/.cache/zig --seed 0x48fe2aeb -Zb2ede88d1c7627c9 test --fuzz

If you rerun that command that it printed, it does not in fact reproduce the issue:

andy@bark ~/t/abc [1]> /home/andy/tmp/abc/.zig-cache/o/eea1979fed4d51bc1ca0d161af979e22/test --seed=0x48fe2aeb
All 2 tests passed.
1 fuzz tests found.

This is due to lack of communication between parent process (build runner) and fuzzing process (test runner).

However, for performance purposes, we don't want any communication between those processes in the hot path. That means we cannot send a message containing the current input before trying it.

Options are:

Follow the lead from other fuzzers by having a "corpus" directory, which is a list of files memory mapped into the fuzzer process, one per "interesting" input, with filenames corresponding to the run IDs. Advantages to this approach is that it's easy to recover and it could be used to share state across processes. Disadvantage is that it writes to the filesystem in a hot path. Maybe that's OK in practice? I'll have to check.

Another idea that I had is to have the parent process (build runner) create and share a memory mapping with the fuzzing process (test runner). The fuzzer would use this memory to store its most recent input(s) as well as some metadata (for example stats to display on the UI). The parent process can then read from this shared mapping to display the stats in real time as well as to recover inputs when the fuzzer process crashes.

It might not be such a bad idea to send a message when an "interesting" input is found. This message would perhaps be forwarded to other fuzzing processes, perhaps on the same system or perhaps even on other systems. Then again, using a file system directory as a "corpus" directory would also allow other processes, including peers and parents, to notice and pick up interesting inputs.

This issue is a tad bit open ended, but at least to close it, interesting inputs that are found should be displayed in a reproducible manner, where re-running a particular command will in fact reproduce the crash.

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing labels Jul 26, 2024
@andrewrk andrewrk added this to the 0.14.0 milestone Jul 26, 2024
@andrewrk
Copy link
Member Author

andrewrk commented Aug 8, 2024

I'm thinking the next step here is to use .zig-cache/f for corpus directory, keyed on the fully-qualified unit test names, and then implement AFL's strategy of maintaining a minimal set of inputs that trigger unique execution paths as memory-mapped files in this directory. At some point users may then decide to minimize the inputs and then copy them into the source tree, switching over to providing them via std.testing API.

@gcoakes
Copy link
Contributor

gcoakes commented Aug 11, 2024

fuzzer would use this memory to store its most recent input(s)

The current implementation as of today has a shared, memory-mapped file at .zig-cache/v/<program_counter_digest>. I don't think that is the appropriate place to map the current input since that would prohibit parallel processes from fuzzing the same set of program counters. Also, there is currently an assumption that a single test function will be fuzzed within a given test process. Would it be a good idea for each process to use a shared, memory-mapped file as the backing for Fuzzer.input located at .zig-cache/f/<test_fqn>/<pid>. It could be renamed by the parent process when a crash occurs, or it could be renamed by the fuzzing process when an error occurs.

keyed on the fully-qualified unit test names

Does fully-qualified additionally include a build ID for that build of the test? Or, would you want subsequent builds to retain the same cached "interesting" inputs? If the latter, I think we would need to add a phase in which it reanalyzes the cached inputs according to the current program counters.

@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 10, 2025
andrewrk added a commit that referenced this issue Feb 11, 2025
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.

libfuzzer is reworked to select inputs from the entire corpus.

I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.

however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.

does not quite close #20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
andrewrk added a commit that referenced this issue Feb 11, 2025
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.

libfuzzer is reworked to select inputs from the entire corpus.

I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.

however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.

does not quite close #20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
andrewrk added a commit that referenced this issue Feb 11, 2025
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.

libfuzzer is reworked to select inputs from the entire corpus.

I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.

however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.

does not quite close #20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
@andrewrk
Copy link
Member Author

commit message says

does not quite close #20803 yet

@andrewrk andrewrk reopened this Feb 12, 2025
gooncreeper added a commit to gooncreeper/zig that referenced this issue Mar 31, 2025
This PR significantly improves the capabilities of the fuzzer. For
comparison, here is a ten minute head to head between the old and new
fuzzer implementations (with newly included fuzz tests):
-- Old --
```
Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)
```
(note: Unique Runs is highly inflated due of the inefficiency of the
old implementation)
-- New --
```
Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)
Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`
```
NOTE: You have to rebuild the compiler due to new fuzzing
instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine new runs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start, though this does not
close ziglang#20803 since it does not output the input (though it can be
verily easily retrieved from the cache directory.)

Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity. Additionally, volatile was removed from MemoryMappedList
since all that is needed is a guarantee that compiler has done the
writes, which is already accomplished with atomic ordering.

Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)

Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
* Closes ziglang#23180

POSSIBLE IMPROVEMENTS:
* Remove the 8-bit pc counting code prefer a call to a sanitizer
function that updates a flag if a new pc hit happened (similar to how
the __sanitizer_cov_load functions already operate).
* Less basic input minimization function. It could also try splitting
inputs into two between each byte to see if they both hit the same pcs.
This is useful as smaller inputs are usually much more efficient.
* Deterministic mutations when a new input is found.
* Culling out corpus inputs that are redundant due to smaller inputs
already hitting their pcs and memory addresses.
* Applying multiple mutations during dry spells.
* Prioritizing some corpus inputs.
* Creating a list of the most successful input splices (which would
likely contain grammar keywords) and creating a custom mutation for
adding them.
* Removing some less-efficient mutations.
* Store effective mutations to the disk for the benefit of future runs.
* Counting __sanitizer_cov `@returnAddress`es in determining unique
runs.
* Optimize __sanitizer_cov_trace_const_cmp methods (the use of an
ArrayHashMap is not too fast).
* Processor affinity
* Exclude fuzzer's .rodata

Nevertheless, I feel like the fuzzer is in a viable place to start
being useful (as demonstrated in ziglang#23413)
gooncreeper added a commit to gooncreeper/zig that referenced this issue Mar 31, 2025
This PR significantly improves the capabilities of the fuzzer. For
comparison, here is a ten minute head to head between the old and new
fuzzer implementations (with newly included fuzz tests):
-- Old --
```
Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)
```
(note: Unique Runs is highly inflated due of the inefficiency of the
old implementation)
-- New --
```
Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)
Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`
```
NOTE: You have to rebuild the compiler due to new fuzzing
instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine new runs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start, though this does not
close ziglang#20803 since it does not output the input (though it can be
very easily retrieved from the cache directory.)

Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity. Additionally, volatile was removed from MemoryMappedList
since all that is needed is a guarantee that compiler has done the
writes, which is already accomplished with atomic ordering.

Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)

Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
* Closes ziglang#23180

POSSIBLE IMPROVEMENTS:
* Remove the 8-bit pc counting code prefer a call to a sanitizer
function that updates a flag if a new pc hit happened (similar to how
the __sanitizer_cov_load functions already operate).
* Less basic input minimization function. It could also try splitting
inputs into two between each byte to see if they both hit the same pcs.
This is useful as smaller inputs are usually much more efficient.
* Deterministic mutations when a new input is found.
* Culling out corpus inputs that are redundant due to smaller inputs
already hitting their pcs and memory addresses.
* Applying multiple mutations during dry spells.
* Prioritizing some corpus inputs.
* Creating a list of the most successful input splices (which would
likely contain grammar keywords) and creating a custom mutation for
adding them.
* Removing some less-efficient mutations.
* Store effective mutations to the disk for the benefit of future runs.
* Counting __sanitizer_cov `@returnAddress`es in determining unique
runs.
* Optimize __sanitizer_cov_trace_const_cmp methods (the use of an
ArrayHashMap is not too fast).
* Processor affinity
* Exclude fuzzer's .rodata

Nevertheless, I feel like the fuzzer is in a viable place to start
being useful (as demonstrated in ziglang#23413)
gooncreeper added a commit to gooncreeper/zig that referenced this issue Mar 31, 2025
This PR significantly improves the capabilities of the fuzzer. For
comparison, here is a ten minute head to head between the old and new
fuzzer implementations (with newly included fuzz tests):
-- Old --
```
Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)
```
(note: Unique Runs is highly inflated due of the inefficiency of the
old implementation)
-- New --
```
Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)
Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`
```
NOTE: You have to rebuild the compiler due to new fuzzing
instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine new runs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start, though this does not
close ziglang#20803 since it does not output the input (though it can be
very easily retrieved from the cache directory.)

Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity. Additionally, volatile was removed from MemoryMappedList
since all that is needed is a guarantee that compiler has done the
writes, which is already accomplished with atomic ordering.

Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)

Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
* Closes ziglang#23180

POSSIBLE IMPROVEMENTS:
* Remove the 8-bit pc counting code prefer a call to a sanitizer
function that updates a flag if a new pc hit happened (similar to how
the __sanitizer_cov_load functions already operate).
* Less basic input minimization function. It could also try splitting
inputs into two between each byte to see if they both hit the same pcs.
This is useful as smaller inputs are usually much more efficient.
* Deterministic mutations when a new input is found.
* Culling out corpus inputs that are redundant due to smaller inputs
already hitting their pcs and memory addresses.
* Applying multiple mutations during dry spells.
* Prioritizing some corpus inputs.
* Creating a list of the most successful input splices (which would
likely contain grammar keywords) and creating a custom mutation for
adding them.
* Removing some less-efficient mutations.
* Store effective mutations to the disk for the benefit of future runs.
* Counting __sanitizer_cov `@returnAddress`es in determining unique
runs.
* Optimize __sanitizer_cov_trace_const_cmp methods (the use of an
ArrayHashMap is not too fast).
* Processor affinity
* Exclude fuzzer's .rodata

Nevertheless, I feel like the fuzzer is in a viable place to start
being useful (as demonstrated with the find in ziglang#23413)
gooncreeper added a commit to gooncreeper/zig that referenced this issue Mar 31, 2025
This PR significantly improves the capabilities of the fuzzer. For
comparison, here is a ten minute head to head between the old and new
fuzzer implementations (with newly included fuzz tests):
-- Old --
```
Total Runs: 49020931
Unique Runs: 1044131 (2.1%)
Speed (Runs/Second): 81696
Coverage: 2069 / 15866 (13.0%)
```
(note: Unique Runs is highly inflated due of the inefficiency of the
old implementation)
-- New --
```
Total Runs: 537039526
Unique Runs: 1511 (0.0%)
Speed (Runs/Second): 894950
Coverage: 3000 / 15719 (19.1%)
Examples: `while(C)i(){}else|`
          `{y:n()align(b)addrspace`
          `switch(P){else=>`
          `[:l]align(_:r:l)R`
          `(if(b){defer{nosuspend`
          `union(enum(I))`
```
NOTE: You have to rebuild the compiler due to new fuzzing
instrumentation being enabled for memory loads.

The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine new runs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.

An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start, though this does not
close ziglang#20803 since it does not output the input (though it can be
very easily retrieved from the cache directory.)

Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity. Additionally, volatile was removed from MemoryMappedList
since all that is needed is a guarantee that compiler has done the
writes, which is already accomplished with atomic ordering.

Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)

Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
* Closes ziglang#23180

Possible Improvements:
* Remove the 8-bit pc counting code prefer a call to a sanitizer
function that updates a flag if a new pc hit happened (similar to how
the __sanitizer_cov_load functions already operate).
* Less basic input minimization function. It could also try splitting
inputs into two between each byte to see if they both hit the same pcs.
This is useful as smaller inputs are usually much more efficient.
* Deterministic mutations when a new input is found.
* Culling out corpus inputs that are redundant due to smaller inputs
already hitting their pcs and memory addresses.
* Applying multiple mutations during dry spells.
* Prioritizing some corpus inputs.
* Creating a list of the most successful input splices (which would
likely contain grammar keywords) and creating a custom mutation for
adding them.
* Removing some less-efficient mutations.
* Store effective mutations to the disk for the benefit of future runs.
* Counting __sanitizer_cov `@returnAddress`es in determining unique
runs.
* Optimize __sanitizer_cov_trace_const_cmp methods (the use of an
ArrayHashMap is not too fast).
* Processor affinity
* Exclude fuzzer's .rodata

Nevertheless, I feel like the fuzzer is in a viable place to start
being useful (as demonstrated with the find in ziglang#23413)
@gooncreeper gooncreeper linked a pull request Mar 31, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. fuzzing
Projects
None yet
2 participants