-
Notifications
You must be signed in to change notification settings - Fork 108
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This implements the first step of #337 Adds a harness generation flow that, in comparison to the existing default builder: - Provides repository link for the target project. - Is C-specific, uses no CPP code language or similar. - Includes post-processing on the generated code to add certain header files we always want in the harnesses. - Adds constraints on header files the LLM should include in the harnesses. Does this by providing absolute paths to header files in the OSS-Fuzz containers. - Uses some new fuzz introspector APIs to help with context. This PR was made to have no intrusion on the existing workflow, i.e. experiments can continue as they are running now. However, there are several improvements that can be made and I prefer to have these in follow-up PRs: 1) Fixing logic relies on the default prompt builder. This is because the code fixer creates a new prompt builder https://github.com/google/oss-fuzz-gen/blob/09d2235f3957c4d43367ecbd7f3f88147b487abf/llm_toolkit/code_fixer.py#L408 This in fact means that the C++ default logic is used for fixing JVM targets. I would like to change the flow here in the medium term such that the code fixing logic reuses the one we used for main harness generation. I think this should be changed so the prompt builder comes closer to a "harness generator" abstraction and has more knowledge of the target under analysis. But, I prefer to do this later as the PR is already big. 2) Integrate so we can run experiments in the CI with bother or either harness generation flows. 3) Add new features to the prompt builder. Ref: #337 --------- Signed-off-by: David Korczynski <[email protected]>
- Loading branch information
1 parent
3bd98aa
commit 8d8a8bd
Showing
6 changed files
with
219 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
<system> | ||
Hello! I need you to write a fuzzing harness. The target codebase is written purely in the C language so the harness should be in pure C. | ||
|
||
The Codebase we are targeting is located in the repository {TARGET_REPO}. | ||
|
||
I would like for you to write the harness targeting the function {TARGET_FUNCTION} | ||
|
||
The source code for the function is: | ||
|
||
{TARGET_FUNCTION_SOURCE_CODE} | ||
|
||
The harness should be in libFuzzer style, with the code wrapped in `int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)`. Specifically, do not include `extern "C"` in the fuzzer code. | ||
|
||
Please wrap all code in <code> tags and you should include nothing else but the code in your reply. Do not include any other text. | ||
|
||
Make sure the ensure strings passed to the target are null-terminated. | ||
|
||
There is one rule that your harness must satisfy: all of the header files in this library are: {TARGET_HEADER_FILES}. Make sure to not include any header files not in this list and you should use the full path to the header file as outlined in the list. | ||
|
||
{FUNCTION_ARG_TYPES_MSG} | ||
|
||
The most important part of the harness is that it will build and compile correctly against the target code. Please focus on making the code as simple as possible in order to secure it can be build. | ||
|
||
{ADDITIONAL_INFORMATION} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters