Skip to content

missing trials when doing local experiment with runners-cpus #2075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zukatsinadze opened this issue Mar 11, 2025 · 2 comments · May be fixed by #2076
Open

missing trials when doing local experiment with runners-cpus #2075

zukatsinadze opened this issue Mar 11, 2025 · 2 comments · May be fixed by #2076

Comments

@zukatsinadze
Copy link

Hi @DonggeLiu @jonathanmetzman

Lately, I've been running lots of local experiments on fuzzbench and noticed that after I added --runners-cpus flag reports were sometimes incomplete due to race condition.

This is my config:

# The number of trials of a fuzzer-benchmark pair.
trials: 5

# The amount of time in seconds that each trial is run for.
# 1 day = 24 * 60 * 60 = 86400
max_total_time: 3600

# The location of the docker registry.
# FIXME: Support custom docker registry.
# See https://github.com/google/fuzzbench/issues/777
docker_registry: gcr.io/fuzzbench

# The local experiment folder that will store most of the experiment data.
# Please use an absolute path.
experiment_filestore: /home/zuka/hexhive/data/local-runs/experiment-data

# The local report folder where HTML reports and summary data will be stored.
# Please use an absolute path.
report_filestore: /home/zuka/hexhive/data/local-runs/report-data

# Flag that indicates this is a local experiment.
local_experiment: true

and I use this command to start experiment:

PYTHONPATH=. python3 experiment/run_experiment.py \                                                                                                                                                                
--experiment-config experiment-config.yaml \
--benchmarks curl_curl_fuzzer_http freetype2_ftfuzzer bloaty_fuzz_target jsoncpp_jsoncpp_fuzzer libxml2_xml sqlite3_ossfuzz vorbis_decode_fuzzer \
--experiment-name libafl-1h-with-seeds \
--fuzzers libafl_default libafl_random libafl_weighted libafl_valprof libafl_covaccount \
--concurrent-builds 15 --runners-cpus 15 --measurers-cpus 1

Adding runners-cpus besides restricting number of usable CPUs, also adds pinning to docker command. Most of the times I am getting only first cycle of trials (If I run with --runners-cpus 16, then I get only 16 trials in the report). For other trials there were fuzzer logs, corpus archives, but no coverage archives.

The reason for this is measurer_main_process ends before the next cycle of trials is started. I see Finished measure loop. in the logs after the first cycle and the loop is never restarted.

After some more debugging I found the issue in this piece of code inside measure_manager_loop

        while not scheduler.all_trials_ended(experiment):
            continue_inner_loop = measure_manager_inner_loop(
                experiment, max_cycle, request_queue, response_queue,
                queued_snapshots)
             if not continue_inner_loop:
                break
            time.sleep(MEASUREMENT_LOOP_WAIT)

After the first cycle ends, measure_manager_inner_loop returns False and the loop breaks out, because there are no unmeasured snapshots in the database yet.

I don't really understand the need for this break, so to fix the issue for my runs, I just removed break logic from the measurer loop and just let it run until scheduler.all_trials_ended. If you think this is an acceptable solution I can create PR.

@DonggeLiu
Copy link
Contributor

Hi @zukatsinadze, thanks for posting this!
We will really appreciate it if you could submit a PR for this.
FuzzBench allows us to run experiments on PRs, we will be able to compare the results over there : )

@zukatsinadze zukatsinadze linked a pull request Mar 11, 2025 that will close this issue
@zukatsinadze
Copy link
Author

Created PR #2076

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants