-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/fuzzer: use a MAB to decide on exec fuzz vs exec gen #4632
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
|
Let's try to use a plain delta-epsylon MAB for this purpose. To better track its effect, also calculate moving averages of the "new max signal" / "execution time" ratios for exec fuzz and exec gen.
2de14ee
to
dbd3f2d
Compare
Extra CallInfo // stores Signal and Cover collected from background threads | ||
Calls []CallInfo | ||
Extra CallInfo // stores Signal and Cover collected from background threads | ||
ElapsedSec float64 // total execution time in seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would store this as time.Duration. Floats are pretty specific thing, if something wants floats (learning), it should convert the standard time to floats.
Also if we add per-syscall execution time, storing it as seconds will be somewhat strange since most syscalls execute within microseconds.
Okay, this particular approach will likely not work well because the "new signal / execution time" ratio can have a pretty large range of values. The bounded learning rate, even being as small as 0.0005, would not cope with extreme outliers. I'm currently testing how useful it may be if we just look at the probability of generating any new coverage. This is the easiest implementation-wise. The "new signal / execution time" ratio itself still sounds like a much better target, but we need to map it to a smaller range of values to use as a reward for a MAB. Perhaps keep some history of values of the last N "new signal / execution time" values and see what share of them is less than the current observed value? Then it would nicely map to The only caveat is that if N is large, we might need some tricky data structures to determine this rate efficiently (balanced trees should be able do it in |
Maybe we could map coverage and execution time to some logarithmic buckets: 1, 10, 100, or finer-grained 1,2,5,10,20,50,100...? |
I'm currently running experiments to see whether it brings any fuzzing performance improvements.
Cc #1950
Let's try to use a plain delta-epsylon MAB for this purpose.
To better track its effect, also calculate moving averages of the "new max signal" / "execution time" ratios for exec fuzz and exec gen.