There were many successful attempts to use AFL with Rust programs, see e.g. @frewsxcv's afl.rs. We can go one step further and make guided fuzzing a common way to test Rust code.
Guided fuzzing requires code instrumentation so that the fuzzing engines get feedback from the code execution and can guide a) mutations and b) corpus expansion. Since Rust is based on LLVM, there is such instrumentation available already:
https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards (control flow feedback)
https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow (data flow feedback)
We may need to make some LLVM flags available via the Rust frontend, that's it.
This LLVM instrumentation is already supported by at least AFL, libFuzzer and honggfuzz. We expect more engines to appear in near future and it's important to keep them plug compatible. This way using a new engine on a vast body of code will be trivial.
And by fuzzing engine we should understand a wider class of tools, including e.g. concolic execution tools.
In C/C++ we use the following interface for things that need fuzzing (we call these things fuzz targets):
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
DoSomethingInterestingWithMyAPI(Data, Size);
return 0; // Non-zero return values are reserved for future use.
}
At least as a start I propose to have something similar in (safe) Rust.
The interface should not depend on any particular fuzzing engine -- engines should be interchangeable.
The interface should also allow for both in-process and out-of-process engines.
With some rare exceptions fuzzing needs to be in-process, i.e. the code under test and the fuzzing engine should co-exist in the same process. This typically speeds up fuzzing by 1-2 orders of magnitude and often makes deployment simpler. libFuzzer is in-process, AFL has in-process (aka persistent) mode, same for hoggfuzz
A frequent question about fuzzing is how to fuzz the input consisting of several chunks of data, or even tree-like data structures. One of the possible answers is fuzzing protobufs. This can be discussed later and separately from this proposal.
One interesting special case is fuzzing for equivalence (e.g. to compare Rust and C implementations of the same thing). See my recent write up. This might be especially interesting for projects that re-implement commonly used C libraries, such as https://github.com/briansmith/ring.
Fuzzing is often useful as a one-off effort, but it really shines if used continuously.
One of the services that provide infrastructure for continuous fuzzing is https://github.com/google/oss-fuzz and we'd like to see important Rust projects there.
So after some discussion the way to do it would probably be to create a new sanitizer option, fuzz, which runs sanconv
, and have its corresponding #![sanitizer_runtime]
wrap around the linking-to-libfuzz stuff.
@nagisa is currently looking into this (I'm using a mac right now so it's harder for me to hack on this)
cc @japaric @frewsxcv
I think that there's a superior option for signatures of fuzz targets.
In particular, QuickCheck's Arbitrary
trait nicely captures "given whatever data I am fed by a generator, I can produce an instance". As a result, _any function whose arguments all implement Arbitrary
_ can be seen as a valid fuzz target.
However, even beyond functions, there are closures. fn
implements Fn
, and Fn
vs FnMut
even captures whether fuzzing in parallel within a single address space is kosher. (and FnOnce
captures whether you'll need to get another one...)
Last of all, Fn(...) -> T
is just sugar for Fn<(...,)>
, and Arbitrary
is implemented for tuples whose members all implement Arbitrary
.
As a result:
fn fuzz_once<A: Arbitrary, F: FnOnce<A>>(target: F, data: &[u8])
-> <F as FnOnce<A>>::Target {
let mut g = FuzzGen::new(data);
target.call_once(<A as Arbitrary>::arbitrary(&mut g))
}
fn fuzz_mut<A: Arbitrary, F: FnMut<A>>(target: &mut F, data: &[u8])
-> <F as FnOnce<A>>::Target {
let mut g = FuzzGen::new(data);
target.call_mut(<A as Arbitrary>::arbitrary(&mut g))
}
fn fuzz<A: Arbitrary, F: Fn<A>>(target: &F, data: &[u8])
-> <F as FnOnce<A>>::Target {
let mut g = FuzzGen::new(data);
target.call(<A as Arbitrary>::arbitrary(&mut g))
}
The rest follows naturally.
Current work is in https://github.com/rust-fuzz. AFL has moved there too. @nagisa has working fuzzing via libfuzzer-sys, we just need to wrap it nicely.
This is exciting, but it feels rather "big" to be an issue; should this not go through the RFC process?
We now have https://github.com/rust-fuzz/cargo-fuzz
Yeah, to make it actually part of the distribution it would have to be an rfc. But we can prototype it outside if we want.
Closing. If someone wants to pursue this, please follow the RFC process here https://github.com/rust-lang/rfcs#before-creating-an-rfc.
first class support for fuzzing was recently proposed for go https://go.googlesource.com/proposal/+/master/design/draft-fuzzing.md
For completeness, Go support was proposed long long time ago:
https://docs.google.com/document/u/1/d/1zXR-TFL3BfnceEAWytV8bnzB2Tfp6EPFinWVJ5V4QC8/pub
https://github.com/golang/go/issues/19109
This new one is a respin.
Most helpful comment
Current work is in https://github.com/rust-fuzz. AFL has moved there too. @nagisa has working fuzzing via libfuzzer-sys, we just need to wrap it nicely.