Resmack: Part 1: Grammar Fuzzing Thoughts
Note: This post is part of a series on developing resmack. See the resmack tag for all posts in the series.
I am working on a new grammar fuzzer, resmack-rust (renamed to resmack-rust to free up the name resmack) that is intended to be fast, handle dynamic expressions, and easy to use. This post contains a few of my current thoughts from resmack’s development.
Python Is Slow
A large part of wanting to improve on my other grammar fuzzer,
gramfuzz, was due to massive, but
expected, disappointment in not being able to use my python-based tools
as part of a fast, coverage-based fuzzer. I always knew Python’s speed wouldn’t
be able to stack up against something written in C
/Rust
/*, but it was much
worse than I thought, even with all of the tricks and optimizations that I
used to boost pfp’s data generation speed. See
the performance section
of a talk I gave about using pfp
with libFuzzer:
Method | Exec/s | Chart |
---|---|---|
libFuzzer + pfp | ~8k/s | ▓▓▓▓▓▓▓▓ |
libFuzzer | ~75k/s | ████████████████████████████████████████ |
Rust is Fast
Rust had been on my to-do list for far too long. I had attempted a few times to pick it up, but didn’t generate the velocity needed to break through the “wow this is different/hard” learning curve. I had been itching to have another go at Rust.
I have also been enjoying Brandon Falk’s Twitter discussions and live-streams of the work he is doing. Reading through his fzero fuzzer grammar fuzzer project is what reminded me of my solution for gracefully handling recursion in gramfuzz.
Brandon’s fzero grammar fuzzer is written in Rust, and was created in response to f1 fuzzer. It is very fast. I use his fuzzer as a reference point for resmack as I develop it. I run my fuzzer and fzero on the same grammar, with both reporting MiB/s of generated data.
Below is a trivial grammar that I run resmack with:
let rules = rules
.add_rule("Special", "SPECIAL ONE")
.add_rule("RefdRule", or!("Hello", "Blah", reff!("Special")))
.add_rule("TestRule", and!(reff!("RefdRule"), "World"))
.add_rule("TestRule2", and!(reff!("TestRule"), "World"))
.add_rule("TestRule2", and!(reff!("TestRule"), "World"))
.add_rule("TestRule2", int!(min = 5, max = 1337))
.add_rule("TestRule2", and!(or!(1, 2, 3, 4, 5, string!(min = 5, max = 10, charset = "abcdefg"))))
.add_rule("TestRule2", and!(1000.5))
.add_rule("TestRule2", "---World");
And below is a similar grammar that I use with fzero:
{
"<start>": [["<test>"]],
"<test>": [["<TestRule2>"]],
"<int>": [["<digit>"], ["<onenine>", "<digits>"], ["-", "<digits>"],
["-", "<onenine>", "<digits>"]],
"<characters>": [["<character-1>"]],
"<character>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"],
["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"],
["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"],
["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"],
["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"],
["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"],
["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["#"],
["$"], ["%"], ["&"], ["\""], ["("], [")"], ["*"], ["+"],
[","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="],
[">"], ["?"], ["@"], ["["], ["]"], ["^"], ["_"], ["`"],
["{"], ["|"], ["}"], ["~"], [" "]],
"<Special>": [["SPECIAL ONE"]],
"<RefdRule>": [["Hello"], ["Blah"], ["<Special>"]],
"<TestRule>": [["<RefdRule>", "World"]],
"<TestRule2>": [["<TestRule2_1>"], ["<TestRule2_2>"], ["<TestRule2_3>"], ["<TestRule2_4>"], ["<TestRule_5>"], ["<TestRule_6>"]],
"<TestRule2_1>": [["<TestRule>", "World"]],
"<TestRule2_2>": [["<TestRule>", "World"]],
"<TestRule2_3>": [["<int>"]],
"<TestRule2_4>": [["1"], ["2"], ["3"], ["4"], ["5"], ["<string>"]],
"<TestRule2_5>": [["1000.5"]],
"<TestRule2_6>": [["---World"]]
}
and just for kicks, here’s the same grammar implemented in gramfuzz:
Def("Special", "SPECIAL ONE")
Def("RefdRule", Or("Hello", "Blah", Ref("Special")))
Def("TestRule", Ref("RefdRule"), "World")
Def("TestRule2", Ref("TestRule"), "World")
Def("TestRule2", Ref("TestRule"), "World")
Def("TestRule2", Or(1, 2, 3, 4, 5, String(min=0, max=10, charset="abcdefg")))
Def("TestRule2", 1000.5)
Def("TestRule2", "---World")
Below are the throughput numbers that I am currently getting with each of the grammars above:
Fuzzer | MiB/s | SAFE_ONLY=false |
---|---|---|
resmack | 89 MiB/s | 92 MiB/s |
fzero | 352 MiB/s | 515 MiB/s |
gramfuzz | 0.71 MiB/s | N/A |
fzero is about four times faster when using safe code. The biggest part of what makes fzero so fast is that it generates rust code that is then compiled into a standalone fuzzing binary, with all rules and rule values being static:
fn fragment_59(&mut self, depth: usize) {
if depth >= 20 { return; }
match self.rand() % 3 {
0 => self.fragment_16(depth + 1),
1 => self.fragment_18(depth + 1),
2 => self.fragment_20(depth + 1),
_ => unreachable!(),
}
}
fn fragment_16(&mut self, depth: usize) {
if depth >= 20 { return; }
self.buf.extend(&[72, 101, 108, 108, 111]); // Hello
}
This allows massive performance gains to be had by taking advantage of
compiler optimizations and having static rule values. I want resmack to include
dynamic rules, where one rule can reference another, local, named rule to
generate a valid checksum. I could probably implement resmack in a similar
way that fzero is implemented (generate rust code that is then compiled),
but right now it’s not on the roadmap. Perhaps it should be though. My first
thought is to use macros to define dynamic rule implementations that override
the build()
function. Maybe it would work.
Note on SAFE_ONLY
in resmack/fzero
fzero has a flag SAFE_ONLY
that causes the generated rust code to use an
unsafe code block to copy chosen values into the output Vec<u8>
instead
of using buf.extend(...)
:
unsafe {
let old_size = output.len();
let new_size = old_size + item.len();
if new_size > output.capacity() {
output.reserve(new_size - old_size);
}
std::ptr::copy_nonoverlapping(
item.as_ptr(),
output.as_mut_ptr().offset(old_size as isize),
item.len(),
);
output.set_len(new_size);
}
The actual code generated by fzero has all item.len()
and item
references
replaced with actual, inline values:
unsafe {
let old_size = self.buf.len();
let new_size = old_size + 5;
if new_size > self.buf.capacity() {
self.buf.reserve(new_size - old_size);
}
std::ptr::copy_nonoverlapping(
[72, 101, 108, 108, 111].as_ptr(), // Hello
self.buf.as_mut_ptr().offset(old_size as isize),
5,
);
self.buf.set_len(new_size);
}
~Doubling performance is not seen when using the same unsafe copy in resmack.
Getting Rid of Rule Categories
When I wrote gramfuzz I didn’t want to have to specify a top-level rule to
encapsulate all sub rules, such as <expression>
or <start>
. I added the
concept of “categories” that I could bucket sets of rules into, with the
intention of being able to tell the fuzzer to generate random rules from
category X
.
I’ve gotten rid of this concept - it’s the same as specifying the top-level rule that I was against for some reason, and it adds one more level of indirection that isn’t needed.
PRNG Performance
Resmack started off using Rust’s rand
crate. I quickly learned that the rand
crate is slow, especially after
trying a few other PRNG implementations.
I bounced around a few times between different PRNG implementations that worked
with the rand
crate, before ultimately giving up on rand
altogether.
Resmack currently uses
the xoshiro128** (star star)
algorithm,
that I directly ported into Rust. I found it gave a good tradeoff between speed
and having a good distribution of values.
Since I’ve been comparing resmack’s performance to fzero, I also wanted to
compare resmack’s Xoshiro128** PRNG implementation with the one fzero uses.
The chart below includes my Xoshiro128**
rust implementation, fzero’s PRNG,
and the other rand
-compatible implementations that I tested.
These graphs can be generated using cargo bench --bench prngs -- --verbose
.
The generated image is located at ./target/criterion/prngs/report/violin.svg
I was originally going to include OsRng
in the benchmark, but it was ~600x
slower than Xoshiro128** and seriously skewed the graph:
Full table of PRNG benchmarks:
Implementation | Slope | Mean | Median | Outliers |
---|---|---|---|---|
Xoshiro128** |
1.8877 ns | 1.8903 ns | 1.8803 ns | 18/100 - 6 high mild, 12 high severe |
FZero |
2.6668 ns | 2.6690 ns | 2.6631 ns | 14/100 - 5 high mild, 9 high severe |
XorShiftRng |
2.8246 ns | 2.8316 ns | 2.8223 ns | 14/100 - 1 high mild, 13 high severe |
rand_xorshift |
2.8426 ns | 2.8439 ns | 2.8235 ns | 18/100 - 5 high mild, 13 high severe |
rand::Hc128 |
5.2896 ns | 5.3053 ns | 5.2824 ns | 16/100 - 3 high mild, 13 high severe |
rand::IsaacRng |
7.4043 ns | 7.4208 ns | 7.4029 ns | 13/100 - 4 high mild, 9 high severe |
rand::ChaCha |
7.9252 ns | 7.9373 ns | 7.9167 ns | 13/100 - 6 high mild, 7 high severe |
rand::StdRng |
7.9738 ns | 7.9932 ns | 7.9627 ns | 15/100 - 6 high mild, 9 high severe |
rand::OsRng |
616.79 ns | 617.63 ns | 616.51 ns | 9/100 - 2 high mild, 7 high severe |