Resmack: Part 1: Grammar Fuzzing Thoughts

Note: This post is part of a series on developing resmack. See the resmack tag for all posts in the series.

Part 2 Full Fuzzer Detour →

I am working on a new grammar fuzzer, resmack-rust (renamed to resmack-rust to free up the name resmack) that is intended to be fast, handle dynamic expressions, and easy to use. This post contains a few of my current thoughts from resmack’s development.

Python Is Slow

A large part of wanting to improve on my other grammar fuzzer, gramfuzz, was due to massive, but expected, disappointment in not being able to use my python-based tools as part of a fast, coverage-based fuzzer. I always knew Python’s speed wouldn’t be able to stack up against something written in C/Rust/*, but it was much worse than I thought, even with all of the tricks and optimizations that I used to boost pfp’s data generation speed. See the performance section of a talk I gave about using pfp with libFuzzer:

Method Exec/s Chart
libFuzzer + pfp ~8k/s ▓▓▓▓▓▓▓▓
libFuzzer ~75k/s ████████████████████████████████████████

Rust is Fast

Rust had been on my to-do list for far too long. I had attempted a few times to pick it up, but didn’t generate the velocity needed to break through the “wow this is different/hard” learning curve. I had been itching to have another go at Rust.

I have also been enjoying Brandon Falk’s Twitter discussions and live-streams of the work he is doing. Reading through his fzero fuzzer grammar fuzzer project is what reminded me of my solution for gracefully handling recursion in gramfuzz.

Brandon’s fzero grammar fuzzer is written in Rust, and was created in response to f1 fuzzer. It is very fast. I use his fuzzer as a reference point for resmack as I develop it. I run my fuzzer and fzero on the same grammar, with both reporting MiB/s of generated data.

Below is a trivial grammar that I run resmack with:

let rules = rules
    .add_rule("Special", "SPECIAL ONE")
    .add_rule("RefdRule", or!("Hello", "Blah", reff!("Special")))
    .add_rule("TestRule", and!(reff!("RefdRule"), "World"))
    .add_rule("TestRule2", and!(reff!("TestRule"), "World"))
    .add_rule("TestRule2", and!(reff!("TestRule"), "World"))
    .add_rule("TestRule2", int!(min = 5, max = 1337))
    .add_rule("TestRule2", and!(or!(1, 2, 3, 4, 5, string!(min = 5, max = 10, charset = "abcdefg"))))
    .add_rule("TestRule2", and!(1000.5))
    .add_rule("TestRule2", "---World");

And below is a similar grammar that I use with fzero:

{
    "<start>": [["<test>"]],
    "<test>": [["<TestRule2>"]],
    "<int>": [["<digit>"], ["<onenine>", "<digits>"], ["-", "<digits>"],
              ["-", "<onenine>", "<digits>"]],
    "<characters>": [["<character-1>"]],
    "<character>": [["0"], ["1"], ["2"], ["3"], ["4"], ["5"], ["6"], ["7"],
                    ["8"], ["9"], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"],
                    ["g"], ["h"], ["i"], ["j"], ["k"], ["l"], ["m"], ["n"],
                    ["o"], ["p"], ["q"], ["r"], ["s"], ["t"], ["u"], ["v"],
                    ["w"], ["x"], ["y"], ["z"], ["A"], ["B"], ["C"], ["D"],
                    ["E"], ["F"], ["G"], ["H"], ["I"], ["J"], ["K"], ["L"],
                    ["M"], ["N"], ["O"], ["P"], ["Q"], ["R"], ["S"], ["T"],
                    ["U"], ["V"], ["W"], ["X"], ["Y"], ["Z"], ["!"], ["#"],
                    ["$"], ["%"], ["&"], ["\""], ["("], [")"], ["*"], ["+"],
                    [","], ["-"], ["."], ["/"], [":"], [";"], ["<"], ["="],
                    [">"], ["?"], ["@"], ["["], ["]"], ["^"], ["_"], ["`"],
                    ["{"], ["|"], ["}"], ["~"], [" "]],
    "<Special>": [["SPECIAL ONE"]],
    "<RefdRule>": [["Hello"], ["Blah"], ["<Special>"]],
    "<TestRule>": [["<RefdRule>", "World"]],
    "<TestRule2>": [["<TestRule2_1>"], ["<TestRule2_2>"], ["<TestRule2_3>"], ["<TestRule2_4>"], ["<TestRule_5>"], ["<TestRule_6>"]],
    "<TestRule2_1>": [["<TestRule>", "World"]],
    "<TestRule2_2>": [["<TestRule>", "World"]],
    "<TestRule2_3>": [["<int>"]],
    "<TestRule2_4>": [["1"], ["2"], ["3"], ["4"], ["5"], ["<string>"]],
    "<TestRule2_5>": [["1000.5"]],
    "<TestRule2_6>": [["---World"]]
}

and just for kicks, here’s the same grammar implemented in gramfuzz:

Def("Special", "SPECIAL ONE")
Def("RefdRule", Or("Hello", "Blah", Ref("Special")))
Def("TestRule", Ref("RefdRule"), "World")
Def("TestRule2", Ref("TestRule"), "World")
Def("TestRule2", Ref("TestRule"), "World")
Def("TestRule2", Or(1, 2, 3, 4, 5, String(min=0, max=10, charset="abcdefg")))
Def("TestRule2", 1000.5)
Def("TestRule2", "---World")

Below are the throughput numbers that I am currently getting with each of the grammars above:

Fuzzer MiB/s SAFE_ONLY=false
resmack 89 MiB/s 92 MiB/s
fzero 352 MiB/s 515 MiB/s
gramfuzz 0.71 MiB/s N/A

fzero is about four times faster when using safe code. The biggest part of what makes fzero so fast is that it generates rust code that is then compiled into a standalone fuzzing binary, with all rules and rule values being static:

fn fragment_59(&mut self, depth: usize) {
    if depth >= 20 { return; }
    match self.rand() % 3 {
        0 => self.fragment_16(depth + 1),
        1 => self.fragment_18(depth + 1),
        2 => self.fragment_20(depth + 1),
        _ => unreachable!(),
    }
}
fn fragment_16(&mut self, depth: usize) {
    if depth >= 20 { return; }
    self.buf.extend(&[72, 101, 108, 108, 111]); // Hello
}

This allows massive performance gains to be had by taking advantage of compiler optimizations and having static rule values. I want resmack to include dynamic rules, where one rule can reference another, local, named rule to generate a valid checksum. I could probably implement resmack in a similar way that fzero is implemented (generate rust code that is then compiled), but right now it’s not on the roadmap. Perhaps it should be though. My first thought is to use macros to define dynamic rule implementations that override the build() function. Maybe it would work.

Note on SAFE_ONLY in resmack/fzero

fzero has a flag SAFE_ONLY that causes the generated rust code to use an unsafe code block to copy chosen values into the output Vec<u8> instead of using buf.extend(...):

unsafe {
    let old_size = output.len();
    let new_size = old_size + item.len();

    if new_size > output.capacity() {
        output.reserve(new_size - old_size);
    }

    std::ptr::copy_nonoverlapping(
        item.as_ptr(),
        output.as_mut_ptr().offset(old_size as isize),
        item.len(),
    );
    output.set_len(new_size);
}

The actual code generated by fzero has all item.len() and item references replaced with actual, inline values:

unsafe {
    let old_size = self.buf.len();
    let new_size = old_size + 5;

    if new_size > self.buf.capacity() {
        self.buf.reserve(new_size - old_size);
    }

    std::ptr::copy_nonoverlapping(
        [72, 101, 108, 108, 111].as_ptr(), // Hello
        self.buf.as_mut_ptr().offset(old_size as isize),
        5,
    );
    self.buf.set_len(new_size);
}

~Doubling performance is not seen when using the same unsafe copy in resmack.

Getting Rid of Rule Categories

When I wrote gramfuzz I didn’t want to have to specify a top-level rule to encapsulate all sub rules, such as <expression> or <start>. I added the concept of “categories” that I could bucket sets of rules into, with the intention of being able to tell the fuzzer to generate random rules from category X.

I’ve gotten rid of this concept - it’s the same as specifying the top-level rule that I was against for some reason, and it adds one more level of indirection that isn’t needed.

PRNG Performance

Resmack started off using Rust’s rand crate. I quickly learned that the rand crate is slow, especially after trying a few other PRNG implementations.

I bounced around a few times between different PRNG implementations that worked with the rand crate, before ultimately giving up on rand altogether.

Resmack currently uses the xoshiro128** (star star) algorithm, that I directly ported into Rust. I found it gave a good tradeoff between speed and having a good distribution of values.

Since I’ve been comparing resmack’s performance to fzero, I also wanted to compare resmack’s Xoshiro128** PRNG implementation with the one fzero uses. The chart below includes my Xoshiro128** rust implementation, fzero’s PRNG, and the other rand-compatible implementations that I tested.

image

These graphs can be generated using cargo bench --bench prngs -- --verbose. The generated image is located at ./target/criterion/prngs/report/violin.svg

I was originally going to include OsRng in the benchmark, but it was ~600x slower than Xoshiro128** and seriously skewed the graph:

image

Full table of PRNG benchmarks:

Implementation Slope Mean Median Outliers
Xoshiro128** 1.8877 ns 1.8903 ns 1.8803 ns 18/100 - 6 high mild, 12 high severe
FZero 2.6668 ns 2.6690 ns 2.6631 ns 14/100 - 5 high mild, 9 high severe
XorShiftRng 2.8246 ns 2.8316 ns 2.8223 ns 14/100 - 1 high mild, 13 high severe
rand_xorshift 2.8426 ns 2.8439 ns 2.8235 ns 18/100 - 5 high mild, 13 high severe
rand::Hc128 5.2896 ns 5.3053 ns 5.2824 ns 16/100 - 3 high mild, 13 high severe
rand::IsaacRng 7.4043 ns 7.4208 ns 7.4029 ns 13/100 - 4 high mild, 9 high severe
rand::ChaCha 7.9252 ns 7.9373 ns 7.9167 ns 13/100 - 6 high mild, 7 high severe
rand::StdRng 7.9738 ns 7.9932 ns 7.9627 ns 15/100 - 6 high mild, 9 high severe
rand::OsRng 616.79 ns 617.63 ns 616.51 ns 9/100 - 2 high mild, 7 high severe
comments powered by Disqus