A Fuzzing Lesson (AFL) April 13, 2020

Do not be misled by the title; AFL does not stand for A Fuzzing Lesson. It stands for American Fuzzy Lop: one of the cutest animals out there. You might wonder what this rabbit is doing here since this is not a biology class. In fact, the state-of-the-art fuzzer that you are going to learn about is named after this cute rabbit.

AFL is a security-oriented fuzzer written in C that combines compile-time instrumentation with a genetic algorithm to discover crash-inducing inputs. In contrast to other fuzzers, AFL’s focus is practicality: low-performance overhead, very few configuration options, and handling real-world programs.

Design Goals of AFL

We start out by discussing the design goals of AFL which make it widely usable.

1. AFL is Fast. AFL uses a combination of tricks that make it faster compared to other fuzzers. First, it takes advantage of low-level instrumentation which imposes negligible overhead. Second, it prioritizes mutation strategies that lead to discovering more paths. Finally, it re-evaluates the queue of generated inputs on specific intervals using a fast algorithm that selects a smaller subset of test cases that still cover every exercised tuple of the form (branch_source, branch_destination) so far. Every queue entry is assigned a score proportional to its execution latency and file size. Then, the candidate with the lowest score is selected for each tuple.

2. AFL Focuses on Code Coverage. Coverage-guided fuzzing (aka grey-box fuzzing) tries to maximize the code coverage of a program such that more and more code branches are exercised. AFL instrumentation captures branch coverage and hit counts. It maintains an array shared_mem which is a 64 KB region passed to the instrumented binary. It keeps the information regarding the branch source and branch destination for each branch as tuples. So, it distinguishes between the following execution paths and tries to discover new paths in each iteration. For example, it can distinguish between these execution paths:

(1) A --> B --> C --> D
(2) A --> B --> D --> C

since the recorded tuples are different:

(1) (A,B) (B,C) (C,D)
(2) (A,B) (B,D) (D,C)

3. AFL is Easy to Use. AFL is designed to be highly practical. Compared to other fuzzers, AFL requires virtually no configuration and fine-tuning. You can jump to the end of this article for a quick guide describing how to run AFL. Thereafter, you will be set to start using AFL to hunt bugs in programs!

4. AFL Can be Chained to Other Tools. Although AFL can be used without additional options, one can use additional tools to enhance the effectiveness of AFL and find bugs that might go unnoticed when using the vanilla fuzzer. For instance, AddressSanitizer can be added as a compiler option to enable detecting invalid memory accesses.

5. AFL Can Minimize Generated Inputs. AFL provides a tool, afl-tmin and afl-cmin, for test case minimization. It takes an input file and tries to trim as much data as possible while producing the same crashing state or instrumentation output. In other words, it minimizes the input without altering the execution path. This is especially useful when reporting or investigating a crash.

Kinds of Input Mutations

AFL is based on a set of mutation strategies that are shown to utilize CPU efficiently and generate interesting test cases. Here are some of the main mutation strategies for generating new inputs that make AFL sophisticated:

1. Bit Flips: This simple strategy involves sequential, ordered flips for 1-4 bits. The observed results demonstrate 10 to 70 new additional paths per million inputs generated.

               
AFL Rocks!   ==>    AFL Rhcks!
              3/div>

Although the number of the additional paths seems small, since AFL usually explores around 1000 inputs per second per core, such paths appear fairly quickly. For instance, on an 8-core machine, it takes 125 seconds to explore a million inputs. Moreover, discovering one unseen path can lead to discovering more unseen paths, as AFL uses the newly discovered paths as feedback to quickly exercise unseen branches.

2. Byte Flips: A natural extension to the previous strategy is flipping bytes. This strategy involves flipping 8-, 16-, or 32-bit wide bitflip. Nearly 30 additional paths are discovered per million generated inputs. Note that this strategy is much cheaper compared to flipping bits due to the underlying hardware support for flipping bytes. However, it limits the opportunities to find additional paths.

               C   >
AFL Rocks!   ==>    AFL Rhcks!
              Pv>

3. Simple Arithmetic: For triggering more complex conditions deterministically, AFL increments or decrements constant integers by 1-35. The effectiveness of this strategy varies depending on the input format. For instance, in JPEG, it helps discover around 2 additional paths per million inputs.

4. Known Integers: Another deterministic strategy in AFL involves replacing the numbers in the input file with a hardcoded set of integer numbers that are likely to trigger edge, crashing cases. Examples of such integers include 0, -1, MAX_INT, and MIN_INT-1. This strategy leads to discovering 2-5 additional paths per one million generated inputs.

char *t                       ==>     char *t
= malloc(128*sizeof(char));           = malloc(MAX_INT*sizeof(char));

5. Stacked Tweaks: This strategy randomly stacks some (or all) of the following operations and applies them: single bit-flip and byte-flip, block duplication, block deletion, etc. The number of operations that are stacked is randomly chosen as a power-of-two between 1 and 64, and the block size for block operations is usually 1 KB.

6. Splicing: In contrast to previous strategies, this one relies on two distinct inputs. Then, they are spliced in a random location. This strategy sometimes discovers 20% additional paths.

Real Bugs Found by AFL

AFL has been able to find many interesting bugs to date, around 500 of which have been security vulnerabilities, and [have been officially reported](https://github.com/mrash/afl-cve). Here is a sampling of the kinds of bugs together with links to the corresponding vulnerabilities found by AFL:

Assertion violation [example: CVE-2016-9399]
Buffer overflow [example: CVE-2015-8126]
Integer overflow [example: CVE-2017-6839]
Null dereference [example: CVE-2017-7475]
Infinite loop [example: CVE-2017-5852]

Quick Guide to Run AFL

Part 1: Running the Fuzzer

Follow the step-by-step instructions to install and run AFL.

1. Install AFL:

$ apt-get install -y afl

2. Download fuzzgoat source code, extract it, and build it with afl-clang. This step will inject instrumentations into the target program.

$ git clone https://github.com/fuzzstati0n/fuzzgoat
$ cd fuzzgoat
$ export CC=afl-clang
$ make -j8

3. Next, create an input folder where the seed file will reside, and an output folder where the generated crashing inputs will go.

$ mkdir input output

4. We are going to use a valid ELF file which is already on the system as the seed input.

$ cp /bin/ps input/

5. Finally, run the fuzzer:

$ afl-fuzz -i input -o output -- ./fuzzgoat @@

You will immediately see the following GUI starting in the terminal. After a few seconds, you should see some crashing inputs. You can find them under the output/crashes folder. You can find more details about the status screen in AFL’s documentation.

Part 2: Minimizing the Crashing Inputs

Next, we are going to try out afl-tmin that performs test minimizing. This tool minimizes the test case to the bare minimum that results in the same execution path. It iterates over the actual bytes in the test case and removes smaller and smaller blocks of bytes. You can observe this tool in action using the following command.

$ afl-tmin -i output/crashes/[crash-file] -o test.min -- ./fuzzgoat