• Input length exploration. A programs may explore cer-
tain states only when the length of the input exceeds
some threshold, but neither symbolic execution nor
gradient descent can tell the fuzzer when to increase the
length of the input. Angora detects when the length of
the input may affect a path constraint and then increases
the input length adequately (Section 3.6).
Angora outperformed state-of-the-art fuzzers substan-
tially. Table 1 compares the bugs found by Angora with
other fuzzers on the LAVA-M data set [9]. Angora found
more bugs in each program in the data set. Particularly,
in who Angora found 1541 bugs, which is eight times as
many bugs as found by the second-best fuzzer, Steelix.
Moreover, Angora found 103 bugs that the LAVA authors
injected but could not trigger. We also tested Angora on
eight popular, mature open source programs. Angora found
6, 52, 29, 40 and 48 new bugs in file, jhead, nm, objdump
and size, respectively (Table 5). We measured the coverage
of Angora and evaluated how its key techniques contribute
to its impressive performance.
2. Background: American Fuzzy Lop (AFL)
Fuzzing is an automated testing technique to find
bugs. American Fuzzy Lop (AFL) [1] is a state-of-the-
art mutation-based graybox fuzzer. AFL employs light-
weight compile-time instrumentation and genetic algorithms
to automatically discover test cases that likely trigger new
internal states in the targeted program. As a coverage-based
fuzzer, AFL generates inputs to traverse different paths in
the program to trigger bugs.
2.1. Branch coverage
AFL measures a path by a set of branches. During each
run, AFL counts how many times each branch executes.
It represents a branch as a tuple (l
prev
, l
cur
), where l
prev
and l
cur
are the IDs of the basic blocks before and after
the conditional statement, respectively. AFL gets the branch
coverage information by using lightweight instrumentation.
The instrumentation is injected at each branch point at
compile time. For each run, AFL allocates a path trace table
to count how many times each branch of every conditional
statement executes. The index to the table is the hash of a
branch, h(l
prev
, l
cur
), where h is a hash function.
AFL also keeps a global branch coverage table across
different runs. Each entry contains an 8-bit vector that
records how many times the branch executes in different
runs. Each bit in this vector b represents a range: b
0
, . . . , b
7
represent the ranges [1], [2], [3], [4, 7], [8, 15], [16, 31],
[32, 127], [128, ∞), respectively. For example, if b
3
is set,
then it indicates that there exists a run where this branch
executed between 4 and 7 times, inclusively.
AFL compares the path trace table and branch coverage
table to determine, heuristically, whether a new input trig-
gers a new internal state of the program. An input triggers
a new internal state if either of the following happens:
• The program executes a new branch, i.e., the path
trace table has an entry for this branch but the branch
coverage table has no entry for this branch.
• There exists a branch where the number of times, n,
this branch executed in the current run is different from
any previous runs. AFL determines this approximately
by examining whether the bit representing the range of
n was set in the corresponding bit vector in the branch
coverage table.
2.2. Mutation strategies
AFL applies the following mutations on the input ran-
domly [3].
• Bit or byte flips.
• Attempts to set “interesting” bytes, words, or dwords.
• Addition or subtraction of small integers to bytes,
words, or dwords.
• Completely random single-byte sets.
• Block deletion, block duplication via overwrite or in-
sertion, or block memset.
• Splice two distinct input files at a random location.
3. Design
3.1. Overview
AFL and other similar fuzzers use branch coverage as
the metric. However, they fail to consider the call context
when calculating branch coverage. Our experience shows
that without context, branch coverage would fail to explore
program states adequately. Therefore, we propose context-
sensitive branch coverage as the metric of coverage (Sec-
tion 3.2).
Algorithm 1 shows Angora’s two stages: instrumentation
and the fuzzing loop. During each iteration of the fuzzing
loop, Angora selects an unexplored branch and searches
for an input that explores this branch. We introduce the
following key techniques to find the input efficiently.
• For most conditional statements, its predicate is influ-
enced by only a few bytes in the input, so it would
be unproductive to mutate the entire input. Therefore,
when exploring a branch, Angora determines which
input bytes flow into the corresponding predicate and
focuses on mutating these bytes only (Section 3.3).
• After determining which input bytes to mutate, Angora
needs to decide how to mutate them. Using random
or heuristics-based mutations is unlikely to find sat-
isfactory values efficiently. Instead, we view the path
constraint on a branch as a constraint on a blackbox
function over the input, and we adapt the gradient de-
scent algorithm for solving the constraint (Section 3.4).
• During gradient descent, we evaluate the blackbox
function over its arguments, where some arguments
consist of multiple bytes. For example, when four
consecutive bytes in the input that are always used
together as an integer flow into a conditional statement,