diff mbox

[14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed

Message ID 20170825164343.29015-14-george.dunlap@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

George Dunlap Aug. 25, 2017, 4:43 p.m. UTC
AFL considers a testcase to be a useful addition not only if there are
tuples exercised by that testcase which were not exercised otherwise,
but also if the *number* of times an individual tuple is exercised
changes significantly; in particular, if the number of the highes bit
changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).

Unfortunately, one simple way to increase these stats it to execute
the same (or similar) instructions multiple times.  Such long
testcases take exponentially longer to fuzz: the fuzzer spends more
time flipping bits looking for meaningful changes, and each execution
takes longer because it is doing more things.  So long paths which add
nothing to the actual code coverage but effectively "distract" the
fuzzer, making it less effective.

Experiments have shown that not allowing infinite number of
instruction retries for the old (non-compact) format does indeed speed
up and increase code coverage.  However, it has also shown that on the
new, more compact format, having no instruction limit causes the highest
throughput in code coverage.

So leave the option in, but have it default to 0 (no limit).

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 9 ++++++++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 7 ++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

Comments

Wei Liu Sept. 19, 2017, 10:05 a.m. UTC | #1
On Fri, Sep 15, 2017 at 02:55:03PM +0100, George Dunlap wrote:
> On 09/15/2017 02:38 PM, Wei Liu wrote:
> > On Fri, Aug 25, 2017 at 05:43:43PM +0100, George Dunlap wrote:
> >> AFL considers a testcase to be a useful addition not only if there are
> >> tuples exercised by that testcase which were not exercised otherwise,
> >> but also if the *number* of times an individual tuple is exercised
> >> changes significantly; in particular, if the number of the highes bit
> >> changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).
> >>
> >> Unfortunately, one simple way to increase these stats it to execute
> >> the same (or similar) instructions multiple times.  Such long
> >> testcases take exponentially longer to fuzz: the fuzzer spends more
> >> time flipping bits looking for meaningful changes, and each execution
> >> takes longer because it is doing more things.  So long paths which add
> >> nothing to the actual code coverage but effectively "distract" the
> >> fuzzer, making it less effective.
> >>
> >> Experiments have shown that not allowing infinite number of
> >> instruction retries for the old (non-compact) format does indeed speed
> >> up and increase code coverage.  However, it has also shown that on the
> >> new, more compact format, having no instruction limit causes the highest
> >> throughput in code coverage.
> >>
> >> So leave the option in, but have it default to 0 (no limit).
> > 
> > How does limiting the number of loops help afl produce better input?
> > Wouldn't afl still try to flip bits beyond the limit (say, the >=n+1
> > instructions when the limit is n)? I assume it will give up at some
> > point, but when?
> 
> * Limiting the number of loops means longer testcases don't look
> different than shorter testcases
> 
> * If a testcase doesn't look different than existing testcases, then AFL
> will discard it rather than adding it to the queue
> 
> * Larger "queue" testcases exhibit a quadratic effect for search time:
> They are longer to fuzz (since they're bigger) and each testcase is
> longer to run.
> 
> * So limiting the number of loops causes the testcases in the queue to
> be shorter, which in turn decreases the time to fuzz and execute test
> cases in the queue, which causes higher throughput.
> 
> > I guess my point is having a limit but doesn't tell afl about it seems
> > a bit sub-optimal to me. I'm not quite sure if I understand correctly
> > how afl works though.
> 
> Well I think the key thing is to look at the graph.  Either 1) there's
> something wrong with my methodology, or 2) for the original layout,
> limiting the number of testcases helps improve AFL's performance.
> 

I was just curious how it got that behaviour, either way I don't think I
would care that much because they are all internal implementation
details which are subject to change at any time.
diff mbox

Patch

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 86c1241784..5cc6ba39ff 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -15,6 +15,7 @@  static uint8_t input[INPUT_SIZE];
 
 extern bool opt_compact;
 extern bool opt_rerun;
+extern int opt_instruction_limit;
 
 int main(int argc, char **argv)
 {
@@ -34,11 +35,13 @@  int main(int argc, char **argv)
             OPT_MIN_SIZE,
             OPT_COMPACT,
             OPT_RERUN,
+            OPT_INSTRUCTION_LIMIT,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
             { "compact", required_argument, NULL, OPT_COMPACT },
             { "rerun", no_argument, NULL, OPT_RERUN },
+            { "instruction-limit", required_argument, NULL, OPT_INSTRUCTION_LIMIT },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -61,8 +64,12 @@  int main(int argc, char **argv)
             opt_rerun = true;
             break;
 
+        case OPT_INSTRUCTION_LIMIT:
+            opt_instruction_limit = atoi(optarg);
+            break;
+
         case '?':
-            printf("Usage: %s [--compact=0|1] [--rerun] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] [--rerun] [--instruction-limit=N] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 7a07e7e37a..46c382db11 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -960,10 +960,13 @@  void setup_fuzz_state(struct fuzz_state *state, const uint8_t *data_p, size_t si
     state->data_num = size;
 }
 
+int opt_instruction_limit = 0;
+
 int runtest(struct fuzz_state *state) {
     int rc;
 
     struct x86_emulate_ctxt *ctxt = &state->ctxt;
+    int icount = 0;
     
     state->ops = all_fuzzer_ops;
 
@@ -988,7 +991,9 @@  int runtest(struct fuzz_state *state) {
 
         rc = x86_emulate(ctxt, &state->ops);
         printf("Emulation result: %d\n", rc);
-    } while ( rc == X86EMUL_OKAY );
+    } while ( rc == X86EMUL_OKAY &&
+              (!opt_instruction_limit ||
+               (++icount < opt_instruction_limit)) );
 
     save_fpu_state(state->fxsave);