diff mbox

[v3,06/12] fuzz/x86_emulate: Take multiple test files for inputs

Message ID 20171010162011.9629-6-george.dunlap@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

George Dunlap Oct. 10, 2017, 4:20 p.m. UTC
Finding aggregate coverage for a set of test files means running each
afl-generated test case through the harness.  At the moment, this is
done by re-executing afl-harness-cov with each input file.  When a
large number of test cases have been generated, this can take a
significant amonut of time; a recent test with 30k total files
generated by 4 parallel fuzzers took over 7 minutes.

The vast majority of this time is taken up with 'exec', however.
Since the harness is already designed to loop over multiple inputs for
llvm "persistent mode", just allow it to take a large number of inputs
on the same when *not* running in llvm "persistent mode"..  Then the
command can be efficiently executed like this:

  ls */queue/id* | xargs $path/afl-harness-cov

For the above-mentioned test on 30k files, the time to generate
coverage data was reduced from 7 minutes to under 30 seconds.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v3:
- Combine some variable declarations
- Make sure that count is set only once no matter how it's compiled
v2:
- Make check for batch processing more clear

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/README.afl                             |  7 +++++++
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 25 +++++++++++++++--------
 2 files changed, 24 insertions(+), 8 deletions(-)

Comments

Andrew Cooper Oct. 10, 2017, 4:56 p.m. UTC | #1
On 10/10/17 17:20, George Dunlap wrote:
> @@ -65,12 +68,15 @@ int main(int argc, char **argv)
>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>      __AFL_INIT();
>  
> -    while ( __AFL_LOOP(1000) )
> +    for( count = 0; __AFL_LOOP(1000); )
> +#else
> +    for( count = 0; count < max; count++ )
>  #endif
>      {
>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>          {
> -            fp = fopen(argv[optind], "rb");
> +            printf("Opening file %s\n", argv[optind]);
> +            fp = fopen(argv[optind + count], "rb");

I presume the printf() wants adjusting to match the fopen() ?

~Andrew

>              if ( fp == NULL )
>              {
>                  perror("fopen");
> @@ -100,7 +106,10 @@ int main(int argc, char **argv)
>          if ( !feof(fp) )
>          {
>              printf("Input too large\n");
> -            exit(-1);
> +            /* Don't exit if we're doing batch processing */
> +            if ( max == 1 )
> +                exit(-1);
> +            continue;
>          }
>  
>          if ( fp != stdin )
George Dunlap Oct. 10, 2017, 4:58 p.m. UTC | #2
On 10/10/2017 05:56 PM, Andrew Cooper wrote:
> On 10/10/17 17:20, George Dunlap wrote:
>> @@ -65,12 +68,15 @@ int main(int argc, char **argv)
>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>>      __AFL_INIT();
>>  
>> -    while ( __AFL_LOOP(1000) )
>> +    for( count = 0; __AFL_LOOP(1000); )
>> +#else
>> +    for( count = 0; count < max; count++ )
>>  #endif
>>      {
>>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>>          {
>> -            fp = fopen(argv[optind], "rb");
>> +            printf("Opening file %s\n", argv[optind]);
>> +            fp = fopen(argv[optind + count], "rb");
> 
> I presume the printf() wants adjusting to match the fopen() ?

Oh!  I thought I'd fixed that.  Indeed it does.

I can fix that on check-in, if we don't find anything bigger worth
re-sending for.

 -George
Andrew Cooper Oct. 10, 2017, 5:56 p.m. UTC | #3
On 10/10/17 17:58, George Dunlap wrote:
> On 10/10/2017 05:56 PM, Andrew Cooper wrote:
>> On 10/10/17 17:20, George Dunlap wrote:
>>> @@ -65,12 +68,15 @@ int main(int argc, char **argv)
>>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>>>      __AFL_INIT();
>>>  
>>> -    while ( __AFL_LOOP(1000) )
>>> +    for( count = 0; __AFL_LOOP(1000); )
>>> +#else
>>> +    for( count = 0; count < max; count++ )
>>>  #endif
>>>      {
>>>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>>>          {
>>> -            fp = fopen(argv[optind], "rb");
>>> +            printf("Opening file %s\n", argv[optind]);
>>> +            fp = fopen(argv[optind + count], "rb");
>> I presume the printf() wants adjusting to match the fopen() ?
> Oh!  I thought I'd fixed that.  Indeed it does.
>
> I can fix that on check-in, if we don't find anything bigger worth
> re-sending for.

I can't see anything else needing fixing.  Acked-by: Andrew Cooper
<andrew.cooper3@citrix.com>

~Andrew
diff mbox

Patch

diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
index 8b58b8cdea..a59564985a 100644
--- a/tools/fuzz/README.afl
+++ b/tools/fuzz/README.afl
@@ -49,6 +49,13 @@  generate coverage data.  To do this, use the target `afl-cov`:
 
     $ make afl-cov #produces afl-harness-cov
 
+In order to speed up the process of checking total coverage,
+`afl-harness-cov` can take several test inputs on its command-line;
+the speed-up effect should be similar to that of using afl-clang-fast.
+You can use xargs to do this most efficiently, like so:
+
+    $ ls queue/id* | xargs $path/afl-harness-cov
+
 NOTE: Please also note that the coverage instrumentation hard-codes
 the absolute path for the instrumentation read and write files in the
 binary; so coverage data will always show up in the build directory no
diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 57b4542556..26b710cb3f 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -16,6 +16,7 @@  int main(int argc, char **argv)
 {
     size_t size;
     FILE *fp = NULL;
+    int max, count;
 
     setbuf(stdin, NULL);
     setbuf(stdout, NULL);
@@ -42,8 +43,7 @@  int main(int argc, char **argv)
             break;
 
         case '?':
-        usage:
-            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
@@ -54,10 +54,13 @@  int main(int argc, char **argv)
         }
     }
 
-    if ( optind == argc ) /* No positional parameters.  Use stdin. */
+    max = argc - optind;
+
+    if ( !max ) /* No positional parameters.  Use stdin. */
+    {
+        max = 1;
         fp = stdin;
-    else if ( optind != (argc - 1) )
-        goto usage;
+    }
 
     if ( LLVMFuzzerInitialize(&argc, &argv) )
         exit(-1);
@@ -65,12 +68,15 @@  int main(int argc, char **argv)
 #ifdef __AFL_HAVE_MANUAL_CONTROL
     __AFL_INIT();
 
-    while ( __AFL_LOOP(1000) )
+    for( count = 0; __AFL_LOOP(1000); )
+#else
+    for( count = 0; count < max; count++ )
 #endif
     {
         if ( fp != stdin ) /* If not using stdin, open the provided file. */
         {
-            fp = fopen(argv[optind], "rb");
+            printf("Opening file %s\n", argv[optind]);
+            fp = fopen(argv[optind + count], "rb");
             if ( fp == NULL )
             {
                 perror("fopen");
@@ -100,7 +106,10 @@  int main(int argc, char **argv)
         if ( !feof(fp) )
         {
             printf("Input too large\n");
-            exit(-1);
+            /* Don't exit if we're doing batch processing */
+            if ( max == 1 )
+                exit(-1);
+            continue;
         }
 
         if ( fp != stdin )