Message ID | 20170925142648.25959-8-george.dunlap@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote: > --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c > +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c > @@ -16,6 +16,8 @@ int main(int argc, char **argv) > { > size_t size; > FILE *fp = NULL; > + int count = 0; > + int max; Generally speaking these should be unsigned int, but I see how this collides with the types of the variables max is being calculated from. In any event both could go on a single line. > @@ -66,11 +70,14 @@ int main(int argc, char **argv) > __AFL_INIT(); > > while ( __AFL_LOOP(1000) ) > +#else > + for( count = 0; count < max; count++ ) Initially I've thought the initializer on count was pointless further up because of the re-initialization here. Of course that's needed because of the #if/#else this sits in. Hence I wonder whether omitting the assignment here wouldn't be appropriate - it wouldn't really be wromng for a compiler to warn about this redundancy. Either way Acked-by: Jan Beulich <jbeulich@suse.com> Jan
On 10/04/2017 09:24 AM, Jan Beulich wrote: >>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote: >> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c >> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c >> @@ -16,6 +16,8 @@ int main(int argc, char **argv) >> { >> size_t size; >> FILE *fp = NULL; >> + int count = 0; >> + int max; > > Generally speaking these should be unsigned int, but I see how this > collides with the types of the variables max is being calculated from. > In any event both could go on a single line. Ack > >> @@ -66,11 +70,14 @@ int main(int argc, char **argv) >> __AFL_INIT(); >> >> while ( __AFL_LOOP(1000) ) >> +#else >> + for( count = 0; count < max; count++ ) > > Initially I've thought the initializer on count was pointless further > up because of the re-initialization here. Of course that's needed > because of the #if/#else this sits in. Hence I wonder whether omitting > the assignment here wouldn't be appropriate - it wouldn't really be > wromng for a compiler to warn about this redundancy. Could do that I suppose. The other option would be to change the other loop to `for ( count = 0; __AFL_LOOP(1000); )` Let me know if you have any preference. > Either way > Acked-by: Jan Beulich <jbeulich@suse.com> Thanks, -George
>>> On 04.10.17 at 18:58, <george.dunlap@citrix.com> wrote: > On 10/04/2017 09:24 AM, Jan Beulich wrote: >>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote: >>> @@ -66,11 +70,14 @@ int main(int argc, char **argv) >>> __AFL_INIT(); >>> >>> while ( __AFL_LOOP(1000) ) >>> +#else >>> + for( count = 0; count < max; count++ ) >> >> Initially I've thought the initializer on count was pointless further >> up because of the re-initialization here. Of course that's needed >> because of the #if/#else this sits in. Hence I wonder whether omitting >> the assignment here wouldn't be appropriate - it wouldn't really be >> wromng for a compiler to warn about this redundancy. > > Could do that I suppose. The other option would be to change the other > loop to `for ( count = 0; __AFL_LOOP(1000); )` > > Let me know if you have any preference. No preference, your alternative is fine. Jan
diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl index 0d955b2687..e8c23d734c 100644 --- a/tools/fuzz/README.afl +++ b/tools/fuzz/README.afl @@ -49,6 +49,13 @@ generate coverage data. To do this, use the target `afl-cov`: $ make afl-cov #produces afl-harness-cov +In order to speed up the process of checking total coverage, +`afl-harness-cov` can take several test inputs on its command-line; +the speed-up effect should be similar to that of using afl-clang-fast. +You can use xargs to do this most efficiently, like so: + + $ ls queue/id* | xargs $path/afl-harness-cov + NOTE: Please also note that the coverage instrumentation hard-codes the absolute path for the instrumentation read and write files in the binary; so coverage data will always show up in the build directory no diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c index b4d15451b5..669f698711 100644 --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c @@ -16,6 +16,8 @@ int main(int argc, char **argv) { size_t size; FILE *fp = NULL; + int count = 0; + int max; setbuf(stdin, NULL); setbuf(stdout, NULL); @@ -42,8 +44,7 @@ int main(int argc, char **argv) break; case '?': - usage: - printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]); + printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]); exit(-1); break; @@ -54,10 +55,13 @@ int main(int argc, char **argv) } } - if ( optind == argc ) /* No positional parameters. Use stdin. */ + max = argc - optind; + + if ( !max ) /* No positional parameters. Use stdin. */ + { + max = 1; fp = stdin; - else if ( optind != (argc - 1) ) - goto usage; + } if ( LLVMFuzzerInitialize(&argc, &argv) ) exit(-1); @@ -66,11 +70,14 @@ int main(int argc, char **argv) __AFL_INIT(); while ( __AFL_LOOP(1000) ) +#else + for( count = 0; count < max; count++ ) #endif { if ( fp != stdin ) /* If not using stdin, open the provided file. */ { - fp = fopen(argv[optind], "rb"); + printf("Opening file %s\n", argv[optind]); + fp = fopen(argv[optind + count], "rb"); if ( fp == NULL ) { perror("fopen"); @@ -89,7 +96,10 @@ int main(int argc, char **argv) if ( !feof(fp) ) { printf("Input too large\n"); - exit(-1); + /* Don't exit if we're doing batch processing */ + if ( max == 1 ) + exit(-1); + continue; } if ( fp != stdin )
Finding aggregate coverage for a set of test files means running each afl-generated test case through the harness. At the moment, this is done by re-executing afl-harness-cov with each input file. When a large number of test cases have been generated, this can take a significant amonut of time; a recent test with 30k total files generated by 4 parallel fuzzers took over 7 minutes. The vast majority of this time is taken up with 'exec', however. Since the harness is already designed to loop over multiple inputs for llvm "persistent mode", just allow it to take a large number of inputs on the same when *not* running in llvm "persistent mode".. Then the command can be efficiently executed like this: ls */queue/id* | xargs $path/afl-harness-cov For the above-mentioned test on 30k files, the time to generate coverage data was reduced from 7 minutes to under 30 seconds. Signed-off-by: George Dunlap <george.dunlap@citrix.com> --- v2: - Make check for batch processing more clear CC: Ian Jackson <ian.jackson@citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jan Beulich <jbeulich@suse.com> --- tools/fuzz/README.afl | 7 +++++++ tools/fuzz/x86_instruction_emulator/afl-harness.c | 24 ++++++++++++++++------- 2 files changed, 24 insertions(+), 7 deletions(-)