From patchwork Fri Aug 25 16:43:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: George Dunlap X-Patchwork-Id: 9922511 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 44CF8603FA for ; Fri, 25 Aug 2017 16:48:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34DC6281D2 for ; Fri, 25 Aug 2017 16:48:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 29BB0283CB; Fri, 25 Aug 2017 16:48:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A2FC52837E for ; Fri, 25 Aug 2017 16:48:39 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dlHkh-00067v-NA; Fri, 25 Aug 2017 16:46:35 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dlHkg-00066q-J9 for xen-devel@lists.xenproject.org; Fri, 25 Aug 2017 16:46:34 +0000 Received: from [85.158.137.68] by server-6.bemta-3.messagelabs.com id 22/7D-02181-96450A95; Fri, 25 Aug 2017 16:46:33 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFuplkeJIrShJLcpLzFFi42JxWrrBXjczZEG kwa6drBbft0xmcmD0OPzhCksAYxRrZl5SfkUCa8aJ+2IF000qXu85zdTA+FS1i5GTQ0LAX+JL 1wVGEJtNQE9i3vGvLF2MHBwiAioSt/cadDFycTAL7GeU6P36gw2kRljAQWL+o+vMIDaLgKpEc 8chJhCbV8BO4vHtv+wQM+Ulzj24zQwyhxMovuskF4gpJGAr8eKMHkiFEFDn4gdH2SE6BSVOzn zCAmIzC0hIHHzxgnkCI+8sJKlZSFILGJlWMaoXpxaVpRbpmuolFWWmZ5TkJmbm6BoaGOvlphY XJ6an5iQmFesl5+duYgQGTT0DA+MOxstfnQ4xSnIwKYnyWr+cHynEl5SfUpmRWJwRX1Sak1p8 iFGGg0NJgpc9eEGkkGBRanpqRVpmDjB8YdISHDxKIrwlIGne4oLE3OLMdIjUKUZdjjm/d3xhE mLJy89LlRLnXQ5SJABSlFGaBzcCFkuXGGWlhHkZGRgYhHgKUotyM0tQ5V8xinMwKgnzpoJM4c nMK4Hb9AroCCagIyadmANyREkiQkqqgdHTg4OrgelvzRZjXYvJkVf8jllM8Tnm/eDn9lb57vM tJTcval0VP2ZpPmFBhZ/2tH7ViVuLonn3xnrO2f32oOS/NKvmB3U/qp8eWHLpVWXP+tQn59fF crkpn77L63/wz5HQsjKb2lq29/Nj4pKe3l1xXPlguK3aFTHHf9nvy+sUK471s7zdIaDEUpyRa KjFXFScCADirHMnoAIAAA== X-Env-Sender: prvs=4030414b0=George.Dunlap@citrix.com X-Msg-Ref: server-3.tower-31.messagelabs.com!1503679589!111658874!2 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 26275 invoked from network); 25 Aug 2017 16:46:32 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-3.tower-31.messagelabs.com with RC4-SHA encrypted SMTP; 25 Aug 2017 16:46:32 -0000 X-IronPort-AV: E=Sophos;i="5.41,426,1498521600"; d="scan'208";a="445164304" From: George Dunlap To: Date: Fri, 25 Aug 2017 17:43:40 +0100 Message-ID: <20170825164343.29015-11-george.dunlap@citrix.com> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20170825164343.29015-1-george.dunlap@citrix.com> References: <20170825164343.29015-1-george.dunlap@citrix.com> MIME-Version: 1.0 Cc: Ian Jackson , Wei Liu , George Dunlap , Jan Beulich , Andrew Cooper Subject: [Xen-devel] [PATCH 11/14] fuzz/x86_emulate: Make input more compact X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP At the moment, AFL reckons that for any given input, 87% of it is completely irrelevant: that is, it can change it as much as it wants but have no impact on the result of the test; and yet it can't remove it. This is largely because we interpret the blob handed to us as a large struct, including CR values, MSR values, segment registers, and a full cpu_user_regs. Instead, modify our interpretation to have a "set state" stanza at the front. Begin by reading a byte; if it is lower than a certain threshold, set some state according to what byte it is, and repeat. Continue until the byte is above a certain threshold. This allows AFL to compact any given test case much smaller; to the point where now it reckons there is not a single byte of the test file which becomes irrelevant. Testing have shown that this option both allows AFL to reach coverage much faster, and to have a total coverage higher than with the old format. Make this an option (rather than a unilateral change) to enable side-by-side performance comparison of the old and new formats. Signed-off-by: George Dunlap --- I'll reply to this e-mail with a graph of some tests I ran. CC: Ian Jackson CC: Wei Liu CC: Andrew Cooper CC: Jan Beulich --- tools/fuzz/x86_instruction_emulator/afl-harness.c | 13 +++- tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 94 ++++++++++++++++++++--- 2 files changed, 97 insertions(+), 10 deletions(-) diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c index 79f8aec653..12b3765dcc 100644 --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c @@ -4,6 +4,7 @@ #include #include #include +#include extern int LLVMFuzzerInitialize(int *argc, char ***argv); extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size); @@ -12,6 +13,8 @@ extern unsigned int fuzz_minimal_input_size(void); #define INPUT_SIZE 4096 static uint8_t input[INPUT_SIZE]; +extern bool opt_compact; + int main(int argc, char **argv) { size_t size; @@ -22,13 +25,17 @@ int main(int argc, char **argv) setbuf(stdin, NULL); setbuf(stdout, NULL); + opt_compact = true; + while ( 1 ) { enum { OPT_MIN_SIZE, + OPT_COMPACT, }; static const struct option lopts[] = { { "min-input-size", no_argument, NULL, OPT_MIN_SIZE }, + { "compact", required_argument, NULL, OPT_COMPACT }, { 0, 0, 0, 0 } }; int c = getopt_long_only(argc, argv, "", lopts, NULL); @@ -43,8 +50,12 @@ int main(int argc, char **argv) exit(0); break; + case OPT_COMPACT: + opt_compact = atoi(optarg); + break; + case '?': - printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]); + printf("Usage: %s [--compact=0|1] $FILE [$FILE...] | [--min-input-size]\n", argv[0]); exit(-1); break; diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c index 89d1714125..48b02f2bf6 100644 --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -53,6 +53,15 @@ struct fuzz_state }; #define DATA_OFFSET offsetof(struct fuzz_state, corpus) +bool opt_compact; + +unsigned int fuzz_minimal_input_size(void) +{ + if ( opt_compact ) + return sizeof(unsigned long) + 1; + else + return DATA_OFFSET + 1; +} static inline int davail(struct fuzz_state *s, size_t size) { @@ -647,9 +656,81 @@ static void setup_state(struct x86_emulate_ctxt *ctxt) { struct fuzz_state *s = ctxt->data; - /* Fuzz all of the state in one go */ - if (!dread(s, s, DATA_OFFSET)) - exit(-1); + if ( !opt_compact ) + { + /* Fuzz all of the state in one go */ + if (!dread(s, s, DATA_OFFSET)) + exit(-1); + return; + } + + /* Modify only select bits of state */ + + /* Always read 'options' */ + if ( !dread(s, &s->options, sizeof(s->options)) ) + return; + + while(1) { + uint16_t offset; + + /* Read 16 bits to decide what bit of state to modify */ + if ( !dread(s, &offset, sizeof(offset)) ) + return; + + /* + * Then decide if it's "pointing to" different bits of the + * state + */ + + /* cr[]? */ + if ( offset < 5 ) + { + if ( !dread(s, s->cr + offset, sizeof(*s->cr)) ) + return; + printf("Setting CR %d to %lx\n", offset, s->cr[offset]); + continue; + } + + offset -= 5; + + /* msr[]? */ + if ( offset < MSR_INDEX_MAX ) + { + if ( !dread(s, s->msr + offset, sizeof(*s->msr)) ) + return; + printf("Setting MSR i%d (%x) to %lx\n", offset, + msr_index[offset], s->msr[offset]); + continue; + } + + offset -= MSR_INDEX_MAX; + + /* segments[]? */ + if ( offset < SEG_NUM ) + { + if ( !dread(s, s->segments + offset, sizeof(*s->segments)) ) + return; + printf("Setting Segment %d\n", offset); + continue; + + } + + offset -= SEG_NUM; + + /* regs? */ + if ( offset < sizeof(struct cpu_user_regs) + && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) ) + { + if ( !dread(s, ((char *)ctxt->regs) + offset, sizeof(uint64_t)) ) + return; + printf("Setting cpu_user_regs offset %x\n", offset); + continue; + } + + /* None of the above -- take that as "start emulating" */ + + return; + } } #define CANONICALIZE(x) \ @@ -821,7 +902,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size) /* Reset all global state variables */ memset(&input, 0, sizeof(input)); - if ( size <= DATA_OFFSET ) + if ( size < fuzz_minimal_input_size() ) { printf("Input too small\n"); return 1; @@ -858,11 +939,6 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size) return 0; } -unsigned int fuzz_minimal_input_size(void) -{ - return DATA_OFFSET + 1; -} - /* * Local variables: * mode: C