From patchwork Tue May 21 15:39:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Caccavale X-Patchwork-Id: 10954153 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF54213AD for ; Tue, 21 May 2019 15:40:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB69628A85 for ; Tue, 21 May 2019 15:40:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF3B228AA2; Tue, 21 May 2019 15:40:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_INVALID,DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1037E28AA0 for ; Tue, 21 May 2019 15:40:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728767AbfEUPj6 (ORCPT ); Tue, 21 May 2019 11:39:58 -0400 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:19222 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728067AbfEUPj5 (ORCPT ); Tue, 21 May 2019 11:39:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1558453191; x=1589989191; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=AMfjMIyVVLp3f0BRoXOD3C8gCuDAnUH9nAaKFrrh6kw=; b=KL50Y/mMH7IUAE3rFD7uUBU5ClNj4OCDTGTogz/hFifiytGnuUaALSDT lbLNpW7Nm8BRkBFOUoZ2pXY2TIF7ovLhX/K24RZZ2YO5vkz7UMNbz4pYO IkSgRKdcUVJR5Gqo32UuqvHTPMxY0rqGLZTGzer2647DP7FcoGCvyc0IZ M=; X-IronPort-AV: E=Sophos;i="5.60,495,1549929600"; d="scan'208";a="675574214" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2a-69849ee2.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 21 May 2019 15:39:48 +0000 Received: from EX13MTAUEA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-69849ee2.us-west-2.amazon.com (8.14.7/8.14.7) with ESMTP id x4LFdhwK026958 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 21 May 2019 15:39:47 GMT Received: from EX13D08UEE004.ant.amazon.com (10.43.62.182) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:46 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D08UEE004.ant.amazon.com (10.43.62.182) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:45 +0000 Received: from uc2253769c0055c.ant.amazon.com (10.28.85.98) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 21 May 2019 15:39:41 +0000 From: Sam Caccavale CC: , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 1/3] Build target for emulate.o as a userspace binary Date: Tue, 21 May 2019 17:39:22 +0200 Message-ID: <20190521153924.15110-2-samcacc@amazon.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190521153924.15110-1-samcacc@amazon.de> References: <20190521153924.15110-1-samcacc@amazon.de> MIME-Version: 1.0 To: unlisted-recipients:; (no To-header on input) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This commit contains the minimal set of functionality to build afl-harness around arch/x86/emulate.c which allows exercising code in that source file, like x86_emulate_insn. Resolving the dependencies was done via GCC's -H flag by get_headers.py. --- tools/Makefile | 9 ++ .../fuzz/x86_instruction_emulation/.gitignore | 2 + tools/fuzz/x86_instruction_emulation/Makefile | 57 +++++++ .../fuzz/x86_instruction_emulation/README.md | 12 ++ .../x86_instruction_emulation/afl-harness.c | 149 ++++++++++++++++++ tools/fuzz/x86_instruction_emulation/common.h | 87 ++++++++++ .../x86_instruction_emulation/emulator_ops.c | 58 +++++++ .../x86_instruction_emulation/emulator_ops.h | 117 ++++++++++++++ .../scripts/get_headers.py | 95 +++++++++++ .../scripts/make_deps | 4 + tools/fuzz/x86_instruction_emulation/stubs.c | 56 +++++++ tools/fuzz/x86_instruction_emulation/stubs.h | 52 ++++++ 12 files changed, 698 insertions(+) create mode 100644 tools/fuzz/x86_instruction_emulation/.gitignore create mode 100644 tools/fuzz/x86_instruction_emulation/Makefile create mode 100644 tools/fuzz/x86_instruction_emulation/README.md create mode 100644 tools/fuzz/x86_instruction_emulation/afl-harness.c create mode 100644 tools/fuzz/x86_instruction_emulation/common.h create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.c create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.h create mode 100644 tools/fuzz/x86_instruction_emulation/scripts/get_headers.py create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/make_deps create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.c create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.h diff --git a/tools/Makefile b/tools/Makefile index 3dfd72ae6c1a..4d68817b7e49 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -94,6 +94,12 @@ freefall: FORCE kvm_stat: FORCE $(call descend,kvm/$@) +fuzz: FORCE + $(call descend,fuzz/x86_instruction_emulation) + +fuzz_deps: FORCE + $(call descend,fuzz/x86_instruction_emulation,fuzz_deps) + all: acpi cgroup cpupower gpio hv firewire liblockdep \ perf selftests spi turbostat usb \ virtio vm bpf x86_energy_perf_policy \ @@ -171,6 +177,9 @@ tmon_clean: freefall_clean: $(call descend,laptop/freefall,clean) +fuzz_clean: + $(call descend,fuzz/x86_instruction_emulation,clean) + build_clean: $(call descend,build,clean) diff --git a/tools/fuzz/x86_instruction_emulation/.gitignore b/tools/fuzz/x86_instruction_emulation/.gitignore new file mode 100644 index 000000000000..7d44f7ce266e --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/.gitignore @@ -0,0 +1,2 @@ +*.o +*-harness diff --git a/tools/fuzz/x86_instruction_emulation/Makefile b/tools/fuzz/x86_instruction_emulation/Makefile new file mode 100644 index 000000000000..d2854a332605 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/Makefile @@ -0,0 +1,57 @@ +ROOT_DIR=../../.. +THIS_DIR=tools/fuzz/x86_instruction_emulation + +include ../../scripts/Makefile.include + +.DEFAULT_GOAL := all + +INCLUDES := $(patsubst -I./%,-I./$(ROOT_DIR)/%, $(LINUXINCLUDE)) +INCLUDES := $(patsubst ./include/%,./$(ROOT_DIR)/include/%, $(INCLUDES)) +INCLUDES += -include ./$(ROOT_DIR)/include/linux/compiler_types.h + +$(ROOT_DIR)/.config: + make -C $(ROOT_DIR) menuconfig + sed -i -r 's/^#? *CONFIG_KVM(.*)=.*/CONFIG_KVM\1=y/' $(ROOT_DIR)/.config + + +ifdef DEBUG +KBUILD_CFLAGS += -DDEBUG +endif +KBUILD_CFLAGS += -g -O0 + +# This may still not a final solution-- but multiple build flags need to be +# filtered out of KBUILD_FLAGS so a solution is needed. +UNNEEDED_FLAGS:= -mno-sse +KBUILD_CFLAGS := $(filter-out $(UNNEEDED_FLAGS),$(KBUILD_CFLAGS)) + +DEPS := emulator_ops.h stubs.h common.h + +%.o: %.c $(DEPS) + $(CC) $(KBUILD_CFLAGS) $(INCLUDES) -g -c -o $@ $< + +KERNEL_OBJS := arch/x86/kvm/emulate.o \ + arch/x86/lib/retpoline.o \ + lib/find_bit.o +KERNEL_OBJS := $(patsubst %,$(ROOT_DIR)/%, $(KERNEL_OBJS)) +# I don't know how to make KERNEL_OBJS a dependency of this. I cannot call +# make from here since it causes a cycle and I would rather not hardcode +# these dependencies into _another_ makefile. +$(KERNEL_OBJS): $(ROOT_DIR)/.config + $(error Run `./tools/fuzz/x86_instruction_emulation/scripts/make_deps' first (setting envvar CC if desired).) + +LOCAL_OBJS := emulator_ops.o stubs.o +afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS) + @$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie + +all: afl-harness + +.PHONY: fuzz_deps +fuzz_deps: + cd $(ROOT_DIR) && \ + python3 $(THIS_DIR)/scripts/get_headers.py \ + $(THIS_DIR)/afl-harness.c \ + arch/x86/kvm/emulate.c + +.PHONY: clean +clean: + $(RM) -r *.o afl-harness diff --git a/tools/fuzz/x86_instruction_emulation/README.md b/tools/fuzz/x86_instruction_emulation/README.md new file mode 100644 index 000000000000..a78ac51b0152 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/README.md @@ -0,0 +1,12 @@ +# Building + +From the root of linux, run: +1. `./tools/fuzz/x86_instruction_emulation/scripts/make_deps` + - Optionally, set the envvar CC with your desired compiler. +2. `make tools/fuzz` + +## TODO + +I'd like to add the object files built in `make_deps` as real +dependencies of `make tools/fuzz` but it causes a cycle in Make to +add them to `tools/fuzz/x86_instruction_emulation/Makefile`. \ No newline at end of file diff --git a/tools/fuzz/x86_instruction_emulation/afl-harness.c b/tools/fuzz/x86_instruction_emulation/afl-harness.c new file mode 100644 index 000000000000..b3b09d7f15f2 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/afl-harness.c @@ -0,0 +1,149 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * x86 Instruction Emulation Fuzzing Wrapper + * + * Authors: + * Sam Caccavale + * + * Supporting code from xen: + * xen/tools/fuzz/x86_instruction_emulation/afl-harness.c + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1 + */ + +#include +#include +#include +#include +#include +#include +#include "emulator_ops.h" + +/* Arbitrary, but limiting fuzz input size is important. */ +#define MAX_INPUT_SIZE 4096 +#define INSTRUCTION_BYTES (MAX_INPUT_SIZE - MIN_INPUT_SIZE) + +int main(int argc, char **argv) +{ + size_t size; + FILE *fp = NULL; + int max, count; + struct state *state; + + setbuf(stdin, NULL); + setbuf(stdout, NULL); + + while (1) { + enum { OPT_INPUT_SIZE, + }; + static const struct option lopts[] = { + { "input-size", no_argument, NULL, OPT_INPUT_SIZE }, + { 0, 0, 0, 0 } + }; + int c = getopt_long_only(argc, argv, "", lopts, NULL); + + if (c == -1) + break; + + switch (c) { + case OPT_INPUT_SIZE: + printf("Min: %u\n", MIN_INPUT_SIZE); + printf("Max: %u\n", MAX_INPUT_SIZE); + exit(0); + break; + + case '?': + printf("Usage: %s $FILE [$FILE...] | [--input-size]\n", + argv[0]); + exit(-1); + break; + + default: + printf("Bad getopt return %d (%c)\n", c, c); + exit(-1); + break; + } + } + + max = argc - optind; + + if (!max) { /* No positional parameters. Use stdin. */ + max = 1; + fp = stdin; + } + + state = create_emulator(); + state->data = malloc(INSTRUCTION_BYTES); + +#ifdef __AFL_HAVE_MANUAL_CONTROL + __AFL_INIT(); + + /* + * This is the number of times AFL's forkserver should reuse a + * process to fuzz the target. 1000 is the recommended starting + * point. Future tweaking may or may not yeild better results. + */ + for (count = 0; __AFL_LOOP(1000);) +#else + for (count = 0; count < max; count++) +#endif + { + if (fp != stdin) { /* If not stdin, open the provided file. */ + printf("Opening file %s\n", argv[optind + count]); + fp = fopen(argv[optind + count], "rb"); + if (fp == NULL) { + perror("fopen"); + exit(-1); + } + } +#ifdef __AFL_HAVE_MANUAL_CONTROL + else { + /* + * This will ensure we're dealing with a clean stream + * state after the afl-fuzz process messes with the + * open file handle. + */ + fseek(fp, 0, SEEK_SET); + } +#endif + reset_emulator(state); + + size = fread(state, 1, MIN_INPUT_SIZE, fp); + if (size != MIN_INPUT_SIZE) { + printf("Input does not populate state\n"); + if (max == 1) + exit(-1); + } + + size = fread(state->data, 1, INSTRUCTION_BYTES, fp); + state->data_available = size; + + if (ferror(fp)) { + perror("fread"); + exit(-1); + } + + /* Only run the test if the input file was < than INPUT_SIZE */ + if (feof(fp)) { + initialize_emulator(state); + emulate_until_complete(state); + } else { + printf("Input too large\n"); + /* Don't exit if we're doing batch processing */ + if (max == 1) + exit(-1); + } + + if (fp != stdin) { + fclose(fp); + fp = NULL; + } + } + + free(state->data); + free_emulator(state); + return 0; +} diff --git a/tools/fuzz/x86_instruction_emulation/common.h b/tools/fuzz/x86_instruction_emulation/common.h new file mode 100644 index 000000000000..9aec2bea3d50 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/common.h @@ -0,0 +1,87 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include + +#define PRIx64 "llx" +#define MSR_INDEX_MAX 16 + +#ifdef DEBUG +#define DEBUG_PRINT 1 +#else +#define DEBUG_PRINT 0 +#endif + +#define debug(...) \ + do { \ + if (DEBUG_PRINT) \ + fprintf(stderr, __VA_ARGS__); \ + } while (0) + +enum x86_segment { + /* General purpose. */ + x86_seg_es, + x86_seg_cs, + x86_seg_ss, + x86_seg_ds, + x86_seg_fs, + x86_seg_gs, + /* System: Valid to use for implicit table references. */ + x86_seg_tr, + x86_seg_ldtr, + x86_seg_gdtr, + x86_seg_idtr, + /* No Segment: For accesses which are already linear. */ + x86_seg_none +}; + +#define NR_SEG x86_seg_none + +struct segment_register { + uint16_t sel; + union { + uint16_t attr; + struct { + uint16_t type : 4; + uint16_t s : 1; + uint16_t dpl : 2; + uint16_t p : 1; + uint16_t avl : 1; + uint16_t l : 1; + uint16_t db : 1; + uint16_t g : 1; + uint16_t pad : 4; + }; + }; + uint32_t limit; + uint64_t base; +}; + +enum user_regs { + REGS_RAX, + REGS_RCX, + REGS_RDX, + REGS_RBX, + REGS_RSP, + REGS_RBP, + REGS_RSI, + REGS_RDI, + REGS_R8, + REGS_R9, + REGS_R10, + REGS_R11, + REGS_R12, + REGS_R13, + REGS_R14, + REGS_R15, + NR_REGS +}; +#define NR_VCPU_REGS NR_REGS + +static inline void print_n_bytes(unsigned char *bytes, size_t n) +{ + int i; + + for (i = 0; i < n; i++) + fprintf(stderr, " %02x", bytes[i]); + fprintf(stderr, "\n"); +} diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c b/tools/fuzz/x86_instruction_emulation/emulator_ops.c new file mode 100644 index 000000000000..55ae4e8fbd96 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * x86 Instruction Emulation Fuzzing Wrapper + * + * Authors: + * Sam Caccavale + * + * Supporting code from xen: + * xen/tools/fuzz/x86_instruction_emulation/fuzz_emul.c + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1 + */ + +#include +#include +#include +#include + +#include "stubs.h" +#include "emulator_ops.h" + +#include + +#include +#include +#include +#include + +void initialize_emulator(struct state *state) +{ +} + +int step_emulator(struct state *state) +{ + return 0; +} + +int emulate_until_complete(struct state *state) +{ + return 0; +} + +struct state *create_emulator(void) +{ + struct state *state = malloc(sizeof(struct state)); + return state; +} + +void reset_emulator(struct state *state) +{ +} + +void free_emulator(struct state *state) +{ +} diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.h b/tools/fuzz/x86_instruction_emulation/emulator_ops.h new file mode 100644 index 000000000000..5ae072d5f205 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.h @@ -0,0 +1,117 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef EMULATOR_OPS_H +#define EMULATOR_OPS_H + +#include "common.h" +#include "stubs.h" +#include + +struct vcpu { + unsigned long cr[5]; + uint64_t msr[MSR_INDEX_MAX]; + uint64_t regs[NR_REGS]; + uint64_t rflags; + struct segment_register segments[NR_SEG]; +}; + +/* + * Internal state of the emulate harness. Calculated initially from the input + * corpus, and later mutated by the emulation callbacks. + */ +struct state { + /* Bitmask which enables/disables hooks. */ + unsigned long options; + + /* Internal representation of emulated CPU. */ + struct vcpu vcpu; + + /* + * Input bytes are consumed at s.data[eip + s.other_bytes_consumed] + * while eip + size_requested + other_bytes_consumed < data_available + * + * emul_fetch consumes bytes for use as x86 instructions as eip grows + * + * get_bytes_and_increment consumes bytes and increments + * other_bytes_consumed. These bytes can be used as return values for + * memory reads, random chances to fail, or other purposes. + * + * Only these two functions should be used to access this data. + * + * This causes .data to be an interspliced source of instructions and + * other data. Xen's instruction emulation does this to provide + * deterministic randomness on fuzz runs at the cost of complexity to + * crash output. + * + * Other alternatives (two separate streams, getting ins bytes from + * low, and random bytes from high, etc) yield byte streams which may + * not bear an as close correlation with AFL's input. This was chosen + * since the only drawback to this approach is remedied by simple + * bookkeeping of instruction bytes when debugging crashes. This is + * enabled by the TODO flag. + */ + unsigned char *data; + + /* Real amount of data backing state->data[]. */ + size_t data_available; + + /* + * Amount of bytes consumed for purposes other than instructions. + * E.G. whether a memory access should fault. + */ + size_t other_bytes_consumed; + + /* Emulation context */ + struct x86_emulate_ctxt ctxt; +}; + +#define MIN_INPUT_SIZE (offsetof(struct state, data)) + +#define container_of(ptr, type, member) \ + ({ \ + const typeof(((type *)0)->member) * __mptr = (ptr); \ + (type *)((char *)__mptr - offsetof(type, member)); \ + }) + +#define get_state(h) container_of(h, struct state, ctxt) + +void buffer_stderr(void) __attribute__((constructor)); + +/* + * Allocates space for, and creates a `struct state`. The user should set + * state->data to their instruction stream and do any modification of the + * state-vpcu if desired. + */ +extern struct state *create_emulator(void); + +/* + * memset's all fields of state except data and data_available. + */ +extern void reset_emulator(struct state *state); + +/* + * free_emulator does not free the state->data member, the user should free + * it before freeing the emulator. The alternative implementation either + * restrains data to a fixed size or has the user pass in a (pointer, size) + * pair which would have to be copied. Copying this would be slow to fuzz. + */ +extern void free_emulator(struct state *state); + +/* + * Uses the state->option field to disable certain functionality and + * initializes the state->ctxt to a valid state. This is optional if + * intentionally testing the emulator in an invalid state. + */ +extern void initialize_emulator(struct state *state); + +/* + * Advances the emulator one instruction and handles any exceptions. + */ +extern int step_emulator(struct state *state); + +/* + * Steps the emulator until no instructions remain or something fails. + */ +extern int emulate_until_complete(struct state *state); + +#endif /* ifdef EMULATOR_OPS_H */ diff --git a/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py b/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py new file mode 100644 index 000000000000..94e63cab4e81 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py @@ -0,0 +1,95 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0+ + +""" +This script compiles files given via arguments with GCC's -H flag. This +produces the list of headers used in compilation which we (primitively) +check for coresponding source files to recur on. In the end, we have the +subset of source and header files needed to compile/link the source files +that were given as arguments. + +There are a lot of hacks here including: +Fickle regex to parse the -H output +Attempting to use the `.FILE_NAME.o.cmd` that linux produces when building +Being totally unrelocatable and requiring some linux build env vars +Running `subproces.run` on input read from a file on disk + +If anything breaks it is likely related to one of the above hacks. +""" + +import os +import sys +import subprocess +from pathlib import Path + +if len(sys.argv) < 2: + print("Supply a starting file(s)") + exit(-1) + +KBUILD_FLAGS = os.environ.get("KBUILD_CFLAGS", "").split() +LINUXINCLUDE = os.environ.get("LINUXINCLUDE", "").split() + +if [] in (KBUILD_FLAGS, LINUXINCLUDE): + print("Environment variable KBUILD_FLAGS or LINUXINCLUDE is not set.") + print("\tThese are emitted by the root Makefile-- are you running") + print("\tthis script from somewhere other than `make tools/fuzz_deps`?") + exit(-1) + +queue = [Path(source_file) for source_file in sys.argv[1:]] +all_headers = set() +source_files = set() + +# While there are source files in the queue left to process +while queue: + current_file = queue.pop() # Grab a file + + source_files.add(current_file) # Add it to the set of processed files + queue_size = len(queue) + print(f"\nFinding headers included in { current_file }:") + + # If there is a `.FILE_NAME.o.cmd` file, then use that to build FILE_NAME.c + # if not, then `gcc -H FILE_NAME $KBUILD_FLAGS $LINUXINCLUDE` probably works + gcc_command = ["gcc", "-H", current_file] + KBUILD_FLAGS + LINUXINCLUDE + cmd_file = current_file.parent / ("." + current_file.name[:-1] + "o.cmd") + if cmd_file.exists(): + with open(cmd_file, "r") as f: + gcc_command = f.readline().split(":= ")[1].split() + ["-H"] + + # Build the file, pipe the output to sed to get just the list of headers + p1 = subprocess.run(gcc_command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) + p2 = subprocess.run(["sed", "/^\.[^/]/!d"], input=p1.stdout, stdout=subprocess.PIPE) + + headers = p2.stdout.decode("utf-8").split("\n") + try: + # Output is of type r'\.+ filename' so split on whitespace and take the latter + headers = {Path(header.split()[1]) for header in headers if header} + except IndexError: # Fickle regex sometimes produces lines without whitespace + print("Breaking output:") + print({header for header in headers if header and len(header.split()) < 2}) + raise + + # This loop is for adding new source files back to the queue + for header in headers: # Ensure all the headers are actually files. + assert header.exists() + + # If we have already seen the header file then we have already seen its source + # so there is no need to add it. This is probably redudant with the next check. + if header in all_headers: + continue + + # If we found a header file with a coresponding source file that we have not + # processed, then add it to the queue to process + source = header.parent / (header.name[:-1] + "c") + if source.exists() and source not in source_files: + queue.append(source) + + print( + f"\t{ len(headers - all_headers) } new headers found," + f"{ len(headers | all_headers) } total." + ) + print(f"\tQueue size was { queue_size } and is now { len(queue) }.") + all_headers.update(headers) # Add all the newly found headers to all_headers + +print("\n\nObjects included:") +print("".join(sorted(f"\t{ str(path) }\n" for path in source_files))) +exit(0) diff --git a/tools/fuzz/x86_instruction_emulation/scripts/make_deps b/tools/fuzz/x86_instruction_emulation/scripts/make_deps new file mode 100755 index 000000000000..4e2ab38c886f --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/scripts/make_deps @@ -0,0 +1,4 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ + +make ${CC:+ "CC=$CC"} arch/x86/kvm/emulate.o arch/x86/lib/retpoline.o lib/find_bit.o diff --git a/tools/fuzz/x86_instruction_emulation/stubs.c b/tools/fuzz/x86_instruction_emulation/stubs.c new file mode 100644 index 000000000000..d2ffeb3ec599 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/stubs.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * These functions/symbols are required to build emulate.o but belong in + * linux tree files with include many other headers/unnecessary symbols + * and cause building emulate.o to become far more complicated than just + * stubbing them out here. In order to stub them without modifying the + * included source, we're declaring many of them here. + */ +#include + +#include "stubs.h" + +#define VMWARE_BACKDOOR_PMC_HOST_TSC 0x10000 +#define VMWARE_BACKDOOR_PMC_REAL_TIME 0x10001 +#define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002 + +/* + * This is easy enough to stub out its full functionality. + */ +bool is_vmware_backdoor_pmc(u32 pmc_idx) +{ + switch (pmc_idx) { + case VMWARE_BACKDOOR_PMC_HOST_TSC: + case VMWARE_BACKDOOR_PMC_REAL_TIME: + case VMWARE_BACKDOOR_PMC_APPARENT_TIME: + return true; + } + return false; +} + +/* + * Printk has no side effects so we don't need to worry about it. + */ +int printk(const char *s, ...) +{ + return 0; +} + +/* + * This is required by source included from emulate.c and would be linked in. + */ +bool enable_vmware_backdoor; + +struct exception_table_entry; +struct pt_regs; + +/* + * TODO: we likely need to implement this to handle emulated exceptions. + */ +bool ex_handler_default(const struct exception_table_entry *fixup, + struct pt_regs *regs, int trapnr, + unsigned long error_code, unsigned long fault_addr) +{ + return true; +} diff --git a/tools/fuzz/x86_instruction_emulation/stubs.h b/tools/fuzz/x86_instruction_emulation/stubs.h new file mode 100644 index 000000000000..02c2f6f9bc26 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/stubs.h @@ -0,0 +1,52 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef STUBS_H +#define STUBS_H + + +/* + * Several typedefs in linux/types.h collide with ones in standard libs. + * + * kvm_emulate.h uses many other types in linux/types.h heavily, so we must + * include it but having access to the standard libs is also important for the + * harnesses. + * + * A solution would be to not include any kernel files in the harness but then + * useful introspection into state->ctxt is impossible as x86_emulate_ctxt is + * defined in kvm_emulate.h and uses a ton of linux/types.h types. + * + * Instead, we use dummy defines to avoid the collision and linux seems + * content to use the equivalent types from the standard libs. + */ +#define fd_set fake_fd_set +#define dev_t fake_dev_t +#define nlink_t fake_nlink_t +#define timer_t fake_timer_t +#define loff_t fake_loff_t +#define u_int64_t fake_u_int64_t +#define int64_t fake_int64_t +#define blkcnt_t fake_blkcnt_t +#define uint64_t fake_uint64_t +#include +#undef fd_set +#undef dev_t +#undef nlint_t +#undef timer_t +#undef loff_t +#undef u_int64_t +#undef int64_t +#undef blkcnt_t +#undef uint64_t + +/* + * These are identically defined in string.h and included kernel headers. + */ +#ifdef __always_inline +#undef __always_inline +#endif + +#ifdef __attribute_const__ +#undef __attribute_const__ +#endif + +#endif /* STUBS_H */ From patchwork Tue May 21 15:39:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Caccavale X-Patchwork-Id: 10954155 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A66FA76 for ; Tue, 21 May 2019 15:40:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93B69201B1 for ; Tue, 21 May 2019 15:40:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8801C28779; Tue, 21 May 2019 15:40:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_INVALID,DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 49E3C201B1 for ; Tue, 21 May 2019 15:40:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728816AbfEUPkM (ORCPT ); Tue, 21 May 2019 11:40:12 -0400 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:3398 "EHLO smtp-fw-6002.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728067AbfEUPkM (ORCPT ); Tue, 21 May 2019 11:40:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1558453209; x=1589989209; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=LI26iuFc1hayM1XvY3zlLM7H/SJeHaDj4s+5vkyXg9Q=; b=XEi48TSr6SpP0f+7xtB2djv7wf/2dGY2zqFPSiNyH3DwYHxssUdy9NeF HVbq3DwjDZDlQS0te5NGeGkZnnlUd5hE8T/NuQU4h7PjAK3rEevywbUsB WezSy9NFdcAtKtw46fcx9BPUrYRstQ8b0HHLpwT7FTPjXxnTXXkHpWTSg Q=; X-IronPort-AV: E=Sophos;i="5.60,495,1549929600"; d="scan'208";a="403087828" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 21 May 2019 15:40:07 +0000 Received: from EX13MTAUEA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com (8.14.7/8.14.7) with ESMTP id x4LFe29n055106 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 21 May 2019 15:40:05 GMT Received: from EX13D08UEE001.ant.amazon.com (10.43.62.126) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:52 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D08UEE001.ant.amazon.com (10.43.62.126) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:52 +0000 Received: from uc2253769c0055c.ant.amazon.com (10.28.85.98) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 21 May 2019 15:39:47 +0000 From: Sam Caccavale CC: , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 2/3] Emulate simple x86 instructions in userspace Date: Tue, 21 May 2019 17:39:23 +0200 Message-ID: <20190521153924.15110-3-samcacc@amazon.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190521153924.15110-1-samcacc@amazon.de> References: <20190521153924.15110-1-samcacc@amazon.de> MIME-Version: 1.0 To: unlisted-recipients:; (no To-header on input) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Added the minimal subset of code to run afl-harness with a binary file as input. These bytes are used to populate the vcpu structure and then as an instruction stream for the emulator. It does not attempt to handle exceptions an only supports very simple ops. --- .../x86_instruction_emulation/emulator_ops.c | 324 +++++++++++++++++- .../scripts/afl-many | 28 ++ .../scripts/bin_fuzz | 23 ++ 3 files changed, 374 insertions(+), 1 deletion(-) create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/afl-many create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c b/tools/fuzz/x86_instruction_emulation/emulator_ops.c index 55ae4e8fbd96..38bba4c97e2f 100644 --- a/tools/fuzz/x86_instruction_emulation/emulator_ops.c +++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c @@ -29,17 +29,331 @@ #include #include +/* + * Due to some quirk of building, the way printing to an unbuffered stream + * is implemented smashes the stack. For now, we'll just buffer stderr but + * a fix is needed. + */ +char buff[BUFSIZ]; +void buffer_stderr(void) +{ + setvbuf(stderr, buff, _IOLBF, BUFSIZ); +} + +#define number_of_registers(c) (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8) + +ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg) +{ + assert(reg < number_of_registers(ctxt)); + return get_state(ctxt)->vcpu.regs[reg]; +} + +void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg, ulong val) +{ + assert(reg < number_of_registers(ctxt)); + get_state(ctxt)->vcpu.regs[reg] = val; +} + +#undef number_of_registers /* (ctxt->mode == X86EMUL_MODE_PROT64 ? 16 : 8) */ + +/* All read ops: */ + +#define emul_offset (state->ctxt.eip + state->other_bytes_consumed) +#define decode_offset (state->ctxt._eip + state->other_bytes_consumed) +/* + * There is a very loose abstraction around ctxt.eip, ctxt._eip, and + * state.other_bytes_consumed. + * + * eip is incremented when an instruction is emulated by x86_emulate_insn + * _eip is incremeneted when an instruction is decoded by x86_decode_insn + * + * state.other_bytes_consumed is incremented when bytes are consumed by + * get_bytes_and_increment for any use besides an x86 instruction. + * + * The bytes between emul_offset and decode_offset are instructions yet + * to be executed. + */ + +#define _data_available(bytes) (decode_offset + bytes < state->data_available) + +static int _get_bytes(void *dst, struct state *state, unsigned int bytes, + char *callee) +{ + if (!_data_available(bytes)) { + fprintf(stderr, "Tried retrieving %d bytes\n", bytes); + fprintf(stderr, "%s failed to retrieve bytes for %s.\n", + __func__, callee); + return X86EMUL_UNHANDLEABLE; + } + + memcpy(dst, &state->data[decode_offset], bytes); + return X86EMUL_CONTINUE; +} + +/* + * The only function that any x86_emulate_ops should call to retrieve bytes. + * See comments in struct state definition for more information. + */ +static int get_bytes_and_increment(void *dst, struct state *state, + unsigned int bytes, char *callee) +{ + int rc = _get_bytes(dst, state, bytes, callee); + + if (rc == X86EMUL_CONTINUE) + state->other_bytes_consumed += bytes; + + return rc; +} + +/* + * This is called by x86_decode_insn to fetch bytes and is the only function + * that gets bytes from state.data without incrementing .other_byes_consumed + * since the emulator will increment ._eip during x86_decode_insn and ._eip + * is used as an index into state.data. + */ +int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr, void *val, + unsigned int bytes, struct x86_exception *fault) +{ + if (_get_bytes(val, get_state(ctxt), bytes, "emul_fetch") != + X86EMUL_CONTINUE) { + return X86EMUL_UNHANDLEABLE; + } + + return X86EMUL_CONTINUE; +} + +ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr) +{ + return get_state(ctxt)->vcpu.cr[cr]; +} + +int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) +{ + get_state(ctxt)->vcpu.cr[cr] = val; + return 0; +} + +unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt) +{ + return get_state(ctxt)->vcpu.rflags; +} + +void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags) +{ + get_state(ctxt)->vcpu.rflags = hflags; +} + +/* End of emulator ops */ + +#define SET(h) .h = emul_##h +const struct x86_emulate_ops all_emulator_ops = { + SET(read_gpr), + SET(write_gpr), + SET(fetch), + SET(get_cr), + SET(set_cr), + SET(get_hflags), + SET(set_hflags), +}; +#undef SET + +enum { + HOOK_read_gpr, + HOOK_write_gpr, + HOOK_fetch, + HOOK_get_cr, + HOOK_set_cr, + HOOK_get_hflags, + HOOK_set_hflags +}; + +/* Put reg_x into a definitely valid state. */ +#define CANONICALIZE(reg_x) \ + do { \ + uint64_t _y = (reg_x); \ + if (_y & (1ULL << 47)) \ + _y |= (~0ULL) << 48; \ + else \ + _y &= (1ULL << 48) - 1; \ + debug("Canonicalized %" PRIx64 " to %" PRIx64 "\n", \ + reg_x, y); \ + (reg_x) = _y; \ + } while (0) + +/* + * Canonicalizes reg if options << CANONICALIZE_##reg is set. + * This weighs the emulator to use canonical values for important registers. + * + * Expects options and regs to be defined. + */ +#define CANONICALIZE_MAYBE(reg) \ + do { \ + if (!(options & (1 << CANONICALIZE_##reg))) \ + CANONICALIZE(regs->reg); \ + } while (0) + +/* + * Disable an x86_emulate_op if options << HOOK_op is set. + * + * Expects options to be defined. + */ +#define MAYBE_DISABLE_HOOK(h) \ + do { \ + if (options & (1 << HOOK_##h)) { \ + vcpu->ctxt.ops.h = NULL; \ + debug("Disabling hook " #h "\n"); \ + } \ + } while (0) + +/* + * FROM XEN: + * + * Constrain input to architecturally-possible states where + * the emulator relies on these + * + * In general we want the emulator to be as absolutely robust as + * possible; which means that we want to minimize the number of things + * it assumes about the input state. Tesing this means minimizing and + * removing as much of the input constraints as possible. + * + * So we only add constraints that (in general) have been proven to + * cause crashes in the emulator. + * + * For future reference: other constraints which might be necessary at + * some point: + * + * - EFER.LMA => !EFLAGS.NT + * - In VM86 mode, force segment... + * - ...access rights to 0xf3 + * - ...limits to 0xffff + * - ...bases to below 1Mb, 16-byte aligned + * - ...selectors to (base >> 4) + */ +static void sanitize_input(struct state *s) +{ + /* Some hooks can't be disabled. */ + // options &= ~((1<error_code = 0; + // regs->entry_vector = 0; + + // CANONICALIZE_MAYBE(rip); + // CANONICALIZE_MAYBE(rsp); + // CANONICALIZE_MAYBE(rbp); + + /* + * CR0.PG can't be set if CR0.PE isn't set. Set is more interesting, so + * set PE if PG is set. + */ + if (s->vcpu.cr[0] & X86_CR0_PG) + s->vcpu.cr[0] |= X86_CR0_PE; + + /* EFLAGS.VM not available in long mode */ + if (s->ctxt.mode == X86EMUL_MODE_PROT64) + s->vcpu.rflags &= ~X86_EFLAGS_VM; + + /* EFLAGS.VM implies 16-bit mode */ + if (s->vcpu.rflags & X86_EFLAGS_VM) { + s->vcpu.segments[x86_seg_cs].db = 0; + s->vcpu.segments[x86_seg_ss].db = 0; + } +} + +static void init_x86_emulate_ctxt(struct x86_emulate_ctxt *ctxt) +{ + ctxt->ops = &all_emulator_ops; + ctxt->eflags = 0; + ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0; + ctxt->eip = 0; + + ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this? + + init_decode_cache(ctxt); +} + void initialize_emulator(struct state *state) { + state->ctxt.ops = &all_emulator_ops; + + /* See also sanitize_input, some hooks can't be disabled. */ + // MAYBE_DISABLE_HOOK(read_gpr); + + sanitize_input(state); + init_x86_emulate_ctxt(&state->ctxt); +} + +static const char *const x86emul_mode_string[] = { + [X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL", + [X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86", + [X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16", + [X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32", + [X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64", +}; + +static void dump_state_after(const char *desc, struct state *state) +{ + debug(" -- State after %s --\n", desc); + debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]); + debug(" cr0: %lx\n", state->vcpu.cr[0]); + debug(" cr3: %lx\n", state->vcpu.cr[3]); + debug(" cr4: %lx\n", state->vcpu.cr[4]); + + debug("Decode _eip: %lu\n", state->ctxt._eip); + debug("Emulate eip: %lu\n", state->ctxt.eip); + + debug("\n"); } int step_emulator(struct state *state) { - return 0; + int rc, prev_eip = state->ctxt.eip; + int decode_size = state->data_available - decode_offset; + + if (decode_size < 15) { + rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset], + decode_size); + } else { + rc = x86_decode_insn(&state->ctxt, NULL, 0); + } + debug("Decode result: %d\n", rc); + if (rc != X86EMUL_CONTINUE) + return rc; + + debug("Instruction: "); + print_n_bytes(&state->data[emul_offset], + state->ctxt._eip - state->ctxt.eip); + + rc = x86_emulate_insn(&state->ctxt); + debug("Emulation result: %d\n", rc); + dump_state_after("emulating", state); + + if (state->ctxt.have_exception) { + fprintf(stderr, "Emulator propagated exception: { "); + fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector); + fprintf(stderr, "error code: %d }\n", + state->ctxt.exception.error_code); + rc = X86EMUL_UNHANDLEABLE; + } else if (prev_eip == state->ctxt.eip) { + fprintf(stderr, "ctxt.eip not advanced.\n"); + rc = X86EMUL_UNHANDLEABLE; + } + + if (decode_offset == state->data_available) + debug("emulator is done\n"); + + return rc; } int emulate_until_complete(struct state *state) { + int count = 0; + + do { + count++; + } while (step_emulator(state) == X86EMUL_CONTINUE); + + debug("Emulated %d instructions\n", count); return 0; } @@ -51,8 +365,16 @@ struct state *create_emulator(void) void reset_emulator(struct state *state) { + unsigned char *data = state->data; + size_t data_available = state->data_available; + + memset(state, 0, sizeof(struct state)); + + state->data = data; + state->data_available = data_available; } void free_emulator(struct state *state) { + free(state); } diff --git a/tools/fuzz/x86_instruction_emulation/scripts/afl-many b/tools/fuzz/x86_instruction_emulation/scripts/afl-many new file mode 100755 index 000000000000..ab15258573a2 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/scripts/afl-many @@ -0,0 +1,28 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# This is for running AFL over NPROC or `nproc` cores with normal AFL options. + +export AFL_NO_AFFINITY=1 + +while [ -z "$sync_dir" ]; do + while getopts ":o:" opt; do + case "${opt}" in + o) + sync_dir="${OPTARG}" + ;; + *) + ;; + esac + done + ((OPTIND++)) + [ $OPTIND -gt $# ] && break +done + +for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do + taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 & +done +taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 & + +sleep 5 +watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}" +pkill afl-fuzz diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz new file mode 100755 index 000000000000..e570b17f9404 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz @@ -0,0 +1,23 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0+ +# This runs the afl-harness at $1, $2 times (or 100) +# It runs uniq and sorts the output to give an idea of what is causing the +# most crashes. Useful for deciding what to implement next. + +if [ "$#" -lt 1 ]; then + echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]" + exit +fi + +mkdir -p fuzz +rm -f fuzz/*.in fuzz/*.out + +for i in $(seq 1 1 ${2:-100}) +do + { + head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1 + } > fuzz/$i.out 2>&1 + +done + +find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.* Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction output/' | sort | uniq -c | sort -rn From patchwork Tue May 21 15:39:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Caccavale X-Patchwork-Id: 10954157 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD11B13AD for ; Tue, 21 May 2019 15:40:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB336201B1 for ; Tue, 21 May 2019 15:40:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BEFE028708; Tue, 21 May 2019 15:40:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_INVALID,DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F1FA201B1 for ; Tue, 21 May 2019 15:40:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728861AbfEUPkT (ORCPT ); Tue, 21 May 2019 11:40:19 -0400 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:15059 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728067AbfEUPkS (ORCPT ); Tue, 21 May 2019 11:40:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1558453217; x=1589989217; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=5B9enZJwF3veiFRzLvmeNGG7DvSS35w6K7vV8jk0rNY=; b=ub+Iqxp1E3xolTbZyTAHaPxBNMwh8fDeL8Hxny8LIMKp56Yzq7ejYZhL xMyrSMILX+bq664eR8CckK7kYxbXFBzXWGZA2ipi4f0bt1bWofCiv50Oz Qied2bI6CiU3AgEdEAsaugQsIpUNqYdoezrstW2mKs8IcgcsCjj9yHb+A s=; X-IronPort-AV: E=Sophos;i="5.60,495,1549929600"; d="scan'208";a="767022279" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2b-baacba05.us-west-2.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 21 May 2019 15:40:14 +0000 Received: from EX13MTAUEA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2b-baacba05.us-west-2.amazon.com (8.14.7/8.14.7) with ESMTP id x4LFeBCx117178 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 21 May 2019 15:40:12 GMT Received: from EX13D08UEE003.ant.amazon.com (10.43.62.118) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:59 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D08UEE003.ant.amazon.com (10.43.62.118) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 21 May 2019 15:39:59 +0000 Received: from uc2253769c0055c.ant.amazon.com (10.28.85.98) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 21 May 2019 15:39:54 +0000 From: Sam Caccavale CC: , , , , , , , , , , , , , , , , , , , , Subject: [PATCH 3/3] Demonstrating unit testing via simple-harness Date: Tue, 21 May 2019 17:39:24 +0200 Message-ID: <20190521153924.15110-4-samcacc@amazon.de> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190521153924.15110-1-samcacc@amazon.de> References: <20190521153924.15110-1-samcacc@amazon.de> MIME-Version: 1.0 To: unlisted-recipients:; (no To-header on input) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Simple-harness.c uses inline asm support to generate asm and then has the emulator emulate this code. This may be useful as a form of testing for the emulator. --- tools/fuzz/x86_instruction_emulation/Makefile | 7 ++- .../simple-harness.c | 49 +++++++++++++++++++ 2 files changed, 54 insertions(+), 2 deletions(-) create mode 100644 tools/fuzz/x86_instruction_emulation/simple-harness.c diff --git a/tools/fuzz/x86_instruction_emulation/Makefile b/tools/fuzz/x86_instruction_emulation/Makefile index d2854a332605..bb29149ae0f7 100644 --- a/tools/fuzz/x86_instruction_emulation/Makefile +++ b/tools/fuzz/x86_instruction_emulation/Makefile @@ -43,7 +43,10 @@ LOCAL_OBJS := emulator_ops.o stubs.o afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS) @$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie -all: afl-harness +simple-harness: simple-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS) + @$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie + +all: afl-harness simple-harness .PHONY: fuzz_deps fuzz_deps: @@ -54,4 +57,4 @@ fuzz_deps: .PHONY: clean clean: - $(RM) -r *.o afl-harness + $(RM) -r *.o afl-harness simple-harness diff --git a/tools/fuzz/x86_instruction_emulation/simple-harness.c b/tools/fuzz/x86_instruction_emulation/simple-harness.c new file mode 100644 index 000000000000..9601aafb9423 --- /dev/null +++ b/tools/fuzz/x86_instruction_emulation/simple-harness.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include "emulator_ops.h" +#include + +extern void foo(void) +{ + asm volatile("__start:mov $0xdeadbeef, %rax;" + "xor %rax, %rax;" + "__end:"); +} + +int main(int argc, char **argv) +{ + extern unsigned char __start; + extern unsigned char __end; + struct state *state = create_emulator(); + int rc; + + /* Ensures the emulator is in a valid state. */ + initialize_emulator(state); + + /* Provide the emulator with instructions to emulate. */ + state->data = &__start; + state->data_available = &__end - &__start; + + /* Execute mov $0xdeadbeef, %rax */ + rc = step_emulator(state); + /* Check that the emulator succeeded. */ + assert(rc == X86EMUL_CONTINUE); + /* Check that 0xdeadbeef was moved to rax. */ + assert(state->ctxt._regs[REGS_RAX] == 0xdeadbeef); + + /* Execute xor %rax, %rax */ + rc = step_emulator(state); + /* Check that the emulator succeeded. */ + assert(rc == X86EMUL_CONTINUE); + /* Check that xoring rax with itself cleared rax. */ + assert(state->ctxt._regs[REGS_RAX] == 0); + + /* Free the emulator. */ + free_emulator(state); + + return 0; +}