diff mbox series

[v3,22/22] fuzz: add documentation to docs/devel/

Message ID 20190918231846.22538-23-alxndr@bu.edu (mailing list archive)
State New, archived
Headers show
Series Add virtual device fuzzing support | expand

Commit Message

Alexander Bulekov Sept. 18, 2019, 11:19 p.m. UTC
Signed-off-by: Alexander Oleinik <alxndr@bu.edu>
---
 docs/devel/fuzzing.txt | 114 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)
 create mode 100644 docs/devel/fuzzing.txt

Comments

Darren Kenny Sept. 23, 2019, 2:54 p.m. UTC | #1
Hi Alexander,

Some comments, and questions below...

On Wed, Sep 18, 2019 at 11:19:48PM +0000, Oleinik, Alexander wrote:
>Signed-off-by: Alexander Oleinik <alxndr@bu.edu>
>---
> docs/devel/fuzzing.txt | 114 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 114 insertions(+)
> create mode 100644 docs/devel/fuzzing.txt
>
>diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt
>new file mode 100644
>index 0000000000..53a1f858f5
>--- /dev/null
>+++ b/docs/devel/fuzzing.txt
>@@ -0,0 +1,114 @@
>+= Fuzzing =
>+
>+== Introduction ==
>+
>+This document describes the virtual-device fuzzing infrastructure in QEMU and
>+how to use it to implement additional fuzzers.
>+
>+== Basics ==
>+
>+Fuzzing operates by passing inputs to an entry point/target function. The
>+fuzzer tracks the code coverage triggered by the input. Based on these
>+findings, the fuzzer mutates the input and repeats the fuzzing.
>+
>+To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer
>+is an _in-process_ fuzzer. For the developer, this means that it is their
>+responsibility to ensure that state is reset between fuzzing-runs.
>+
>+== Building the fuzzers ==
>+
>+NOTE: If possible, build a 32-bit binary. When forking, the page map ends up
>+being much smaller. On 64-bit AddressSanitizer consumes 20 Terabytes of virtual
>+memory.

It might be worth having a little more on this, since I would
imagine most people would be more interested in 64-bit these days,
but are unlikely to have 20Tb available :)

But if there is little to be gained by the 64-bit vs 32-bit if
fuzzing a device, then it might be worth explaining why.

>+
>+To build the fuzzers, install a recent version of clang:
>+Configure with (substitute the clang binaries with the version you installed):
>+
>+    CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing
>+
>+Fuzz targets are built similarily to system/softmmu:

Typo: s/similarily/similarly/

>+
>+    make i386-softmmu/fuzz
>+
>+This builds ./i386-softmmu/qemu-fuzz-i386

Reading this, it kind of implies that it should be also possible to
build i386-softmmu/all still - which from what I can see it isn't,
the build fails with an undefined 'main' - most likely the main.o
object is not being linked in for this target?

Unfortunately, also doing the above build command is also failing
for me after applying this set of patches:

  qemu-upstream-libfuzz/tests/fuzz/fuzz.c:149:28: error: expected expression
      fuzz_qts = qtest_setup(void);
                             ^
  1 error generated.
  make[1]: *** [tests/fuzz/fuzz.o] Error 1

This is due to the (void) in the calling of the function, will comment on the
patch separately.

>+
>+The first option to this command is: --FUZZER_NAME
>+To list all of the available fuzzers run qemu-fuzz-i386 with no arguments.
>+
>+Libfuzzer parses all arguments that do not begin with "--". Information about
>+these is available by passing -help=1
>+
>+Now the only thing left to do is wait for the fuzzer to trigger potential
>+crashes.

I would suggest adding an example of doing a run for this here.

>+
>+== Adding a new fuzzer ==
>+Coverage over virtual devices can be improved by adding additional fuzzers.
>+Fuzzers are kept in tests/fuzz/ and should be added to
>+tests/fuzz/Makefile.include
>+
>+Fuzzers can rely on both qtest and libqos to communicate with virtual devices.
>+
>+1. Create a new source file. For example ``tests/fuzz/fuzz-foo-device.c``.
>+
>+2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers
>+for reference.
>+
>+3. Register the fuzzer in ``tests/fuzz/Makefile.include`` by appending the
>+corresponding object to fuzz-obj-y
>+
>+Fuzzers can be more-or-less thought of as special qtest programs which can
>+modify the qtest commands and/or qtest command arguments based on inputs
>+provided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the
>+fuzzer loops over the byte-array interpreting it as a list of qtest commands,
>+addresses, or values.
>+
>+
>+= Implmentation Details =
>+
>+== The Fuzzer's Lifecycle ==
>+
>+The fuzzer has two entrypoints that libfuzzer calls.
>+
>+LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
>+necessary state
>+
>+LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
>+resets the state at the end of each run.

It might be worth explicitly mentioning that Libfuzzer provides it's
own main() function which calls these.

>+
>+In more detail:
>+
>+LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
>+dashes, so they are ignored by libfuzzer main()). Currently, the arguments
>+select the fuzz target. Then, the qtest client is initialized. If the target
>+requires qos, qgraph is set up and the QOM/LIBQOS modules are initailized.
>+Then the QGraph is walked and the QEMU cmd_line is determined and saved.
>+
>+After this, the vl.c:real_main is called to set up the guest. After this, the
>+fuzzer saves the initial vm/device state to ram, after which the initilization
>+is complete.

Is this still true with this patchset? I don't see real_main being
defined any more in vl.c.

>+
>+LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz
>+input. It is also responsible for manually calling the main loop/main_loop_wait
>+to ensure that bottom halves are executed. Finally, it calls reset() which
>+restores state from the ramfile and/or resets the guest.
>+
>+
>+Since the same process is reused for many fuzzing runs, QEMU state needs to
>+be reset at the end of each run. There are currently two implemented
>+options for resetting state:
>+1. Reboot the guest between runs.
>+   Pros: Straightforward and fast for simple fuzz targets.
>+   Cons: Depending on the device, does not reset all device state. If the
>+   device requires some initialization prior to being ready for fuzzing
>+   (common for QOS-based targets), this initialization needs to be done after
>+   each reboot.
>+   Example target: --virtio-net-ctrl-fuzz
>+2. Run each test case in a separate forked process and copy the coverage
>+   information back to the parent. This is fairly similar to AFL's "deferred"
>+   fork-server mode [3]
>+   Pros: Relatively fast. Devices only need to be initialized once. No need
>+   to do slow reboots or vmloads.
>+   Cons: Not officially supported by libfuzzer and the implementation is very
>+   flimsy. Does not work well for devices that rely on dedicated threads.
>+   Example target: --qtest-fork-fuzz
>+

Thanks,

Darren.
diff mbox series

Patch

diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt
new file mode 100644
index 0000000000..53a1f858f5
--- /dev/null
+++ b/docs/devel/fuzzing.txt
@@ -0,0 +1,114 @@ 
+= Fuzzing =
+
+== Introduction ==
+
+This document describes the virtual-device fuzzing infrastructure in QEMU and
+how to use it to implement additional fuzzers.
+
+== Basics ==
+
+Fuzzing operates by passing inputs to an entry point/target function. The
+fuzzer tracks the code coverage triggered by the input. Based on these
+findings, the fuzzer mutates the input and repeats the fuzzing. 
+
+To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer
+is an _in-process_ fuzzer. For the developer, this means that it is their
+responsibility to ensure that state is reset between fuzzing-runs.
+
+== Building the fuzzers ==
+
+NOTE: If possible, build a 32-bit binary. When forking, the page map ends up
+being much smaller. On 64-bit AddressSanitizer consumes 20 Terabytes of virtual
+memory.
+
+To build the fuzzers, install a recent version of clang:
+Configure with (substitute the clang binaries with the version you installed):
+
+    CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing
+
+Fuzz targets are built similarily to system/softmmu:
+
+    make i386-softmmu/fuzz
+
+This builds ./i386-softmmu/qemu-fuzz-i386
+
+The first option to this command is: --FUZZER_NAME
+To list all of the available fuzzers run qemu-fuzz-i386 with no arguments.
+
+Libfuzzer parses all arguments that do not begin with "--". Information about
+these is available by passing -help=1
+
+Now the only thing left to do is wait for the fuzzer to trigger potential
+crashes.
+
+== Adding a new fuzzer ==
+Coverage over virtual devices can be improved by adding additional fuzzers. 
+Fuzzers are kept in tests/fuzz/ and should be added to
+tests/fuzz/Makefile.include
+
+Fuzzers can rely on both qtest and libqos to communicate with virtual devices.
+
+1. Create a new source file. For example ``tests/fuzz/fuzz-foo-device.c``.
+
+2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers
+for reference.
+
+3. Register the fuzzer in ``tests/fuzz/Makefile.include`` by appending the
+corresponding object to fuzz-obj-y
+
+Fuzzers can be more-or-less thought of as special qtest programs which can
+modify the qtest commands and/or qtest command arguments based on inputs
+provided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the
+fuzzer loops over the byte-array interpreting it as a list of qtest commands,
+addresses, or values.
+
+
+= Implmentation Details =
+
+== The Fuzzer's Lifecycle ==
+
+The fuzzer has two entrypoints that libfuzzer calls.
+
+LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
+necessary state
+
+LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
+resets the state at the end of each run.
+
+In more detail:
+
+LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
+dashes, so they are ignored by libfuzzer main()). Currently, the arguments
+select the fuzz target. Then, the qtest client is initialized. If the target
+requires qos, qgraph is set up and the QOM/LIBQOS modules are initailized.
+Then the QGraph is walked and the QEMU cmd_line is determined and saved.
+
+After this, the vl.c:real_main is called to set up the guest. After this, the
+fuzzer saves the initial vm/device state to ram, after which the initilization
+is complete.
+
+LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz
+input. It is also responsible for manually calling the main loop/main_loop_wait
+to ensure that bottom halves are executed. Finally, it calls reset() which
+restores state from the ramfile and/or resets the guest.
+
+
+Since the same process is reused for many fuzzing runs, QEMU state needs to
+be reset at the end of each run. There are currently two implemented
+options for resetting state: 
+1. Reboot the guest between runs.
+   Pros: Straightforward and fast for simple fuzz targets. 
+   Cons: Depending on the device, does not reset all device state. If the
+   device requires some initialization prior to being ready for fuzzing
+   (common for QOS-based targets), this initialization needs to be done after
+   each reboot.
+   Example target: --virtio-net-ctrl-fuzz
+2. Run each test case in a separate forked process and copy the coverage
+   information back to the parent. This is fairly similar to AFL's "deferred"
+   fork-server mode [3]
+   Pros: Relatively fast. Devices only need to be initialized once. No need
+   to do slow reboots or vmloads.
+   Cons: Not officially supported by libfuzzer and the implementation is very
+   flimsy. Does not work well for devices that rely on dedicated threads.
+   Example target: --qtest-fork-fuzz
+