diff mbox series

[RFC,3/3] fuzz: Add callbacks for dma-access functions

Message ID 20200611055651.13784-4-alxndr@bu.edu (mailing list archive)
State New, archived
Headers show
Series fuzz: add generic fuzzer | expand

Commit Message

Alexander Bulekov June 11, 2020, 5:56 a.m. UTC
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
 exec.c                                | 17 ++++++++++++++++-
 include/exec/memory.h                 |  8 ++++++++
 include/exec/memory_ldst_cached.inc.h |  9 +++++++++
 include/sysemu/dma.h                  |  5 ++++-
 memory_ldst.inc.c                     | 12 ++++++++++++
 5 files changed, 49 insertions(+), 2 deletions(-)

Comments

Stefan Hajnoczi June 23, 2020, 2:14 p.m. UTC | #1
On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  exec.c                                | 17 ++++++++++++++++-
>  include/exec/memory.h                 |  8 ++++++++
>  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
>  include/sysemu/dma.h                  |  5 ++++-
>  memory_ldst.inc.c                     | 12 ++++++++++++
>  5 files changed, 49 insertions(+), 2 deletions(-)

Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
function is clear.

The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
is undefined. In a header file:

  #ifdef CONFIG_FUZZ
  void fuzz_dma_read_cb(size_t addr, size_t len);
  #else
  static inline void fuzz_dma_read_cb(size_t addr, size_t len)
  {
      /* Do nothing */
  }
  #endif

Now the compiler should eliminate the deadcode:

  #ifdef CONFIG_FUZZ
  if (as->root == get_system_memory()) {
      dma_read_cb(addr, len);
  }
  #endif

becomes:

  if (as->root == get_system_memory()) {
      fuzz_dma_read_cb(addr, len);
  }

Hopefully gcc and clang will eliminate this and emit no instructions
when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
'is_write' too:

  void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)

This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
elimination becomes trivial for the compiler:

  fuzz_read_cb(as, is_write, addr, len);
Alexander Bulekov June 23, 2020, 2:55 p.m. UTC | #2
On 200623 1514, Stefan Hajnoczi wrote:
> On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > ---
> >  exec.c                                | 17 ++++++++++++++++-
> >  include/exec/memory.h                 |  8 ++++++++
> >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> >  include/sysemu/dma.h                  |  5 ++++-
> >  memory_ldst.inc.c                     | 12 ++++++++++++
> >  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> function is clear.
> 
> The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> is undefined. In a header file:
> 
>   #ifdef CONFIG_FUZZ
>   void fuzz_dma_read_cb(size_t addr, size_t len);
>   #else
>   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
>   {
>       /* Do nothing */
>   }
>   #endif
> 
> Now the compiler should eliminate the deadcode:
> 
>   #ifdef CONFIG_FUZZ
>   if (as->root == get_system_memory()) {
>       dma_read_cb(addr, len);
>   }
>   #endif
> 
> becomes:
> 
>   if (as->root == get_system_memory()) {
>       fuzz_dma_read_cb(addr, len);
>   }
> 
> Hopefully gcc and clang will eliminate this and emit no instructions
> when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> 'is_write' too:
> 
>   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> 
> This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> elimination becomes trivial for the compiler:
> 
>   fuzz_read_cb(as, is_write, addr, len);

Do you think it would be better to have a "trace_dma_read" or
"trace_device_read_from_guest_memory"? I haven't looked under the hood
too much for the tracepoint api, but these would just be standard
tracepoints(disabled for the majority of builds). When we build the
fuzzer, we could compile with --wrap="trace_dma_read" and implement
a __wrap_trace_dma_read in the generic fuzzer. I looked at the symbols
for a qemu build and it looks like trace_* are actual functions, rather
than preprocessor magic, so maybe this could work?
-Alex
Philippe Mathieu-Daudé June 24, 2020, 9:46 a.m. UTC | #3
On 6/11/20 7:56 AM, Alexander Bulekov wrote:
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  exec.c                                | 17 ++++++++++++++++-
>  include/exec/memory.h                 |  8 ++++++++
>  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
>  include/sysemu/dma.h                  |  5 ++++-
>  memory_ldst.inc.c                     | 12 ++++++++++++
>  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index be4be2df3a..2ed724ab54 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3247,7 +3247,10 @@ MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr,
>  {
>      MemTxResult result = MEMTX_OK;
>      FlatView *fv;
> -
> +#ifdef CONFIG_FUZZ
> +    if(as->root == get_system_memory())

Since it is local to exec.c, you can directly use system_memory.

But why restrict this to the system memory anyway?

> +        dma_read_cb(addr, len);
> +#endif
Stefan Hajnoczi June 26, 2020, 3:44 p.m. UTC | #4
On Tue, Jun 23, 2020 at 10:55:00AM -0400, Alexander Bulekov wrote:
> On 200623 1514, Stefan Hajnoczi wrote:
> > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > ---
> > >  exec.c                                | 17 ++++++++++++++++-
> > >  include/exec/memory.h                 |  8 ++++++++
> > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > >  include/sysemu/dma.h                  |  5 ++++-
> > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > 
> > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > function is clear.
> > 
> > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > is undefined. In a header file:
> > 
> >   #ifdef CONFIG_FUZZ
> >   void fuzz_dma_read_cb(size_t addr, size_t len);
> >   #else
> >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> >   {
> >       /* Do nothing */
> >   }
> >   #endif
> > 
> > Now the compiler should eliminate the deadcode:
> > 
> >   #ifdef CONFIG_FUZZ
> >   if (as->root == get_system_memory()) {
> >       dma_read_cb(addr, len);
> >   }
> >   #endif
> > 
> > becomes:
> > 
> >   if (as->root == get_system_memory()) {
> >       fuzz_dma_read_cb(addr, len);
> >   }
> > 
> > Hopefully gcc and clang will eliminate this and emit no instructions
> > when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> > 'is_write' too:
> > 
> >   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> > 
> > This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> > elimination becomes trivial for the compiler:
> > 
> >   fuzz_read_cb(as, is_write, addr, len);
> 
> Do you think it would be better to have a "trace_dma_read" or
> "trace_device_read_from_guest_memory"? I haven't looked under the hood
> too much for the tracepoint api, but these would just be standard
> tracepoints(disabled for the majority of builds). When we build the
> fuzzer, we could compile with --wrap="trace_dma_read" and implement
> a __wrap_trace_dma_read in the generic fuzzer. I looked at the symbols
> for a qemu build and it looks like trace_* are actual functions, rather
> than preprocessor magic, so maybe this could work?

I think plain old functions are fine for what you are doing.

QEMU's tracing does not provide callbacks that are invoked when a trace
event is emitted. The fuzzing code wouldn't be able to hook
trace_device_read_from_guest_memory, you could only analyze a trace
after the fact.

Stefan
Alexander Bulekov July 9, 2020, 11:48 p.m. UTC | #5
On 200623 1514, Stefan Hajnoczi wrote:
> On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > ---
> >  exec.c                                | 17 ++++++++++++++++-
> >  include/exec/memory.h                 |  8 ++++++++
> >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> >  include/sysemu/dma.h                  |  5 ++++-
> >  memory_ldst.inc.c                     | 12 ++++++++++++
> >  5 files changed, 49 insertions(+), 2 deletions(-)
> 
> Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> function is clear.
> 
> The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> is undefined. In a header file:
> 
>   #ifdef CONFIG_FUZZ
>   void fuzz_dma_read_cb(size_t addr, size_t len);
>   #else
>   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
>   {
>       /* Do nothing */
>   }
>   #endif
> 

If I understand correctly, this still has the problem that normal
qemu-system builds under --enable-fuzzing are broken. I'm not sure if
there is some nice solution for this. For example, a sort-of ugly
solution could add this to softmmu/main.c (ie something that is linked
for the qemu-system build, but not for qemu-fuzz).

#ifdef CONFIG_FUZZ
#include "something.h"
static void fuzz_dma_read_cb(size_t addr, size_t len)
{
    /* Do nothing */
}
#endif

> Now the compiler should eliminate the deadcode:
> 
>   #ifdef CONFIG_FUZZ
>   if (as->root == get_system_memory()) {
>       dma_read_cb(addr, len);
>   }
>   #endif
> 
> becomes:
> 
>   if (as->root == get_system_memory()) {
>       fuzz_dma_read_cb(addr, len);
>   }
> 
> Hopefully gcc and clang will eliminate this and emit no instructions
> when CONFIG_FUZZ is undefined. If not, you can simply pass in 'as' and
> 'is_write' too:
> 
>   void fuzz_dma_read_cb(AddressSpace *as, bool is_write, size_t addr, size_t len)
> 
Yes I'll add in is_write. Hopefully that will make life easier for the
compiler.
Thanks

> This way the conditional is moved inside fuzz_dma_read_cb() and deadcode
> elimination becomes trivial for the compiler:
> 
>   fuzz_read_cb(as, is_write, addr, len);
Stefan Hajnoczi July 13, 2020, 11:41 a.m. UTC | #6
On Thu, Jul 09, 2020 at 07:48:55PM -0400, Alexander Bulekov wrote:
> On 200623 1514, Stefan Hajnoczi wrote:
> > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > ---
> > >  exec.c                                | 17 ++++++++++++++++-
> > >  include/exec/memory.h                 |  8 ++++++++
> > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > >  include/sysemu/dma.h                  |  5 ++++-
> > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > 
> > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > function is clear.
> > 
> > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > is undefined. In a header file:
> > 
> >   #ifdef CONFIG_FUZZ
> >   void fuzz_dma_read_cb(size_t addr, size_t len);
> >   #else
> >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> >   {
> >       /* Do nothing */
> >   }
> >   #endif
> > 
> 
> If I understand correctly, this still has the problem that normal
> qemu-system builds under --enable-fuzzing are broken. I'm not sure if
> there is some nice solution for this. For example, a sort-of ugly
> solution could add this to softmmu/main.c (ie something that is linked
> for the qemu-system build, but not for qemu-fuzz).
> 
> #ifdef CONFIG_FUZZ
> #include "something.h"
> static void fuzz_dma_read_cb(size_t addr, size_t len)
> {
>     /* Do nothing */
> }
> #endif

Another ugly solution is using weak symbols in the main code and a
strong symbol in the fuzzer target:
https://en.wikipedia.org/wiki/Weak_symbol

Stefan
Alexander Bulekov July 13, 2020, 11:52 a.m. UTC | #7
On 200713 1241, Stefan Hajnoczi wrote:
> On Thu, Jul 09, 2020 at 07:48:55PM -0400, Alexander Bulekov wrote:
> > On 200623 1514, Stefan Hajnoczi wrote:
> > > On Thu, Jun 11, 2020 at 01:56:51AM -0400, Alexander Bulekov wrote:
> > > > Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> > > > ---
> > > >  exec.c                                | 17 ++++++++++++++++-
> > > >  include/exec/memory.h                 |  8 ++++++++
> > > >  include/exec/memory_ldst_cached.inc.h |  9 +++++++++
> > > >  include/sysemu/dma.h                  |  5 ++++-
> > > >  memory_ldst.inc.c                     | 12 ++++++++++++
> > > >  5 files changed, 49 insertions(+), 2 deletions(-)
> > > 
> > > Please rename dma_read_cb() to fuzz_dma_read_cb() so the purpose of the
> > > function is clear.
> > > 
> > > The ifdefs can be avoided by defining an empty function when CONFIG_FUZZ
> > > is undefined. In a header file:
> > > 
> > >   #ifdef CONFIG_FUZZ
> > >   void fuzz_dma_read_cb(size_t addr, size_t len);
> > >   #else
> > >   static inline void fuzz_dma_read_cb(size_t addr, size_t len)
> > >   {
> > >       /* Do nothing */
> > >   }
> > >   #endif
> > > 
> > 
> > If I understand correctly, this still has the problem that normal
> > qemu-system builds under --enable-fuzzing are broken. I'm not sure if
> > there is some nice solution for this. For example, a sort-of ugly
> > solution could add this to softmmu/main.c (ie something that is linked
> > for the qemu-system build, but not for qemu-fuzz).
> > 
> > #ifdef CONFIG_FUZZ
> > #include "something.h"
> > static void fuzz_dma_read_cb(size_t addr, size_t len)
> > {
> >     /* Do nothing */
> > }
> > #endif
> 
> Another ugly solution is using weak symbols in the main code and a
> strong symbol in the fuzzer target:
> https://en.wikipedia.org/wiki/Weak_symbol

Ok - I'll try that out. I think we'll also need a check in the actual
dma_read_cb function to confirm that we are actually the general-fuzzer.
We don't want to be hooking accesses while e.g. running the non-general
virtio-net fuzzer.

-Alex

> Stefan
diff mbox series

Patch

diff --git a/exec.c b/exec.c
index be4be2df3a..2ed724ab54 100644
--- a/exec.c
+++ b/exec.c
@@ -3247,7 +3247,10 @@  MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr,
 {
     MemTxResult result = MEMTX_OK;
     FlatView *fv;
-
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory())
+        dma_read_cb(addr, len);
+#endif
     if (len > 0) {
         RCU_READ_LOCK_GUARD();
         fv = address_space_to_flatview(as);
@@ -3556,6 +3559,10 @@  void *address_space_map(AddressSpace *as,
         }
 
         *plen = l;
+#ifdef CONFIG_FUZZ
+        if(as->root == get_system_memory() && !is_write)
+            dma_read_cb(addr, *plen);
+#endif
         return bounce.buffer;
     }
 
@@ -3563,6 +3570,10 @@  void *address_space_map(AddressSpace *as,
     memory_region_ref(mr);
     *plen = flatview_extend_translation(fv, addr, len, mr, xlat,
                                         l, is_write, attrs);
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory() && !is_write)
+        dma_read_cb(addr, *plen);
+#endif
     ptr = qemu_ram_ptr_length(mr->ram_block, xlat, plen, true);
 
     return ptr;
@@ -3635,6 +3646,10 @@  int64_t address_space_cache_init(MemoryRegionCache *cache,
 
     assert(len > 0);
 
+#ifdef CONFIG_FUZZ
+    if(as->root == get_system_memory() && !is_write)
+        dma_read_cb(addr, len);
+#endif
     l = len;
     cache->fv = address_space_get_flatview(as);
     d = flatview_to_dispatch(cache->fv);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3e00cdbbfa..e9178b3e0a 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -49,6 +49,10 @@ 
 
 extern bool global_dirty_log;
 
+#ifdef CONFIG_FUZZ
+extern void dma_read_cb(size_t addr, size_t len);
+#endif
+
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionMmio MemoryRegionMmio;
 
@@ -2427,6 +2431,10 @@  address_space_read_cached(MemoryRegionCache *cache, hwaddr addr,
                           void *buf, hwaddr len)
 {
     assert(addr < cache->len && len <= cache->len - addr);
+
+#ifdef CONFIG_FUZZ
+    dma_read_cb(addr, len);
+#endif
     if (likely(cache->ptr)) {
         memcpy(buf, cache->ptr + addr, len);
     } else {
diff --git a/include/exec/memory_ldst_cached.inc.h b/include/exec/memory_ldst_cached.inc.h
index fd4bbb40e7..dc3ce14a97 100644
--- a/include/exec/memory_ldst_cached.inc.h
+++ b/include/exec/memory_ldst_cached.inc.h
@@ -28,6 +28,9 @@  static inline uint32_t ADDRESS_SPACE_LD_CACHED(l)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 4 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 4);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(l)(cache->ptr + addr);
     } else {
@@ -39,6 +42,9 @@  static inline uint64_t ADDRESS_SPACE_LD_CACHED(q)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 8 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 8);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(q)(cache->ptr + addr);
     } else {
@@ -50,6 +56,9 @@  static inline uint32_t ADDRESS_SPACE_LD_CACHED(uw)(MemoryRegionCache *cache,
     hwaddr addr, MemTxAttrs attrs, MemTxResult *result)
 {
     assert(addr < cache->len && 2 <= cache->len - addr);
+#ifdef CONFIG_FUZZ
+    dma_read_cb(cache->xlat + addr, 2);
+#endif
     if (likely(cache->ptr)) {
         return LD_P(uw)(cache->ptr + addr);
     } else {
diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
index 80c5bc3e02..f32d7db7aa 100644
--- a/include/sysemu/dma.h
+++ b/include/sysemu/dma.h
@@ -105,8 +105,11 @@  static inline int dma_memory_rw(AddressSpace *as, dma_addr_t addr,
                                 void *buf, dma_addr_t len,
                                 DMADirection dir)
 {
+#ifdef CONFIG_FUZZ
+    if (dir == DMA_DIRECTION_TO_DEVICE)
+        dma_read_cb(addr, len);
+#endif
     dma_barrier(as, dir);
-
     return dma_memory_rw_relaxed(as, addr, buf, len, dir);
 }
 
diff --git a/memory_ldst.inc.c b/memory_ldst.inc.c
index c54aee4a95..1935436aff 100644
--- a/memory_ldst.inc.c
+++ b/memory_ldst.inc.c
@@ -42,6 +42,9 @@  static inline uint32_t glue(address_space_ldl_internal, SUFFIX)(ARG1_DECL,
                                         MO_32 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 4);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -110,6 +113,9 @@  static inline uint64_t glue(address_space_ldq_internal, SUFFIX)(ARG1_DECL,
                                         MO_64 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 8);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -175,6 +181,9 @@  uint32_t glue(address_space_ldub, SUFFIX)(ARG1_DECL,
         r = memory_region_dispatch_read(mr, addr1, &val, MO_8, attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 1);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         val = ldub_p(ptr);
         r = MEMTX_OK;
@@ -212,6 +221,9 @@  static inline uint32_t glue(address_space_lduw_internal, SUFFIX)(ARG1_DECL,
                                         MO_16 | devend_memop(endian), attrs);
     } else {
         /* RAM case */
+#ifdef CONFIG_FUZZ
+        dma_read_cb(addr, 2);
+#endif
         ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN: