diff mbox

[v2] rcu: reduce more than 7MB heap memory by malloc_trim()

Message ID 1511419276-31212-1-git-send-email-yang.zhong@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yang Zhong Nov. 23, 2017, 6:41 a.m. UTC
Since there are some issues in memory alloc/free machenism
in glibc for little chunk memory, if Qemu frequently
alloc/free little chunk memory, the glibc doesn't alloc
little chunk memory from free list of glibc and still
allocate from OS, which make the heap size bigger and bigger.

This patch introduce malloc_trim(), which will free heap memory.

Below are test results from smaps file.
(1)without patch
55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
Size:              21796 kB
Rss:               14260 kB
Pss:               14260 kB

(2)with patch
55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
Size:              21668 kB
Rss:                6940 kB
Pss:                6940 kB

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
---
 configure  | 4 ++++
 util/rcu.c | 6 ++++++
 2 files changed, 10 insertions(+)

Comments

Stefan Hajnoczi Nov. 23, 2017, 11:19 a.m. UTC | #1
On Thu, Nov 23, 2017 at 02:41:16PM +0800, Yang Zhong wrote:
> Since there are some issues in memory alloc/free machenism
> in glibc for little chunk memory, if Qemu frequently
> alloc/free little chunk memory, the glibc doesn't alloc
> little chunk memory from free list of glibc and still
> allocate from OS, which make the heap size bigger and bigger.
> 
> This patch introduce malloc_trim(), which will free heap memory.
> 
> Below are test results from smaps file.
> (1)without patch
> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
> Size:              21796 kB
> Rss:               14260 kB
> Pss:               14260 kB
> 
> (2)with patch
> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
> Size:              21668 kB
> Rss:                6940 kB
> Pss:                6940 kB
> 
> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> ---
>  configure  | 4 ++++
>  util/rcu.c | 6 ++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/configure b/configure
> index 0e856bb..5b463d4 100755
> --- a/configure
> +++ b/configure
> @@ -6012,6 +6012,10 @@ if test "$opengl" = "yes" ; then
>    fi
>  fi
>  
> +if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
> +  echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak

malloc(3) is provided by glibc, not glib, so the name
CONFIG_NONGLIBMALLOC is confusing.

I suggest calling it CONFIG_MALLOC_TRIM instead:

  # Even if malloc_trim() is available, these non-libc memory allocators
  # do not support it.
  if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
      if test "$malloc_trim" = "yes" ; then
          echo "Disabling malloc_trim with non-libc memory allocator"
      fi
      malloc_trim="no"
  fi

  if test "$malloc_trim" != "no" ; then
      cat > $TMPC << EOF
  #include <malloc.h>
  int main(void) { malloc_trim(0); return 0; }
  EOF
      if compile_prog "" "" ; then
          malloc_trim="yes"
      else
          malloc_trim="no"
      fi
  fi

  ...

  if test "$malloc_trim" = "yes" ; then
      echo "CONFIG_MALLOC_TRIM=y" >> $config_host_mak
  fi

Then the code in rcu.c just has to #ifdef CONFIG_MALLOC_TRIM and there's
no need for Linux-specific checks.  If other operating systems support
malloc_trim() then QEMU will use it by default.
Yang Zhong Nov. 24, 2017, 6:29 a.m. UTC | #2
On Thu, Nov 23, 2017 at 11:19:43AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 23, 2017 at 02:41:16PM +0800, Yang Zhong wrote:
> > Since there are some issues in memory alloc/free machenism
> > in glibc for little chunk memory, if Qemu frequently
> > alloc/free little chunk memory, the glibc doesn't alloc
> > little chunk memory from free list of glibc and still
> > allocate from OS, which make the heap size bigger and bigger.
> > 
> > This patch introduce malloc_trim(), which will free heap memory.
> > 
> > Below are test results from smaps file.
> > (1)without patch
> > 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
> > Size:              21796 kB
> > Rss:               14260 kB
> > Pss:               14260 kB
> > 
> > (2)with patch
> > 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
> > Size:              21668 kB
> > Rss:                6940 kB
> > Pss:                6940 kB
> > 
> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> > ---
> >  configure  | 4 ++++
> >  util/rcu.c | 6 ++++++
> >  2 files changed, 10 insertions(+)
> > 
> > diff --git a/configure b/configure
> > index 0e856bb..5b463d4 100755
> > --- a/configure
> > +++ b/configure
> > @@ -6012,6 +6012,10 @@ if test "$opengl" = "yes" ; then
> >    fi
> >  fi
> >  
> > +if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
> > +  echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak
> 
> malloc(3) is provided by glibc, not glib, so the name
> CONFIG_NONGLIBMALLOC is confusing.
> 
> I suggest calling it CONFIG_MALLOC_TRIM instead:
> 
>   # Even if malloc_trim() is available, these non-libc memory allocators
>   # do not support it.
>   if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
>       if test "$malloc_trim" = "yes" ; then
>           echo "Disabling malloc_trim with non-libc memory allocator"
>       fi
>       malloc_trim="no"
>   fi
> 
>   if test "$malloc_trim" != "no" ; then
>       cat > $TMPC << EOF
>   #include <malloc.h>
>   int main(void) { malloc_trim(0); return 0; }
>   EOF
>       if compile_prog "" "" ; then
>           malloc_trim="yes"
>       else
>           malloc_trim="no"
>       fi
>   fi
> 
>   ...
> 
>   if test "$malloc_trim" = "yes" ; then
>       echo "CONFIG_MALLOC_TRIM=y" >> $config_host_mak
>   fi
> 
> Then the code in rcu.c just has to #ifdef CONFIG_MALLOC_TRIM and there's
> no need for Linux-specific checks.  If other operating systems support
> malloc_trim() then QEMU will use it by default.

  Hello Stefan,

  Thanks for your detailed infomation!

  I did test with this new patch, which are okay, thanks again!

  I will send V3 patch soon, please help review again! thanks!

  Regards,

  Yang
Marc-André Lureau Dec. 7, 2017, 3:33 p.m. UTC | #3
Hi

On Thu, Nov 23, 2017 at 7:41 AM, Yang Zhong <yang.zhong@intel.com> wrote:
> Since there are some issues in memory alloc/free machenism
> in glibc for little chunk memory, if Qemu frequently
> alloc/free little chunk memory, the glibc doesn't alloc
> little chunk memory from free list of glibc and still
> allocate from OS, which make the heap size bigger and bigger.
>
> This patch introduce malloc_trim(), which will free heap memory.
>
> Below are test results from smaps file.
> (1)without patch
> 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
> Size:              21796 kB
> Rss:               14260 kB
> Pss:               14260 kB
>
> (2)with patch
> 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
> Size:              21668 kB
> Rss:                6940 kB
> Pss:                6940 kB

Have you opened a bug to glibc malloc (or discussed that issue with
the glibc malloc developpers) ?

Or is there a justification that qemu should have its own trim
heuristic on top of malloc?

>
> Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> ---
>  configure  | 4 ++++
>  util/rcu.c | 6 ++++++
>  2 files changed, 10 insertions(+)
>
> diff --git a/configure b/configure
> index 0e856bb..5b463d4 100755
> --- a/configure
> +++ b/configure
> @@ -6012,6 +6012,10 @@ if test "$opengl" = "yes" ; then
>    fi
>  fi
>
> +if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
> +  echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak
> +fi
> +
>  if test "$avx2_opt" = "yes" ; then
>    echo "CONFIG_AVX2_OPT=y" >> $config_host_mak
>  fi
> diff --git a/util/rcu.c b/util/rcu.c
> index ca5a63e..f3e96a8 100644
> --- a/util/rcu.c
> +++ b/util/rcu.c
> @@ -32,6 +32,9 @@
>  #include "qemu/atomic.h"
>  #include "qemu/thread.h"
>  #include "qemu/main-loop.h"
> +#if defined(CONFIG_LINUX)
> +#include <malloc.h>
> +#endif
>
>  /*
>   * Global grace period counter.  Bit 0 is always one in rcu_gp_ctr.
> @@ -272,6 +275,9 @@ static void *call_rcu_thread(void *opaque)
>              node->func(node);
>          }
>          qemu_mutex_unlock_iothread();
> +#if defined(CONFIG_LINUX) && !defined(CONFIG_NONGLIBMALLOC)
> +        malloc_trim(4 * 1024 * 1024);
> +#endif
>      }
>      abort();
>  }
> --
> 1.9.1
>
>
Yang Zhong Dec. 8, 2017, 5:14 a.m. UTC | #4
On Thu, Dec 07, 2017 at 04:33:10PM +0100, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Nov 23, 2017 at 7:41 AM, Yang Zhong <yang.zhong@intel.com> wrote:
> > Since there are some issues in memory alloc/free machenism
> > in glibc for little chunk memory, if Qemu frequently
> > alloc/free little chunk memory, the glibc doesn't alloc
> > little chunk memory from free list of glibc and still
> > allocate from OS, which make the heap size bigger and bigger.
> >
> > This patch introduce malloc_trim(), which will free heap memory.
> >
> > Below are test results from smaps file.
> > (1)without patch
> > 55f0783e1000-55f07992a000 rw-p 00000000 00:00 0  [heap]
> > Size:              21796 kB
> > Rss:               14260 kB
> > Pss:               14260 kB
> >
> > (2)with patch
> > 55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0  [heap]
> > Size:              21668 kB
> > Rss:                6940 kB
> > Pss:                6940 kB
> 
> Have you opened a bug to glibc malloc (or discussed that issue with
> the glibc malloc developpers) ?
> 
> Or is there a justification that qemu should have its own trim
> heuristic on top of malloc?

  Hello Marc-Andr,
 
  Thanks for your comments!
  I did not open a bug for glibc community, maybe this issue is not new
  issue for them because there are lots of complains about malloc/free 
  issues in the internet. If need, i can file this bug to them.

  Qemu(v2.3) introduce the rcu mechanism for VM performance, but the side 
  effect is heap memory never shrink like below case
  http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg02748.html 
  
  Glibc let free memory in their free list for next malloc, but the fact 
  is glibc still allocate memory from OS in Qemu. There are lots of memory 
  hole in the heap, escepially the batch malloc and free with rcu mechanism,
  which made heap memory bigger and bigger, and never shrink.

  from the test by mallinfo() and malloc_trim(), i found malloc_trim() not
  only trim heap top free memory, but also can free hole memory in the heap.
  
  This is V2 patch thread and we are talking in V3 patch thread now.
  http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg04519.html

  Regards,

  Yang
 

> Marc-André Lureau
diff mbox

Patch

diff --git a/configure b/configure
index 0e856bb..5b463d4 100755
--- a/configure
+++ b/configure
@@ -6012,6 +6012,10 @@  if test "$opengl" = "yes" ; then
   fi
 fi
 
+if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
+  echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak
+fi
+
 if test "$avx2_opt" = "yes" ; then
   echo "CONFIG_AVX2_OPT=y" >> $config_host_mak
 fi
diff --git a/util/rcu.c b/util/rcu.c
index ca5a63e..f3e96a8 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -32,6 +32,9 @@ 
 #include "qemu/atomic.h"
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
+#if defined(CONFIG_LINUX)
+#include <malloc.h>
+#endif
 
 /*
  * Global grace period counter.  Bit 0 is always one in rcu_gp_ctr.
@@ -272,6 +275,9 @@  static void *call_rcu_thread(void *opaque)
             node->func(node);
         }
         qemu_mutex_unlock_iothread();
+#if defined(CONFIG_LINUX) && !defined(CONFIG_NONGLIBMALLOC)
+        malloc_trim(4 * 1024 * 1024);
+#endif
     }
     abort();
 }