diff mbox

[RFC] exec: eliminate ram naming issue as migration

Message ID ED26CBA2FAD1BF48A8719AEF02201E365144EAAB@SHSMSX103.ccr.corp.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tan, Jianfeng Feb. 24, 2018, 3:11 a.m. UTC
> -----Original Message-----
> From: Tan, Jianfeng
> Sent: Saturday, February 24, 2018 11:08 AM
> To: 'Igor Mammedov'
> Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org;
> Michael S . Tsirkin
> Subject: RE: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> migration
> 
> Hi Igor and all,
> 
> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: Thursday, February 8, 2018 7:30 PM
> > To: Tan, Jianfeng
> > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-
> devel@nongnu.org;
> > Michael S . Tsirkin
> > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > migration
> >
> [...]
> > > > It could be solved by adding memdev option to machine,
> > > > which would allow to specify backend object. And then on
> > > > top make -mem-path alias new option to clean thing up.
> > >
> > > Do you mean?
> > >
> > > src vm: -m xG
> > > dst vm: -m xG,memdev=pc.ram -object memory-backend-
> file,id=pc.ram,size=xG,mem-path=xxx,share=on ...
> > Yep, I've meant something like it
> >
> > src vm: -m xG,memdev=SHARED_RAM -object memory-backend-
> file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > dst vm: -m xG,memdev=SHARED_RAM -object memory-backend-
> file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> 
> After a second thought, I find adding a backend for nonnuma pc RAM is
> roundabout way.
> 
> And we actually have an existing way to add a file-backed RAM: commit
> c902760fb25f ("Add option to use file backed guest memory"). Basically, this
> commit adds two options, --mem-path and --mem-prealloc, without specify
> a backend explicitly.
> 
> So how about just adding a new option --mem-share to decide if that's a
> private memory or shared memory? That seems much straightforward way
> to me; after this change we can migrate like:
> 
> src vm: -m xG
> dst vm: -m xG --mem-path xxx --mem-share
> 

Attach the patch FYI. Look forward to your thoughts.

Comments

Igor Mammedov Feb. 26, 2018, 12:55 p.m. UTC | #1
On Sat, 24 Feb 2018 03:11:30 +0000
"Tan, Jianfeng" <jianfeng.tan@intel.com> wrote:

> > -----Original Message-----
> > From: Tan, Jianfeng
> > Sent: Saturday, February 24, 2018 11:08 AM
> > To: 'Igor Mammedov'
> > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org;
> > Michael S . Tsirkin
> > Subject: RE: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > migration
> > 
> > Hi Igor and all,
> >   
> > > -----Original Message-----
> > > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > > Sent: Thursday, February 8, 2018 7:30 PM
> > > To: Tan, Jianfeng
> > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-  
> > devel@nongnu.org;  
> > > Michael S . Tsirkin
> > > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > migration
> > >  
> > [...]  
> > > > > It could be solved by adding memdev option to machine,
> > > > > which would allow to specify backend object. And then on
> > > > > top make -mem-path alias new option to clean thing up.  
> > > >
> > > > Do you mean?
> > > >
> > > > src vm: -m xG
> > > > dst vm: -m xG,memdev=pc.ram -object memory-backend-  
> > file,id=pc.ram,size=xG,mem-path=xxx,share=on ...  
> > > Yep, I've meant something like it
> > >
> > > src vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on  
> > > dst vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > 
> > After a second thought, I find adding a backend for nonnuma pc RAM is
> > roundabout way.
> > 
> > And we actually have an existing way to add a file-backed RAM: commit
> > c902760fb25f ("Add option to use file backed guest memory"). Basically, this
> > commit adds two options, --mem-path and --mem-prealloc, without specify
> > a backend explicitly.
> > 
> > So how about just adding a new option --mem-share to decide if that's a
> > private memory or shared memory? That seems much straightforward way
Above options are legacy (which we can't remove for compat reasons),
their replacement is 'memory-backend-file' backend which has all of
the above including 'share' property.

So just add 'memdev' property to machine and reuse memory-backend-file
with it instead of duplicating functionality in the legacy code.

> > to me; after this change we can migrate like:
> > 
> > src vm: -m xG
> > dst vm: -m xG --mem-path xxx --mem-share
Even though it might work for now, that's still invalid configuration
for migration, src side must include the same
  "--mem-path xxx --mem-share"
options as dst.

It'd be better to fix management application to start QEMU
properly on SRC side.

 
> Attach the patch FYI. Look forward to your thoughts.
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 31612ca..5eaf367 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -127,6 +127,7 @@ extern bool enable_mlock;
>  extern uint8_t qemu_extra_params_fw[2];
>  extern QEMUClockType rtc_clock;
>  extern const char *mem_path;
> +extern int mem_share;
>  extern int mem_prealloc;
>  
>  #define MAX_NODES 128
> diff --git a/numa.c b/numa.c
> index 7b9c33a..322289f 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -456,7 +456,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>      if (mem_path) {
>  #ifdef __linux__
>          Error *err = NULL;
> -        memory_region_init_ram_from_file(mr, owner, name, ram_size, false,
> +        memory_region_init_ram_from_file(mr, owner, name, ram_size, mem_share,
>                                           mem_path, &err);
>          if (err) {
>              error_report_err(err);
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 678181c..c968c53 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -389,6 +389,15 @@ STEXI
>  Allocate guest RAM from a temporarily created file in @var{path}.
>  ETEXI
>  
> +DEF("mem-share", 0, QEMU_OPTION_memshare,
> +    "-mem-share   make guest memory shareable (use with -mem-path)\n",
> +    QEMU_ARCH_ALL)
> +STEXI
> +@item -mem-share
> +@findex -mem-share
> +Make file-backed guest RAM shareable when using -mem-path.
> +ETEXI
> +
>  DEF("mem-prealloc", 0, QEMU_OPTION_mem_prealloc,
>      "-mem-prealloc   preallocate guest memory (use with -mem-path)\n",
>      QEMU_ARCH_ALL)
> diff --git a/vl.c b/vl.c
> index 444b750..0ff06c2 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -140,6 +140,7 @@ int display_opengl;
>  const char* keyboard_layout = NULL;
>  ram_addr_t ram_size;
>  const char *mem_path = NULL;
> +int mem_share = 0;
>  int mem_prealloc = 0; /* force preallocation of physical target memory */
>  bool enable_mlock = false;
>  int nb_nics;
> @@ -3395,6 +3396,9 @@ int main(int argc, char **argv, char **envp)
>              case QEMU_OPTION_mempath:
>                  mem_path = optarg;
>                  break;
> +            case QEMU_OPTION_memshare:
> +                mem_share = 1;
> +                break;
>              case QEMU_OPTION_mem_prealloc:
>                  mem_prealloc = 1;
>                  break;
Paolo Bonzini Feb. 26, 2018, 2:43 p.m. UTC | #2
On 26/02/2018 13:55, Igor Mammedov wrote:
>>> So how about just adding a new option --mem-share to decide if that's a
>>> private memory or shared memory? That seems much straightforward way
> Above options are legacy (which we can't remove for compat reasons),
> their replacement is 'memory-backend-file' backend which has all of
> the above including 'share' property.

More precisely, we have added "-object memory-backend-file" to avoid
proliferation of options related to memory.  Besides unifying the cases
of 1 and >1 NUMA node, using -object also has the advantage of
supporting memory hotplug.

You wrote "I find adding a backend for nonnuma pc RAM is roundabout way"
but basically the command line says "this VM has only one NUMA node,
backed by this memory object" which is a precise description of what the
VM memory looks like.

> So just add 'memdev' property to machine and reuse memory-backend-file
> with it instead of duplicating functionality in the legacy code.

That would however also have a different RAMBlock id, effectively
producing the same output as "-numa node,memdev=...".

I think this should be solved at the libvirt level.  Libvirt should
write in the migration XML cookie whether the VM is using -object or
-mem-path to declare its memory, and newly-started VMs should always use
-object.  This won't fix the problem for VMs that are already running,
but it will fix it the next time they are started.

Paolo
Tan, Jianfeng Feb. 27, 2018, 4:36 a.m. UTC | #3
> -----Original Message-----
> From: Igor Mammedov [mailto:imammedo@redhat.com]
> Sent: Monday, February 26, 2018 8:56 PM
> To: Tan, Jianfeng
> Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org;
> Michael S . Tsirkin
> Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> migration
> 
> On Sat, 24 Feb 2018 03:11:30 +0000
> "Tan, Jianfeng" <jianfeng.tan@intel.com> wrote:
> 
> > > -----Original Message-----
> > > From: Tan, Jianfeng
> > > Sent: Saturday, February 24, 2018 11:08 AM
> > > To: 'Igor Mammedov'
> > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-
> devel@nongnu.org;
> > > Michael S . Tsirkin
> > > Subject: RE: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > migration
> > >
> > > Hi Igor and all,
> > >
> > > > -----Original Message-----
> > > > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > > > Sent: Thursday, February 8, 2018 7:30 PM
> > > > To: Tan, Jianfeng
> > > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-
> > > devel@nongnu.org;
> > > > Michael S . Tsirkin
> > > > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > > migration
> > > >
> > > [...]
> > > > > > It could be solved by adding memdev option to machine,
> > > > > > which would allow to specify backend object. And then on
> > > > > > top make -mem-path alias new option to clean thing up.
> > > > >
> > > > > Do you mean?
> > > > >
> > > > > src vm: -m xG
> > > > > dst vm: -m xG,memdev=pc.ram -object memory-backend-
> > > file,id=pc.ram,size=xG,mem-path=xxx,share=on ...
> > > > Yep, I've meant something like it
> > > >
> > > > src vm: -m xG,memdev=SHARED_RAM -object memory-backend-
> > > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > > > dst vm: -m xG,memdev=SHARED_RAM -object memory-backend-
> > > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > >
> > > After a second thought, I find adding a backend for nonnuma pc RAM is
> > > roundabout way.
> > >
> > > And we actually have an existing way to add a file-backed RAM: commit
> > > c902760fb25f ("Add option to use file backed guest memory"). Basically,
> this
> > > commit adds two options, --mem-path and --mem-prealloc, without
> specify
> > > a backend explicitly.
> > >
> > > So how about just adding a new option --mem-share to decide if that's a
> > > private memory or shared memory? That seems much straightforward
> way
> Above options are legacy (which we can't remove for compat reasons),
> their replacement is 'memory-backend-file' backend which has all of
> the above including 'share' property.

OK, such options are legacy. I've no idea of that. Thanks! That makes sense.

> 
> So just add 'memdev' property to machine and reuse memory-backend-file
> with it instead of duplicating functionality in the legacy code.

To "-m" or "-machine"?
Tan, Jianfeng Feb. 27, 2018, 4:55 a.m. UTC | #4
> -----Original Message-----

> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> Sent: Monday, February 26, 2018 10:43 PM

> To: Igor Mammedov; Tan, Jianfeng

> Cc: Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org; Michael S .

> Tsirkin

> Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as

> migration

> 

> On 26/02/2018 13:55, Igor Mammedov wrote:

> >>> So how about just adding a new option --mem-share to decide if that's a

> >>> private memory or shared memory? That seems much straightforward

> way

> > Above options are legacy (which we can't remove for compat reasons),

> > their replacement is 'memory-backend-file' backend which has all of

> > the above including 'share' property.

> 

> More precisely, we have added "-object memory-backend-file" to avoid

> proliferation of options related to memory.  Besides unifying the cases

> of 1 and >1 NUMA node, using -object also has the advantage of

> supporting memory hotplug.

> 

> You wrote "I find adding a backend for nonnuma pc RAM is roundabout way"

> but basically the command line says "this VM has only one NUMA node,

> backed by this memory object" which is a precise description of what the

> VM memory looks like.

> 

> > So just add 'memdev' property to machine and reuse memory-backend-file

> > with it instead of duplicating functionality in the legacy code.

> 

> That would however also have a different RAMBlock id, effectively

> producing the same output as "-numa node,memdev=...".


Is it possible that we force that RAMBlock id to be "pc.ram"?

> 

> I think this should be solved at the libvirt level.  Libvirt should

> write in the migration XML cookie whether the VM is using -object or

> -mem-path to declare its memory, and newly-started VMs should always use

> -object.  This won't fix the problem for VMs that are already running,

> but it will fix it the next time they are started.


In this case, we return to the start point :-)

Thanks,
Jianfeng
Igor Mammedov Feb. 28, 2018, 3:40 p.m. UTC | #5
On Tue, 27 Feb 2018 04:36:45 +0000
"Tan, Jianfeng" <jianfeng.tan@intel.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > Sent: Monday, February 26, 2018 8:56 PM
> > To: Tan, Jianfeng
> > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org;
> > Michael S . Tsirkin
> > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > migration
> > 
> > On Sat, 24 Feb 2018 03:11:30 +0000
> > "Tan, Jianfeng" <jianfeng.tan@intel.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Tan, Jianfeng
> > > > Sent: Saturday, February 24, 2018 11:08 AM
> > > > To: 'Igor Mammedov'
> > > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-  
> > devel@nongnu.org;  
> > > > Michael S . Tsirkin
> > > > Subject: RE: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > > migration
> > > >
> > > > Hi Igor and all,
> > > >  
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov [mailto:imammedo@redhat.com]
> > > > > Sent: Thursday, February 8, 2018 7:30 PM
> > > > > To: Tan, Jianfeng
> > > > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-  
> > > > devel@nongnu.org;  
> > > > > Michael S . Tsirkin
> > > > > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > > > migration
> > > > >  
> > > > [...]  
> > > > > > > It could be solved by adding memdev option to machine,
> > > > > > > which would allow to specify backend object. And then on
> > > > > > > top make -mem-path alias new option to clean thing up.  
> > > > > >
> > > > > > Do you mean?
> > > > > >
> > > > > > src vm: -m xG
> > > > > > dst vm: -m xG,memdev=pc.ram -object memory-backend-  
> > > > file,id=pc.ram,size=xG,mem-path=xxx,share=on ...  
> > > > > Yep, I've meant something like it
> > > > >
> > > > > src vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > > > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on  
> > > > > dst vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > > > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > > >
> > > > After a second thought, I find adding a backend for nonnuma pc RAM is
> > > > roundabout way.
> > > >
> > > > And we actually have an existing way to add a file-backed RAM: commit
> > > > c902760fb25f ("Add option to use file backed guest memory"). Basically,  
> > this  
> > > > commit adds two options, --mem-path and --mem-prealloc, without  
> > specify  
> > > > a backend explicitly.
> > > >
> > > > So how about just adding a new option --mem-share to decide if that's a
> > > > private memory or shared memory? That seems much straightforward  
> > way
> > Above options are legacy (which we can't remove for compat reasons),
> > their replacement is 'memory-backend-file' backend which has all of
> > the above including 'share' property.  
> 
> OK, such options are legacy. I've no idea of that. Thanks! That makes sense.
> 
> > 
> > So just add 'memdev' property to machine and reuse memory-backend-file
> > with it instead of duplicating functionality in the legacy code.  
> 
> To "-m" or "-machine"?
"-machine", I plan to convert -m to machine options as well (it's somewhere on my TODO list)

but as Paolo pointed out that will help only to avoid using -numa
and won't help with your case, which should be solved at upper layer
(i.e. starting QEMU on src with shared memory from the begging).
diff mbox

Patch

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 31612ca..5eaf367 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -127,6 +127,7 @@  extern bool enable_mlock;
 extern uint8_t qemu_extra_params_fw[2];
 extern QEMUClockType rtc_clock;
 extern const char *mem_path;
+extern int mem_share;
 extern int mem_prealloc;
 
 #define MAX_NODES 128
diff --git a/numa.c b/numa.c
index 7b9c33a..322289f 100644
--- a/numa.c
+++ b/numa.c
@@ -456,7 +456,7 @@  static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
     if (mem_path) {
 #ifdef __linux__
         Error *err = NULL;
-        memory_region_init_ram_from_file(mr, owner, name, ram_size, false,
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, mem_share,
                                          mem_path, &err);
         if (err) {
             error_report_err(err);
diff --git a/qemu-options.hx b/qemu-options.hx
index 678181c..c968c53 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -389,6 +389,15 @@  STEXI
 Allocate guest RAM from a temporarily created file in @var{path}.
 ETEXI
 
+DEF("mem-share", 0, QEMU_OPTION_memshare,
+    "-mem-share   make guest memory shareable (use with -mem-path)\n",
+    QEMU_ARCH_ALL)
+STEXI
+@item -mem-share
+@findex -mem-share
+Make file-backed guest RAM shareable when using -mem-path.
+ETEXI
+
 DEF("mem-prealloc", 0, QEMU_OPTION_mem_prealloc,
     "-mem-prealloc   preallocate guest memory (use with -mem-path)\n",
     QEMU_ARCH_ALL)
diff --git a/vl.c b/vl.c
index 444b750..0ff06c2 100644
--- a/vl.c
+++ b/vl.c
@@ -140,6 +140,7 @@  int display_opengl;
 const char* keyboard_layout = NULL;
 ram_addr_t ram_size;
 const char *mem_path = NULL;
+int mem_share = 0;
 int mem_prealloc = 0; /* force preallocation of physical target memory */
 bool enable_mlock = false;
 int nb_nics;
@@ -3395,6 +3396,9 @@  int main(int argc, char **argv, char **envp)
             case QEMU_OPTION_mempath:
                 mem_path = optarg;
                 break;
+            case QEMU_OPTION_memshare:
+                mem_share = 1;
+                break;
             case QEMU_OPTION_mem_prealloc:
                 mem_prealloc = 1;
                 break;