diff mbox

[01/15] postcopy: Transmit and compare individual page sizes

Message ID 20170106182823.1960-2-dgilbert@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dr. David Alan Gilbert Jan. 6, 2017, 6:28 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

When using postcopy with hugepages, we require the source
and destination page sizes for any RAMBlock to match.

Transmit them as part of the RAM information header and
fail if there's a difference.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Alexey Perevalov Jan. 25, 2017, 8:01 a.m. UTC | #1
Hi David,

I checked you whole patch set with Andrea's kernel
git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git

It works and really gives sufficient decreasing of the downtime.

I'm newby in qemu and here in the mailing list.

I have some remarks on current patch.

On both client and server side post copy capability should be enabled

migrate_set_capability postcopy-ram on

and serialization/deserialization relies on it.

  So if destination host
doesn't set post copy capability ram_load will skip reading of
remote_page_size and in case of multiple RAMBlocks the next read of len
will be incorrect. Hopefully usually len is 0, but it could be bigger 
and overrun buffer ;).
Maybe it's better to pass post copy capability attribute from host to 
destination to avoid
such assumption.


On 01/06/2017 09:28 PM, Dave Gilbert wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> When using postcopy with hugepages, we require the source
> and destination page sizes for any RAMBlock to match.
>
> Transmit them as part of the RAM information header and
> fail if there's a difference.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>   migration/ram.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index a1c8089..39998f5 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1970,6 +1970,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>           qemu_put_byte(f, strlen(block->idstr));
>           qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>           qemu_put_be64(f, block->used_length);
> +        if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> +            qemu_put_be64(f, block->page_size);
> +        }
>       }
>   
>       rcu_read_unlock();
> @@ -2536,6 +2539,18 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                               error_report_err(local_err);
>                           }
>                       }
> +                    /* For postcopy we need to check hugepage sizes match */
> +                    if (migrate_postcopy_ram() &&
> +                        block->page_size != qemu_host_page_size) {
> +                        uint64_t remote_page_size = qemu_get_be64(f);
> +                        if (remote_page_size != block->page_size) {
> +                            error_report("Mismatched RAM page size %s "
> +                                         "(local) %" PRId64 "!= %" PRId64,
> +                                         id, block->page_size,
> +                                         remote_page_size);
> +                            ret = -EINVAL;
> +                        }
> +                    }
>                       ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>                                             block->idstr);
>                   } else {
Juan Quintela Jan. 25, 2017, 9:47 a.m. UTC | #2
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> When using postcopy with hugepages, we require the source
> and destination page sizes for any RAMBlock to match.
>
> Transmit them as part of the RAM information header and
> fail if there's a difference.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/ram.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index a1c8089..39998f5 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1970,6 +1970,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>          qemu_put_byte(f, strlen(block->idstr));
>          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
>          qemu_put_be64(f, block->used_length);
> +        if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> +            qemu_put_be64(f, block->page_size);
> +        }

Hi

1- can different block have different page_size? 
2- can we remove the migrate_postocpy_ram() check and just send it for
   newer versions and not for older ones?

>      }
>  
>      rcu_read_unlock();
> @@ -2536,6 +2539,18 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                              error_report_err(local_err);
>                          }
>                      }
> +                    /* For postcopy we need to check hugepage sizes match */
> +                    if (migrate_postcopy_ram() &&
> +                        block->page_size != qemu_host_page_size) {
> +                        uint64_t remote_page_size = qemu_get_be64(f);
> +                        if (remote_page_size != block->page_size) {
> +                            error_report("Mismatched RAM page size %s "
> +                                         "(local) %" PRId64 "!= %" PRId64,
> +                                         id, block->page_size,
> +                                         remote_page_size);
> +                            ret = -EINVAL;
> +                        }
> +                    }
>                      ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>                                            block->idstr);
>                  } else {
Dr. David Alan Gilbert Jan. 25, 2017, 4:15 p.m. UTC | #3
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > When using postcopy with hugepages, we require the source
> > and destination page sizes for any RAMBlock to match.
> >
> > Transmit them as part of the RAM information header and
> > fail if there's a difference.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  migration/ram.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/migration/ram.c b/migration/ram.c
> > index a1c8089..39998f5 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1970,6 +1970,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >          qemu_put_byte(f, strlen(block->idstr));
> >          qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
> >          qemu_put_be64(f, block->used_length);
> > +        if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> > +            qemu_put_be64(f, block->page_size);
> > +        }
> 
> Hi
> 
> 1- can different block have different page_size? 

Yes, you can specify a backing file for any DIMM or Numa node,
for (simple but odd) example:

  qemu -m 2G,slots=4,maxmem=8G -object memory-backend-file,id=huge,prealloc=yes,mem-path=/dev/hugepages/foo,size=1G -device pc-dimm,memdev=huge,id=dimm0

  ends up with 2G of normal base RAM, and 1G of (2MB) hugepages.
  You can do something similar with NUMA configurations where all your
main RAM comes from hugepages but things like ROMs and vram are normal
4k pages; that's a config I've seen from an end user.

> 2- can we remove the migrate_postocpy_ram() check and just send it for
>    newer versions and not for older ones?

You mean bind it so that hugepage postcopy only works on newer machine types?
We could do that, although it's a shame to restrict the machine type
if we don't need to.
The next patch (hmm I could swap them around) will catch a gross level
error before we get to this point, but it wont catch the case
where you've got the same set of page sizes just on different blocks.

Dave

> 
> >      }
> >  
> >      rcu_read_unlock();
> > @@ -2536,6 +2539,18 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >                              error_report_err(local_err);
> >                          }
> >                      }
> > +                    /* For postcopy we need to check hugepage sizes match */
> > +                    if (migrate_postcopy_ram() &&
> > +                        block->page_size != qemu_host_page_size) {
> > +                        uint64_t remote_page_size = qemu_get_be64(f);
> > +                        if (remote_page_size != block->page_size) {
> > +                            error_report("Mismatched RAM page size %s "
> > +                                         "(local) %" PRId64 "!= %" PRId64,
> > +                                         id, block->page_size,
> > +                                         remote_page_size);
> > +                            ret = -EINVAL;
> > +                        }
> > +                    }
> >                      ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> >                                            block->idstr);
> >                  } else {
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert Jan. 25, 2017, 6:38 p.m. UTC | #4
* Alexey Perevalov (a.perevalov@samsung.com) wrote:
> Hi David,

Hi Alexey,

> I checked you whole patch set with Andrea's kernel
> git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> 
> It works and really gives sufficient decreasing of the downtime.
> 
> I'm newby in qemu and here in the mailing list.

Welcome!

> I have some remarks on current patch.
> 
> On both client and server side post copy capability should be enabled
> 
> migrate_set_capability postcopy-ram on
> 
> and serialization/deserialization relies on it.
> 
>  So if destination host
> doesn't set post copy capability ram_load will skip reading of
> remote_page_size and in case of multiple RAMBlocks the next read of len
> will be incorrect. Hopefully usually len is 0, but it could be bigger and
> overrun buffer ;).
> Maybe it's better to pass post copy capability attribute from host to
> destination to avoid
> such assumption.

We already pass an 'advise' command from the source to the destination
to tell it we're using postcopy - that's effectively us passing the
capability.  You can see in the 2nd patch I modify it's contents;
that happens before the RAM code we see here, so we
so we should always be safe in the case of the source having enabled
postcopy but the destination not having done.

'len' is always read as a byte, and the buffer is 256 bytes long - so
even if we read garbage off the stream we can never overrun the buffer.

However, you have made me think of a related case;  if the destination
has postcopy capability set, but the source does NOT have it set
then we'll incorrectly read this data.
I can fix that by changing the migrate_postcopy_ram() on
the receive side to:
  postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE

so we'll only do it if the source has it enabled.

Thanks,

Dave

> 
> 
> On 01/06/2017 09:28 PM, Dave Gilbert wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > When using postcopy with hugepages, we require the source
> > and destination page sizes for any RAMBlock to match.
> > 
> > Transmit them as part of the RAM information header and
> > fail if there's a difference.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >   migration/ram.c | 15 +++++++++++++++
> >   1 file changed, 15 insertions(+)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index a1c8089..39998f5 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1970,6 +1970,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >           qemu_put_byte(f, strlen(block->idstr));
> >           qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
> >           qemu_put_be64(f, block->used_length);
> > +        if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> > +            qemu_put_be64(f, block->page_size);
> > +        }
> >       }
> >       rcu_read_unlock();
> > @@ -2536,6 +2539,18 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >                               error_report_err(local_err);
> >                           }
> >                       }
> > +                    /* For postcopy we need to check hugepage sizes match */
> > +                    if (migrate_postcopy_ram() &&
> > +                        block->page_size != qemu_host_page_size) {
> > +                        uint64_t remote_page_size = qemu_get_be64(f);
> > +                        if (remote_page_size != block->page_size) {
> > +                            error_report("Mismatched RAM page size %s "
> > +                                         "(local) %" PRId64 "!= %" PRId64,
> > +                                         id, block->page_size,
> > +                                         remote_page_size);
> > +                            ret = -EINVAL;
> > +                        }
> > +                    }
> >                       ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> >                                             block->idstr);
> >                   } else {
> 
> 
> -- 
> Best regards,
> Alexey Perevalov
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox

Patch

diff --git a/migration/ram.c b/migration/ram.c
index a1c8089..39998f5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1970,6 +1970,9 @@  static int ram_save_setup(QEMUFile *f, void *opaque)
         qemu_put_byte(f, strlen(block->idstr));
         qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
         qemu_put_be64(f, block->used_length);
+        if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
+            qemu_put_be64(f, block->page_size);
+        }
     }
 
     rcu_read_unlock();
@@ -2536,6 +2539,18 @@  static int ram_load(QEMUFile *f, void *opaque, int version_id)
                             error_report_err(local_err);
                         }
                     }
+                    /* For postcopy we need to check hugepage sizes match */
+                    if (migrate_postcopy_ram() &&
+                        block->page_size != qemu_host_page_size) {
+                        uint64_t remote_page_size = qemu_get_be64(f);
+                        if (remote_page_size != block->page_size) {
+                            error_report("Mismatched RAM page size %s "
+                                         "(local) %" PRId64 "!= %" PRId64,
+                                         id, block->page_size,
+                                         remote_page_size);
+                            ret = -EINVAL;
+                        }
+                    }
                     ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
                                           block->idstr);
                 } else {