diff mbox series

[1/3] multifd: bugfix for migration using compression methods

Message ID 20241218091413.140396-2-yuan1.liu@intel.com (mailing list archive)
State New
Headers show
Series bugfixes for migration using compression methods | expand

Commit Message

Liu, Yuan1 Dec. 18, 2024, 9:14 a.m. UTC
When compression is enabled on the migration channel and
the pages processed are all zero pages, these pages will
not be sent and updated on the target side, resulting in
incorrect memory data on the source and target sides.

The root cause is that all compression methods call
multifd_send_prepare_common to determine whether to compress
dirty pages, but multifd_send_prepare_common does not update
the IOV of MultiFDPacket_t when all dirty pages are zero pages.

The solution is to always update the IOV of MultiFDPacket_t
regardless of whether the dirty pages are all zero pages.

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
---
 migration/multifd-nocomp.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Peter Xu Dec. 18, 2024, 5:03 p.m. UTC | #1
On Wed, Dec 18, 2024 at 05:14:11PM +0800, Yuan Liu wrote:
> When compression is enabled on the migration channel and
> the pages processed are all zero pages, these pages will
> not be sent and updated on the target side, resulting in
> incorrect memory data on the source and target sides.
> 
> The root cause is that all compression methods call
> multifd_send_prepare_common to determine whether to compress
> dirty pages, but multifd_send_prepare_common does not update
> the IOV of MultiFDPacket_t when all dirty pages are zero pages.
> 
> The solution is to always update the IOV of MultiFDPacket_t
> regardless of whether the dirty pages are all zero pages.
> 
> Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Jason Zeng <jason.zeng@intel.com>

Ouch.. thanks for digging this out.

Reviewed-by: Peter Xu <peterx@redhat.com>

Is this the correct Fixes tag (and copy stable for 9.0+)?

Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.")

> ---
>  migration/multifd-nocomp.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
> index 55191152f9..2e4aaac285 100644
> --- a/migration/multifd-nocomp.c
> +++ b/migration/multifd-nocomp.c
> @@ -362,6 +362,7 @@ int multifd_ram_flush_and_sync(void)
>  bool multifd_send_prepare_common(MultiFDSendParams *p)
>  {
>      MultiFDPages_t *pages = &p->data->u.ram;
> +    multifd_send_prepare_header(p);
>      multifd_send_zero_page_detect(p);
>  
>      if (!pages->normal_num) {
> @@ -369,8 +370,6 @@ bool multifd_send_prepare_common(MultiFDSendParams *p)
>          return false;
>      }
>  
> -    multifd_send_prepare_header(p);
> -
>      return true;
>  }
>  
> -- 
> 2.43.0
>
Liu, Yuan1 Dec. 19, 2024, 8:42 a.m. UTC | #2
> -----Original Message-----
> From: Peter Xu <peterx@redhat.com>
> Sent: Thursday, December 19, 2024 1:03 AM
> To: Liu, Yuan1 <yuan1.liu@intel.com>
> Cc: farosas@suse.de; qemu-devel@nongnu.org; Zeng, Jason
> <jason.zeng@intel.com>; Wang, Yichen <yichen.wang@bytedance.com>
> Subject: Re: [PATCH 1/3] multifd: bugfix for migration using compression
> methods
> 
> On Wed, Dec 18, 2024 at 05:14:11PM +0800, Yuan Liu wrote:
> > When compression is enabled on the migration channel and
> > the pages processed are all zero pages, these pages will
> > not be sent and updated on the target side, resulting in
> > incorrect memory data on the source and target sides.
> >
> > The root cause is that all compression methods call
> > multifd_send_prepare_common to determine whether to compress
> > dirty pages, but multifd_send_prepare_common does not update
> > the IOV of MultiFDPacket_t when all dirty pages are zero pages.
> >
> > The solution is to always update the IOV of MultiFDPacket_t
> > regardless of whether the dirty pages are all zero pages.
> >
> > Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> > Reviewed-by: Jason Zeng <jason.zeng@intel.com>
> 
> Ouch.. thanks for digging this out.
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> Is this the correct Fixes tag (and copy stable for 9.0+)?
> 
> Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on
> the multifd thread.")

Yes, this patch is used to fix the tag containing 303e6f54f9, and I see it already exists in v9.0.0 and later versions.

By the way, The three bugfix patches are based on mainline and the commit id is 8032c78e55 (origin/staging, origin/master, origin/HEAD) Merge tag 'firmware-20241216-pull-request' of https://gitlab.com/kraxel/qemu into staging

> > ---
> >  migration/multifd-nocomp.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
> > index 55191152f9..2e4aaac285 100644
> > --- a/migration/multifd-nocomp.c
> > +++ b/migration/multifd-nocomp.c
> > @@ -362,6 +362,7 @@ int multifd_ram_flush_and_sync(void)
> >  bool multifd_send_prepare_common(MultiFDSendParams *p)
> >  {
> >      MultiFDPages_t *pages = &p->data->u.ram;
> > +    multifd_send_prepare_header(p);
> >      multifd_send_zero_page_detect(p);
> >
> >      if (!pages->normal_num) {
> > @@ -369,8 +370,6 @@ bool multifd_send_prepare_common(MultiFDSendParams
> *p)
> >          return false;
> >      }
> >
> > -    multifd_send_prepare_header(p);
> > -
> >      return true;
> >  }
> >
> > --
> > 2.43.0
> >
> 
> --
> Peter Xu
diff mbox series

Patch

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 55191152f9..2e4aaac285 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -362,6 +362,7 @@  int multifd_ram_flush_and_sync(void)
 bool multifd_send_prepare_common(MultiFDSendParams *p)
 {
     MultiFDPages_t *pages = &p->data->u.ram;
+    multifd_send_prepare_header(p);
     multifd_send_zero_page_detect(p);
 
     if (!pages->normal_num) {
@@ -369,8 +370,6 @@  bool multifd_send_prepare_common(MultiFDSendParams *p)
         return false;
     }
 
-    multifd_send_prepare_header(p);
-
     return true;
 }