mbox series

[0/3] bugfixes for migration using compression methods

Message ID 20241218091413.140396-1-yuan1.liu@intel.com (mailing list archive)
Headers show
Series bugfixes for migration using compression methods | expand

Message

Liu, Yuan1 Dec. 18, 2024, 9:14 a.m. UTC
This set of patches is used to fix the bugs of incorrect migration
memory data when compression is enabled.

The method to reproduce this bug is as follows
1. Run "stress-ng --class memory --all 1" in the source side, the
stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git

2. Enable the multifd compression methods and start migration
   e.g. migrate_set_parameter multifd-compression qpl

3. The guest kernel will crash automatically or crash at shutdown after
   the migration is complete

The root cause of the bugs and the solutions are described in detail in
the patch.

My verification method as follows
1. Start the VM and run the stess-ng test command on the source side.
2. Start the VM with "-S" parameter on the target side, it is
   used to pause the vCPUs after migration.
3. After the migration is successful, use the dump-guest-memory command
   to export the memory data of the source and target VMs respectively.
4. Use "cmp -l source_memory target_memory" to verify memory data.

Yuan Liu (3):
  multifd: bugfix for migration using compression methods
  multifd: bugfix for incorrect migration data with QPL compression
  multifd: bugfix for incorrect migration data with qatzip compression

 migration/multifd-nocomp.c | 3 +--
 migration/multifd-qatzip.c | 1 +
 migration/multifd-qpl.c    | 1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

Comments

Peter Xu Dec. 18, 2024, 5:12 p.m. UTC | #1
On Wed, Dec 18, 2024 at 05:14:10PM +0800, Yuan Liu wrote:
> This set of patches is used to fix the bugs of incorrect migration
> memory data when compression is enabled.
> 
> The method to reproduce this bug is as follows
> 1. Run "stress-ng --class memory --all 1" in the source side, the
> stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> 
> 2. Enable the multifd compression methods and start migration
>    e.g. migrate_set_parameter multifd-compression qpl
> 
> 3. The guest kernel will crash automatically or crash at shutdown after
>    the migration is complete
> 
> The root cause of the bugs and the solutions are described in detail in
> the patch.
> 
> My verification method as follows
> 1. Start the VM and run the stess-ng test command on the source side.
> 2. Start the VM with "-S" parameter on the target side, it is
>    used to pause the vCPUs after migration.
> 3. After the migration is successful, use the dump-guest-memory command
>    to export the memory data of the source and target VMs respectively.
> 4. Use "cmp -l source_memory target_memory" to verify memory data.

This looks like a good idea to test memory intergrity.  I wonder if we can
do that in some, or all, of our migration qtests.

I didn't check the latter 2 patches but I assume they can also have a
proper Fixes tag.

The other thing is uadk seems also broken from that regard.. we could add
one patch for it, but the testing may be challenging for any of us.  In all
case, I copy Shameer.
Michael Tokarev Jan. 12, 2025, 1:12 p.m. UTC | #2
18.12.2024 12:14, Yuan Liu wrote:
> This set of patches is used to fix the bugs of incorrect migration
> memory data when compression is enabled.
> 
> The method to reproduce this bug is as follows
> 1. Run "stress-ng --class memory --all 1" in the source side, the
> stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> 
> 2. Enable the multifd compression methods and start migration
>     e.g. migrate_set_parameter multifd-compression qpl
> 
> 3. The guest kernel will crash automatically or crash at shutdown after
>     the migration is complete
> 
> The root cause of the bugs and the solutions are described in detail in
> the patch.
> 
> My verification method as follows
> 1. Start the VM and run the stess-ng test command on the source side.
> 2. Start the VM with "-S" parameter on the target side, it is
>     used to pause the vCPUs after migration.
> 3. After the migration is successful, use the dump-guest-memory command
>     to export the memory data of the source and target VMs respectively.
> 4. Use "cmp -l source_memory target_memory" to verify memory data.
> 
> Yuan Liu (3):
>    multifd: bugfix for migration using compression methods
>    multifd: bugfix for incorrect migration data with QPL compression
>    multifd: bugfix for incorrect migration data with qatzip compression

Should just the first patch be applied to qemu-stable branches, or all 3?
The first one has been Cc'd qemu-stable, but the other two hasn't?

Thanks,

/mjt
Liu, Yuan1 Jan. 13, 2025, 12:58 a.m. UTC | #3
> -----Original Message-----
> From: Michael Tokarev <mjt@tls.msk.ru>
> Sent: Sunday, January 12, 2025 9:13 PM
> To: Liu, Yuan1 <yuan1.liu@intel.com>; peterx@redhat.com; farosas@suse.de
> Cc: qemu-devel@nongnu.org; Zeng, Jason <jason.zeng@intel.com>; Wang,
> Yichen <yichen.wang@bytedance.com>; qemu-stable <qemu-stable@nongnu.org>
> Subject: Re: [PATCH 0/3] bugfixes for migration using compression methods
> 
> 18.12.2024 12:14, Yuan Liu wrote:
> > This set of patches is used to fix the bugs of incorrect migration
> > memory data when compression is enabled.
> >
> > The method to reproduce this bug is as follows
> > 1. Run "stress-ng --class memory --all 1" in the source side, the
> > stress-ng tool comes from https://github.com/ColinIanKing/stress-ng.git
> >
> > 2. Enable the multifd compression methods and start migration
> >     e.g. migrate_set_parameter multifd-compression qpl
> >
> > 3. The guest kernel will crash automatically or crash at shutdown after
> >     the migration is complete
> >
> > The root cause of the bugs and the solutions are described in detail in
> > the patch.
> >
> > My verification method as follows
> > 1. Start the VM and run the stess-ng test command on the source side.
> > 2. Start the VM with "-S" parameter on the target side, it is
> >     used to pause the vCPUs after migration.
> > 3. After the migration is successful, use the dump-guest-memory command
> >     to export the memory data of the source and target VMs respectively.
> > 4. Use "cmp -l source_memory target_memory" to verify memory data.
> >
> > Yuan Liu (3):
> >    multifd: bugfix for migration using compression methods
> >    multifd: bugfix for incorrect migration data with QPL compression
> >    multifd: bugfix for incorrect migration data with qatzip compression
> 
> Should just the first patch be applied to qemu-stable branches, or all 3?
> The first one has been Cc'd qemu-stable, but the other two hasn't?

I think all three patches should be applied, they solve three different bugs.
> 
> Thanks,
> 
> /mjt