Message ID | b1f864a65fafd4fdab1f89230df52e46ae41f2ac.1741124640.git.maciej.szmigiero@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Multifd | expand |
On 3/4/25 23:03, Maciej S. Szmigiero wrote: > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> > > Wire VFIO multifd transfer specific setup and cleanup functions into > general VFIO load/save setup and cleanup methods. > > Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Thanks, C. > --- > hw/vfio/migration.c | 24 ++++++++++++++++++++++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c > index dc1fe4e717a4..3c8286ae6230 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -453,6 +453,10 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) > uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; > int ret; > > + if (!vfio_multifd_setup(vbasedev, false, errp)) { > + return -EINVAL; > + } > + > qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); > > vfio_query_stop_copy_size(vbasedev, &stop_copy_size); > @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) > Error *local_err = NULL; > int ret; > > + /* Currently a NOP, done for symmetry with load_cleanup() */ > + vfio_multifd_cleanup(vbasedev); > + > /* > * Changing device state from STOP_COPY to STOP can take time. Do it here, > * after migration has completed, so it won't increase downtime. > @@ -674,15 +681,28 @@ static void vfio_save_state(QEMUFile *f, void *opaque) > static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) > { > VFIODevice *vbasedev = opaque; > + VFIOMigration *migration = vbasedev->migration; > + int ret; > > - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, > - vbasedev->migration->device_state, errp); > + if (!vfio_multifd_setup(vbasedev, true, errp)) { > + return -EINVAL; > + } > + > + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, > + migration->device_state, errp); > + if (ret) { > + return ret; > + } > + > + return 0; > } > > static int vfio_load_cleanup(void *opaque) > { > VFIODevice *vbasedev = opaque; > > + vfio_multifd_cleanup(vbasedev); > + > vfio_migration_cleanup(vbasedev); > trace_vfio_load_cleanup(vbasedev->name); > >
On Tue, Mar 04, 2025 at 11:03:52PM +0100, Maciej S. Szmigiero wrote: > @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) > Error *local_err = NULL; > int ret; > > + /* Currently a NOP, done for symmetry with load_cleanup() */ > + vfio_multifd_cleanup(vbasedev); So I just notice this when looking at the cleanup path. It can be super confusing to cleanup the load threads in save().. IIUC we should drop it. > + > /* > * Changing device state from STOP_COPY to STOP can take time. Do it here, > * after migration has completed, so it won't increase downtime.
On 5.03.2025 17:22, Peter Xu wrote: > On Tue, Mar 04, 2025 at 11:03:52PM +0100, Maciej S. Szmigiero wrote: >> @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) >> Error *local_err = NULL; >> int ret; >> >> + /* Currently a NOP, done for symmetry with load_cleanup() */ >> + vfio_multifd_cleanup(vbasedev); > > So I just notice this when looking at the cleanup path. It can be super > confusing to cleanup the load threads in save().. IIUC we should drop it. > It's a NOP since in the save operation migration->multifd is going to be NULL so that "g_clear_pointer(&migration->multifd, vfio_multifd_free)" inside it won't do anything. Cedric suggested calling it anyway since vfio_save_setup() calls vfio_multifd_setup() so to be consistent we should call vfio_multifd_cleanup() on cleanup too. I think calling it makes sense since otherwise that vfio_multifd_setup() calls looks unbalanced. Thanks, Maciej
On Wed, Mar 05, 2025 at 05:27:19PM +0100, Maciej S. Szmigiero wrote: > On 5.03.2025 17:22, Peter Xu wrote: > > On Tue, Mar 04, 2025 at 11:03:52PM +0100, Maciej S. Szmigiero wrote: > > > @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) > > > Error *local_err = NULL; > > > int ret; > > > + /* Currently a NOP, done for symmetry with load_cleanup() */ > > > + vfio_multifd_cleanup(vbasedev); > > > > So I just notice this when looking at the cleanup path. It can be super > > confusing to cleanup the load threads in save().. IIUC we should drop it. > > > > It's a NOP since in the save operation migration->multifd is going to be > NULL so that "g_clear_pointer(&migration->multifd, vfio_multifd_free)" > inside it won't do anything. > > Cedric suggested calling it anyway since vfio_save_setup() calls > vfio_multifd_setup() so to be consistent we should call > vfio_multifd_cleanup() on cleanup too. > > I think calling it makes sense since otherwise that vfio_multifd_setup() > calls looks unbalanced. IMHO we should split vfio_multifd_setup() into two functions: - vfio_multifd_supported(): covering the first half of the fn, detect whether it's supported all over and return the result. - vfio_load_setup_multifd(): covering almost only vfio_multifd_new(). Then: - the 1st function should be used in both save_setup() and load_setup(). Meanwhile vfio_load_setup_multifd() should only be invoked in load_setup(). - we rename vfio_multifd_cleanup() to vfio_multifd_load_cleanup(), because that's really only about load.. - vfio_multifd_setup() (or after it renamed..) can drop the redundant alloc_multifd parameter.
On 3/5/25 17:39, Peter Xu wrote: > On Wed, Mar 05, 2025 at 05:27:19PM +0100, Maciej S. Szmigiero wrote: >> On 5.03.2025 17:22, Peter Xu wrote: >>> On Tue, Mar 04, 2025 at 11:03:52PM +0100, Maciej S. Szmigiero wrote: >>>> @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) >>>> Error *local_err = NULL; >>>> int ret; >>>> + /* Currently a NOP, done for symmetry with load_cleanup() */ >>>> + vfio_multifd_cleanup(vbasedev); >>> >>> So I just notice this when looking at the cleanup path. It can be super >>> confusing to cleanup the load threads in save().. IIUC we should drop it. >>> >> >> It's a NOP since in the save operation migration->multifd is going to be >> NULL so that "g_clear_pointer(&migration->multifd, vfio_multifd_free)" >> inside it won't do anything. >> >> Cedric suggested calling it anyway since vfio_save_setup() calls >> vfio_multifd_setup() so to be consistent we should call >> vfio_multifd_cleanup() on cleanup too. >> >> I think calling it makes sense since otherwise that vfio_multifd_setup() >> calls looks unbalanced. > > IMHO we should split vfio_multifd_setup() into two functions: > > - vfio_multifd_supported(): covering the first half of the fn, detect > whether it's supported all over and return the result. > > - vfio_load_setup_multifd(): covering almost only vfio_multifd_new(). > > Then: > > - the 1st function should be used in both save_setup() and > load_setup(). Meanwhile vfio_load_setup_multifd() should only be > invoked in load_setup(). > > - we rename vfio_multifd_cleanup() to vfio_multifd_load_cleanup(), > because that's really only about load.. > > - vfio_multifd_setup() (or after it renamed..) can drop the redundant > alloc_multifd parameter. I think this is close to the initial proposal of Maciej in v5 and I asked to do it that way in v6, moslty because we don't agree on the need of 'bool multifd_transfer'. Since it's minor, we can refactor afterwards. For now, let's keep it as it is please. Thanks, C.
On Wed, Mar 05, 2025 at 11:39:23AM -0500, Peter Xu wrote: > On Wed, Mar 05, 2025 at 05:27:19PM +0100, Maciej S. Szmigiero wrote: > > On 5.03.2025 17:22, Peter Xu wrote: > > > On Tue, Mar 04, 2025 at 11:03:52PM +0100, Maciej S. Szmigiero wrote: > > > > @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) > > > > Error *local_err = NULL; > > > > int ret; > > > > + /* Currently a NOP, done for symmetry with load_cleanup() */ > > > > + vfio_multifd_cleanup(vbasedev); > > > > > > So I just notice this when looking at the cleanup path. It can be super > > > confusing to cleanup the load threads in save().. IIUC we should drop it. > > > > > > > It's a NOP since in the save operation migration->multifd is going to be > > NULL so that "g_clear_pointer(&migration->multifd, vfio_multifd_free)" > > inside it won't do anything. > > > > Cedric suggested calling it anyway since vfio_save_setup() calls > > vfio_multifd_setup() so to be consistent we should call > > vfio_multifd_cleanup() on cleanup too. > > > > I think calling it makes sense since otherwise that vfio_multifd_setup() > > calls looks unbalanced. > > IMHO we should split vfio_multifd_setup() into two functions: > > - vfio_multifd_supported(): covering the first half of the fn, detect > whether it's supported all over and return the result. > > - vfio_load_setup_multifd(): covering almost only vfio_multifd_new(). > > Then: > > - the 1st function should be used in both save_setup() and > load_setup(). Meanwhile vfio_load_setup_multifd() should only be > invoked in load_setup(). > > - we rename vfio_multifd_cleanup() to vfio_multifd_load_cleanup(), > because that's really only about load.. > > - vfio_multifd_setup() (or after it renamed..) can drop the redundant > alloc_multifd parameter. PS: I'm OK if you and Cedric prefer having this discussed after merging the initial version, e.g. during hard-freeze. It would still be good to clean it up when at it though, if you agree.
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index dc1fe4e717a4..3c8286ae6230 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -453,6 +453,10 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; int ret; + if (!vfio_multifd_setup(vbasedev, false, errp)) { + return -EINVAL; + } + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); vfio_query_stop_copy_size(vbasedev, &stop_copy_size); @@ -509,6 +513,9 @@ static void vfio_save_cleanup(void *opaque) Error *local_err = NULL; int ret; + /* Currently a NOP, done for symmetry with load_cleanup() */ + vfio_multifd_cleanup(vbasedev); + /* * Changing device state from STOP_COPY to STOP can take time. Do it here, * after migration has completed, so it won't increase downtime. @@ -674,15 +681,28 @@ static void vfio_save_state(QEMUFile *f, void *opaque) static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + int ret; - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, - vbasedev->migration->device_state, errp); + if (!vfio_multifd_setup(vbasedev, true, errp)) { + return -EINVAL; + } + + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, + migration->device_state, errp); + if (ret) { + return ret; + } + + return 0; } static int vfio_load_cleanup(void *opaque) { VFIODevice *vbasedev = opaque; + vfio_multifd_cleanup(vbasedev); + vfio_migration_cleanup(vbasedev); trace_vfio_load_cleanup(vbasedev->name);