Message ID | 20230831125702.11263-5-avihaih@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio/migration: Block VFIO migration with postcopy and background snapshot | expand |
When try to do the vfio post-copy migration, we can get an expected internal error now: "unable to execute QEMU command 'migrate': 0000:b1:00.2: VFIO migration is not supported with postcopy migration" Tested-by: Yanghang Liu <yanghliu@redhat.com> Best Regards, YangHang Liu On Thu, Aug 31, 2023 at 8:57 PM Avihai Horon <avihaih@nvidia.com> wrote: > > VFIO migration is not compatible with postcopy migration. A VFIO device > in the destination can't handle page faults for pages that have not been > sent yet. > > Doing such migration will cause the VM to crash in the destination: > > qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address > qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address) > qemu: hardware error: vfio: DMA mapping failed, unable to continue > > To prevent this, block VFIO migration with postcopy migration. > > Reported-by: Yanghang Liu <yanghliu@redhat.com> > Signed-off-by: Avihai Horon <avihaih@nvidia.com> > --- > hw/vfio/migration.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c > index 71855468fe..20994dc1d6 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) > > /* ---------------------------------------------------------------------- */ > > +static int vfio_save_prepare(void *opaque, Error **errp) > +{ > + VFIODevice *vbasedev = opaque; > + > + /* > + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. > + */ > + if (runstate_check(RUN_STATE_SAVE_VM)) { > + return 0; > + } > + > + if (migrate_postcopy_ram()) { > + error_setg( > + errp, "%s: VFIO migration is not supported with postcopy migration", > + vbasedev->name); > + return -EOPNOTSUPP; > + } > + > + return 0; > +} > + > static int vfio_save_setup(QEMUFile *f, void *opaque) > { > VFIODevice *vbasedev = opaque; > @@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque) > } > > static const SaveVMHandlers savevm_vfio_handlers = { > + .save_prepare = vfio_save_prepare, > .save_setup = vfio_save_setup, > .save_cleanup = vfio_save_cleanup, > .state_pending_estimate = vfio_state_pending_estimate, > -- > 2.26.3 >
On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote: > VFIO migration is not compatible with postcopy migration. A VFIO device > in the destination can't handle page faults for pages that have not been > sent yet. > > Doing such migration will cause the VM to crash in the destination: > > qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address > qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address) > qemu: hardware error: vfio: DMA mapping failed, unable to continue > > To prevent this, block VFIO migration with postcopy migration. > > Reported-by: Yanghang Liu <yanghliu@redhat.com> > Signed-off-by: Avihai Horon <avihaih@nvidia.com> > --- > hw/vfio/migration.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c > index 71855468fe..20994dc1d6 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) > > /* ---------------------------------------------------------------------- */ > > +static int vfio_save_prepare(void *opaque, Error **errp) > +{ > + VFIODevice *vbasedev = opaque; > + > + /* > + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. > + */ > + if (runstate_check(RUN_STATE_SAVE_VM)) { > + return 0; > + } Just purely curious: will it really work to save a snapshot for the GPU assigned use case? > + > + if (migrate_postcopy_ram()) { > + error_setg( > + errp, "%s: VFIO migration is not supported with postcopy migration", > + vbasedev->name); > + return -EOPNOTSUPP; > + } > + > + return 0; > +}
On 01/09/2023 18:51, Peter Xu wrote: > External email: Use caution opening links or attachments > > > On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote: >> VFIO migration is not compatible with postcopy migration. A VFIO device >> in the destination can't handle page faults for pages that have not been >> sent yet. >> >> Doing such migration will cause the VM to crash in the destination: >> >> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address >> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address) >> qemu: hardware error: vfio: DMA mapping failed, unable to continue >> >> To prevent this, block VFIO migration with postcopy migration. >> >> Reported-by: Yanghang Liu <yanghliu@redhat.com> >> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >> --- >> hw/vfio/migration.c | 22 ++++++++++++++++++++++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c >> index 71855468fe..20994dc1d6 100644 >> --- a/hw/vfio/migration.c >> +++ b/hw/vfio/migration.c >> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) >> >> /* ---------------------------------------------------------------------- */ >> >> +static int vfio_save_prepare(void *opaque, Error **errp) >> +{ >> + VFIODevice *vbasedev = opaque; >> + >> + /* >> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. >> + */ >> + if (runstate_check(RUN_STATE_SAVE_VM)) { >> + return 0; >> + } > Just purely curious: will it really work to save a snapshot for the GPU > assigned use case? I have never tried that. Adding Tarun, maybe he can answer that. Thanks. >> + >> + if (migrate_postcopy_ram()) { >> + error_setg( >> + errp, "%s: VFIO migration is not supported with postcopy migration", >> + vbasedev->name); >> + return -EOPNOTSUPP; >> + } >> + >> + return 0; >> +} > -- > Peter Xu >
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 71855468fe..20994dc1d6 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) /* ---------------------------------------------------------------------- */ +static int vfio_save_prepare(void *opaque, Error **errp) +{ + VFIODevice *vbasedev = opaque; + + /* + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on. + */ + if (runstate_check(RUN_STATE_SAVE_VM)) { + return 0; + } + + if (migrate_postcopy_ram()) { + error_setg( + errp, "%s: VFIO migration is not supported with postcopy migration", + vbasedev->name); + return -EOPNOTSUPP; + } + + return 0; +} + static int vfio_save_setup(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; @@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque) } static const SaveVMHandlers savevm_vfio_handlers = { + .save_prepare = vfio_save_prepare, .save_setup = vfio_save_setup, .save_cleanup = vfio_save_cleanup, .state_pending_estimate = vfio_state_pending_estimate,
VFIO migration is not compatible with postcopy migration. A VFIO device in the destination can't handle page faults for pages that have not been sent yet. Doing such migration will cause the VM to crash in the destination: qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address) qemu: hardware error: vfio: DMA mapping failed, unable to continue To prevent this, block VFIO migration with postcopy migration. Reported-by: Yanghang Liu <yanghliu@redhat.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> --- hw/vfio/migration.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)