mbox series

[V3,00/13] allow cpr-reboot for vfio

Message ID 1707418446-134863-1-git-send-email-steven.sistare@oracle.com (mailing list archive)
Headers show
Series allow cpr-reboot for vfio | expand

Message

Steven Sistare Feb. 8, 2024, 6:53 p.m. UTC
Allow cpr-reboot for vfio if the guest is in the suspended runstate.  The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore.  The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.

Most of the patches in this series enhance migration notifiers so they can
return an error status and message.  The last few patches register a notifier
for vfio that returns an error if the guest is not suspended.

Changes in V3:
  * update to tip, add RB's
  * replace MigrationStatus with new enum MigrationEventType
  * simplify migrate_fd_connect error recovery
  * support vfio iommufd containers
  * add patches:
      migration: stop vm for cpr
      migration: update cpr-reboot description

Steve Sistare (13):
  notify: pass error to notifier with return
  migration: remove error from notifier data
  migration: convert to NotifierWithReturn
  migration: MigrationEvent for notifiers
  migration: remove postcopy_after_devices
  migration: MigrationNotifyFunc
  migration: per-mode notifiers
  migration: refactor migrate_fd_connect failures
  migration: notifier error checking
  migration: stop vm for cpr
  vfio: register container for cpr
  vfio: allow cpr-reboot migration if suspended
  migration: update cpr-reboot description

 hw/net/virtio-net.c                   |  13 ++--
 hw/vfio/common.c                      |   2 +-
 hw/vfio/container.c                   |  11 ++-
 hw/vfio/cpr.c                         |  39 +++++++++++
 hw/vfio/iommufd.c                     |   6 ++
 hw/vfio/meson.build                   |   1 +
 hw/vfio/migration.c                   |  15 ++--
 hw/vfio/trace-events                  |   2 +-
 hw/virtio/vhost-user.c                |  10 +--
 hw/virtio/virtio-balloon.c            |   3 +-
 include/hw/vfio/vfio-common.h         |   5 +-
 include/hw/vfio/vfio-container-base.h |   1 +
 include/hw/virtio/virtio-net.h        |   2 +-
 include/migration/misc.h              |  31 ++++++--
 include/qemu/notify.h                 |   8 ++-
 migration/migration.c                 | 128 +++++++++++++++++++++++-----------
 migration/migration.h                 |   2 -
 migration/postcopy-ram.c              |   3 +-
 migration/postcopy-ram.h              |   1 -
 migration/ram.c                       |   3 +-
 net/vhost-vdpa.c                      |  14 ++--
 qapi/migration.json                   |  36 ++++++----
 ui/spice-core.c                       |  17 +++--
 util/notify.c                         |   5 +-
 24 files changed, 244 insertions(+), 114 deletions(-)
 create mode 100644 hw/vfio/cpr.c

Comments

Peter Xu Feb. 20, 2024, 7:49 a.m. UTC | #1
On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
> Allow cpr-reboot for vfio if the guest is in the suspended runstate.  The
> guest drivers' suspend methods flush outstanding requests and re-initialize
> the devices, and thus there is no device state to save and restore.  The
> user is responsible for suspending the guest before initiating cpr, such as
> by issuing guest-suspend-ram to the qemu guest agent.
> 
> Most of the patches in this series enhance migration notifiers so they can
> return an error status and message.  The last few patches register a notifier
> for vfio that returns an error if the guest is not suspended.
> 
> Changes in V3:
>   * update to tip, add RB's
>   * replace MigrationStatus with new enum MigrationEventType
>   * simplify migrate_fd_connect error recovery
>   * support vfio iommufd containers
>   * add patches:
>       migration: stop vm for cpr
>       migration: update cpr-reboot description

This doesn't apply to master anymore, please rebase when repost, thanks.
Steven Sistare Feb. 20, 2024, 10:32 p.m. UTC | #2
On 2/20/2024 2:49 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
>> Allow cpr-reboot for vfio if the guest is in the suspended runstate.  The
>> guest drivers' suspend methods flush outstanding requests and re-initialize
>> the devices, and thus there is no device state to save and restore.  The
>> user is responsible for suspending the guest before initiating cpr, such as
>> by issuing guest-suspend-ram to the qemu guest agent.
>>
>> Most of the patches in this series enhance migration notifiers so they can
>> return an error status and message.  The last few patches register a notifier
>> for vfio that returns an error if the guest is not suspended.
>>
>> Changes in V3:
>>   * update to tip, add RB's
>>   * replace MigrationStatus with new enum MigrationEventType
>>   * simplify migrate_fd_connect error recovery
>>   * support vfio iommufd containers
>>   * add patches:
>>       migration: stop vm for cpr
>>       migration: update cpr-reboot description
> 
> This doesn't apply to master anymore, please rebase when repost, thanks.

Will do.  Before I do, any comments on "migration: update cpr-reboot description"?
After we converge on that short description, I will submit a longer treatment in
docs/devel/migration, which I see you have recently populated.

- Steve
Peter Xu Feb. 21, 2024, 2:13 a.m. UTC | #3
On Tue, Feb 20, 2024 at 05:32:34PM -0500, Steven Sistare wrote:
> On 2/20/2024 2:49 AM, Peter Xu wrote:
> > On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
> >> Allow cpr-reboot for vfio if the guest is in the suspended runstate.  The
> >> guest drivers' suspend methods flush outstanding requests and re-initialize
> >> the devices, and thus there is no device state to save and restore.  The
> >> user is responsible for suspending the guest before initiating cpr, such as
> >> by issuing guest-suspend-ram to the qemu guest agent.
> >>
> >> Most of the patches in this series enhance migration notifiers so they can
> >> return an error status and message.  The last few patches register a notifier
> >> for vfio that returns an error if the guest is not suspended.
> >>
> >> Changes in V3:
> >>   * update to tip, add RB's
> >>   * replace MigrationStatus with new enum MigrationEventType
> >>   * simplify migrate_fd_connect error recovery
> >>   * support vfio iommufd containers
> >>   * add patches:
> >>       migration: stop vm for cpr
> >>       migration: update cpr-reboot description
> > 
> > This doesn't apply to master anymore, please rebase when repost, thanks.
> 
> Will do.  Before I do, any comments on "migration: update cpr-reboot description"?
> After we converge on that short description, I will submit a longer treatment in
> docs/devel/migration, which I see you have recently populated.

Sounds good; yes I hope we have a file there, as it'll pop up later in
https://www.qemu.org/docs/master/devel/migration/.

You can add a short sentence to forbid postcopy if that's the plan.  Other
than that it looks good.

Thanks,