mbox series

[RFC,v1,0/4] Add migration support for VFIO device

Message ID 1539713558-2453-1-git-send-email-kwankhede@nvidia.com (mailing list archive)
Headers show
Series Add migration support for VFIO device | expand

Message

Kirti Wankhede Oct. 16, 2018, 6:12 p.m. UTC
Add migration support for VFIO device

This Patch set include patches as below:
- Define KABI for VFIO device for migration support.
- Generic migration functionality for VFIO device.
  * This patch set adds functionality only for PCI devices, but can be
    extended to other VFIO devices.
  * Added all the basic functions required for pre-copy, stop-and-copy and
    resume phases of migration.
  * Added state change notifier and from that notifier function, VFIO
    device's state changed is conveyed to VFIO vendor driver.
  * During save setup phase and resume/load setup phase, migration region
    is queried from vendor driver and is mmaped by QEMU. This region is
    used to read/write data from and to vendor driver.
  * .save_live_pending, .save_live_iterate and .is_active_iterate are
    implemented to use QEMU's functionality of iteration during pre-copy
    phase.
  * In .save_live_complete_precopy, that is in stop-and-copy phase,
    iteration to read data from vendor driver is implemented till pending
    bytes returned by vendor driver are not zero.
  * .save_cleanup and .load_cleanup are implemented to unmap migration
    region that was setup duing setup phase.
  * Added function to get dirty pages bitmap from vendor driver.
- Add vfio_listerner_log_sync to mark dirty pages.
- Make VFIO PCI device migration capable.

Thanks,
Kirti

Kirti Wankhede (4):
  VFIO KABI for migration interface
  Add migration functions for VFIO devices
  Add vfio_listerner_log_sync to mark dirty pages
  Make vfio-pci device migration capable.

 hw/vfio/Makefile.objs         |   2 +-
 hw/vfio/common.c              |  32 ++
 hw/vfio/migration.c           | 716 ++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.c                 |  13 +-
 include/hw/vfio/vfio-common.h |  23 ++
 linux-headers/linux/vfio.h    |  91 ++++++
 6 files changed, 869 insertions(+), 8 deletions(-)
 create mode 100644 hw/vfio/migration.c

Comments

Cornelia Huck Oct. 17, 2018, 8:49 a.m. UTC | #1
On Tue, 16 Oct 2018 23:42:34 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> Add migration support for VFIO device

I'd love to take a deeper look at this; but sadly, I'm currently low on
spare time, and therefore will only add some general remarks.

> 
> This Patch set include patches as below:
> - Define KABI for VFIO device for migration support.
> - Generic migration functionality for VFIO device.
>   * This patch set adds functionality only for PCI devices, but can be
>     extended to other VFIO devices.

I've been thinking about how best to add migration to vfio-ccw; it
might be quite different from what we do for vfio-pci, but I hope that
we can still use some common infrastructure.

Basically, we probably need something like the following:

- Wait until outstanding channel programs are finished, and don't allow
  new ones. Don't report back to userspace (or maybe do; we just don't
  want to fence of too many new requests.)
- Collect subchannel and related state, copy it over. That probably
  fits in with the interface you're proposing.
- Re-setup on the other side and get going again. The problem here
  might be state that the vfio-ccw driver can't be aware of (like
  replaying storage server configurations). Maybe we can send CRWs
  (notifications) to the guest and leverage the existing suspend/resume
  driver code for channel devices.

For vfio-ap, I frankly have no idea how migration will work, given the
setup with the matrix device and the interesting information in the SIE
control blocks (but maybe it just can replay from the info in the
matrix device?) Keys etc. are likely to be the bigger problem here.

Cc:ing some s390 folks for awareness.

>   * Added all the basic functions required for pre-copy, stop-and-copy and
>     resume phases of migration.
>   * Added state change notifier and from that notifier function, VFIO
>     device's state changed is conveyed to VFIO vendor driver.

I assume that this will, in general, be transparent for the guest; but
maybe there'll be need for some interaction for special cases?

>   * During save setup phase and resume/load setup phase, migration region
>     is queried from vendor driver and is mmaped by QEMU. This region is
>     used to read/write data from and to vendor driver.
>   * .save_live_pending, .save_live_iterate and .is_active_iterate are
>     implemented to use QEMU's functionality of iteration during pre-copy
>     phase.
>   * In .save_live_complete_precopy, that is in stop-and-copy phase,
>     iteration to read data from vendor driver is implemented till pending
>     bytes returned by vendor driver are not zero.
>   * .save_cleanup and .load_cleanup are implemented to unmap migration
>     region that was setup duing setup phase.
>   * Added function to get dirty pages bitmap from vendor driver.
> - Add vfio_listerner_log_sync to mark dirty pages.
> - Make VFIO PCI device migration capable.
> 
> Thanks,
> Kirti
> 
> Kirti Wankhede (4):
>   VFIO KABI for migration interface
>   Add migration functions for VFIO devices
>   Add vfio_listerner_log_sync to mark dirty pages
>   Make vfio-pci device migration capable.
> 
>  hw/vfio/Makefile.objs         |   2 +-
>  hw/vfio/common.c              |  32 ++
>  hw/vfio/migration.c           | 716 ++++++++++++++++++++++++++++++++++++++++++
>  hw/vfio/pci.c                 |  13 +-
>  include/hw/vfio/vfio-common.h |  23 ++
>  linux-headers/linux/vfio.h    |  91 ++++++
>  6 files changed, 869 insertions(+), 8 deletions(-)
>  create mode 100644 hw/vfio/migration.c
>
Kirti Wankhede Oct. 17, 2018, 8:59 p.m. UTC | #2
On 10/17/2018 2:19 PM, Cornelia Huck wrote:
> On Tue, 16 Oct 2018 23:42:34 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> 
>> Add migration support for VFIO device
> 
> I'd love to take a deeper look at this; but sadly, I'm currently low on
> spare time, and therefore will only add some general remarks.
> 

Thanks. Those would be really helpful to have common infrastructure that
can be used across different types of vfio devices.

>>
>> This Patch set include patches as below:
>> - Define KABI for VFIO device for migration support.
>> - Generic migration functionality for VFIO device.
>>   * This patch set adds functionality only for PCI devices, but can be
>>     extended to other VFIO devices.
> 
> I've been thinking about how best to add migration to vfio-ccw; it
> might be quite different from what we do for vfio-pci, but I hope that
> we can still use some common infrastructure.
> 
> Basically, we probably need something like the following:
> 
> - Wait until outstanding channel programs are finished, and don't allow
>   new ones. Don't report back to userspace (or maybe do; we just don't
>   want to fence of too many new requests.)
> - Collect subchannel and related state, copy it over. That probably
>   fits in with the interface you're proposing.
> - Re-setup on the other side and get going again. The problem here
>   might be state that the vfio-ccw driver can't be aware of (like
>   replaying storage server configurations). Maybe we can send CRWs
>   (notifications) to the guest and leverage the existing suspend/resume
>   driver code for channel devices.
> 
> For vfio-ap, I frankly have no idea how migration will work, given the
> setup with the matrix device and the interesting information in the SIE
> control blocks (but maybe it just can replay from the info in the
> matrix device?) Keys etc. are likely to be the bigger problem here.
> 
> Cc:ing some s390 folks for awareness.
> 
>>   * Added all the basic functions required for pre-copy, stop-and-copy and
>>     resume phases of migration.
>>   * Added state change notifier and from that notifier function, VFIO
>>     device's state changed is conveyed to VFIO vendor driver.
> 
> I assume that this will, in general, be transparent for the guest; but
> maybe there'll be need for some interaction for special cases?

Ideally, this should be transparent for the guest for live migration.
Guest shouldn't see any functional change during live migration.

Thanks,
Kirti

> 
>>   * During save setup phase and resume/load setup phase, migration region
>>     is queried from vendor driver and is mmaped by QEMU. This region is
>>     used to read/write data from and to vendor driver.
>>   * .save_live_pending, .save_live_iterate and .is_active_iterate are
>>     implemented to use QEMU's functionality of iteration during pre-copy
>>     phase.
>>   * In .save_live_complete_precopy, that is in stop-and-copy phase,
>>     iteration to read data from vendor driver is implemented till pending
>>     bytes returned by vendor driver are not zero.
>>   * .save_cleanup and .load_cleanup are implemented to unmap migration
>>     region that was setup duing setup phase.
>>   * Added function to get dirty pages bitmap from vendor driver.
>> - Add vfio_listerner_log_sync to mark dirty pages.
>> - Make VFIO PCI device migration capable.
>>
>> Thanks,
>> Kirti
>>
>> Kirti Wankhede (4):
>>   VFIO KABI for migration interface
>>   Add migration functions for VFIO devices
>>   Add vfio_listerner_log_sync to mark dirty pages
>>   Make vfio-pci device migration capable.
>>
>>  hw/vfio/Makefile.objs         |   2 +-
>>  hw/vfio/common.c              |  32 ++
>>  hw/vfio/migration.c           | 716 ++++++++++++++++++++++++++++++++++++++++++
>>  hw/vfio/pci.c                 |  13 +-
>>  include/hw/vfio/vfio-common.h |  23 ++
>>  linux-headers/linux/vfio.h    |  91 ++++++
>>  6 files changed, 869 insertions(+), 8 deletions(-)
>>  create mode 100644 hw/vfio/migration.c
>>
>
Tian, Kevin Oct. 18, 2018, 2:41 a.m. UTC | #3
> From: Kirti Wankhede
> Sent: Wednesday, October 17, 2018 2:13 AM
> 
> Add migration support for VFIO device
> 
> This Patch set include patches as below:
> - Define KABI for VFIO device for migration support.
> - Generic migration functionality for VFIO device.
>   * This patch set adds functionality only for PCI devices, but can be
>     extended to other VFIO devices.
>   * Added all the basic functions required for pre-copy, stop-and-copy and
>     resume phases of migration.
>   * Added state change notifier and from that notifier function, VFIO
>     device's state changed is conveyed to VFIO vendor driver.
>   * During save setup phase and resume/load setup phase, migration region
>     is queried from vendor driver and is mmaped by QEMU. This region is
>     used to read/write data from and to vendor driver.
>   * .save_live_pending, .save_live_iterate and .is_active_iterate are
>     implemented to use QEMU's functionality of iteration during pre-copy
>     phase.
>   * In .save_live_complete_precopy, that is in stop-and-copy phase,
>     iteration to read data from vendor driver is implemented till pending
>     bytes returned by vendor driver are not zero.
>   * .save_cleanup and .load_cleanup are implemented to unmap migration
>     region that was setup duing setup phase.
>   * Added function to get dirty pages bitmap from vendor driver.
> - Add vfio_listerner_log_sync to mark dirty pages.
> - Make VFIO PCI device migration capable.
> 

I didn't see a kernel part change implementing the new KABI. If there is,
can you point out?

btw how is this work related to previous effort on adding live migration 
to VFIO?

https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg01134.html
https://www.spinics.net/linux/fedora/libvir/msg170669.html 

Thanks
Kevin