mbox series

[v2,0/3] Allow the group FD to remain open when unplugging a device

Message ID 0-v2-15417f29324e+1c-vfio_group_disassociate_jgg@nvidia.com (mailing list archive)
Headers show
Series Allow the group FD to remain open when unplugging a device | expand

Message

Jason Gunthorpe Oct. 7, 2022, 2:04 p.m. UTC
Testing has shown that virtnodedevd will leave the group FD open for long
periods, even after all the cdevs have been destroyed. This blocks
destruction of the VFIO device and is undesirable.

That approach was selected to accomodate SPAPR which has an broken
lifecyle model for the iommu_group. However, we can accomodate SPAPR by
realizing that it doesn't use the iommu core at all, so rules about
iommu_group lifetime do not apply to it.

Giving the KVM code its own kref on the iommu_group allows the VFIO core
code to release its iommu_group reference earlier and we can remove the
sleep that only existed for SPAPR.

v2:
 - Use vfio_file_is_group() istead of open coding
 - Do not delete vfio_group_detach_container() from
   vfio_group_fops_release()
v1: https://lore.kernel.org/r/0-v1-90bf0950c42c+39-vfio_group_disassociate_jgg@nvidia.com

Jason Gunthorpe (3):
  vfio: Add vfio_file_is_group()
  vfio: Hold a reference to the iommu_group in kvm for SPAPR
  vfio: Make the group FD disassociate from the iommu_group

 drivers/vfio/pci/vfio_pci_core.c |  2 +-
 drivers/vfio/vfio.h              |  1 -
 drivers/vfio/vfio_main.c         | 85 +++++++++++++++++++++++---------
 include/linux/vfio.h             |  1 +
 virt/kvm/vfio.c                  | 45 ++++++++++++-----
 5 files changed, 95 insertions(+), 39 deletions(-)


base-commit: c82e81ab2569559ad873b3061217c2f37560682b

Comments

Matthew Rosato Oct. 7, 2022, 3:12 p.m. UTC | #1
On 10/7/22 10:04 AM, Jason Gunthorpe wrote:
> Testing has shown that virtnodedevd will leave the group FD open for long
> periods, even after all the cdevs have been destroyed. This blocks
> destruction of the VFIO device and is undesirable.
> 
> That approach was selected to accomodate SPAPR which has an broken
> lifecyle model for the iommu_group. However, we can accomodate SPAPR by
> realizing that it doesn't use the iommu core at all, so rules about
> iommu_group lifetime do not apply to it.
> 
> Giving the KVM code its own kref on the iommu_group allows the VFIO core
> code to release its iommu_group reference earlier and we can remove the
> sleep that only existed for SPAPR.
> 
> v2:
>  - Use vfio_file_is_group() istead of open coding
>  - Do not delete vfio_group_detach_container() from
>    vfio_group_fops_release()
> v1: https://lore.kernel.org/r/0-v1-90bf0950c42c+39-vfio_group_disassociate_jgg@nvidia.com

OK, just wanted to close the loop here.  Besides my testing v1+fixup yesterday which looked good, I also just now tested vfio-pci on s390 intentionally leaving the device bound to the vfio-pci driver over multiple VM starts/shutdowns (which was broken in v1).  This is working for me now too.  Thanks Jason!

> 
> Jason Gunthorpe (3):
>   vfio: Add vfio_file_is_group()
>   vfio: Hold a reference to the iommu_group in kvm for SPAPR
>   vfio: Make the group FD disassociate from the iommu_group
> 
>  drivers/vfio/pci/vfio_pci_core.c |  2 +-
>  drivers/vfio/vfio.h              |  1 -
>  drivers/vfio/vfio_main.c         | 85 +++++++++++++++++++++++---------
>  include/linux/vfio.h             |  1 +
>  virt/kvm/vfio.c                  | 45 ++++++++++++-----
>  5 files changed, 95 insertions(+), 39 deletions(-)
> 
> 
> base-commit: c82e81ab2569559ad873b3061217c2f37560682b
Alex Williamson Oct. 7, 2022, 7:53 p.m. UTC | #2
On Fri,  7 Oct 2022 11:04:38 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> Testing has shown that virtnodedevd will leave the group FD open for long
> periods, even after all the cdevs have been destroyed. This blocks
> destruction of the VFIO device and is undesirable.
> 
> That approach was selected to accomodate SPAPR which has an broken
> lifecyle model for the iommu_group. However, we can accomodate SPAPR by
> realizing that it doesn't use the iommu core at all, so rules about
> iommu_group lifetime do not apply to it.
> 
> Giving the KVM code its own kref on the iommu_group allows the VFIO core
> code to release its iommu_group reference earlier and we can remove the
> sleep that only existed for SPAPR.
> 
> v2:
>  - Use vfio_file_is_group() istead of open coding
>  - Do not delete vfio_group_detach_container() from
>    vfio_group_fops_release()
> v1: https://lore.kernel.org/r/0-v1-90bf0950c42c+39-vfio_group_disassociate_jgg@nvidia.com
> 
> Jason Gunthorpe (3):
>   vfio: Add vfio_file_is_group()
>   vfio: Hold a reference to the iommu_group in kvm for SPAPR
>   vfio: Make the group FD disassociate from the iommu_group
> 
>  drivers/vfio/pci/vfio_pci_core.c |  2 +-
>  drivers/vfio/vfio.h              |  1 -
>  drivers/vfio/vfio_main.c         | 85 +++++++++++++++++++++++---------
>  include/linux/vfio.h             |  1 +
>  virt/kvm/vfio.c                  | 45 ++++++++++++-----
>  5 files changed, 95 insertions(+), 39 deletions(-)

Applied to vfio next branch for v6.1, along with my trivial follow-up.
Thanks for the group effort on this one!

Alex