diff mbox series

[2/2] vfio/ccw: Don't initialize HOST_IOMMU_DEVICE with mdev

Message ID 20240722070713.1342711-3-zhenzhong.duan@intel.com (mailing list archive)
State New, archived
Headers show
Series Don't initialize HOST_IOMMU_DEVICE with mdev | expand

Commit Message

Zhenzhong Duan July 22, 2024, 7:07 a.m. UTC
mdevs aren't "physical" devices and when asking for backing IOMMU info,
it fails the entire provisioning of the guest. Fix that by setting
vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
presence of mdevs.

Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/vfio/ccw.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Joao Martins July 22, 2024, 9:18 a.m. UTC | #1
On 22/07/2024 08:07, Zhenzhong Duan wrote:
> mdevs aren't "physical" devices and when asking for backing IOMMU info,
> it fails the entire provisioning of the guest. Fix that by setting
> vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
> presence of mdevs.
> 
> Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>

> ---
>  hw/vfio/ccw.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 1f8e1272c7..70934b01d5 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -675,6 +675,9 @@ static void vfio_ccw_instance_init(Object *obj)
>      VFIOCCWDevice *vcdev = VFIO_CCW(obj);
>      VFIODevice *vbasedev = &vcdev->vdev;
>  
> +    /* CCW device is mdev type device */
> +    vbasedev->mdev = true;
> +
>      /*
>       * All vfio-ccw devices are believed to operate in a way compatible with
>       * discarding of memory in RAM blocks, ie. pages pinned in the host are
Eric Farman July 22, 2024, 2:57 p.m. UTC | #2
On Mon, 2024-07-22 at 15:07 +0800, Zhenzhong Duan wrote:
> mdevs aren't "physical" devices and when asking for backing IOMMU info,
> it fails the entire provisioning of the guest. Fix that by setting
> vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
> presence of mdevs.

Hmm, picking the two commits that Cedric mentioned in his cover-letter reply [1] doesn't "fail the entire provisioning of the guest" for me.

Applying this patch on top of that causes the call from vfio_attach_device() to hiod_legacy_vfio_realize() to be skipped, which seems odd. What am I missing?

[1] https://lore.kernel.org/qemu-devel/4c9a184b-514c-4276-95ca-9ed86623b9a4@redhat.com/

> 
> Fixes: 930589520128 ("vfio/iommufd: Implement HostIOMMUDeviceClass::realize() handler")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  hw/vfio/ccw.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 1f8e1272c7..70934b01d5 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -675,6 +675,9 @@ static void vfio_ccw_instance_init(Object *obj)
>      VFIOCCWDevice *vcdev = VFIO_CCW(obj);
>      VFIODevice *vbasedev = &vcdev->vdev;
>  
> +    /* CCW device is mdev type device */
> +    vbasedev->mdev = true;
> +
>      /*
>       * All vfio-ccw devices are believed to operate in a way compatible with
>       * discarding of memory in RAM blocks, ie. pages pinned in the host are
Joao Martins July 22, 2024, 3:09 p.m. UTC | #3
On 22/07/2024 15:57, Eric Farman wrote:
> On Mon, 2024-07-22 at 15:07 +0800, Zhenzhong Duan wrote:
>> mdevs aren't "physical" devices and when asking for backing IOMMU info,
>> it fails the entire provisioning of the guest. Fix that by setting
>> vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
>> presence of mdevs.
> 
> Hmm, picking the two commits that Cedric mentioned in his cover-letter reply [1] doesn't "fail the entire provisioning of the guest" for me.
> 
> Applying this patch on top of that causes the call from vfio_attach_device() to hiod_legacy_vfio_realize() to be skipped, which seems odd. What am I missing?
> 
> [1] https://lore.kernel.org/qemu-devel/4c9a184b-514c-4276-95ca-9ed86623b9a4@redhat.com/
>

If you are using IOMMUFD it will fail the entire provisioning i.e. GET_HW_INFO
fails because there's no actual device/IOMMU you can probe hardware information
from and you can't start a guest. This happened at least for me in x86 vfio-pci
mdevs (or at least I reproduced it when trying to test mdev_tty)

But if you don't support IOMMUFD, then it probably makes no difference as type1
doesn't do anything particularly special besides initializing some static data.
The realize is skipped because you technically don't have a physical host IOMMU
directly behind the mdev, but rather some parent function related software
entity doing that for you.

Zhengzhong noticed there were some other mdevs aside from vfio-pci and in an
attempt to prevent regression elsewhere it posted for the other mdevs in qemu.

	Joao
Cédric Le Goater July 22, 2024, 3:36 p.m. UTC | #4
On 7/22/24 17:09, Joao Martins wrote:
> On 22/07/2024 15:57, Eric Farman wrote:
>> On Mon, 2024-07-22 at 15:07 +0800, Zhenzhong Duan wrote:
>>> mdevs aren't "physical" devices and when asking for backing IOMMU info,
>>> it fails the entire provisioning of the guest. Fix that by setting
>>> vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
>>> presence of mdevs.
>>
>> Hmm, picking the two commits that Cedric mentioned in his cover-letter reply [1] doesn't "fail the entire provisioning of the guest" for me.
>>
>> Applying this patch on top of that causes the call from vfio_attach_device() to hiod_legacy_vfio_realize() to be skipped, which seems odd. What am I missing?
>>
>> [1] https://lore.kernel.org/qemu-devel/4c9a184b-514c-4276-95ca-9ed86623b9a4@redhat.com/
>>
> 
> If you are using IOMMUFD it will fail the entire provisioning i.e. GET_HW_INFO
> fails because there's no actual device/IOMMU you can probe hardware information
> from and you can't start a guest. This happened at least for me in x86 vfio-pci
> mdevs (or at least I reproduced it when trying to test mdev_tty)
> 
> But if you don't support IOMMUFD, then it probably makes no difference as type1
> doesn't do anything particularly special besides initializing some static data.
> The realize is skipped because you technically don't have a physical host IOMMU
> directly behind the mdev, but rather some parent function related software
> entity doing that for you.
> 
> Zhengzhong noticed there were some other mdevs aside from vfio-pci and in an
> attempt to prevent regression elsewhere it posted for the other mdevs in qemu.


yes. I confirm with :

   -device vfio-ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-e656-4187-b773-fea4e926310d,iommufd=iommufd0 \
   -object iommufd,id=iommufd0 \
   -trace 'iommufd*'

iommufd_cdev_getfd  /dev/vfio/devices/vfio4 (fd=28)
iommufd_backend_connect fd=27 owned=1 users=1
iommufd_cdev_connect_and_bind  [iommufd=27] Successfully bound device 8eb8351a-e656-4187-b773-fea4e926310d (fd=28): output devid=1
iommufd_backend_alloc_ioas  iommufd=27 ioas=2
iommufd_cdev_alloc_ioas  [iommufd=27] new IOMMUFD container with ioasid=2
iommufd_cdev_attach_ioas_hwpt  [iommufd=27] Successfully attached device 8eb8351a-e656-4187-b773-fea4e926310d (28) to id=2
iommufd_backend_map_dma  iommufd=27 ioas=2 iova=0x0 size=0x200000000 addr=0x3fd9ff00000 readonly=0 (0)
iommufd_cdev_device_info  8eb8351a-e656-4187-b773-fea4e926310d (28) num_irqs=1 num_regions=0 flags=33
iommufd_cdev_detach_ioas_hwpt  [iommufd=27] Successfully detached 8eb8351a-e656-4187-b773-fea4e926310d
iommufd_backend_unmap_dma  iommufd=27 ioas=2 iova=0x0 size=0x200000000 (0)
iommufd_backend_free_id  iommufd=27 id=2 (0)
iommufd_backend_disconnect fd=-1 users=0

qemu-kvm: -device vfio-ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-e656-4187-b773-fea4e926310d,iommufd=iommufd0: vfio 8eb8351a-e656-4187-b773-fea4e926310d: Failed to get hardware info: No such file or directory



Thanks,

C.
Eric Farman July 22, 2024, 5:45 p.m. UTC | #5
On Mon, 2024-07-22 at 17:36 +0200, Cédric Le Goater wrote:
> On 7/22/24 17:09, Joao Martins wrote:
> > On 22/07/2024 15:57, Eric Farman wrote:
> > > On Mon, 2024-07-22 at 15:07 +0800, Zhenzhong Duan wrote:
> > > > mdevs aren't "physical" devices and when asking for backing IOMMU info,
> > > > it fails the entire provisioning of the guest. Fix that by setting
> > > > vbasedev->mdev true so skipping HostIOMMUDevice initialization in the
> > > > presence of mdevs.
> > > 
> > > Hmm, picking the two commits that Cedric mentioned in his cover-letter reply [1] doesn't "fail the entire provisioning of the guest" for me.
> > > 
> > > Applying this patch on top of that causes the call from vfio_attach_device() to hiod_legacy_vfio_realize() to be skipped, which seems odd. What am I missing?
> > > 
> > > [1] https://lore.kernel.org/qemu-devel/4c9a184b-514c-4276-95ca-9ed86623b9a4@redhat.com/
> > > 
> > 
> > If you are using IOMMUFD
> > 

Which is not the case in defconfig.

> >  it will fail the entire provisioning i.e. GET_HW_INFO
> > fails because there's no actual device/IOMMU you can probe hardware information
> > from and you can't start a guest. This happened at least for me in x86 vfio-pci
> > mdevs (or at least I reproduced it when trying to test mdev_tty)
> > 
> > But if you don't support IOMMUFD, then it probably makes no difference as type1
> > doesn't do anything particularly special besides initializing some static data.

This was my concern. The static data doesn't look particularly exciting, but it does seem strange to
be skipping over it in the non-iommufd case now.

> > The realize is skipped because you technically don't have a physical host IOMMU
> > directly behind the mdev, but rather some parent function related software
> > entity doing that for you.
> > 
> > Zhengzhong noticed there were some other mdevs aside from vfio-pci and in an
> > attempt to prevent regression elsewhere it posted for the other mdevs in qemu.
> 
> 
> yes. I confirm with :
> 
>    -device vfio-ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-e656-4187-b773-fea4e926310d,iommufd=iommufd0 \
>    -object iommufd,id=iommufd0 \
>    -trace 'iommufd*'
> 
> iommufd_cdev_getfd  /dev/vfio/devices/vfio4 (fd=28)

Ah, right... Need to enable iommufd AND vfio_device_cdev to get into this potential situation. I
guess this is better than random failures down the road.

Acked-by: Eric Farman <farman@linux.ibm.com>

> iommufd_backend_connect fd=27 owned=1 users=1
> iommufd_cdev_connect_and_bind  [iommufd=27] Successfully bound device 8eb8351a-e656-4187-b773-fea4e926310d (fd=28): output devid=1
> iommufd_backend_alloc_ioas  iommufd=27 ioas=2
> iommufd_cdev_alloc_ioas  [iommufd=27] new IOMMUFD container with ioasid=2
> iommufd_cdev_attach_ioas_hwpt  [iommufd=27] Successfully attached device 8eb8351a-e656-4187-b773-fea4e926310d (28) to id=2
> iommufd_backend_map_dma  iommufd=27 ioas=2 iova=0x0 size=0x200000000 addr=0x3fd9ff00000 readonly=0 (0)
> iommufd_cdev_device_info  8eb8351a-e656-4187-b773-fea4e926310d (28) num_irqs=1 num_regions=0 flags=33
> iommufd_cdev_detach_ioas_hwpt  [iommufd=27] Successfully detached 8eb8351a-e656-4187-b773-fea4e926310d
> iommufd_backend_unmap_dma  iommufd=27 ioas=2 iova=0x0 size=0x200000000 (0)
> iommufd_backend_free_id  iommufd=27 id=2 (0)
> iommufd_backend_disconnect fd=-1 users=0
> 
> qemu-kvm: -device vfio-ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-e656-4187-b773-fea4e926310d,iommufd=iommufd0: vfio 8eb8351a-e656-4187-b773-fea4e926310d: Failed to get hardware info: No such file or directory
> 
> 
> 
> Thanks,
> 
> C.
> 
>
Zhenzhong Duan July 23, 2024, 2:52 a.m. UTC | #6
>-----Original Message-----
>From: Eric Farman <farman@linux.ibm.com>
>Subject: Re: [PATCH 2/2] vfio/ccw: Don't initialize HOST_IOMMU_DEVICE
>with mdev
>
>On Mon, 2024-07-22 at 17:36 +0200, Cédric Le Goater wrote:
>> On 7/22/24 17:09, Joao Martins wrote:
>> > On 22/07/2024 15:57, Eric Farman wrote:
>> > > On Mon, 2024-07-22 at 15:07 +0800, Zhenzhong Duan wrote:
>> > > > mdevs aren't "physical" devices and when asking for backing IOMMU
>info,
>> > > > it fails the entire provisioning of the guest. Fix that by setting
>> > > > vbasedev->mdev true so skipping HostIOMMUDevice initialization in
>the
>> > > > presence of mdevs.
>> > >
>> > > Hmm, picking the two commits that Cedric mentioned in his cover-
>letter reply [1] doesn't "fail the entire provisioning of the guest" for me.
>> > >
>> > > Applying this patch on top of that causes the call from
>vfio_attach_device() to hiod_legacy_vfio_realize() to be skipped, which
>seems odd. What am I missing?
>> > >
>> > > [1] https://lore.kernel.org/qemu-devel/4c9a184b-514c-4276-95ca-
>9ed86623b9a4@redhat.com/
>> > >
>> >
>> > If you are using IOMMUFD
>> >
>
>Which is not the case in defconfig.
>
>> >  it will fail the entire provisioning i.e. GET_HW_INFO
>> > fails because there's no actual device/IOMMU you can probe hardware
>information
>> > from and you can't start a guest. This happened at least for me in x86
>vfio-pci
>> > mdevs (or at least I reproduced it when trying to test mdev_tty)
>> >
>> > But if you don't support IOMMUFD, then it probably makes no difference
>as type1
>> > doesn't do anything particularly special besides initializing some static
>data.
>
>This was my concern. The static data doesn't look particularly exciting, but it
>does seem strange to
>be skipping over it in the non-iommufd case now.

Thanks Joao and Cédric for helping explain and confirm.

Yes, after this fix HostIOMMUDevice is totally bypassed for mdev even in non-iommufd case.
In non-iommufd case, the only supported HostIOMMUDevice capability is aw_bits which is calculated through bcontainer->iova_ranges which is always NULL for mdev.
So HOST_IOMMU_DEVICE_CAP_AW_BITS_MAX(64) is returned which is larger enough that vIOMMU can safely ignore. Then we can safely bypass entire HostIOMMUDevice for mdev.

Thanks
Zhenzhong

>
>> > The realize is skipped because you technically don't have a physical host
>IOMMU
>> > directly behind the mdev, but rather some parent function related
>software
>> > entity doing that for you.
>> >
>> > Zhengzhong noticed there were some other mdevs aside from vfio-pci
>and in an
>> > attempt to prevent regression elsewhere it posted for the other mdevs in
>qemu.
>>
>>
>> yes. I confirm with :
>>
>>    -device vfio-ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-
>e656-4187-b773-fea4e926310d,iommufd=iommufd0 \
>>    -object iommufd,id=iommufd0 \
>>    -trace 'iommufd*'
>>
>> iommufd_cdev_getfd  /dev/vfio/devices/vfio4 (fd=28)
>
>Ah, right... Need to enable iommufd AND vfio_device_cdev to get into this
>potential situation. I
>guess this is better than random failures down the road.
>
>Acked-by: Eric Farman <farman@linux.ibm.com>
>
>> iommufd_backend_connect fd=27 owned=1 users=1
>> iommufd_cdev_connect_and_bind  [iommufd=27] Successfully bound
>device 8eb8351a-e656-4187-b773-fea4e926310d (fd=28): output devid=1
>> iommufd_backend_alloc_ioas  iommufd=27 ioas=2
>> iommufd_cdev_alloc_ioas  [iommufd=27] new IOMMUFD container with
>ioasid=2
>> iommufd_cdev_attach_ioas_hwpt  [iommufd=27] Successfully attached
>device 8eb8351a-e656-4187-b773-fea4e926310d (28) to id=2
>> iommufd_backend_map_dma  iommufd=27 ioas=2 iova=0x0
>size=0x200000000 addr=0x3fd9ff00000 readonly=0 (0)
>> iommufd_cdev_device_info  8eb8351a-e656-4187-b773-fea4e926310d
>(28) num_irqs=1 num_regions=0 flags=33
>> iommufd_cdev_detach_ioas_hwpt  [iommufd=27] Successfully detached
>8eb8351a-e656-4187-b773-fea4e926310d
>> iommufd_backend_unmap_dma  iommufd=27 ioas=2 iova=0x0
>size=0x200000000 (0)
>> iommufd_backend_free_id  iommufd=27 id=2 (0)
>> iommufd_backend_disconnect fd=-1 users=0
>>
>> qemu-kvm: -device vfio-
>ap,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/8eb8351a-e656-4187-
>b773-fea4e926310d,iommufd=iommufd0: vfio 8eb8351a-e656-4187-b773-
>fea4e926310d: Failed to get hardware info: No such file or directory
>>
>>
>>
>> Thanks,
>>
>> C.
>>
>>
diff mbox series

Patch

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 1f8e1272c7..70934b01d5 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -675,6 +675,9 @@  static void vfio_ccw_instance_init(Object *obj)
     VFIOCCWDevice *vcdev = VFIO_CCW(obj);
     VFIODevice *vbasedev = &vcdev->vdev;
 
+    /* CCW device is mdev type device */
+    vbasedev->mdev = true;
+
     /*
      * All vfio-ccw devices are believed to operate in a way compatible with
      * discarding of memory in RAM blocks, ie. pages pinned in the host are