mbox series

[v3,0/3] VIRTIO-IOMMU: Introduce an aw-bits option

Message ID 20240208101128.655167-1-eric.auger@redhat.com (mailing list archive)
Headers show
Series VIRTIO-IOMMU: Introduce an aw-bits option | expand

Message

Eric Auger Feb. 8, 2024, 10:10 a.m. UTC
In [1] and [2] we attempted to fix a case where a VFIO-PCI device
protected with a virtio-iommu is assigned to an x86 guest. On x86
the physical IOMMU may have an address width (gaw) of 39 or 48 bits
whereas the virtio-iommu exposes a 64b input address space by default.
Hence the guest may try to use the full 64b space and DMA MAP
failures may be encountered. To work around this issue we endeavoured
to pass usable host IOVA regions (excluding the out of range space) from
VFIO to the virtio-iommu device so that the virtio-iommu driver can
query those latter during the probe request and let the guest iommu
kernel subsystem carve them out. 

However if there are several devices in the same iommu group,
only the reserved regions of the first one are taken into
account by the iommu subsystem of the guest. This generally
works on baremetal because devices are not going to
expose different reserved regions. However in our case, this
may prevent from taking into account the host iommu geometry.

So the simplest solution to this problem looks to introduce an
input address width option, aw-bits, which matches what is
done on the intel-iommu. By default, from now on it is set
to 39 bits with pc_q35 and 48 with arm virt. This replaces the
previous default value of 64b. So we need to introduce a compat
for machines older than 9.0 to behave similarly. We use
hw_compat_8_2 to acheive that goal.

Outstanding series [2] remains useful to let resv regions beeing
communicated on time before the probe request.

[1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
    https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
    - This is merged -

[2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
    https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
    - This is pending for review on the ML -

This series can be found at:
https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
previous
https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2

Applied on top of [3]
[PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/

History:
v2 -> v3:
- Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
- use &error_abort instead of NULL error handle
  on object_property_get_uint() call (Cédric)
- use VTD_HOST_AW_39BIT (Cédric)

v1 -> v2
- Limit aw to 48b on ARM
- Check aw is within [32,64]
- Use hw_compat_8_2


Eric Auger (3):
  virtio-iommu: Add an option to define the input range width
  virtio-iommu: Trace domain range limits as unsigned int
  hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt

 include/hw/virtio/virtio-iommu.h | 1 +
 hw/arm/virt.c                    | 6 ++++++
 hw/core/machine.c                | 5 ++++-
 hw/i386/pc.c                     | 6 ++++++
 hw/virtio/virtio-iommu.c         | 7 ++++++-
 hw/virtio/trace-events           | 2 +-
 6 files changed, 24 insertions(+), 3 deletions(-)

Comments

Jean-Philippe Brucker Feb. 8, 2024, 2:42 p.m. UTC | #1
On Thu, Feb 08, 2024 at 11:10:16AM +0100, Eric Auger wrote:
> In [1] and [2] we attempted to fix a case where a VFIO-PCI device
> protected with a virtio-iommu is assigned to an x86 guest. On x86
> the physical IOMMU may have an address width (gaw) of 39 or 48 bits
> whereas the virtio-iommu exposes a 64b input address space by default.
> Hence the guest may try to use the full 64b space and DMA MAP
> failures may be encountered. To work around this issue we endeavoured
> to pass usable host IOVA regions (excluding the out of range space) from
> VFIO to the virtio-iommu device so that the virtio-iommu driver can
> query those latter during the probe request and let the guest iommu
> kernel subsystem carve them out. 
> 
> However if there are several devices in the same iommu group,
> only the reserved regions of the first one are taken into
> account by the iommu subsystem of the guest. This generally
> works on baremetal because devices are not going to
> expose different reserved regions. However in our case, this
> may prevent from taking into account the host iommu geometry.
> 
> So the simplest solution to this problem looks to introduce an
> input address width option, aw-bits, which matches what is
> done on the intel-iommu. By default, from now on it is set
> to 39 bits with pc_q35 and 48 with arm virt. This replaces the
> previous default value of 64b. So we need to introduce a compat
> for machines older than 9.0 to behave similarly. We use
> hw_compat_8_2 to acheive that goal.

For the series:

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> 
> Outstanding series [2] remains useful to let resv regions beeing
> communicated on time before the probe request.
> 
> [1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
>     https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
>     - This is merged -
> 
> [2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
>     https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
>     - This is pending for review on the ML -
> 
> This series can be found at:
> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
> previous
> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2
> 
> Applied on top of [3]
> [PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
> https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/
> 
> History:
> v2 -> v3:
> - Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
> - use &error_abort instead of NULL error handle
>   on object_property_get_uint() call (Cédric)
> - use VTD_HOST_AW_39BIT (Cédric)
> 
> v1 -> v2
> - Limit aw to 48b on ARM
> - Check aw is within [32,64]
> - Use hw_compat_8_2
> 
> 
> Eric Auger (3):
>   virtio-iommu: Add an option to define the input range width
>   virtio-iommu: Trace domain range limits as unsigned int
>   hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt
> 
>  include/hw/virtio/virtio-iommu.h | 1 +
>  hw/arm/virt.c                    | 6 ++++++
>  hw/core/machine.c                | 5 ++++-
>  hw/i386/pc.c                     | 6 ++++++
>  hw/virtio/virtio-iommu.c         | 7 ++++++-
>  hw/virtio/trace-events           | 2 +-
>  6 files changed, 24 insertions(+), 3 deletions(-)
> 
> -- 
> 2.41.0
>
Eric Auger Feb. 8, 2024, 3:39 p.m. UTC | #2
Hi Jean,

On 2/8/24 15:42, Jean-Philippe Brucker wrote:
> On Thu, Feb 08, 2024 at 11:10:16AM +0100, Eric Auger wrote:
>> In [1] and [2] we attempted to fix a case where a VFIO-PCI device
>> protected with a virtio-iommu is assigned to an x86 guest. On x86
>> the physical IOMMU may have an address width (gaw) of 39 or 48 bits
>> whereas the virtio-iommu exposes a 64b input address space by default.
>> Hence the guest may try to use the full 64b space and DMA MAP
>> failures may be encountered. To work around this issue we endeavoured
>> to pass usable host IOVA regions (excluding the out of range space) from
>> VFIO to the virtio-iommu device so that the virtio-iommu driver can
>> query those latter during the probe request and let the guest iommu
>> kernel subsystem carve them out. 
>>
>> However if there are several devices in the same iommu group,
>> only the reserved regions of the first one are taken into
>> account by the iommu subsystem of the guest. This generally
>> works on baremetal because devices are not going to
>> expose different reserved regions. However in our case, this
>> may prevent from taking into account the host iommu geometry.
>>
>> So the simplest solution to this problem looks to introduce an
>> input address width option, aw-bits, which matches what is
>> done on the intel-iommu. By default, from now on it is set
>> to 39 bits with pc_q35 and 48 with arm virt. This replaces the
>> previous default value of 64b. So we need to introduce a compat
>> for machines older than 9.0 to behave similarly. We use
>> hw_compat_8_2 to acheive that goal.
> For the series:
>
> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Thanks!

and after reading your v2 last reply we case we need to support smaller
aw ranges, I will do the change when the use case arises.

Eric
>
>> Outstanding series [2] remains useful to let resv regions beeing
>> communicated on time before the probe request.
>>
>> [1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
>>     https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
>>     - This is merged -
>>
>> [2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
>>     https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
>>     - This is pending for review on the ML -
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
>> previous
>> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2
>>
>> Applied on top of [3]
>> [PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
>> https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/
>>
>> History:
>> v2 -> v3:
>> - Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
>> - use &error_abort instead of NULL error handle
>>   on object_property_get_uint() call (Cédric)
>> - use VTD_HOST_AW_39BIT (Cédric)
>>
>> v1 -> v2
>> - Limit aw to 48b on ARM
>> - Check aw is within [32,64]
>> - Use hw_compat_8_2
>>
>>
>> Eric Auger (3):
>>   virtio-iommu: Add an option to define the input range width
>>   virtio-iommu: Trace domain range limits as unsigned int
>>   hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt
>>
>>  include/hw/virtio/virtio-iommu.h | 1 +
>>  hw/arm/virt.c                    | 6 ++++++
>>  hw/core/machine.c                | 5 ++++-
>>  hw/i386/pc.c                     | 6 ++++++
>>  hw/virtio/virtio-iommu.c         | 7 ++++++-
>>  hw/virtio/trace-events           | 2 +-
>>  6 files changed, 24 insertions(+), 3 deletions(-)
>>
>> -- 
>> 2.41.0
>>
Michael S. Tsirkin Feb. 13, 2024, 10:35 a.m. UTC | #3
On Thu, Feb 08, 2024 at 11:10:16AM +0100, Eric Auger wrote:
> In [1] and [2] we attempted to fix a case where a VFIO-PCI device
> protected with a virtio-iommu is assigned to an x86 guest. On x86
> the physical IOMMU may have an address width (gaw) of 39 or 48 bits
> whereas the virtio-iommu exposes a 64b input address space by default.
> Hence the guest may try to use the full 64b space and DMA MAP
> failures may be encountered. To work around this issue we endeavoured
> to pass usable host IOVA regions (excluding the out of range space) from
> VFIO to the virtio-iommu device so that the virtio-iommu driver can
> query those latter during the probe request and let the guest iommu
> kernel subsystem carve them out. 
> 
> However if there are several devices in the same iommu group,
> only the reserved regions of the first one are taken into
> account by the iommu subsystem of the guest. This generally
> works on baremetal because devices are not going to
> expose different reserved regions. However in our case, this
> may prevent from taking into account the host iommu geometry.
> 
> So the simplest solution to this problem looks to introduce an
> input address width option, aw-bits, which matches what is
> done on the intel-iommu. By default, from now on it is set
> to 39 bits with pc_q35 and 48 with arm virt. This replaces the
> previous default value of 64b. So we need to introduce a compat
> for machines older than 9.0 to behave similarly. We use
> hw_compat_8_2 to acheive that goal.
> 
> Outstanding series [2] remains useful to let resv regions beeing
> communicated on time before the probe request.
> 
> [1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
>     https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
>     - This is merged -
> 
> [2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
>     https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
>     - This is pending for review on the ML -
> 
> This series can be found at:
> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
> previous
> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2
> 
> Applied on top of [3]
> [PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
> https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/

So, I applied this without that patch until we agree whether there are
compat issues in that one. Seems to work without or did I miss anything?


> History:
> v2 -> v3:
> - Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
> - use &error_abort instead of NULL error handle
>   on object_property_get_uint() call (Cédric)
> - use VTD_HOST_AW_39BIT (Cédric)
> 
> v1 -> v2
> - Limit aw to 48b on ARM
> - Check aw is within [32,64]
> - Use hw_compat_8_2
> 
> 
> Eric Auger (3):
>   virtio-iommu: Add an option to define the input range width
>   virtio-iommu: Trace domain range limits as unsigned int
>   hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt
> 
>  include/hw/virtio/virtio-iommu.h | 1 +
>  hw/arm/virt.c                    | 6 ++++++
>  hw/core/machine.c                | 5 ++++-
>  hw/i386/pc.c                     | 6 ++++++
>  hw/virtio/virtio-iommu.c         | 7 ++++++-
>  hw/virtio/trace-events           | 2 +-
>  6 files changed, 24 insertions(+), 3 deletions(-)
> 
> -- 
> 2.41.0
Eric Auger Feb. 13, 2024, 10:38 a.m. UTC | #4
Hi Michael,
On 2/13/24 11:35, Michael S. Tsirkin wrote:
> On Thu, Feb 08, 2024 at 11:10:16AM +0100, Eric Auger wrote:
>> In [1] and [2] we attempted to fix a case where a VFIO-PCI device
>> protected with a virtio-iommu is assigned to an x86 guest. On x86
>> the physical IOMMU may have an address width (gaw) of 39 or 48 bits
>> whereas the virtio-iommu exposes a 64b input address space by default.
>> Hence the guest may try to use the full 64b space and DMA MAP
>> failures may be encountered. To work around this issue we endeavoured
>> to pass usable host IOVA regions (excluding the out of range space) from
>> VFIO to the virtio-iommu device so that the virtio-iommu driver can
>> query those latter during the probe request and let the guest iommu
>> kernel subsystem carve them out. 
>>
>> However if there are several devices in the same iommu group,
>> only the reserved regions of the first one are taken into
>> account by the iommu subsystem of the guest. This generally
>> works on baremetal because devices are not going to
>> expose different reserved regions. However in our case, this
>> may prevent from taking into account the host iommu geometry.
>>
>> So the simplest solution to this problem looks to introduce an
>> input address width option, aw-bits, which matches what is
>> done on the intel-iommu. By default, from now on it is set
>> to 39 bits with pc_q35 and 48 with arm virt. This replaces the
>> previous default value of 64b. So we need to introduce a compat
>> for machines older than 9.0 to behave similarly. We use
>> hw_compat_8_2 to acheive that goal.
>>
>> Outstanding series [2] remains useful to let resv regions beeing
>> communicated on time before the probe request.
>>
>> [1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
>>     https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
>>     - This is merged -
>>
>> [2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
>>     https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
>>     - This is pending for review on the ML -
>>
>> This series can be found at:
>> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
>> previous
>> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2
>>
>> Applied on top of [3]
>> [PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
>> https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/
> So, I applied this without that patch until we agree whether there are
> compat issues in that one. Seems to work without or did I miss anything?
To me it works without compat by checking the input stream version and
if <= 3 apply the previous default.

maybe I miss something but I tested mig between v2 and v3 and that worked

Eric
>
>
>> History:
>> v2 -> v3:
>> - Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
>> - use &error_abort instead of NULL error handle
>>   on object_property_get_uint() call (Cédric)
>> - use VTD_HOST_AW_39BIT (Cédric)
>>
>> v1 -> v2
>> - Limit aw to 48b on ARM
>> - Check aw is within [32,64]
>> - Use hw_compat_8_2
>>
>>
>> Eric Auger (3):
>>   virtio-iommu: Add an option to define the input range width
>>   virtio-iommu: Trace domain range limits as unsigned int
>>   hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt
>>
>>  include/hw/virtio/virtio-iommu.h | 1 +
>>  hw/arm/virt.c                    | 6 ++++++
>>  hw/core/machine.c                | 5 ++++-
>>  hw/i386/pc.c                     | 6 ++++++
>>  hw/virtio/virtio-iommu.c         | 7 ++++++-
>>  hw/virtio/trace-events           | 2 +-
>>  6 files changed, 24 insertions(+), 3 deletions(-)
>>
>> -- 
>> 2.41.0
Michael S. Tsirkin Feb. 13, 2024, 3:41 p.m. UTC | #5
On Tue, Feb 13, 2024 at 11:38:06AM +0100, Eric Auger wrote:
> Hi Michael,
> On 2/13/24 11:35, Michael S. Tsirkin wrote:
> > On Thu, Feb 08, 2024 at 11:10:16AM +0100, Eric Auger wrote:
> >> In [1] and [2] we attempted to fix a case where a VFIO-PCI device
> >> protected with a virtio-iommu is assigned to an x86 guest. On x86
> >> the physical IOMMU may have an address width (gaw) of 39 or 48 bits
> >> whereas the virtio-iommu exposes a 64b input address space by default.
> >> Hence the guest may try to use the full 64b space and DMA MAP
> >> failures may be encountered. To work around this issue we endeavoured
> >> to pass usable host IOVA regions (excluding the out of range space) from
> >> VFIO to the virtio-iommu device so that the virtio-iommu driver can
> >> query those latter during the probe request and let the guest iommu
> >> kernel subsystem carve them out. 
> >>
> >> However if there are several devices in the same iommu group,
> >> only the reserved regions of the first one are taken into
> >> account by the iommu subsystem of the guest. This generally
> >> works on baremetal because devices are not going to
> >> expose different reserved regions. However in our case, this
> >> may prevent from taking into account the host iommu geometry.
> >>
> >> So the simplest solution to this problem looks to introduce an
> >> input address width option, aw-bits, which matches what is
> >> done on the intel-iommu. By default, from now on it is set
> >> to 39 bits with pc_q35 and 48 with arm virt. This replaces the
> >> previous default value of 64b. So we need to introduce a compat
> >> for machines older than 9.0 to behave similarly. We use
> >> hw_compat_8_2 to acheive that goal.
> >>
> >> Outstanding series [2] remains useful to let resv regions beeing
> >> communicated on time before the probe request.
> >>
> >> [1] [PATCH v4 00/12] VIRTIO-IOMMU/VFIO: Don't assume 64b IOVA space
> >>     https://lore.kernel.org/all/20231019134651.842175-1-eric.auger@redhat.com/
> >>     - This is merged -
> >>
> >> [2] [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
> >>     https://lore.kernel.org/all/20240117080414.316890-1-eric.auger@redhat.com/
> >>     - This is pending for review on the ML -
> >>
> >> This series can be found at:
> >> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v3
> >> previous
> >> https://github.com/eauger/qemu/tree/virtio-iommu-aw-bits-v2
> >>
> >> Applied on top of [3]
> >> [PATCH v2] virtio-iommu: Use qemu_real_host_page_mask as default page_size_mask
> >> https://lore.kernel.org/all/20240117132039.332273-1-eric.auger@redhat.com/
> > So, I applied this without that patch until we agree whether there are
> > compat issues in that one. Seems to work without or did I miss anything?
> To me it works without compat by checking the input stream version and
> if <= 3 apply the previous default.
> 
> maybe I miss something but I tested mig between v2 and v3 and that worked
> 
> Eric


Can you rebase without that other change?
ATM it fails make check for me:

▶  44/462 ERROR:../tests/qtest/virtio-iommu-test.c:37:pci_config: assertion failed (input_range_end == UINT64_MAX): (0x7fffffffff == 0xffffffffffffffff) ERROR         


https://gitlab.com/mstredhat/qemu/-/jobs/6162397307



> >
> >
> >> History:
> >> v2 -> v3:
> >> - Collected Zhenzhong and Cédric's R-b + Yanghang's T-b
> >> - use &error_abort instead of NULL error handle
> >>   on object_property_get_uint() call (Cédric)
> >> - use VTD_HOST_AW_39BIT (Cédric)
> >>
> >> v1 -> v2
> >> - Limit aw to 48b on ARM
> >> - Check aw is within [32,64]
> >> - Use hw_compat_8_2
> >>
> >>
> >> Eric Auger (3):
> >>   virtio-iommu: Add an option to define the input range width
> >>   virtio-iommu: Trace domain range limits as unsigned int
> >>   hw: Set virtio-iommu aw-bits default value on pc_q35 and arm virt
> >>
> >>  include/hw/virtio/virtio-iommu.h | 1 +
> >>  hw/arm/virt.c                    | 6 ++++++
> >>  hw/core/machine.c                | 5 ++++-
> >>  hw/i386/pc.c                     | 6 ++++++
> >>  hw/virtio/virtio-iommu.c         | 7 ++++++-
> >>  hw/virtio/trace-events           | 2 +-
> >>  6 files changed, 24 insertions(+), 3 deletions(-)
> >>
> >> -- 
> >> 2.41.0