mbox series

[0/2] s390x/pci: relax I/O address translation requirement

Message ID 20241209192927.107503-1-mjrosato@linux.ibm.com (mailing list archive)
Headers show
Series s390x/pci: relax I/O address translation requirement | expand

Message

Matthew Rosato Dec. 9, 2024, 7:29 p.m. UTC
This series introduces the concept of the relaxed translation requirement
for s390x guests in order to allow bypass of the guest IOMMU for more
efficient PCI passthrough.

With this series, QEMU can indicate to the guest that an IOMMU is not
strictly required for a zPCI device.  This would subsequently allow a
guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
PCI devices.

When this occurs, QEMU will note the behavior via an intercepted MPCIFC
instruction and will fill the host iommu with mappings of the entire
guest address space in response.

There is a kernel series [1] that adds the relevant behavior needed to
exploit this new feature from within a s390x linux guest.

[1]: https://lore.kernel.org/linux-s390/20241209192403.107090-1-mjrosato@linux.ibm.com/

Matthew Rosato (2):
  s390x/pci: add support for guests that request direct mapping
  s390x/pci: indicate QEMU supports relaxed translation for passthrough

 hw/s390x/s390-pci-bus.c         | 23 ++++++++++++++++++
 hw/s390x/s390-pci-inst.c        | 42 +++++++++++++++++++++++++++++++--
 hw/s390x/s390-pci-vfio.c        |  4 +++-
 include/hw/s390x/s390-pci-bus.h |  2 ++
 include/hw/s390x/s390-pci-clp.h |  1 +
 5 files changed, 69 insertions(+), 3 deletions(-)

Comments

Thomas Huth Dec. 12, 2024, 9:10 a.m. UTC | #1
On 09/12/2024 20.29, Matthew Rosato wrote:
> This series introduces the concept of the relaxed translation requirement
> for s390x guests in order to allow bypass of the guest IOMMU for more
> efficient PCI passthrough.
> 
> With this series, QEMU can indicate to the guest that an IOMMU is not
> strictly required for a zPCI device.  This would subsequently allow a
> guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
> PCI devices.
> 
> When this occurs, QEMU will note the behavior via an intercepted MPCIFC
> instruction and will fill the host iommu with mappings of the entire
> guest address space in response.
> 
> There is a kernel series [1] that adds the relevant behavior needed to
> exploit this new feature from within a s390x linux guest.
> 
> [1]: https://lore.kernel.org/linux-s390/20241209192403.107090-1-mjrosato@linux.ibm.com/
> 
> Matthew Rosato (2):
>    s390x/pci: add support for guests that request direct mapping
>    s390x/pci: indicate QEMU supports relaxed translation for passthrough

  Hi again!

One more thought: This is a guest-visible feature, isn't it? So do we also 
need some migration handling for this? For example, what happens if you 
start a guest that is aware of this feature on a host that has a QEMU with 
this feature, and then try to live-migrate the guest to a QEMU that does not 
have this feature? I guess the guest will crash? It would be better to fail 
the migration instead. At least we should disable the feature in older 
machine types and only allow it for the latest one.

  Thomas
Matthew Rosato Dec. 12, 2024, 2:42 p.m. UTC | #2
On 12/12/24 4:10 AM, Thomas Huth wrote:
> On 09/12/2024 20.29, Matthew Rosato wrote:
>> This series introduces the concept of the relaxed translation requirement
>> for s390x guests in order to allow bypass of the guest IOMMU for more
>> efficient PCI passthrough.
>>
>> With this series, QEMU can indicate to the guest that an IOMMU is not
>> strictly required for a zPCI device.  This would subsequently allow a
>> guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
>> PCI devices.
>>
>> When this occurs, QEMU will note the behavior via an intercepted MPCIFC
>> instruction and will fill the host iommu with mappings of the entire
>> guest address space in response.
>>
>> There is a kernel series [1] that adds the relevant behavior needed to
>> exploit this new feature from within a s390x linux guest.
>>
>> [1]: https://lore.kernel.org/linux-s390/20241209192403.107090-1-mjrosato@linux.ibm.com/
>>
>> Matthew Rosato (2):
>>    s390x/pci: add support for guests that request direct mapping
>>    s390x/pci: indicate QEMU supports relaxed translation for passthrough
> 
>  Hi again!
> 
> One more thought: This is a guest-visible feature, isn't it? So do we also need some migration handling for this? For example, what happens if you start a guest that is aware of this feature on a host that has a QEMU with this feature, and then try to live-migrate the guest to a QEMU that does not have this feature? I guess the guest will crash? It would be better to fail the migration instead. At least we should disable the feature in older machine types and only allow it for the latest one.

zPCI devices are currently marked as unmigratable in s390_pci_device_vmstate so it's not a reproducible issue yet.

Re: disabling the feature for older machines, OK -- Shall I fence similar to what we did for interpret/forwarding-assist with a new device property that is default to off on older machines ("relax-translation"? alternative suggestions welcome)
Cédric Le Goater Dec. 13, 2024, 9:07 a.m. UTC | #3
On 12/12/24 15:42, Matthew Rosato wrote:
> On 12/12/24 4:10 AM, Thomas Huth wrote:
>> On 09/12/2024 20.29, Matthew Rosato wrote:
>>> This series introduces the concept of the relaxed translation requirement
>>> for s390x guests in order to allow bypass of the guest IOMMU for more
>>> efficient PCI passthrough.
>>>
>>> With this series, QEMU can indicate to the guest that an IOMMU is not
>>> strictly required for a zPCI device.  This would subsequently allow a
>>> guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
>>> PCI devices.
>>>
>>> When this occurs, QEMU will note the behavior via an intercepted MPCIFC
>>> instruction and will fill the host iommu with mappings of the entire
>>> guest address space in response.
>>>
>>> There is a kernel series [1] that adds the relevant behavior needed to
>>> exploit this new feature from within a s390x linux guest.
>>>
>>> [1]: https://lore.kernel.org/linux-s390/20241209192403.107090-1-mjrosato@linux.ibm.com/
>>>
>>> Matthew Rosato (2):
>>>     s390x/pci: add support for guests that request direct mapping
>>>     s390x/pci: indicate QEMU supports relaxed translation for passthrough
>>
>>   Hi again!
>>
>> One more thought: This is a guest-visible feature, isn't it? So do we also need some migration handling for this? For example, what happens if you start a guest that is aware of this feature on a host that has a QEMU with this feature, and then try to live-migrate the guest to a QEMU that does not have this feature? I guess the guest will crash? It would be better to fail the migration instead. At least we should disable the feature in older machine types and only allow it for the latest one.
> 
> zPCI devices are currently marked as unmigratable in s390_pci_device_vmstate so it's not a reproducible issue yet.
> 
> Re: disabling the feature for older machines, OK -- Shall I fence similar to what we did for interpret/forwarding-assist with a new device property that is default to off on older machines ("relax-translation"? alternative suggestions welcome)

Looks good to me.

Thanks,

C.
Thomas Huth Dec. 13, 2024, 9:24 a.m. UTC | #4
On 12/12/2024 15.42, Matthew Rosato wrote:
> On 12/12/24 4:10 AM, Thomas Huth wrote:
>> On 09/12/2024 20.29, Matthew Rosato wrote:
>>> This series introduces the concept of the relaxed translation requirement
>>> for s390x guests in order to allow bypass of the guest IOMMU for more
>>> efficient PCI passthrough.
>>>
>>> With this series, QEMU can indicate to the guest that an IOMMU is not
>>> strictly required for a zPCI device.  This would subsequently allow a
>>> guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
>>> PCI devices.
>>>
>>> When this occurs, QEMU will note the behavior via an intercepted MPCIFC
>>> instruction and will fill the host iommu with mappings of the entire
>>> guest address space in response.
>>>
>>> There is a kernel series [1] that adds the relevant behavior needed to
>>> exploit this new feature from within a s390x linux guest.
>>>
>>> [1]: https://lore.kernel.org/linux-s390/20241209192403.107090-1-mjrosato@linux.ibm.com/
>>>
>>> Matthew Rosato (2):
>>>     s390x/pci: add support for guests that request direct mapping
>>>     s390x/pci: indicate QEMU supports relaxed translation for passthrough
>>
>>   Hi again!
>>
>> One more thought: This is a guest-visible feature, isn't it? So do we also need some migration handling for this? For example, what happens if you start a guest that is aware of this feature on a host that has a QEMU with this feature, and then try to live-migrate the guest to a QEMU that does not have this feature? I guess the guest will crash? It would be better to fail the migration instead. At least we should disable the feature in older machine types and only allow it for the latest one.
> 
> zPCI devices are currently marked as unmigratable in s390_pci_device_vmstate so it's not a reproducible issue yet.

Ah, right, I forgot about that migration blocker, so we should be fine, indeed!

> Re: disabling the feature for older machines, OK -- Shall I fence similar to what we did for interpret/forwarding-assist with a new device property that is default to off on older machines ("relax-translation"? alternative suggestions welcome)

Sounds reasonable!

  Thomas