diff mbox series

[v3,1/1] virtio: fix the condition for iommu_platform not supported

Message ID 20220201133915.3764972-1-pasic@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series [v3,1/1] virtio: fix the condition for iommu_platform not supported | expand

Commit Message

Halil Pasic Feb. 1, 2022, 1:39 p.m. UTC
The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported") claims to fail the device hotplug when iommu_platform
is requested, but not supported by the (vhost) device. On the first
glance the condition for detecting that situation looks perfect, but
because a certain peculiarity of virtio_platform it ain't.

In fact the aforementioned commit introduces a regression. It breaks
virtio-fs support for Secure Execution, and most likely also for AMD SEV
or any other confidential guest scenario that relies encrypted guest
memory.  The same also applies to any other vhost device that does not
support _F_ACCESS_PLATFORM.

The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
"device can not access all of the guest RAM" and "iova != gpa, thus
device needs to translate iova".

Confidential guest technologies currently rely on the device/hypervisor
offering _F_ACCESS_PLATFORM, so that, after the feature has been
negotiated, the guest  grants access to the portions of memory the
device needs to see. So in for confidential guests, generally,
_F_ACCESS_PLATFORM is about the restricted access to memory, but not
about the addresses used being something else than guest physical
addresses.

This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
turn on VIRTIO_F_IOMMU_PLATFORM") for, which fences _F_ACCESS_PLATFORM
form the vhost device that does not need it, because on the vhost
interface it only means "I/O address translation is needed".

This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
by the device, and thus no device capability is needed. In this
situation claiming that the device does not support iommu_plattform=on
is counter-productive. So let us stop doing that!

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com>
Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported")
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org

---

v2->v3:
* Caught a bug: I tired to check if vdev has the feature
   ACCESS_PLATFORM after we have forced it. Moved the check
   to a better place
v1->v2:
* Commit message tweaks. Most notably fixed commit SHA (Michael)

---
---
 hw/virtio/virtio-bus.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)


base-commit: 6621441db50d5bae7e34dbd04bf3c57a27a71b32

Comments

Daniel Henrique Barboza Feb. 1, 2022, 3:36 p.m. UTC | #1
On 2/1/22 10:39, Halil Pasic wrote:
> The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> unsupported") claims to fail the device hotplug when iommu_platform
> is requested, but not supported by the (vhost) device. On the first
> glance the condition for detecting that situation looks perfect, but
> because a certain peculiarity of virtio_platform it ain't.
> 
> In fact the aforementioned commit introduces a regression. It breaks
> virtio-fs support for Secure Execution, and most likely also for AMD SEV
> or any other confidential guest scenario that relies encrypted guest
> memory.  The same also applies to any other vhost device that does not
> support _F_ACCESS_PLATFORM.
> 
> The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
> "device can not access all of the guest RAM" and "iova != gpa, thus
> device needs to translate iova".
> 
> Confidential guest technologies currently rely on the device/hypervisor
> offering _F_ACCESS_PLATFORM, so that, after the feature has been
> negotiated, the guest  grants access to the portions of memory the
> device needs to see. So in for confidential guests, generally,
> _F_ACCESS_PLATFORM is about the restricted access to memory, but not
> about the addresses used being something else than guest physical
> addresses.
> 
> This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
> turn on VIRTIO_F_IOMMU_PLATFORM") for, which fences _F_ACCESS_PLATFORM
> form the vhost device that does not need it, because on the vhost
> interface it only means "I/O address translation is needed".
> 
> This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
> VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
> situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
> by the device, and thus no device capability is needed. In this
> situation claiming that the device does not support iommu_plattform=on
> is counter-productive. So let us stop doing that!
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com>
> Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> unsupported")
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: qemu-stable@nongnu.org
> 
> ---

This version solved the case I mentioned in the v2 about the false positive with
iommu_platform=on and and the vhost-user-fs-pci device. Now I get:


$ sudo ./qemu-system-ppc64 -machine pseries,accel=kvm (...)
-chardev socket,id=char0,path=/tmp/vhostqemu -device vhost-user-fs-pci,chardev=char0,tag=myfs,iommu_platform=on
qemu-system-ppc64: -device vhost-user-fs-pci,chardev=char0,tag=myfs,iommu_platform=on: iommu_platform=true is not supported by the device


Which is the expected result.

I didn't find the opportunity to test this patch with the PowerPC secure VM tech (papr-pef) but I
don't see the point of delaying this fix because of that.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>


One small suggestion down below in case a v4 is needed:

> 
> v2->v3:
> * Caught a bug: I tired to check if vdev has the feature
>     ACCESS_PLATFORM after we have forced it. Moved the check
>     to a better place
> v1->v2:
> * Commit message tweaks. Most notably fixed commit SHA (Michael)
> 
> ---
> ---
>   hw/virtio/virtio-bus.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d23db98c56..34f5a0a664 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -48,6 +48,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>       VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>       bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> +    bool vdev_has_iommu = false;
>       Error *local_err = NULL;
>   
>       DPRINTF("%s: plug device.\n", qbus->name);
> @@ -69,11 +70,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>           return;
>       }
>   
> -    if (has_iommu && !virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> -        error_setg(errp, "iommu_platform=true is not supported by the device");
> -        return;
> -    }
> -
>       if (klass->device_plugged != NULL) {
>           klass->device_plugged(qbus->parent, &local_err);
>       }
> @@ -82,9 +78,14 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>           return;
>       }
>   
> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>       if (klass->get_dma_as != NULL && has_iommu) {
>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>           vdev->dma_as = klass->get_dma_as(qbus->parent);
> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> +            error_setg(errp,
> +                       "iommu_platform=true is not supported by the device");
> +        }


>       } else {
>           vdev->dma_as = &address_space_memory;
>       }


I struggled to understand what this 'else' clause was doing and I assumed that it was
wrong. Searching through the ML I learned that this 'else' clause is intended to handle
legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.


My suggestion, if a v4 is required for any other reason, is to add a small comment in this
'else' clause explaining that this is the legacy virtio devices condition and those devices
don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
myself.



Thanks,


Daniel


> 
> base-commit: 6621441db50d5bae7e34dbd04bf3c57a27a71b32
Halil Pasic Feb. 1, 2022, 4:05 p.m. UTC | #2
On Tue,  1 Feb 2022 14:39:15 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> unsupported") claims to fail the device hotplug when iommu_platform

CC-ing Daniel with his new email address.
Cornelia Huck Feb. 1, 2022, 4:47 p.m. UTC | #3
On Tue, Feb 01 2022, Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> unsupported") claims to fail the device hotplug when iommu_platform
> is requested, but not supported by the (vhost) device. On the first
> glance the condition for detecting that situation looks perfect, but
> because a certain peculiarity of virtio_platform it ain't.
>
> In fact the aforementioned commit introduces a regression. It breaks
> virtio-fs support for Secure Execution, and most likely also for AMD SEV
> or any other confidential guest scenario that relies encrypted guest
> memory.  The same also applies to any other vhost device that does not
> support _F_ACCESS_PLATFORM.
>
> The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
> "device can not access all of the guest RAM" and "iova != gpa, thus
> device needs to translate iova".
>
> Confidential guest technologies currently rely on the device/hypervisor
> offering _F_ACCESS_PLATFORM, so that, after the feature has been
> negotiated, the guest  grants access to the portions of memory the
> device needs to see. So in for confidential guests, generally,
> _F_ACCESS_PLATFORM is about the restricted access to memory, but not
> about the addresses used being something else than guest physical
> addresses.
>
> This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
> turn on VIRTIO_F_IOMMU_PLATFORM") for, which fences _F_ACCESS_PLATFORM

s/for, which //

> form the vhost device that does not need it, because on the vhost

s/form/from/

> interface it only means "I/O address translation is needed".
>
> This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
> VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
> situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
> by the device, and thus no device capability is needed. In this
> situation claiming that the device does not support iommu_plattform=on
> is counter-productive. So let us stop doing that!
>
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com>
> Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> unsupported")
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: qemu-stable@nongnu.org
>
> ---
>
> v2->v3:
> * Caught a bug: I tired to check if vdev has the feature
>    ACCESS_PLATFORM after we have forced it. Moved the check
>    to a better place
> v1->v2:
> * Commit message tweaks. Most notably fixed commit SHA (Michael)
>
> ---
> ---
>  hw/virtio/virtio-bus.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d23db98c56..34f5a0a664 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -48,6 +48,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>      VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
>      VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
>      bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> +    bool vdev_has_iommu = false;

Isn't vdev_has_iommu set unconditionally before you try to use it?

>      Error *local_err = NULL;
>  
>      DPRINTF("%s: plug device.\n", qbus->name);
> @@ -69,11 +70,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>          return;
>      }
>  
> -    if (has_iommu && !virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> -        error_setg(errp, "iommu_platform=true is not supported by the device");
> -        return;
> -    }
> -
>      if (klass->device_plugged != NULL) {
>          klass->device_plugged(qbus->parent, &local_err);
>      }
> @@ -82,9 +78,14 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>          return;
>      }
>  
> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>      if (klass->get_dma_as != NULL && has_iommu) {
>          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> +            error_setg(errp,
> +                       "iommu_platform=true is not supported by the device");
> +        }
>      } else {

I agree that a short comment would be nice here, but this is preexisting
code anyway...

>          vdev->dma_as = &address_space_memory;
>      }
>
> base-commit: 6621441db50d5bae7e34dbd04bf3c57a27a71b32

...so (with or without fixing the nits):

Acked-by: Cornelia Huck <cohuck@redhat.com>

(i.e. looks sane, but I didn't follow all the paths)
Michael S. Tsirkin Feb. 1, 2022, 4:52 p.m. UTC | #4
On Tue, Feb 01, 2022 at 05:47:24PM +0100, Cornelia Huck wrote:
> On Tue, Feb 01 2022, Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> > unsupported") claims to fail the device hotplug when iommu_platform
> > is requested, but not supported by the (vhost) device. On the first
> > glance the condition for detecting that situation looks perfect, but
> > because a certain peculiarity of virtio_platform it ain't.
> >
> > In fact the aforementioned commit introduces a regression. It breaks
> > virtio-fs support for Secure Execution, and most likely also for AMD SEV
> > or any other confidential guest scenario that relies encrypted guest
> > memory.  The same also applies to any other vhost device that does not
> > support _F_ACCESS_PLATFORM.
> >
> > The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
> > "device can not access all of the guest RAM" and "iova != gpa, thus
> > device needs to translate iova".
> >
> > Confidential guest technologies currently rely on the device/hypervisor
> > offering _F_ACCESS_PLATFORM, so that, after the feature has been
> > negotiated, the guest  grants access to the portions of memory the
> > device needs to see. So in for confidential guests, generally,
> > _F_ACCESS_PLATFORM is about the restricted access to memory, but not
> > about the addresses used being something else than guest physical
> > addresses.
> >
> > This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
> > turn on VIRTIO_F_IOMMU_PLATFORM") for, which fences _F_ACCESS_PLATFORM
> 
> s/for, which //
> 
> > form the vhost device that does not need it, because on the vhost
> 
> s/form/from/
> 
> > interface it only means "I/O address translation is needed".
> >
> > This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
> > VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
> > situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
> > by the device, and thus no device capability is needed. In this
> > situation claiming that the device does not support iommu_plattform=on
> > is counter-productive. So let us stop doing that!
> >
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com>
> > Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
> > unsupported")
> > Cc: Kevin Wolf <kwolf@redhat.com>
> > Cc: qemu-stable@nongnu.org
> >
> > ---
> >
> > v2->v3:
> > * Caught a bug: I tired to check if vdev has the feature
> >    ACCESS_PLATFORM after we have forced it. Moved the check
> >    to a better place
> > v1->v2:
> > * Commit message tweaks. Most notably fixed commit SHA (Michael)
> >
> > ---
> > ---
> >  hw/virtio/virtio-bus.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > index d23db98c56..34f5a0a664 100644
> > --- a/hw/virtio/virtio-bus.c
> > +++ b/hw/virtio/virtio-bus.c
> > @@ -48,6 +48,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >      VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
> >      VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> >      bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > +    bool vdev_has_iommu = false;
> 
> Isn't vdev_has_iommu set unconditionally before you try to use it?

I'd like to know too.

> >      Error *local_err = NULL;
> >  
> >      DPRINTF("%s: plug device.\n", qbus->name);
> > @@ -69,11 +70,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >          return;
> >      }
> >  
> > -    if (has_iommu && !virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > -        error_setg(errp, "iommu_platform=true is not supported by the device");
> > -        return;
> > -    }
> > -
> >      if (klass->device_plugged != NULL) {
> >          klass->device_plugged(qbus->parent, &local_err);
> >      }
> > @@ -82,9 +78,14 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >          return;
> >      }
> >  
> > +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >      if (klass->get_dma_as != NULL && has_iommu) {
> >          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >          vdev->dma_as = klass->get_dma_as(qbus->parent);
> > +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> > +            error_setg(errp,
> > +                       "iommu_platform=true is not supported by the device");
> > +        }
> >      } else {
> 
> I agree that a short comment would be nice here, but this is preexisting
> code anyway...
> 
> >          vdev->dma_as = &address_space_memory;
> >      }
> >
> > base-commit: 6621441db50d5bae7e34dbd04bf3c57a27a71b32
> 
> ...so (with or without fixing the nits):
> 
> Acked-by: Cornelia Huck <cohuck@redhat.com>
> 
> (i.e. looks sane, but I didn't follow all the paths)
Halil Pasic Feb. 1, 2022, 5:50 p.m. UTC | #5
On Tue, 1 Feb 2022 11:52:06 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> > > +    bool vdev_has_iommu = false;  
> > 
> > Isn't vdev_has_iommu set unconditionally before you try to use it?  
> 
> I'd like to know too.

Yes it is. Was meant as a conservative thing. AFAIR in C stuff on stack
is not initialized to anything in particular so the idea was better
use false than garbage if someone made a mistake. But on the other
hand compilers can warn about that, and this defeats the compiler
warning. So uninitialized is indeed better.
Halil Pasic Feb. 1, 2022, 6:33 p.m. UTC | #6
On Tue, 1 Feb 2022 12:36:25 -0300
Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:

> > +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >       if (klass->get_dma_as != NULL && has_iommu) {
> >           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> > +            error_setg(errp,
> > +                       "iommu_platform=true is not supported by the device");
> > +        }  
> 
> 
> >       } else {
> >           vdev->dma_as = &address_space_memory;
> >       }  
> 
> 
> I struggled to understand what this 'else' clause was doing and I assumed that it was
> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
> 
> 
> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
> 'else' clause explaining that this is the legacy virtio devices condition and those devices
> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
> myself.

I do not agree that this is about legacy virtio. In my understanding
virtio-ccw simply does not need translation because CCW devices use
guest physical addresses as per architecture. It may be considered
legacy stuff form PCI perspective, but I don't think it is legacy
in general.

And there is a good reason for virtio-ccw devices to use
F_IOMMU_PLATFORM (secure execution).

Other opinions?

Regards,
Halil
Daniel Henrique Barboza Feb. 1, 2022, 7:31 p.m. UTC | #7
On 2/1/22 15:33, Halil Pasic wrote:
> On Tue, 1 Feb 2022 12:36:25 -0300
> Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
> 
>>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>        if (klass->get_dma_as != NULL && has_iommu) {
>>>            virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
>>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>>> +            error_setg(errp,
>>> +                       "iommu_platform=true is not supported by the device");
>>> +        }
>>
>>
>>>        } else {
>>>            vdev->dma_as = &address_space_memory;
>>>        }
>>
>>
>> I struggled to understand what this 'else' clause was doing and I assumed that it was
>> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
>> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
>> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
>>
>>
>> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
>> 'else' clause explaining that this is the legacy virtio devices condition and those devices
>> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
>> myself.
> 
> I do not agree that this is about legacy virtio. In my understanding
> virtio-ccw simply does not need translation because CCW devices use
> guest physical addresses as per architecture. It may be considered
> legacy stuff form PCI perspective, but I don't think it is legacy
> in general.


I wasn't talking about virtio-ccw. I was talking about this piece of code:


     if (klass->get_dma_as != NULL && has_iommu) {
         virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = klass->get_dma_as(qbus->parent);
     } else {
         vdev->dma_as = &address_space_memory;
     }


I suggested something like this:



     if (klass->get_dma_as != NULL && has_iommu) {
         virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = klass->get_dma_as(qbus->parent);
     } else {
         /*
          * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
          * devices that don't implement klass->get_dma_as, regardless of
          * 'has_iommu' setting.
          */
         vdev->dma_as = &address_space_memory;
     }


At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
aware of this history.


Thanks,


Daniel





> 
> And there is a good reason for virtio-ccw devices to use
> F_IOMMU_PLATFORM (secure execution).
> 
> Other opinions?
> 
> Regards,
> Halil
Halil Pasic Feb. 2, 2022, 1:15 a.m. UTC | #8
On Tue, 1 Feb 2022 16:31:22 -0300
Daniel Henrique Barboza <danielhb413@gmail.com> wrote:

> On 2/1/22 15:33, Halil Pasic wrote:
> > On Tue, 1 Feb 2022 12:36:25 -0300
> > Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
> >   
> >>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >>>        if (klass->get_dma_as != NULL && has_iommu) {
> >>>            virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
> >>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> >>> +            error_setg(errp,
> >>> +                       "iommu_platform=true is not supported by the device");
> >>> +        }  
> >>
> >>  
> >>>        } else {
> >>>            vdev->dma_as = &address_space_memory;
> >>>        }  
> >>
> >>
> >> I struggled to understand what this 'else' clause was doing and I assumed that it was
> >> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
> >> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
> >> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
> >>
> >>
> >> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
> >> 'else' clause explaining that this is the legacy virtio devices condition and those devices
> >> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
> >> myself.  
> > 
> > I do not agree that this is about legacy virtio. In my understanding
> > virtio-ccw simply does not need translation because CCW devices use
> > guest physical addresses as per architecture. It may be considered
> > legacy stuff form PCI perspective, but I don't think it is legacy
> > in general.  
> 
> 
> I wasn't talking about virtio-ccw. I was talking about this piece of code:
> 
> 
>      if (klass->get_dma_as != NULL && has_iommu) {
>          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>      } else {
>          vdev->dma_as = &address_space_memory;
>      }
> 
> 
> I suggested something like this:
> 
> 
> 
>      if (klass->get_dma_as != NULL && has_iommu) {
>          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>      } else {
>          /*
>           * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
>           * devices that don't implement klass->get_dma_as, regardless of
>           * 'has_iommu' setting.
>           */
>          vdev->dma_as = &address_space_memory;
>      }
> 
> 
> At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
> the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
> aware of this history.

With virtio-ccw we take the else branch because we don't implement
->get_dma_as(). I don't consider all the virtio-ccw to be legacy.

IMHO there are two ways to think about this: 
a) The commit that introduced this needs a fix which implemets
get_dma_as() for virtio-ccw in a way that it simply returns
address_space_memory.
b) The presence of ->get_dma_as() is not indicative of "legacy".

BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
mean that legacy. And if I read the virtio-pci code correctly
->get_dma_as is set for legacy, transitional and modern devices alike.

IMHO the important thing to figure out is what impact that
virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
in the first branch (of the if-else) has. IMHO if one examines the
commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
probably reach the conclusion that the objective of the latter, is
to prevent the guest form not negotiating the IOMMU_PLATFORM feature
(clearing it as part of the feature negotiation) and trying to use
the device without that feature. In other words, virtio features are
usually optional for the guest for the sake of compatibility, but
IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
nor the patch does mention legacy anywhere. 

In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when
->get_dma_as() is not set is at least unfortunate. Please observe, that
virtio-pci is not affected by this omission because for virtio-pci
devices ->get_dma_as != NULL always holds. And what is the deal for
devices that don't implement get_dma_as() (and don't need address
translation)? If iommu_platform=on is justified (no user error) then
the device does not have access to the entire guest memory. Which
means it more than likely needs cooperation form the guest (driver).
So detecting that the guest does not support IOMMU_PLATFORM and failing
gracefully via virtio_validate_features() instead of carrying on
in good faith and failing in ugly ways when the host attempts to access
guest memory to which it does not have access to. If we assume user
error, that is the host can access at least all the memory it needs
to access to make that device work, then it is probably still a
good idea to fail the device and thus help the user correct his
error.

IMHO the best course of action is
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index 34f5a0a664..1d0eb16d1c 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
 
     vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
     if (klass->get_dma_as != NULL && has_iommu) {
-        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = klass->get_dma_as(qbus->parent);
         if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
             error_setg(errp,
@@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
     } else {
         vdev->dma_as = &address_space_memory;
     }
+    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
 }

which would be a separate patch, as this is a separate issue. Jason,
Michael, Connie, what do you think?

Regards,
Halil
Michael S. Tsirkin Feb. 2, 2022, 7:06 a.m. UTC | #9
On Wed, Feb 02, 2022 at 02:15:47AM +0100, Halil Pasic wrote:
> On Tue, 1 Feb 2022 16:31:22 -0300
> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> 
> > On 2/1/22 15:33, Halil Pasic wrote:
> > > On Tue, 1 Feb 2022 12:36:25 -0300
> > > Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
> > >   
> > >>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >>>        if (klass->get_dma_as != NULL && has_iommu) {
> > >>>            virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> > >>> +            error_setg(errp,
> > >>> +                       "iommu_platform=true is not supported by the device");
> > >>> +        }  
> > >>
> > >>  
> > >>>        } else {
> > >>>            vdev->dma_as = &address_space_memory;
> > >>>        }  
> > >>
> > >>
> > >> I struggled to understand what this 'else' clause was doing and I assumed that it was
> > >> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
> > >> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
> > >> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
> > >>
> > >>
> > >> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
> > >> 'else' clause explaining that this is the legacy virtio devices condition and those devices
> > >> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
> > >> myself.  
> > > 
> > > I do not agree that this is about legacy virtio. In my understanding
> > > virtio-ccw simply does not need translation because CCW devices use
> > > guest physical addresses as per architecture. It may be considered
> > > legacy stuff form PCI perspective, but I don't think it is legacy
> > > in general.  
> > 
> > 
> > I wasn't talking about virtio-ccw. I was talking about this piece of code:
> > 
> > 
> >      if (klass->get_dma_as != NULL && has_iommu) {
> >          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >          vdev->dma_as = klass->get_dma_as(qbus->parent);
> >      } else {
> >          vdev->dma_as = &address_space_memory;
> >      }
> > 
> > 
> > I suggested something like this:
> > 
> > 
> > 
> >      if (klass->get_dma_as != NULL && has_iommu) {
> >          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >          vdev->dma_as = klass->get_dma_as(qbus->parent);
> >      } else {
> >          /*
> >           * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
> >           * devices that don't implement klass->get_dma_as, regardless of
> >           * 'has_iommu' setting.
> >           */
> >          vdev->dma_as = &address_space_memory;
> >      }
> > 
> > 
> > At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
> > the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
> > aware of this history.
> 
> With virtio-ccw we take the else branch because we don't implement
> ->get_dma_as(). I don't consider all the virtio-ccw to be legacy.
> 
> IMHO there are two ways to think about this: 
> a) The commit that introduced this needs a fix which implemets
> get_dma_as() for virtio-ccw in a way that it simply returns
> address_space_memory.
> b) The presence of ->get_dma_as() is not indicative of "legacy".
> 
> BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
> mean that legacy. And if I read the virtio-pci code correctly
> ->get_dma_as is set for legacy, transitional and modern devices alike.
> 
> IMHO the important thing to figure out is what impact that
> virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> in the first branch (of the if-else) has. IMHO if one examines the
> commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
> ("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
> probably reach the conclusion that the objective of the latter, is
> to prevent the guest form not negotiating the IOMMU_PLATFORM feature
> (clearing it as part of the feature negotiation) and trying to use
> the device without that feature. In other words, virtio features are
> usually optional for the guest for the sake of compatibility, but
> IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
> nor the patch does mention legacy anywhere. 
> 
> In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when
> ->get_dma_as() is not set is at least unfortunate. Please observe, that
> virtio-pci is not affected by this omission because for virtio-pci
> devices ->get_dma_as != NULL always holds. And what is the deal for
> devices that don't implement get_dma_as() (and don't need address
> translation)? If iommu_platform=on is justified (no user error) then
> the device does not have access to the entire guest memory. Which
> means it more than likely needs cooperation form the guest (driver).
> So detecting that the guest does not support IOMMU_PLATFORM and failing
> gracefully via virtio_validate_features() instead of carrying on
> in good faith and failing in ugly ways when the host attempts to access
> guest memory to which it does not have access to. If we assume user
> error, that is the host can access at least all the memory it needs
> to access to make that device work, then it is probably still a
> good idea to fail the device and thus help the user correct his
> error.
> 
> IMHO the best course of action is
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 34f5a0a664..1d0eb16d1c 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>  
>      vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>      if (klass->get_dma_as != NULL && has_iommu) {
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>          if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>              error_setg(errp,
> @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>      } else {
>          vdev->dma_as = &address_space_memory;
>      }
> +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>  }
> 
> which would be a separate patch, as this is a separate issue. Jason,
> Michael, Connie, what do you think?

Do you mean just force VIRTIO_F_IOMMU_PLATFORM for everyone?
Or am I misreading the patch?


> Regards,
> Halil
Halil Pasic Feb. 2, 2022, 1:16 p.m. UTC | #10
On Wed, 2 Feb 2022 02:06:12 -0500
"Michael S. Tsirkin" <mst@redhat.com> wrote:

[..]
> > In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when  
> > ->get_dma_as() is not set is at least unfortunate. Please observe, that  
> > virtio-pci is not affected by this omission because for virtio-pci
> > devices ->get_dma_as != NULL always holds. And what is the deal for
> > devices that don't implement get_dma_as() (and don't need address
> > translation)? If iommu_platform=on is justified (no user error) then
> > the device does not have access to the entire guest memory. Which
> > means it more than likely needs cooperation form the guest (driver).
> > So detecting that the guest does not support IOMMU_PLATFORM and failing
> > gracefully via virtio_validate_features() instead of carrying on
> > in good faith and failing in ugly ways when the host attempts to access
> > guest memory to which it does not have access to. If we assume user
> > error, that is the host can access at least all the memory it needs
> > to access to make that device work, then it is probably still a
> > good idea to fail the device and thus help the user correct his
> > error.
> > 
> > IMHO the best course of action is
> > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > index 34f5a0a664..1d0eb16d1c 100644
> > --- a/hw/virtio/virtio-bus.c
> > +++ b/hw/virtio/virtio-bus.c
> > @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >  
> >      vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >      if (klass->get_dma_as != NULL && has_iommu) {
> > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >          vdev->dma_as = klass->get_dma_as(qbus->parent);
> >          if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> >              error_setg(errp,
> > @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >      } else {
> >          vdev->dma_as = &address_space_memory;
> >      }
> > +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >  }
> > 
> > which would be a separate patch, as this is a separate issue. Jason,
> > Michael, Connie, what do you think?  
> 
> Do you mean just force VIRTIO_F_IOMMU_PLATFORM for everyone?
> Or am I misreading the patch?

Yes. Where force means: prevent the driver from setting FEATURES_OK
if it cleared VIRTIO_F_IOMMU_PLATFORM. I really don't see the case
where the device offering but the driver not accepting
VIRTIO_F_IOMMU_PLATFORM is good and useful.

Regards,
Halil


> 
> 
> > Regards,
> > Halil  
> 
>
Daniel Henrique Barboza Feb. 2, 2022, 1:24 p.m. UTC | #11
On 2/1/22 22:15, Halil Pasic wrote:
> On Tue, 1 Feb 2022 16:31:22 -0300
> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> 
>> On 2/1/22 15:33, Halil Pasic wrote:
>>> On Tue, 1 Feb 2022 12:36:25 -0300
>>> Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
>>>    
>>>>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>>>         if (klass->get_dma_as != NULL && has_iommu) {
>>>>>             virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>>>             vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>>>>> +            error_setg(errp,
>>>>> +                       "iommu_platform=true is not supported by the device");
>>>>> +        }
>>>>
>>>>   
>>>>>         } else {
>>>>>             vdev->dma_as = &address_space_memory;
>>>>>         }
>>>>
>>>>
>>>> I struggled to understand what this 'else' clause was doing and I assumed that it was
>>>> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
>>>> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
>>>> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
>>>>
>>>>
>>>> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
>>>> 'else' clause explaining that this is the legacy virtio devices condition and those devices
>>>> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
>>>> myself.
>>>
>>> I do not agree that this is about legacy virtio. In my understanding
>>> virtio-ccw simply does not need translation because CCW devices use
>>> guest physical addresses as per architecture. It may be considered
>>> legacy stuff form PCI perspective, but I don't think it is legacy
>>> in general.
>>
>>
>> I wasn't talking about virtio-ccw. I was talking about this piece of code:
>>
>>
>>       if (klass->get_dma_as != NULL && has_iommu) {
>>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>>       } else {
>>           vdev->dma_as = &address_space_memory;
>>       }
>>
>>
>> I suggested something like this:
>>
>>
>>
>>       if (klass->get_dma_as != NULL && has_iommu) {
>>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>>       } else {
>>           /*
>>            * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
>>            * devices that don't implement klass->get_dma_as, regardless of
>>            * 'has_iommu' setting.
>>            */
>>           vdev->dma_as = &address_space_memory;
>>       }
>>
>>
>> At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
>> the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
>> aware of this history.
> 
> With virtio-ccw we take the else branch because we don't implement
> ->get_dma_as(). I don't consider all the virtio-ccw to be legacy.
> 
> IMHO there are two ways to think about this:
> a) The commit that introduced this needs a fix which implemets
> get_dma_as() for virtio-ccw in a way that it simply returns
> address_space_memory.
> b) The presence of ->get_dma_as() is not indicative of "legacy".
> 
> BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
> mean that legacy. And if I read the virtio-pci code correctly
> ->get_dma_as is set for legacy, transitional and modern devices alike.


Oh ok. I'm not well versed into virtiospeak. My "legacy" comment was a poor choice of
word for the situation.

We can ignore the "legacy" bit. My idea/suggestion is to put a comment at that point
explaining the logic behind into not forcing VIRTIO_F_IOMMU_PLATFORM in devices that
doesn't implement ->get_dma_as().

I am assuming that this is an intended design that was introduced by 2943b53f682
("virtio: force VIRTIO_F_IOMMU_PLATFORM"), meaning that the implementation of the
->get_dma_as is being used as a parameter to force the feature in the device. And with
this code:


     if (klass->get_dma_as != NULL && has_iommu) {
         virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = klass->get_dma_as(qbus->parent);
     } else {
         vdev->dma_as = &address_space_memory;
     }

It is possible that we have 2 vdev devices where ->dma_as = &address_space_memory, but one
of them is sitting in a bus where "klass->get_dma_as(qbus->parent) = &address_space_memory",
and this device will have VIRTIO_F_IOMMU_PLATFORM forced onto it and the former won't.


If this is not an intended design I can only speculate how to fix it. Forcing VIRTIO_F_IOMMU_PLATFORM
in all devices, based only on has_iommu, can break stuff. Setting VIRTIO_F_IOMMU_PLATFORM only
if "vdev->dma_as != &address_space_memory" make some sense but I am fairly certain it will
break stuff the other way. Or perhaps the fix is something else entirely.




> 
> IMHO the important thing to figure out is what impact that
> virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> in the first branch (of the if-else) has. IMHO if one examines the
> commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
> ("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
> probably reach the conclusion that the objective of the latter, is
> to prevent the guest form not negotiating the IOMMU_PLATFORM feature
> (clearing it as part of the feature negotiation) and trying to use
> the device without that feature. In other words, virtio features are
> usually optional for the guest for the sake of compatibility, but
> IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
> nor the patch does mention legacy anywhere.
> 
> In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when
> ->get_dma_as() is not set is at least unfortunate. Please observe, that
> virtio-pci is not affected by this omission because for virtio-pci
> devices ->get_dma_as != NULL always holds. And what is the deal for
> devices that don't implement get_dma_as() (and don't need address
> translation)? If iommu_platform=on is justified (no user error) then
> the device does not have access to the entire guest memory. Which
> means it more than likely needs cooperation form the guest (driver).
> So detecting that the guest does not support IOMMU_PLATFORM and failing
> gracefully via virtio_validate_features() instead of carrying on
> in good faith and failing in ugly ways when the host attempts to access
> guest memory to which it does not have access to. If we assume user
> error, that is the host can access at least all the memory it needs
> to access to make that device work, then it is probably still a
> good idea to fail the device and thus help the user correct his
> error.

Yeah, this go back on what I've said about 2943b53f682 up there. There are assumptions
being made on the ->get_dma_as() existence that aren't clear.


> 
> IMHO the best course of action is
> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index 34f5a0a664..1d0eb16d1c 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>   
>       vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>       if (klass->get_dma_as != NULL && has_iommu) {
> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>           if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>               error_setg(errp,
> @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>       } else {
>           vdev->dma_as = &address_space_memory;
>       }
> +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>   }


I am fairly confident that forcing VIRTIO_F_IOMMU_PLATFORM all around, based on has_iommu
alone, will have consequences all around. This code has been around for almost 5 years and a
lot of stuff has been developed on top of it.

All that said, if this is the proper way of fixing it I'd say to do it now, document it properly
and fix the breakages as they come along. The alternative - hacking around and around a codebase
that might not be solid - is worse in the long run.


Thanks,


Daniel

> 
> which would be a separate patch, as this is a separate issue. Jason,
> Michael, Connie, what do you think?
> 
> Regards,
> Halil
Halil Pasic Feb. 2, 2022, 4:23 p.m. UTC | #12
On Wed, 2 Feb 2022 10:24:51 -0300
Daniel Henrique Barboza <danielhb413@gmail.com> wrote:

> On 2/1/22 22:15, Halil Pasic wrote:
> > On Tue, 1 Feb 2022 16:31:22 -0300
> > Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> >   
> >> On 2/1/22 15:33, Halil Pasic wrote:  
> >>> On Tue, 1 Feb 2022 12:36:25 -0300
> >>> Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
> >>>      
> >>>>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >>>>>         if (klass->get_dma_as != NULL && has_iommu) {
> >>>>>             virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >>>>>             vdev->dma_as = klass->get_dma_as(qbus->parent);
> >>>>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> >>>>> +            error_setg(errp,
> >>>>> +                       "iommu_platform=true is not supported by the device");
> >>>>> +        }  
> >>>>
> >>>>     
> >>>>>         } else {
> >>>>>             vdev->dma_as = &address_space_memory;
> >>>>>         }  
> >>>>
> >>>>
> >>>> I struggled to understand what this 'else' clause was doing and I assumed that it was
> >>>> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
> >>>> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
> >>>> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
> >>>>
> >>>>
> >>>> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
> >>>> 'else' clause explaining that this is the legacy virtio devices condition and those devices
> >>>> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
> >>>> myself.  
> >>>
> >>> I do not agree that this is about legacy virtio. In my understanding
> >>> virtio-ccw simply does not need translation because CCW devices use
> >>> guest physical addresses as per architecture. It may be considered
> >>> legacy stuff form PCI perspective, but I don't think it is legacy
> >>> in general.  
> >>
> >>
> >> I wasn't talking about virtio-ccw. I was talking about this piece of code:
> >>
> >>
> >>       if (klass->get_dma_as != NULL && has_iommu) {
> >>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >>           vdev->dma_as = klass->get_dma_as(qbus->parent);
> >>       } else {
> >>           vdev->dma_as = &address_space_memory;
> >>       }
> >>
> >>
> >> I suggested something like this:
> >>
> >>
> >>
> >>       if (klass->get_dma_as != NULL && has_iommu) {
> >>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >>           vdev->dma_as = klass->get_dma_as(qbus->parent);
> >>       } else {
> >>           /*
> >>            * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
> >>            * devices that don't implement klass->get_dma_as, regardless of
> >>            * 'has_iommu' setting.
> >>            */
> >>           vdev->dma_as = &address_space_memory;
> >>       }
> >>
> >>
> >> At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
> >> the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
> >> aware of this history.  
> > 
> > With virtio-ccw we take the else branch because we don't implement  
> > ->get_dma_as(). I don't consider all the virtio-ccw to be legacy.  
> > 
> > IMHO there are two ways to think about this:
> > a) The commit that introduced this needs a fix which implemets
> > get_dma_as() for virtio-ccw in a way that it simply returns
> > address_space_memory.
> > b) The presence of ->get_dma_as() is not indicative of "legacy".
> > 
> > BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
> > mean that legacy. And if I read the virtio-pci code correctly  
> > ->get_dma_as is set for legacy, transitional and modern devices alike.  
> 
> 
> Oh ok. I'm not well versed into virtiospeak. My "legacy" comment was a poor choice of
> word for the situation.
> 
> We can ignore the "legacy" bit. My idea/suggestion is to put a comment at that point
> explaining the logic behind into not forcing VIRTIO_F_IOMMU_PLATFORM in devices that
> doesn't implement ->get_dma_as().
> 
> I am assuming that this is an intended design that was introduced by 2943b53f682
> ("virtio: force VIRTIO_F_IOMMU_PLATFORM"), meaning that the implementation of the
> ->get_dma_as is being used as a parameter to force the feature in the device. And with  
> this code:
> 
> 
>      if (klass->get_dma_as != NULL && has_iommu) {
>          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>          vdev->dma_as = klass->get_dma_as(qbus->parent);
>      } else {
>          vdev->dma_as = &address_space_memory;
>      }
> 
> It is possible that we have 2 vdev devices where ->dma_as = &address_space_memory, but one
> of them is sitting in a bus where "klass->get_dma_as(qbus->parent) = &address_space_memory",
> and this device will have VIRTIO_F_IOMMU_PLATFORM forced onto it and the former won't.
> 
> 
> If this is not an intended design I can only speculate how to fix it. Forcing VIRTIO_F_IOMMU_PLATFORM
> in all devices, based only on has_iommu, can break stuff. Setting VIRTIO_F_IOMMU_PLATFORM only
> if "vdev->dma_as != &address_space_memory" make some sense but I am fairly certain it will
> break stuff the other way. Or perhaps the fix is something else entirely.
> 
> 
> 
> 
> > 
> > IMHO the important thing to figure out is what impact that
> > virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > in the first branch (of the if-else) has. IMHO if one examines the
> > commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
> > ("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
> > probably reach the conclusion that the objective of the latter, is
> > to prevent the guest form not negotiating the IOMMU_PLATFORM feature
> > (clearing it as part of the feature negotiation) and trying to use
> > the device without that feature. In other words, virtio features are
> > usually optional for the guest for the sake of compatibility, but
> > IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
> > nor the patch does mention legacy anywhere.
> > 
> > In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when  
> > ->get_dma_as() is not set is at least unfortunate. Please observe, that  
> > virtio-pci is not affected by this omission because for virtio-pci
> > devices ->get_dma_as != NULL always holds. And what is the deal for
> > devices that don't implement get_dma_as() (and don't need address
> > translation)? If iommu_platform=on is justified (no user error) then
> > the device does not have access to the entire guest memory. Which
> > means it more than likely needs cooperation form the guest (driver).
> > So detecting that the guest does not support IOMMU_PLATFORM and failing
> > gracefully via virtio_validate_features() instead of carrying on
> > in good faith and failing in ugly ways when the host attempts to access
> > guest memory to which it does not have access to. If we assume user
> > error, that is the host can access at least all the memory it needs
> > to access to make that device work, then it is probably still a
> > good idea to fail the device and thus help the user correct his
> > error.  
> 
> Yeah, this go back on what I've said about 2943b53f682 up there. There are assumptions
> being made on the ->get_dma_as() existence that aren't clear.
> 

I agree. The commit message does not explain.

> 
> > 
> > IMHO the best course of action is
> > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > index 34f5a0a664..1d0eb16d1c 100644
> > --- a/hw/virtio/virtio-bus.c
> > +++ b/hw/virtio/virtio-bus.c
> > @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >   
> >       vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> >       if (klass->get_dma_as != NULL && has_iommu) {
> > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >           vdev->dma_as = klass->get_dma_as(qbus->parent);
> >           if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> >               error_setg(errp,
> > @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> >       } else {
> >           vdev->dma_as = &address_space_memory;
> >       }
> > +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >   }  
> 
> 
> I am fairly confident that forcing VIRTIO_F_IOMMU_PLATFORM all around, based on has_iommu

Yes I should have made that conditional on has_iommu. It was very late
when I finished that email.

> alone, will have consequences all around. This code has been around for almost 5 years and a
> lot of stuff has been developed on top of it.
> 

Do you have any particular examples in mind?

> All that said, if this is the proper way of fixing it I'd say to do it now, document it properly
> and fix the breakages as they come along. The alternative - hacking around and around a codebase
> that might not be solid - is worse in the long run.

IMHO this is a good discussion you triggered. But I see it out of scope
for the bug I'm trying to fix.

I can post a proper patch for "IOMMU_PLATFORM is non-negotiable for
all guests" and we can have proper review and discussion on that. But
I would like the bug I'm working on here fixed first. There are
people that want to use virtiofs with confidential guests, and
we should really make sure they can.

Regards,
Halil
Daniel Henrique Barboza Feb. 2, 2022, 4:27 p.m. UTC | #13
On 2/2/22 13:23, Halil Pasic wrote:
> On Wed, 2 Feb 2022 10:24:51 -0300
> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> 
>> On 2/1/22 22:15, Halil Pasic wrote:
>>> On Tue, 1 Feb 2022 16:31:22 -0300
>>> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
>>>    
>>>> On 2/1/22 15:33, Halil Pasic wrote:
>>>>> On Tue, 1 Feb 2022 12:36:25 -0300
>>>>> Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
>>>>>       
>>>>>>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>>>>>          if (klass->get_dma_as != NULL && has_iommu) {
>>>>>>>              virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>>>>>              vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>>>>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>>>>>>> +            error_setg(errp,
>>>>>>> +                       "iommu_platform=true is not supported by the device");
>>>>>>> +        }
>>>>>>
>>>>>>      
>>>>>>>          } else {
>>>>>>>              vdev->dma_as = &address_space_memory;
>>>>>>>          }
>>>>>>
>>>>>>
>>>>>> I struggled to understand what this 'else' clause was doing and I assumed that it was
>>>>>> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
>>>>>> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
>>>>>> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
>>>>>>
>>>>>>
>>>>>> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
>>>>>> 'else' clause explaining that this is the legacy virtio devices condition and those devices
>>>>>> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
>>>>>> myself.
>>>>>
>>>>> I do not agree that this is about legacy virtio. In my understanding
>>>>> virtio-ccw simply does not need translation because CCW devices use
>>>>> guest physical addresses as per architecture. It may be considered
>>>>> legacy stuff form PCI perspective, but I don't think it is legacy
>>>>> in general.
>>>>
>>>>
>>>> I wasn't talking about virtio-ccw. I was talking about this piece of code:
>>>>
>>>>
>>>>        if (klass->get_dma_as != NULL && has_iommu) {
>>>>            virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>>        } else {
>>>>            vdev->dma_as = &address_space_memory;
>>>>        }
>>>>
>>>>
>>>> I suggested something like this:
>>>>
>>>>
>>>>
>>>>        if (klass->get_dma_as != NULL && has_iommu) {
>>>>            virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>>        } else {
>>>>            /*
>>>>             * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
>>>>             * devices that don't implement klass->get_dma_as, regardless of
>>>>             * 'has_iommu' setting.
>>>>             */
>>>>            vdev->dma_as = &address_space_memory;
>>>>        }
>>>>
>>>>
>>>> At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
>>>> the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
>>>> aware of this history.
>>>
>>> With virtio-ccw we take the else branch because we don't implement
>>> ->get_dma_as(). I don't consider all the virtio-ccw to be legacy.
>>>
>>> IMHO there are two ways to think about this:
>>> a) The commit that introduced this needs a fix which implemets
>>> get_dma_as() for virtio-ccw in a way that it simply returns
>>> address_space_memory.
>>> b) The presence of ->get_dma_as() is not indicative of "legacy".
>>>
>>> BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
>>> mean that legacy. And if I read the virtio-pci code correctly
>>> ->get_dma_as is set for legacy, transitional and modern devices alike.
>>
>>
>> Oh ok. I'm not well versed into virtiospeak. My "legacy" comment was a poor choice of
>> word for the situation.
>>
>> We can ignore the "legacy" bit. My idea/suggestion is to put a comment at that point
>> explaining the logic behind into not forcing VIRTIO_F_IOMMU_PLATFORM in devices that
>> doesn't implement ->get_dma_as().
>>
>> I am assuming that this is an intended design that was introduced by 2943b53f682
>> ("virtio: force VIRTIO_F_IOMMU_PLATFORM"), meaning that the implementation of the
>> ->get_dma_as is being used as a parameter to force the feature in the device. And with
>> this code:
>>
>>
>>       if (klass->get_dma_as != NULL && has_iommu) {
>>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>           vdev->dma_as = klass->get_dma_as(qbus->parent);
>>       } else {
>>           vdev->dma_as = &address_space_memory;
>>       }
>>
>> It is possible that we have 2 vdev devices where ->dma_as = &address_space_memory, but one
>> of them is sitting in a bus where "klass->get_dma_as(qbus->parent) = &address_space_memory",
>> and this device will have VIRTIO_F_IOMMU_PLATFORM forced onto it and the former won't.
>>
>>
>> If this is not an intended design I can only speculate how to fix it. Forcing VIRTIO_F_IOMMU_PLATFORM
>> in all devices, based only on has_iommu, can break stuff. Setting VIRTIO_F_IOMMU_PLATFORM only
>> if "vdev->dma_as != &address_space_memory" make some sense but I am fairly certain it will
>> break stuff the other way. Or perhaps the fix is something else entirely.
>>
>>
>>
>>
>>>
>>> IMHO the important thing to figure out is what impact that
>>> virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>> in the first branch (of the if-else) has. IMHO if one examines the
>>> commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
>>> ("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
>>> probably reach the conclusion that the objective of the latter, is
>>> to prevent the guest form not negotiating the IOMMU_PLATFORM feature
>>> (clearing it as part of the feature negotiation) and trying to use
>>> the device without that feature. In other words, virtio features are
>>> usually optional for the guest for the sake of compatibility, but
>>> IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
>>> nor the patch does mention legacy anywhere.
>>>
>>> In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when
>>> ->get_dma_as() is not set is at least unfortunate. Please observe, that
>>> virtio-pci is not affected by this omission because for virtio-pci
>>> devices ->get_dma_as != NULL always holds. And what is the deal for
>>> devices that don't implement get_dma_as() (and don't need address
>>> translation)? If iommu_platform=on is justified (no user error) then
>>> the device does not have access to the entire guest memory. Which
>>> means it more than likely needs cooperation form the guest (driver).
>>> So detecting that the guest does not support IOMMU_PLATFORM and failing
>>> gracefully via virtio_validate_features() instead of carrying on
>>> in good faith and failing in ugly ways when the host attempts to access
>>> guest memory to which it does not have access to. If we assume user
>>> error, that is the host can access at least all the memory it needs
>>> to access to make that device work, then it is probably still a
>>> good idea to fail the device and thus help the user correct his
>>> error.
>>
>> Yeah, this go back on what I've said about 2943b53f682 up there. There are assumptions
>> being made on the ->get_dma_as() existence that aren't clear.
>>
> 
> I agree. The commit message does not explain.
> 
>>
>>>
>>> IMHO the best course of action is
>>> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
>>> index 34f5a0a664..1d0eb16d1c 100644
>>> --- a/hw/virtio/virtio-bus.c
>>> +++ b/hw/virtio/virtio-bus.c
>>> @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>    
>>>        vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
>>>        if (klass->get_dma_as != NULL && has_iommu) {
>>> -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>            vdev->dma_as = klass->get_dma_as(qbus->parent);
>>>            if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
>>>                error_setg(errp,
>>> @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
>>>        } else {
>>>            vdev->dma_as = &address_space_memory;
>>>        }
>>> +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
>>>    }
>>
>>
>> I am fairly confident that forcing VIRTIO_F_IOMMU_PLATFORM all around, based on has_iommu
> 
> Yes I should have made that conditional on has_iommu. It was very late
> when I finished that email.
> 
>> alone, will have consequences all around. This code has been around for almost 5 years and a
>> lot of stuff has been developed on top of it.
>>
> 
> Do you have any particular examples in mind?
> 
>> All that said, if this is the proper way of fixing it I'd say to do it now, document it properly
>> and fix the breakages as they come along. The alternative - hacking around and around a codebase
>> that might not be solid - is worse in the long run.
> 
> IMHO this is a good discussion you triggered. But I see it out of scope
> for the bug I'm trying to fix.

Agree. I'll re-ack the patch given that I did it from another email that isn't
in QEMU devel. All this discussion is pertinent to a separated work.


Thanks,


Daniel


> 
> I can post a proper patch for "IOMMU_PLATFORM is non-negotiable for
> all guests" and we can have proper review and discussion on that. But
> I would like the bug I'm working on here fixed first. There are
> people that want to use virtiofs with confidential guests, and
> we should really make sure they can.
> 
> Regards,
> Halil
Michael S. Tsirkin Feb. 2, 2022, 4:50 p.m. UTC | #14
On Wed, Feb 02, 2022 at 05:23:53PM +0100, Halil Pasic wrote:
> On Wed, 2 Feb 2022 10:24:51 -0300
> Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> 
> > On 2/1/22 22:15, Halil Pasic wrote:
> > > On Tue, 1 Feb 2022 16:31:22 -0300
> > > Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
> > >   
> > >> On 2/1/22 15:33, Halil Pasic wrote:  
> > >>> On Tue, 1 Feb 2022 12:36:25 -0300
> > >>> Daniel Henrique Barboza <danielhb@linux.ibm.com> wrote:
> > >>>      
> > >>>>> +    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >>>>>         if (klass->get_dma_as != NULL && has_iommu) {
> > >>>>>             virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >>>>>             vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >>>>> +        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> > >>>>> +            error_setg(errp,
> > >>>>> +                       "iommu_platform=true is not supported by the device");
> > >>>>> +        }  
> > >>>>
> > >>>>     
> > >>>>>         } else {
> > >>>>>             vdev->dma_as = &address_space_memory;
> > >>>>>         }  
> > >>>>
> > >>>>
> > >>>> I struggled to understand what this 'else' clause was doing and I assumed that it was
> > >>>> wrong. Searching through the ML I learned that this 'else' clause is intended to handle
> > >>>> legacy virtio devices that doesn't support the DMA API (introduced in 8607f5c3072caeebb)
> > >>>> and thus shouldn't set  VIRTIO_F_IOMMU_PLATFORM.
> > >>>>
> > >>>>
> > >>>> My suggestion, if a v4 is required for any other reason, is to add a small comment in this
> > >>>> 'else' clause explaining that this is the legacy virtio devices condition and those devices
> > >>>> don't set F_IOMMU_PLATFORM. This would make the code easier to read for a virtio casual like
> > >>>> myself.  
> > >>>
> > >>> I do not agree that this is about legacy virtio. In my understanding
> > >>> virtio-ccw simply does not need translation because CCW devices use
> > >>> guest physical addresses as per architecture. It may be considered
> > >>> legacy stuff form PCI perspective, but I don't think it is legacy
> > >>> in general.  
> > >>
> > >>
> > >> I wasn't talking about virtio-ccw. I was talking about this piece of code:
> > >>
> > >>
> > >>       if (klass->get_dma_as != NULL && has_iommu) {
> > >>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >>           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >>       } else {
> > >>           vdev->dma_as = &address_space_memory;
> > >>       }
> > >>
> > >>
> > >> I suggested something like this:
> > >>
> > >>
> > >>
> > >>       if (klass->get_dma_as != NULL && has_iommu) {
> > >>           virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >>           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >>       } else {
> > >>           /*
> > >>            * We don't force VIRTIO_F_IOMMU_PLATFORM for legacy devices, i.e.
> > >>            * devices that don't implement klass->get_dma_as, regardless of
> > >>            * 'has_iommu' setting.
> > >>            */
> > >>           vdev->dma_as = &address_space_memory;
> > >>       }
> > >>
> > >>
> > >> At least from my reading of commits 8607f5c3072 and 2943b53f682 this seems to be
> > >> the case. I spent some time thinking that this IF/ELSE was wrong because I wasn't
> > >> aware of this history.  
> > > 
> > > With virtio-ccw we take the else branch because we don't implement  
> > > ->get_dma_as(). I don't consider all the virtio-ccw to be legacy.  
> > > 
> > > IMHO there are two ways to think about this:
> > > a) The commit that introduced this needs a fix which implemets
> > > get_dma_as() for virtio-ccw in a way that it simply returns
> > > address_space_memory.
> > > b) The presence of ->get_dma_as() is not indicative of "legacy".
> > > 
> > > BTW in virtospeak "legacy" has a special meaning: pre-1.0 virtio. Do you
> > > mean that legacy. And if I read the virtio-pci code correctly  
> > > ->get_dma_as is set for legacy, transitional and modern devices alike.  
> > 
> > 
> > Oh ok. I'm not well versed into virtiospeak. My "legacy" comment was a poor choice of
> > word for the situation.
> > 
> > We can ignore the "legacy" bit. My idea/suggestion is to put a comment at that point
> > explaining the logic behind into not forcing VIRTIO_F_IOMMU_PLATFORM in devices that
> > doesn't implement ->get_dma_as().
> > 
> > I am assuming that this is an intended design that was introduced by 2943b53f682
> > ("virtio: force VIRTIO_F_IOMMU_PLATFORM"), meaning that the implementation of the
> > ->get_dma_as is being used as a parameter to force the feature in the device. And with  
> > this code:
> > 
> > 
> >      if (klass->get_dma_as != NULL && has_iommu) {
> >          virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> >          vdev->dma_as = klass->get_dma_as(qbus->parent);
> >      } else {
> >          vdev->dma_as = &address_space_memory;
> >      }
> > 
> > It is possible that we have 2 vdev devices where ->dma_as = &address_space_memory, but one
> > of them is sitting in a bus where "klass->get_dma_as(qbus->parent) = &address_space_memory",
> > and this device will have VIRTIO_F_IOMMU_PLATFORM forced onto it and the former won't.
> > 
> > 
> > If this is not an intended design I can only speculate how to fix it. Forcing VIRTIO_F_IOMMU_PLATFORM
> > in all devices, based only on has_iommu, can break stuff. Setting VIRTIO_F_IOMMU_PLATFORM only
> > if "vdev->dma_as != &address_space_memory" make some sense but I am fairly certain it will
> > break stuff the other way. Or perhaps the fix is something else entirely.
> > 
> > 
> > 
> > 
> > > 
> > > IMHO the important thing to figure out is what impact that
> > > virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > > in the first branch (of the if-else) has. IMHO if one examines the
> > > commits 8607f5c307 ("virtio: convert to use DMA api") and 2943b53f68
> > > ("virtio: force VIRTIO_F_IOMMU_PLATFORM") very carefully, one will
> > > probably reach the conclusion that the objective of the latter, is
> > > to prevent the guest form not negotiating the IOMMU_PLATFORM feature
> > > (clearing it as part of the feature negotiation) and trying to use
> > > the device without that feature. In other words, virtio features are
> > > usually optional for the guest for the sake of compatibility, but
> > > IOMMU_PLATFORM is not: for very good reasons. Neither the commit message
> > > nor the patch does mention legacy anywhere.
> > > 
> > > In my opinion not forcing the guest to negotiate IOMMU_PLATFORM when  
> > > ->get_dma_as() is not set is at least unfortunate. Please observe, that  
> > > virtio-pci is not affected by this omission because for virtio-pci
> > > devices ->get_dma_as != NULL always holds. And what is the deal for
> > > devices that don't implement get_dma_as() (and don't need address
> > > translation)? If iommu_platform=on is justified (no user error) then
> > > the device does not have access to the entire guest memory. Which
> > > means it more than likely needs cooperation form the guest (driver).
> > > So detecting that the guest does not support IOMMU_PLATFORM and failing
> > > gracefully via virtio_validate_features() instead of carrying on
> > > in good faith and failing in ugly ways when the host attempts to access
> > > guest memory to which it does not have access to. If we assume user
> > > error, that is the host can access at least all the memory it needs
> > > to access to make that device work, then it is probably still a
> > > good idea to fail the device and thus help the user correct his
> > > error.  
> > 
> > Yeah, this go back on what I've said about 2943b53f682 up there. There are assumptions
> > being made on the ->get_dma_as() existence that aren't clear.
> > 
> 
> I agree. The commit message does not explain.
> 
> > 
> > > 
> > > IMHO the best course of action is
> > > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > > index 34f5a0a664..1d0eb16d1c 100644
> > > --- a/hw/virtio/virtio-bus.c
> > > +++ b/hw/virtio/virtio-bus.c
> > > @@ -80,7 +80,6 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >   
> > >       vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > >       if (klass->get_dma_as != NULL && has_iommu) {
> > > -        virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >           vdev->dma_as = klass->get_dma_as(qbus->parent);
> > >           if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
> > >               error_setg(errp,
> > > @@ -89,6 +88,7 @@ void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
> > >       } else {
> > >           vdev->dma_as = &address_space_memory;
> > >       }
> > > +    virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
> > >   }  
> > 
> > 
> > I am fairly confident that forcing VIRTIO_F_IOMMU_PLATFORM all around, based on has_iommu
> 
> Yes I should have made that conditional on has_iommu. It was very late
> when I finished that email.
> 
> > alone, will have consequences all around. This code has been around for almost 5 years and a
> > lot of stuff has been developed on top of it.
> > 
> 
> Do you have any particular examples in mind?
> 
> > All that said, if this is the proper way of fixing it I'd say to do it now, document it properly
> > and fix the breakages as they come along. The alternative - hacking around and around a codebase
> > that might not be solid - is worse in the long run.
> 
> IMHO this is a good discussion you triggered. But I see it out of scope
> for the bug I'm trying to fix.
> 
> I can post a proper patch for "IOMMU_PLATFORM is non-negotiable for
> all guests" and we can have proper review and discussion on that. But
> I would like the bug I'm working on here fixed first. There are
> people that want to use virtiofs with confidential guests, and
> we should really make sure they can.
> 
> Regards,
> Halil

I think I second that.
diff mbox series

Patch

diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index d23db98c56..34f5a0a664 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -48,6 +48,7 @@  void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
     VirtioBusClass *klass = VIRTIO_BUS_GET_CLASS(bus);
     VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
     bool has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
+    bool vdev_has_iommu = false;
     Error *local_err = NULL;
 
     DPRINTF("%s: plug device.\n", qbus->name);
@@ -69,11 +70,6 @@  void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
         return;
     }
 
-    if (has_iommu && !virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
-        error_setg(errp, "iommu_platform=true is not supported by the device");
-        return;
-    }
-
     if (klass->device_plugged != NULL) {
         klass->device_plugged(qbus->parent, &local_err);
     }
@@ -82,9 +78,14 @@  void virtio_bus_device_plugged(VirtIODevice *vdev, Error **errp)
         return;
     }
 
+    vdev_has_iommu = virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
     if (klass->get_dma_as != NULL && has_iommu) {
         virtio_add_feature(&vdev->host_features, VIRTIO_F_IOMMU_PLATFORM);
         vdev->dma_as = klass->get_dma_as(qbus->parent);
+        if (!vdev_has_iommu && vdev->dma_as != &address_space_memory) {
+            error_setg(errp,
+                       "iommu_platform=true is not supported by the device");
+        }
     } else {
         vdev->dma_as = &address_space_memory;
     }