mbox series

[00/21] Control VQ support in vDPA

Message ID 20201216064818.48239-1-jasowang@redhat.com (mailing list archive)
Headers show
Series Control VQ support in vDPA | expand

Message

Jason Wang Dec. 16, 2020, 6:47 a.m. UTC
Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accesing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
                 of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
                 address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
  advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
  simply advertise a single virtqueue group that contains all
  virtqueues. Or if it wants a software emulated control virtqueue, it
  can advertise two virtqueue groups, one is for cvq, another is for
  the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
  the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue.

Please review.

Changes since RFC:

- tweak vhost uAPI documentation
- switch to use device specific IOTLB really in patch 4
- tweak the commit log
- fix that ASID in vhost is claimed to be 32 actually but 16bit
  actually
- fix use after free when using ASID with IOTLB batching requests
- switch to use Stefano's patch for having separated iov
- remove unused "used_as" variable
- fix the iotlb/asid checking in vhost_vdpa_unmap()

Thanks

Jason Wang (20):
  vhost: move the backend feature bits to vhost_types.h
  virtio-vdpa: don't set callback if virtio doesn't need it
  vhost-vdpa: passing iotlb to IOMMU mapping helpers
  vhost-vdpa: switch to use vhost-vdpa specific IOTLB
  vdpa: add the missing comment for nvqs in struct vdpa_device
  vdpa: introduce virtqueue groups
  vdpa: multiple address spaces support
  vdpa: introduce config operations for associating ASID to a virtqueue
    group
  vhost_iotlb: split out IOTLB initialization
  vhost: support ASID in IOTLB API
  vhost-vdpa: introduce asid based IOTLB
  vhost-vdpa: introduce uAPI to get the number of virtqueue groups
  vhost-vdpa: introduce uAPI to get the number of address spaces
  vhost-vdpa: uAPI to get virtqueue group id
  vhost-vdpa: introduce uAPI to set group ASID
  vhost-vdpa: support ASID based IOTLB API
  vdpa_sim: advertise VIRTIO_NET_F_MTU
  vdpa_sim: factor out buffer completion logic
  vdpa_sim: filter destination mac address
  vdpasim: control virtqueue support

Stefano Garzarella (1):
  vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov

 drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
 drivers/vdpa/vdpa.c               |   8 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
 drivers/vhost/iotlb.c             |  23 ++-
 drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
 drivers/vhost/vhost.c             |  23 ++-
 drivers/vhost/vhost.h             |   4 +-
 drivers/virtio/virtio_vdpa.c      |   2 +-
 include/linux/vdpa.h              |  42 ++++-
 include/linux/vhost_iotlb.h       |   2 +
 include/uapi/linux/vhost.h        |  25 ++-
 include/uapi/linux/vhost_types.h  |  10 +-
 13 files changed, 561 insertions(+), 136 deletions(-)

Comments

Michael S. Tsirkin Dec. 16, 2020, 9:47 a.m. UTC | #1
On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
> Hi All:
> 
> This series tries to add the support for control virtqueue in vDPA.
> 
> Control virtqueue is used by networking device for accepting various
> commands from the driver. It's a must to support multiqueue and other
> configurations.
> 
> When used by vhost-vDPA bus driver for VM, the control virtqueue
> should be shadowed via userspace VMM (Qemu) instead of being assigned
> directly to Guest. This is because Qemu needs to know the device state
> in order to start and stop device correctly (e.g for Live Migration).
> 
> This requies to isolate the memory mapping for control virtqueue
> presented by vhost-vDPA to prevent guest from accesing it directly.
> To achieve this, vDPA introduce two new abstractions:
> 
> - address space: identified through address space id (ASID) and a set
>                  of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
>                  address space

How will this support the pretty common case where control vq
is programmed by the kernel through the PF, and others by the VFs?


I actually thought the way to support it is by exposing
something like an "inject buffers" API which sends data to a given VQ.
Maybe an ioctl, and maybe down the road uio ring can support batching
these ....


> 
> Device needs to advertise the following attributes to vDPA:
> 
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
> 
> The mappings from virtqueue to virtqueue groups is fixed and defined
> by vDPA device driver. E.g:
> 
> - For the device that has hardware ASID support, it can simply
>   advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
>   simply advertise a single virtqueue group that contains all
>   virtqueues. Or if it wants a software emulated control virtqueue, it
>   can advertise two virtqueue groups, one is for cvq, another is for
>   the rest virtqueues.
> 
> vDPA also allow to change the association between virtqueue group and
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue
> group to isolate the memory mapping.
> 
> The vhost/vhost-vDPA is also extend for the userspace to:
> 
> - query the number of virtqueue groups and address spaces supported by
>   the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
> 
> This will help userspace VMM(Qemu) to detect whether the control vq
> could be supported and isolate memory mappings of control virtqueue
> from the others.
> 
> To demonstrate the usage, vDPA simulator is extended to support
> setting MAC address via a emulated control virtqueue.
> 
> Please review.
> 
> Changes since RFC:
> 
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>   actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
> 
> Thanks
> 
> Jason Wang (20):
>   vhost: move the backend feature bits to vhost_types.h
>   virtio-vdpa: don't set callback if virtio doesn't need it
>   vhost-vdpa: passing iotlb to IOMMU mapping helpers
>   vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>   vdpa: add the missing comment for nvqs in struct vdpa_device
>   vdpa: introduce virtqueue groups
>   vdpa: multiple address spaces support
>   vdpa: introduce config operations for associating ASID to a virtqueue
>     group
>   vhost_iotlb: split out IOTLB initialization
>   vhost: support ASID in IOTLB API
>   vhost-vdpa: introduce asid based IOTLB
>   vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>   vhost-vdpa: introduce uAPI to get the number of address spaces
>   vhost-vdpa: uAPI to get virtqueue group id
>   vhost-vdpa: introduce uAPI to set group ASID
>   vhost-vdpa: support ASID based IOTLB API
>   vdpa_sim: advertise VIRTIO_NET_F_MTU
>   vdpa_sim: factor out buffer completion logic
>   vdpa_sim: filter destination mac address
>   vdpasim: control virtqueue support
> 
> Stefano Garzarella (1):
>   vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
> 
>  drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
>  drivers/vdpa/vdpa.c               |   8 +-
>  drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
>  drivers/vhost/iotlb.c             |  23 ++-
>  drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
>  drivers/vhost/vhost.c             |  23 ++-
>  drivers/vhost/vhost.h             |   4 +-
>  drivers/virtio/virtio_vdpa.c      |   2 +-
>  include/linux/vdpa.h              |  42 ++++-
>  include/linux/vhost_iotlb.h       |   2 +
>  include/uapi/linux/vhost.h        |  25 ++-
>  include/uapi/linux/vhost_types.h  |  10 +-
>  13 files changed, 561 insertions(+), 136 deletions(-)
> 
> -- 
> 2.25.1
Jason Wang Dec. 17, 2020, 3:30 a.m. UTC | #2
On 2020/12/16 下午5:47, Michael S. Tsirkin wrote:
> On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
>> Hi All:
>>
>> This series tries to add the support for control virtqueue in vDPA.
>>
>> Control virtqueue is used by networking device for accepting various
>> commands from the driver. It's a must to support multiqueue and other
>> configurations.
>>
>> When used by vhost-vDPA bus driver for VM, the control virtqueue
>> should be shadowed via userspace VMM (Qemu) instead of being assigned
>> directly to Guest. This is because Qemu needs to know the device state
>> in order to start and stop device correctly (e.g for Live Migration).
>>
>> This requies to isolate the memory mapping for control virtqueue
>> presented by vhost-vDPA to prevent guest from accesing it directly.
>> To achieve this, vDPA introduce two new abstractions:
>>
>> - address space: identified through address space id (ASID) and a set
>>                   of memory mapping in maintained
>> - virtqueue group: the minimal set of virtqueues that must share an
>>                   address space
> How will this support the pretty common case where control vq
> is programmed by the kernel through the PF, and others by the VFs?


In this case, the VF parent need to provide a software control vq and 
decode the command then send them to VF.


>
>
> I actually thought the way to support it is by exposing
> something like an "inject buffers" API which sends data to a given VQ.
> Maybe an ioctl, and maybe down the road uio ring can support batching
> these ....


So the virtuqueue allows the request to be processed asynchronously (e.g 
driver may choose to use interrupt for control vq). This means we need 
to support that in uAPI level. And if we manage to do that, it's just 
another type of virtqueue.

For virtio-vDPA, this also means the extensions for queue processing 
which is a functional duplication. Using what proposed in this series, 
we don't need any changes for kernel virtio drivers.

What's more important, this series could be used for future features 
that requires DMA isolation between virtqueues:

- report dirty pages via virtqueue
- sub function level device slicing

...

Thanks


>
>
>> Device needs to advertise the following attributes to vDPA:
>>
>> - the number of address spaces supported in the device
>> - the number of virtqueue groups supported in the device
>> - the mappings from a specific virtqueue to its virtqueue groups
>>
>> The mappings from virtqueue to virtqueue groups is fixed and defined
>> by vDPA device driver. E.g:
>>
>> - For the device that has hardware ASID support, it can simply
>>    advertise a per virtqueue virtqueue group.
>> - For the device that does not have hardware ASID support, it can
>>    simply advertise a single virtqueue group that contains all
>>    virtqueues. Or if it wants a software emulated control virtqueue, it
>>    can advertise two virtqueue groups, one is for cvq, another is for
>>    the rest virtqueues.
>>
>> vDPA also allow to change the association between virtqueue group and
>> address space. So in the case of control virtqueue, userspace
>> VMM(Qemu) may use a dedicated address space for the control virtqueue
>> group to isolate the memory mapping.
>>
>> The vhost/vhost-vDPA is also extend for the userspace to:
>>
>> - query the number of virtqueue groups and address spaces supported by
>>    the device
>> - query the virtqueue group for a specific virtqueue
>> - assocaite a virtqueue group with an address space
>> - send ASID based IOTLB commands
>>
>> This will help userspace VMM(Qemu) to detect whether the control vq
>> could be supported and isolate memory mappings of control virtqueue
>> from the others.
>>
>> To demonstrate the usage, vDPA simulator is extended to support
>> setting MAC address via a emulated control virtqueue.
>>
>> Please review.
>>
>> Changes since RFC:
>>
>> - tweak vhost uAPI documentation
>> - switch to use device specific IOTLB really in patch 4
>> - tweak the commit log
>> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>>    actually
>> - fix use after free when using ASID with IOTLB batching requests
>> - switch to use Stefano's patch for having separated iov
>> - remove unused "used_as" variable
>> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>>
>> Thanks
>>
>> Jason Wang (20):
>>    vhost: move the backend feature bits to vhost_types.h
>>    virtio-vdpa: don't set callback if virtio doesn't need it
>>    vhost-vdpa: passing iotlb to IOMMU mapping helpers
>>    vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>>    vdpa: add the missing comment for nvqs in struct vdpa_device
>>    vdpa: introduce virtqueue groups
>>    vdpa: multiple address spaces support
>>    vdpa: introduce config operations for associating ASID to a virtqueue
>>      group
>>    vhost_iotlb: split out IOTLB initialization
>>    vhost: support ASID in IOTLB API
>>    vhost-vdpa: introduce asid based IOTLB
>>    vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>>    vhost-vdpa: introduce uAPI to get the number of address spaces
>>    vhost-vdpa: uAPI to get virtqueue group id
>>    vhost-vdpa: introduce uAPI to set group ASID
>>    vhost-vdpa: support ASID based IOTLB API
>>    vdpa_sim: advertise VIRTIO_NET_F_MTU
>>    vdpa_sim: factor out buffer completion logic
>>    vdpa_sim: filter destination mac address
>>    vdpasim: control virtqueue support
>>
>> Stefano Garzarella (1):
>>    vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
>>
>>   drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
>>   drivers/vdpa/vdpa.c               |   8 +-
>>   drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
>>   drivers/vhost/iotlb.c             |  23 ++-
>>   drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
>>   drivers/vhost/vhost.c             |  23 ++-
>>   drivers/vhost/vhost.h             |   4 +-
>>   drivers/virtio/virtio_vdpa.c      |   2 +-
>>   include/linux/vdpa.h              |  42 ++++-
>>   include/linux/vhost_iotlb.h       |   2 +
>>   include/uapi/linux/vhost.h        |  25 ++-
>>   include/uapi/linux/vhost_types.h  |  10 +-
>>   13 files changed, 561 insertions(+), 136 deletions(-)
>>
>> -- 
>> 2.25.1
Eli Cohen Dec. 17, 2020, 7:26 a.m. UTC | #3
On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:

Hi Jason,
I saw the patchset and will start reviewing it starting Dec 27. I am out
of office next week.

> Hi All:
> 
> This series tries to add the support for control virtqueue in vDPA.
> 
> Control virtqueue is used by networking device for accepting various
> commands from the driver. It's a must to support multiqueue and other
> configurations.
> 
> When used by vhost-vDPA bus driver for VM, the control virtqueue
> should be shadowed via userspace VMM (Qemu) instead of being assigned
> directly to Guest. This is because Qemu needs to know the device state
> in order to start and stop device correctly (e.g for Live Migration).
> 
> This requies to isolate the memory mapping for control virtqueue
> presented by vhost-vDPA to prevent guest from accesing it directly.
> 
> To achieve this, vDPA introduce two new abstractions:
> 
> - address space: identified through address space id (ASID) and a set
>                  of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
>                  address space
> 
> Device needs to advertise the following attributes to vDPA:
> 
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
> 
> The mappings from virtqueue to virtqueue groups is fixed and defined
> by vDPA device driver. E.g:
> 
> - For the device that has hardware ASID support, it can simply
>   advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
>   simply advertise a single virtqueue group that contains all
>   virtqueues. Or if it wants a software emulated control virtqueue, it
>   can advertise two virtqueue groups, one is for cvq, another is for
>   the rest virtqueues.
> 
> vDPA also allow to change the association between virtqueue group and
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue
> group to isolate the memory mapping.
> 
> The vhost/vhost-vDPA is also extend for the userspace to:
> 
> - query the number of virtqueue groups and address spaces supported by
>   the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
> 
> This will help userspace VMM(Qemu) to detect whether the control vq
> could be supported and isolate memory mappings of control virtqueue
> from the others.
> 
> To demonstrate the usage, vDPA simulator is extended to support
> setting MAC address via a emulated control virtqueue.
> 
> Please review.
> 
> Changes since RFC:
> 
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>   actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
> 
> Thanks
> 
> Jason Wang (20):
>   vhost: move the backend feature bits to vhost_types.h
>   virtio-vdpa: don't set callback if virtio doesn't need it
>   vhost-vdpa: passing iotlb to IOMMU mapping helpers
>   vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>   vdpa: add the missing comment for nvqs in struct vdpa_device
>   vdpa: introduce virtqueue groups
>   vdpa: multiple address spaces support
>   vdpa: introduce config operations for associating ASID to a virtqueue
>     group
>   vhost_iotlb: split out IOTLB initialization
>   vhost: support ASID in IOTLB API
>   vhost-vdpa: introduce asid based IOTLB
>   vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>   vhost-vdpa: introduce uAPI to get the number of address spaces
>   vhost-vdpa: uAPI to get virtqueue group id
>   vhost-vdpa: introduce uAPI to set group ASID
>   vhost-vdpa: support ASID based IOTLB API
>   vdpa_sim: advertise VIRTIO_NET_F_MTU
>   vdpa_sim: factor out buffer completion logic
>   vdpa_sim: filter destination mac address
>   vdpasim: control virtqueue support
> 
> Stefano Garzarella (1):
>   vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
> 
>  drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
>  drivers/vdpa/vdpa.c               |   8 +-
>  drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
>  drivers/vhost/iotlb.c             |  23 ++-
>  drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
>  drivers/vhost/vhost.c             |  23 ++-
>  drivers/vhost/vhost.h             |   4 +-
>  drivers/virtio/virtio_vdpa.c      |   2 +-
>  include/linux/vdpa.h              |  42 ++++-
>  include/linux/vhost_iotlb.h       |   2 +
>  include/uapi/linux/vhost.h        |  25 ++-
>  include/uapi/linux/vhost_types.h  |  10 +-
>  13 files changed, 561 insertions(+), 136 deletions(-)
> 
> -- 
> 2.25.1
>
Michael S. Tsirkin Dec. 17, 2020, 7:58 a.m. UTC | #4
On Thu, Dec 17, 2020 at 11:30:18AM +0800, Jason Wang wrote:
> 
> On 2020/12/16 下午5:47, Michael S. Tsirkin wrote:
> > On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
> > > Hi All:
> > > 
> > > This series tries to add the support for control virtqueue in vDPA.
> > > 
> > > Control virtqueue is used by networking device for accepting various
> > > commands from the driver. It's a must to support multiqueue and other
> > > configurations.
> > > 
> > > When used by vhost-vDPA bus driver for VM, the control virtqueue
> > > should be shadowed via userspace VMM (Qemu) instead of being assigned
> > > directly to Guest. This is because Qemu needs to know the device state
> > > in order to start and stop device correctly (e.g for Live Migration).
> > > 
> > > This requies to isolate the memory mapping for control virtqueue
> > > presented by vhost-vDPA to prevent guest from accesing it directly.
> > > To achieve this, vDPA introduce two new abstractions:
> > > 
> > > - address space: identified through address space id (ASID) and a set
> > >                   of memory mapping in maintained
> > > - virtqueue group: the minimal set of virtqueues that must share an
> > >                   address space
> > How will this support the pretty common case where control vq
> > is programmed by the kernel through the PF, and others by the VFs?
> 
> 
> In this case, the VF parent need to provide a software control vq and decode
> the command then send them to VF.


But how does that tie to the address space infrastructure?



> 
> > 
> > 
> > I actually thought the way to support it is by exposing
> > something like an "inject buffers" API which sends data to a given VQ.
> > Maybe an ioctl, and maybe down the road uio ring can support batching
> > these ....
> 
> 
> So the virtuqueue allows the request to be processed asynchronously (e.g
> driver may choose to use interrupt for control vq). This means we need to
> support that in uAPI level.

I don't think we need to make it async, just a regular ioctl will do.
In fact no guest uses the asynchronous property.


> And if we manage to do that, it's just another
> type of virtqueue.
> 
> For virtio-vDPA, this also means the extensions for queue processing which
> is a functional duplication.

I don't see why, just send it to the actual control vq :)

> Using what proposed in this series, we don't
> need any changes for kernel virtio drivers.
> 
> What's more important, this series could be used for future features that
> requires DMA isolation between virtqueues:
> 
> - report dirty pages via virtqueue
> - sub function level device slicing


I agree these are nice to have, but I am not sure basic control vq must
be tied to that.

> ...
> 
> Thanks
> 
> 
> > 
> > 
> > > Device needs to advertise the following attributes to vDPA:
> > > 
> > > - the number of address spaces supported in the device
> > > - the number of virtqueue groups supported in the device
> > > - the mappings from a specific virtqueue to its virtqueue groups
> > > 
> > > The mappings from virtqueue to virtqueue groups is fixed and defined
> > > by vDPA device driver. E.g:
> > > 
> > > - For the device that has hardware ASID support, it can simply
> > >    advertise a per virtqueue virtqueue group.
> > > - For the device that does not have hardware ASID support, it can
> > >    simply advertise a single virtqueue group that contains all
> > >    virtqueues. Or if it wants a software emulated control virtqueue, it
> > >    can advertise two virtqueue groups, one is for cvq, another is for
> > >    the rest virtqueues.
> > > 
> > > vDPA also allow to change the association between virtqueue group and
> > > address space. So in the case of control virtqueue, userspace
> > > VMM(Qemu) may use a dedicated address space for the control virtqueue
> > > group to isolate the memory mapping.
> > > 
> > > The vhost/vhost-vDPA is also extend for the userspace to:
> > > 
> > > - query the number of virtqueue groups and address spaces supported by
> > >    the device
> > > - query the virtqueue group for a specific virtqueue
> > > - assocaite a virtqueue group with an address space
> > > - send ASID based IOTLB commands
> > > 
> > > This will help userspace VMM(Qemu) to detect whether the control vq
> > > could be supported and isolate memory mappings of control virtqueue
> > > from the others.
> > > 
> > > To demonstrate the usage, vDPA simulator is extended to support
> > > setting MAC address via a emulated control virtqueue.
> > > 
> > > Please review.
> > > 
> > > Changes since RFC:
> > > 
> > > - tweak vhost uAPI documentation
> > > - switch to use device specific IOTLB really in patch 4
> > > - tweak the commit log
> > > - fix that ASID in vhost is claimed to be 32 actually but 16bit
> > >    actually
> > > - fix use after free when using ASID with IOTLB batching requests
> > > - switch to use Stefano's patch for having separated iov
> > > - remove unused "used_as" variable
> > > - fix the iotlb/asid checking in vhost_vdpa_unmap()
> > > 
> > > Thanks
> > > 
> > > Jason Wang (20):
> > >    vhost: move the backend feature bits to vhost_types.h
> > >    virtio-vdpa: don't set callback if virtio doesn't need it
> > >    vhost-vdpa: passing iotlb to IOMMU mapping helpers
> > >    vhost-vdpa: switch to use vhost-vdpa specific IOTLB
> > >    vdpa: add the missing comment for nvqs in struct vdpa_device
> > >    vdpa: introduce virtqueue groups
> > >    vdpa: multiple address spaces support
> > >    vdpa: introduce config operations for associating ASID to a virtqueue
> > >      group
> > >    vhost_iotlb: split out IOTLB initialization
> > >    vhost: support ASID in IOTLB API
> > >    vhost-vdpa: introduce asid based IOTLB
> > >    vhost-vdpa: introduce uAPI to get the number of virtqueue groups
> > >    vhost-vdpa: introduce uAPI to get the number of address spaces
> > >    vhost-vdpa: uAPI to get virtqueue group id
> > >    vhost-vdpa: introduce uAPI to set group ASID
> > >    vhost-vdpa: support ASID based IOTLB API
> > >    vdpa_sim: advertise VIRTIO_NET_F_MTU
> > >    vdpa_sim: factor out buffer completion logic
> > >    vdpa_sim: filter destination mac address
> > >    vdpasim: control virtqueue support
> > > 
> > > Stefano Garzarella (1):
> > >    vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
> > > 
> > >   drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
> > >   drivers/vdpa/vdpa.c               |   8 +-
> > >   drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
> > >   drivers/vhost/iotlb.c             |  23 ++-
> > >   drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
> > >   drivers/vhost/vhost.c             |  23 ++-
> > >   drivers/vhost/vhost.h             |   4 +-
> > >   drivers/virtio/virtio_vdpa.c      |   2 +-
> > >   include/linux/vdpa.h              |  42 ++++-
> > >   include/linux/vhost_iotlb.h       |   2 +
> > >   include/uapi/linux/vhost.h        |  25 ++-
> > >   include/uapi/linux/vhost_types.h  |  10 +-
> > >   13 files changed, 561 insertions(+), 136 deletions(-)
> > > 
> > > -- 
> > > 2.25.1
Jason Wang Dec. 17, 2020, 9:02 a.m. UTC | #5
On 2020/12/17 下午3:58, Michael S. Tsirkin wrote:
> On Thu, Dec 17, 2020 at 11:30:18AM +0800, Jason Wang wrote:
>> On 2020/12/16 下午5:47, Michael S. Tsirkin wrote:
>>> On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
>>>> Hi All:
>>>>
>>>> This series tries to add the support for control virtqueue in vDPA.
>>>>
>>>> Control virtqueue is used by networking device for accepting various
>>>> commands from the driver. It's a must to support multiqueue and other
>>>> configurations.
>>>>
>>>> When used by vhost-vDPA bus driver for VM, the control virtqueue
>>>> should be shadowed via userspace VMM (Qemu) instead of being assigned
>>>> directly to Guest. This is because Qemu needs to know the device state
>>>> in order to start and stop device correctly (e.g for Live Migration).
>>>>
>>>> This requies to isolate the memory mapping for control virtqueue
>>>> presented by vhost-vDPA to prevent guest from accesing it directly.
>>>> To achieve this, vDPA introduce two new abstractions:
>>>>
>>>> - address space: identified through address space id (ASID) and a set
>>>>                    of memory mapping in maintained
>>>> - virtqueue group: the minimal set of virtqueues that must share an
>>>>                    address space
>>> How will this support the pretty common case where control vq
>>> is programmed by the kernel through the PF, and others by the VFs?
>>
>> In this case, the VF parent need to provide a software control vq and decode
>> the command then send them to VF.
>
> But how does that tie to the address space infrastructure?


In this case, address space is not a must. But the idea is to make 
control vq works for all types of hardware:

1) control virtqueue is implemented via VF/PF communication
2) control virtqueue is implemented by VF but not through DMA
3) control virtqueue is implemented by VF DMA, it could be either a 
hardware control virtqueue or other type of DMA

The address space is a must for 3) to work and can work for both 1) and 2).


>
>
>
>>>
>>> I actually thought the way to support it is by exposing
>>> something like an "inject buffers" API which sends data to a given VQ.
>>> Maybe an ioctl, and maybe down the road uio ring can support batching
>>> these ....
>>
>> So the virtuqueue allows the request to be processed asynchronously (e.g
>> driver may choose to use interrupt for control vq). This means we need to
>> support that in uAPI level.
> I don't think we need to make it async, just a regular ioctl will do.
> In fact no guest uses the asynchronous property.


It was not forbidden by the spec then we need to support that. E.g we 
can not assume driver doesn't assign interrupt for cvq.


>
>
>> And if we manage to do that, it's just another
>> type of virtqueue.
>>
>> For virtio-vDPA, this also means the extensions for queue processing which
>> is a functional duplication.
> I don't see why, just send it to the actual control vq :)


But in the case you've pointed out, there's no hardware control vq in fact.


>
>> Using what proposed in this series, we don't
>> need any changes for kernel virtio drivers.
>>
>> What's more important, this series could be used for future features that
>> requires DMA isolation between virtqueues:
>>
>> - report dirty pages via virtqueue
>> - sub function level device slicing
>
> I agree these are nice to have, but I am not sure basic control vq must
> be tied to that.


If the control virtqueue is implemented via DMA through VF, it looks 
like a must.

Thanks


>
>> ...
>>
>> Thanks
>>
>>
>>>
>>>> Device needs to advertise the following attributes to vDPA:
>>>>
>>>> - the number of address spaces supported in the device
>>>> - the number of virtqueue groups supported in the device
>>>> - the mappings from a specific virtqueue to its virtqueue groups
>>>>
>>>> The mappings from virtqueue to virtqueue groups is fixed and defined
>>>> by vDPA device driver. E.g:
>>>>
>>>> - For the device that has hardware ASID support, it can simply
>>>>     advertise a per virtqueue virtqueue group.
>>>> - For the device that does not have hardware ASID support, it can
>>>>     simply advertise a single virtqueue group that contains all
>>>>     virtqueues. Or if it wants a software emulated control virtqueue, it
>>>>     can advertise two virtqueue groups, one is for cvq, another is for
>>>>     the rest virtqueues.
>>>>
>>>> vDPA also allow to change the association between virtqueue group and
>>>> address space. So in the case of control virtqueue, userspace
>>>> VMM(Qemu) may use a dedicated address space for the control virtqueue
>>>> group to isolate the memory mapping.
>>>>
>>>> The vhost/vhost-vDPA is also extend for the userspace to:
>>>>
>>>> - query the number of virtqueue groups and address spaces supported by
>>>>     the device
>>>> - query the virtqueue group for a specific virtqueue
>>>> - assocaite a virtqueue group with an address space
>>>> - send ASID based IOTLB commands
>>>>
>>>> This will help userspace VMM(Qemu) to detect whether the control vq
>>>> could be supported and isolate memory mappings of control virtqueue
>>>> from the others.
>>>>
>>>> To demonstrate the usage, vDPA simulator is extended to support
>>>> setting MAC address via a emulated control virtqueue.
>>>>
>>>> Please review.
>>>>
>>>> Changes since RFC:
>>>>
>>>> - tweak vhost uAPI documentation
>>>> - switch to use device specific IOTLB really in patch 4
>>>> - tweak the commit log
>>>> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>>>>     actually
>>>> - fix use after free when using ASID with IOTLB batching requests
>>>> - switch to use Stefano's patch for having separated iov
>>>> - remove unused "used_as" variable
>>>> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>>>>
>>>> Thanks
>>>>
>>>> Jason Wang (20):
>>>>     vhost: move the backend feature bits to vhost_types.h
>>>>     virtio-vdpa: don't set callback if virtio doesn't need it
>>>>     vhost-vdpa: passing iotlb to IOMMU mapping helpers
>>>>     vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>>>>     vdpa: add the missing comment for nvqs in struct vdpa_device
>>>>     vdpa: introduce virtqueue groups
>>>>     vdpa: multiple address spaces support
>>>>     vdpa: introduce config operations for associating ASID to a virtqueue
>>>>       group
>>>>     vhost_iotlb: split out IOTLB initialization
>>>>     vhost: support ASID in IOTLB API
>>>>     vhost-vdpa: introduce asid based IOTLB
>>>>     vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>>>>     vhost-vdpa: introduce uAPI to get the number of address spaces
>>>>     vhost-vdpa: uAPI to get virtqueue group id
>>>>     vhost-vdpa: introduce uAPI to set group ASID
>>>>     vhost-vdpa: support ASID based IOTLB API
>>>>     vdpa_sim: advertise VIRTIO_NET_F_MTU
>>>>     vdpa_sim: factor out buffer completion logic
>>>>     vdpa_sim: filter destination mac address
>>>>     vdpasim: control virtqueue support
>>>>
>>>> Stefano Garzarella (1):
>>>>     vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
>>>>
>>>>    drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
>>>>    drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
>>>>    drivers/vdpa/vdpa.c               |   8 +-
>>>>    drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
>>>>    drivers/vhost/iotlb.c             |  23 ++-
>>>>    drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
>>>>    drivers/vhost/vhost.c             |  23 ++-
>>>>    drivers/vhost/vhost.h             |   4 +-
>>>>    drivers/virtio/virtio_vdpa.c      |   2 +-
>>>>    include/linux/vdpa.h              |  42 ++++-
>>>>    include/linux/vhost_iotlb.h       |   2 +
>>>>    include/uapi/linux/vhost.h        |  25 ++-
>>>>    include/uapi/linux/vhost_types.h  |  10 +-
>>>>    13 files changed, 561 insertions(+), 136 deletions(-)
>>>>
>>>> -- 
>>>> 2.25.1
Michael S. Tsirkin Dec. 17, 2020, 10:28 p.m. UTC | #6
On Thu, Dec 17, 2020 at 05:02:49PM +0800, Jason Wang wrote:
> 
> On 2020/12/17 下午3:58, Michael S. Tsirkin wrote:
> > On Thu, Dec 17, 2020 at 11:30:18AM +0800, Jason Wang wrote:
> > > On 2020/12/16 下午5:47, Michael S. Tsirkin wrote:
> > > > On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
> > > > > Hi All:
> > > > > 
> > > > > This series tries to add the support for control virtqueue in vDPA.
> > > > > 
> > > > > Control virtqueue is used by networking device for accepting various
> > > > > commands from the driver. It's a must to support multiqueue and other
> > > > > configurations.
> > > > > 
> > > > > When used by vhost-vDPA bus driver for VM, the control virtqueue
> > > > > should be shadowed via userspace VMM (Qemu) instead of being assigned
> > > > > directly to Guest. This is because Qemu needs to know the device state
> > > > > in order to start and stop device correctly (e.g for Live Migration).
> > > > > 
> > > > > This requies to isolate the memory mapping for control virtqueue
> > > > > presented by vhost-vDPA to prevent guest from accesing it directly.
> > > > > To achieve this, vDPA introduce two new abstractions:
> > > > > 
> > > > > - address space: identified through address space id (ASID) and a set
> > > > >                    of memory mapping in maintained
> > > > > - virtqueue group: the minimal set of virtqueues that must share an
> > > > >                    address space
> > > > How will this support the pretty common case where control vq
> > > > is programmed by the kernel through the PF, and others by the VFs?
> > > 
> > > In this case, the VF parent need to provide a software control vq and decode
> > > the command then send them to VF.
> > 
> > But how does that tie to the address space infrastructure?
> 
> 
> In this case, address space is not a must.

That's ok, problem is I don't see how address space is going
to work in this case at all.

There's no address space there that userspace/guest can control.


> But the idea is to make control
> vq works for all types of hardware:
> 
> 1) control virtqueue is implemented via VF/PF communication
> 2) control virtqueue is implemented by VF but not through DMA
> 3) control virtqueue is implemented by VF DMA, it could be either a hardware
> control virtqueue or other type of DMA
> 
> The address space is a must for 3) to work and can work for both 1) and 2).
> 
> 
> > 
> > 
> > 
> > > > 
> > > > I actually thought the way to support it is by exposing
> > > > something like an "inject buffers" API which sends data to a given VQ.
> > > > Maybe an ioctl, and maybe down the road uio ring can support batching
> > > > these ....
> > > 
> > > So the virtuqueue allows the request to be processed asynchronously (e.g
> > > driver may choose to use interrupt for control vq). This means we need to
> > > support that in uAPI level.
> > I don't think we need to make it async, just a regular ioctl will do.
> > In fact no guest uses the asynchronous property.
> 
> 
> It was not forbidden by the spec then we need to support that. E.g we can
> not assume driver doesn't assign interrupt for cvq.
> 
> 
> > 
> > 
> > > And if we manage to do that, it's just another
> > > type of virtqueue.
> > > 
> > > For virtio-vDPA, this also means the extensions for queue processing which
> > > is a functional duplication.
> > I don't see why, just send it to the actual control vq :)
> 
> 
> But in the case you've pointed out, there's no hardware control vq in fact.
> 
> 
> > 
> > > Using what proposed in this series, we don't
> > > need any changes for kernel virtio drivers.
> > > 
> > > What's more important, this series could be used for future features that
> > > requires DMA isolation between virtqueues:
> > > 
> > > - report dirty pages via virtqueue
> > > - sub function level device slicing
> > 
> > I agree these are nice to have, but I am not sure basic control vq must
> > be tied to that.
> 
> 
> If the control virtqueue is implemented via DMA through VF, it looks like a
> must.
> 
> Thanks
> 
> 
> > 
> > > ...
> > > 
> > > Thanks
> > > 
> > > 
> > > > 
> > > > > Device needs to advertise the following attributes to vDPA:
> > > > > 
> > > > > - the number of address spaces supported in the device
> > > > > - the number of virtqueue groups supported in the device
> > > > > - the mappings from a specific virtqueue to its virtqueue groups
> > > > > 
> > > > > The mappings from virtqueue to virtqueue groups is fixed and defined
> > > > > by vDPA device driver. E.g:
> > > > > 
> > > > > - For the device that has hardware ASID support, it can simply
> > > > >     advertise a per virtqueue virtqueue group.
> > > > > - For the device that does not have hardware ASID support, it can
> > > > >     simply advertise a single virtqueue group that contains all
> > > > >     virtqueues. Or if it wants a software emulated control virtqueue, it
> > > > >     can advertise two virtqueue groups, one is for cvq, another is for
> > > > >     the rest virtqueues.
> > > > > 
> > > > > vDPA also allow to change the association between virtqueue group and
> > > > > address space. So in the case of control virtqueue, userspace
> > > > > VMM(Qemu) may use a dedicated address space for the control virtqueue
> > > > > group to isolate the memory mapping.
> > > > > 
> > > > > The vhost/vhost-vDPA is also extend for the userspace to:
> > > > > 
> > > > > - query the number of virtqueue groups and address spaces supported by
> > > > >     the device
> > > > > - query the virtqueue group for a specific virtqueue
> > > > > - assocaite a virtqueue group with an address space
> > > > > - send ASID based IOTLB commands
> > > > > 
> > > > > This will help userspace VMM(Qemu) to detect whether the control vq
> > > > > could be supported and isolate memory mappings of control virtqueue
> > > > > from the others.
> > > > > 
> > > > > To demonstrate the usage, vDPA simulator is extended to support
> > > > > setting MAC address via a emulated control virtqueue.
> > > > > 
> > > > > Please review.
> > > > > 
> > > > > Changes since RFC:
> > > > > 
> > > > > - tweak vhost uAPI documentation
> > > > > - switch to use device specific IOTLB really in patch 4
> > > > > - tweak the commit log
> > > > > - fix that ASID in vhost is claimed to be 32 actually but 16bit
> > > > >     actually
> > > > > - fix use after free when using ASID with IOTLB batching requests
> > > > > - switch to use Stefano's patch for having separated iov
> > > > > - remove unused "used_as" variable
> > > > > - fix the iotlb/asid checking in vhost_vdpa_unmap()
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > Jason Wang (20):
> > > > >     vhost: move the backend feature bits to vhost_types.h
> > > > >     virtio-vdpa: don't set callback if virtio doesn't need it
> > > > >     vhost-vdpa: passing iotlb to IOMMU mapping helpers
> > > > >     vhost-vdpa: switch to use vhost-vdpa specific IOTLB
> > > > >     vdpa: add the missing comment for nvqs in struct vdpa_device
> > > > >     vdpa: introduce virtqueue groups
> > > > >     vdpa: multiple address spaces support
> > > > >     vdpa: introduce config operations for associating ASID to a virtqueue
> > > > >       group
> > > > >     vhost_iotlb: split out IOTLB initialization
> > > > >     vhost: support ASID in IOTLB API
> > > > >     vhost-vdpa: introduce asid based IOTLB
> > > > >     vhost-vdpa: introduce uAPI to get the number of virtqueue groups
> > > > >     vhost-vdpa: introduce uAPI to get the number of address spaces
> > > > >     vhost-vdpa: uAPI to get virtqueue group id
> > > > >     vhost-vdpa: introduce uAPI to set group ASID
> > > > >     vhost-vdpa: support ASID based IOTLB API
> > > > >     vdpa_sim: advertise VIRTIO_NET_F_MTU
> > > > >     vdpa_sim: factor out buffer completion logic
> > > > >     vdpa_sim: filter destination mac address
> > > > >     vdpasim: control virtqueue support
> > > > > 
> > > > > Stefano Garzarella (1):
> > > > >     vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov
> > > > > 
> > > > >    drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
> > > > >    drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
> > > > >    drivers/vdpa/vdpa.c               |   8 +-
> > > > >    drivers/vdpa/vdpa_sim/vdpa_sim.c  | 292 ++++++++++++++++++++++++------
> > > > >    drivers/vhost/iotlb.c             |  23 ++-
> > > > >    drivers/vhost/vdpa.c              | 246 ++++++++++++++++++++-----
> > > > >    drivers/vhost/vhost.c             |  23 ++-
> > > > >    drivers/vhost/vhost.h             |   4 +-
> > > > >    drivers/virtio/virtio_vdpa.c      |   2 +-
> > > > >    include/linux/vdpa.h              |  42 ++++-
> > > > >    include/linux/vhost_iotlb.h       |   2 +
> > > > >    include/uapi/linux/vhost.h        |  25 ++-
> > > > >    include/uapi/linux/vhost_types.h  |  10 +-
> > > > >    13 files changed, 561 insertions(+), 136 deletions(-)
> > > > > 
> > > > > -- 
> > > > > 2.25.1
Jason Wang Dec. 18, 2020, 2:56 a.m. UTC | #7
On 2020/12/18 上午6:28, Michael S. Tsirkin wrote:
> On Thu, Dec 17, 2020 at 05:02:49PM +0800, Jason Wang wrote:
>> On 2020/12/17 下午3:58, Michael S. Tsirkin wrote:
>>> On Thu, Dec 17, 2020 at 11:30:18AM +0800, Jason Wang wrote:
>>>> On 2020/12/16 下午5:47, Michael S. Tsirkin wrote:
>>>>> On Wed, Dec 16, 2020 at 02:47:57PM +0800, Jason Wang wrote:
>>>>>> Hi All:
>>>>>>
>>>>>> This series tries to add the support for control virtqueue in vDPA.
>>>>>>
>>>>>> Control virtqueue is used by networking device for accepting various
>>>>>> commands from the driver. It's a must to support multiqueue and other
>>>>>> configurations.
>>>>>>
>>>>>> When used by vhost-vDPA bus driver for VM, the control virtqueue
>>>>>> should be shadowed via userspace VMM (Qemu) instead of being assigned
>>>>>> directly to Guest. This is because Qemu needs to know the device state
>>>>>> in order to start and stop device correctly (e.g for Live Migration).
>>>>>>
>>>>>> This requies to isolate the memory mapping for control virtqueue
>>>>>> presented by vhost-vDPA to prevent guest from accesing it directly.
>>>>>> To achieve this, vDPA introduce two new abstractions:
>>>>>>
>>>>>> - address space: identified through address space id (ASID) and a set
>>>>>>                     of memory mapping in maintained
>>>>>> - virtqueue group: the minimal set of virtqueues that must share an
>>>>>>                     address space
>>>>> How will this support the pretty common case where control vq
>>>>> is programmed by the kernel through the PF, and others by the VFs?
>>>> In this case, the VF parent need to provide a software control vq and decode
>>>> the command then send them to VF.
>>> But how does that tie to the address space infrastructure?
>> In this case, address space is not a must.
> That's ok, problem is I don't see how address space is going
> to work in this case at all.
>
> There's no address space there that userspace/guest can control.
>

The virtqueue group is mandated by parent but the association between 
virtqueue group and address space is under the control of userspace (Qemu).

A simple but common case is that:

1) Device advertise two virtqueue groups: group 0 contains RX and TX, 
group 1 contains CVQ.
2) Device advertise two address spaces

Then, for vhost-vDPA using by VM:

1) associate group 0 with as 0, group 1 with as 1 (via vhost-vDPA 
VHOST_VDPA_SET_GROUP_ASID)
2) Publish guest memory mapping via IOTLB asid 0
3) Publish control virtqueue mapping via IOTLB asid 1

Then the DMA is totally isolated in this case.

For vhost-vDPA using by DPDK or virtio-vDPA

1) associate group 0 and group 1 with as 0

since we don't need DMA isolation in this case.

In order to let it be controlled by Guest, we need extend virtio spec to 
support those concepts.

Thanks
Gautam Dawar Feb. 24, 2022, 9:22 p.m. UTC | #8
Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accessing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
                 of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
                 address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
  advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
  simply advertise a single virtqueue group that contains all
  virtqueues. Or if it wants a software emulated control virtqueue, it
  can advertise two virtqueue groups, one is for cvq, another is for
  the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
  the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue.

Please review.

Changes since v1:

- Rebased the v1 patch series on vhost branch of MST vhost git repo
  git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
- Updates to accommodate vdpa_sim changes from monolithic module in
  kernel used v1 patch series to current modularized class (net, block)
  based approach.
- Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
  propagated them from vdpa_sim_net to vdpa_sim
- Widened the data-type for "asid" member of vhost_msg_v2 to __u32
  to accommodate PASID
- Fixed the buildbot warnings
- Resolved all checkpatch.pl errors and warnings
- Tested both control and datapath with Xilinx Smartnic SN1000 series
  device using QEMU implementing the Shadow virtqueue and support for
  VQ groups and ASID available at:
  github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
  asid_groups-v1.d%2F00

Changes since RFC:

- tweak vhost uAPI documentation
- switch to use device specific IOTLB really in patch 4
- tweak the commit log
- fix that ASID in vhost is claimed to be 32 actually but 16bit
  actually
- fix use after free when using ASID with IOTLB batching requests
- switch to use Stefano's patch for having separated iov
- remove unused "used_as" variable
- fix the iotlb/asid checking in vhost_vdpa_unmap()

Thanks

Gautam Dawar (19):
  vhost: move the backend feature bits to vhost_types.h
  virtio-vdpa: don't set callback if virtio doesn't need it
  vhost-vdpa: passing iotlb to IOMMU mapping helpers
  vhost-vdpa: switch to use vhost-vdpa specific IOTLB
  vdpa: introduce virtqueue groups
  vdpa: multiple address spaces support
  vdpa: introduce config operations for associating ASID to a virtqueue
    group
  vhost_iotlb: split out IOTLB initialization
  vhost: support ASID in IOTLB API
  vhost-vdpa: introduce asid based IOTLB
  vhost-vdpa: introduce uAPI to get the number of virtqueue groups
  vhost-vdpa: introduce uAPI to get the number of address spaces
  vhost-vdpa: uAPI to get virtqueue group id
  vhost-vdpa: introduce uAPI to set group ASID
  vhost-vdpa: support ASID based IOTLB API
  vdpa_sim: advertise VIRTIO_NET_F_MTU
  vdpa_sim: factor out buffer completion logic
  vdpa_sim: filter destination mac address
  vdpasim: control virtqueue support

 drivers/vdpa/ifcvf/ifcvf_main.c      |   8 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c    |  11 +-
 drivers/vdpa/vdpa.c                  |   5 +
 drivers/vdpa/vdpa_sim/vdpa_sim.c     | 100 ++++++++--
 drivers/vdpa/vdpa_sim/vdpa_sim.h     |   3 +
 drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
 drivers/vhost/iotlb.c                |  23 ++-
 drivers/vhost/vdpa.c                 | 272 +++++++++++++++++++++------
 drivers/vhost/vhost.c                |  23 ++-
 drivers/vhost/vhost.h                |   4 +-
 drivers/virtio/virtio_vdpa.c         |   2 +-
 include/linux/vdpa.h                 |  46 ++++-
 include/linux/vhost_iotlb.h          |   2 +
 include/uapi/linux/vhost.h           |  25 ++-
 include/uapi/linux/vhost_types.h     |  11 +-
 15 files changed, 566 insertions(+), 138 deletions(-)
Jason Wang Feb. 28, 2022, 8:17 a.m. UTC | #9
在 2022/2/25 上午5:22, Gautam Dawar 写道:
> Hi All:
>
> This series tries to add the support for control virtqueue in vDPA.
>
> Control virtqueue is used by networking device for accepting various
> commands from the driver. It's a must to support multiqueue and other
> configurations.
>
> When used by vhost-vDPA bus driver for VM, the control virtqueue
> should be shadowed via userspace VMM (Qemu) instead of being assigned
> directly to Guest. This is because Qemu needs to know the device state
> in order to start and stop device correctly (e.g for Live Migration).
>
> This requies to isolate the memory mapping for control virtqueue
> presented by vhost-vDPA to prevent guest from accessing it directly.
>
> To achieve this, vDPA introduce two new abstractions:
>
> - address space: identified through address space id (ASID) and a set
>                   of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
>                   address space
>
> Device needs to advertise the following attributes to vDPA:
>
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
>
> The mappings from virtqueue to virtqueue groups is fixed and defined
> by vDPA device driver. E.g:
>
> - For the device that has hardware ASID support, it can simply
>    advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
>    simply advertise a single virtqueue group that contains all
>    virtqueues. Or if it wants a software emulated control virtqueue, it
>    can advertise two virtqueue groups, one is for cvq, another is for
>    the rest virtqueues.
>
> vDPA also allow to change the association between virtqueue group and
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue
> group to isolate the memory mapping.
>
> The vhost/vhost-vDPA is also extend for the userspace to:
>
> - query the number of virtqueue groups and address spaces supported by
>    the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
>
> This will help userspace VMM(Qemu) to detect whether the control vq
> could be supported and isolate memory mappings of control virtqueue
> from the others.
>
> To demonstrate the usage, vDPA simulator is extended to support
> setting MAC address via a emulated control virtqueue.
>
> Please review.
>
> Changes since v1:
>
> - Rebased the v1 patch series on vhost branch of MST vhost git repo
>    git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
> - Updates to accommodate vdpa_sim changes from monolithic module in
>    kernel used v1 patch series to current modularized class (net, block)
>    based approach.
> - Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
>    propagated them from vdpa_sim_net to vdpa_sim
> - Widened the data-type for "asid" member of vhost_msg_v2 to __u32
>    to accommodate PASID


This is great. Then the semantic matches exactly the PASID proposal here[1].


> - Fixed the buildbot warnings
> - Resolved all checkpatch.pl errors and warnings
> - Tested both control and datapath with Xilinx Smartnic SN1000 series
>    device using QEMU implementing the Shadow virtqueue and support for
>    VQ groups and ASID available at:
>    github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
>    asid_groups-v1.d%2F00


On top, we may extend the netlink protocol to report the mapping between 
virtqueue to its groups.


Thanks

[1] 
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08077.html


>
> Changes since RFC:
>
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>    actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>
> Thanks
>
> Gautam Dawar (19):
>    vhost: move the backend feature bits to vhost_types.h
>    virtio-vdpa: don't set callback if virtio doesn't need it
>    vhost-vdpa: passing iotlb to IOMMU mapping helpers
>    vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>    vdpa: introduce virtqueue groups
>    vdpa: multiple address spaces support
>    vdpa: introduce config operations for associating ASID to a virtqueue
>      group
>    vhost_iotlb: split out IOTLB initialization
>    vhost: support ASID in IOTLB API
>    vhost-vdpa: introduce asid based IOTLB
>    vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>    vhost-vdpa: introduce uAPI to get the number of address spaces
>    vhost-vdpa: uAPI to get virtqueue group id
>    vhost-vdpa: introduce uAPI to set group ASID
>    vhost-vdpa: support ASID based IOTLB API
>    vdpa_sim: advertise VIRTIO_NET_F_MTU
>    vdpa_sim: factor out buffer completion logic
>    vdpa_sim: filter destination mac address
>    vdpasim: control virtqueue support
>
>   drivers/vdpa/ifcvf/ifcvf_main.c      |   8 +-
>   drivers/vdpa/mlx5/net/mlx5_vnet.c    |  11 +-
>   drivers/vdpa/vdpa.c                  |   5 +
>   drivers/vdpa/vdpa_sim/vdpa_sim.c     | 100 ++++++++--
>   drivers/vdpa/vdpa_sim/vdpa_sim.h     |   3 +
>   drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
>   drivers/vhost/iotlb.c                |  23 ++-
>   drivers/vhost/vdpa.c                 | 272 +++++++++++++++++++++------
>   drivers/vhost/vhost.c                |  23 ++-
>   drivers/vhost/vhost.h                |   4 +-
>   drivers/virtio/virtio_vdpa.c         |   2 +-
>   include/linux/vdpa.h                 |  46 ++++-
>   include/linux/vhost_iotlb.h          |   2 +
>   include/uapi/linux/vhost.h           |  25 ++-
>   include/uapi/linux/vhost_types.h     |  11 +-
>   15 files changed, 566 insertions(+), 138 deletions(-)
>
Gautam Dawar Feb. 28, 2022, 10:56 a.m. UTC | #10
在 2022/2/25 上午5:22, Gautam Dawar 写道:
> Hi All:
>
> This series tries to add the support for control virtqueue in vDPA.
>
> Control virtqueue is used by networking device for accepting various 
> commands from the driver. It's a must to support multiqueue and other 
> configurations.
>
> When used by vhost-vDPA bus driver for VM, the control virtqueue 
> should be shadowed via userspace VMM (Qemu) instead of being assigned 
> directly to Guest. This is because Qemu needs to know the device state 
> in order to start and stop device correctly (e.g for Live Migration).
>
> This requies to isolate the memory mapping for control virtqueue 
> presented by vhost-vDPA to prevent guest from accessing it directly.
>
> To achieve this, vDPA introduce two new abstractions:
>
> - address space: identified through address space id (ASID) and a set
>                   of memory mapping in maintained
> - virtqueue group: the minimal set of virtqueues that must share an
>                   address space
>
> Device needs to advertise the following attributes to vDPA:
>
> - the number of address spaces supported in the device
> - the number of virtqueue groups supported in the device
> - the mappings from a specific virtqueue to its virtqueue groups
>
> The mappings from virtqueue to virtqueue groups is fixed and defined 
> by vDPA device driver. E.g:
>
> - For the device that has hardware ASID support, it can simply
>    advertise a per virtqueue virtqueue group.
> - For the device that does not have hardware ASID support, it can
>    simply advertise a single virtqueue group that contains all
>    virtqueues. Or if it wants a software emulated control virtqueue, it
>    can advertise two virtqueue groups, one is for cvq, another is for
>    the rest virtqueues.
>
> vDPA also allow to change the association between virtqueue group and 
> address space. So in the case of control virtqueue, userspace
> VMM(Qemu) may use a dedicated address space for the control virtqueue 
> group to isolate the memory mapping.
>
> The vhost/vhost-vDPA is also extend for the userspace to:
>
> - query the number of virtqueue groups and address spaces supported by
>    the device
> - query the virtqueue group for a specific virtqueue
> - assocaite a virtqueue group with an address space
> - send ASID based IOTLB commands
>
> This will help userspace VMM(Qemu) to detect whether the control vq 
> could be supported and isolate memory mappings of control virtqueue 
> from the others.
>
> To demonstrate the usage, vDPA simulator is extended to support 
> setting MAC address via a emulated control virtqueue.
>
> Please review.
>
> Changes since v1:
>
> - Rebased the v1 patch series on vhost branch of MST vhost git repo
>    git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/log/?h=vhost
> - Updates to accommodate vdpa_sim changes from monolithic module in
>    kernel used v1 patch series to current modularized class (net, block)
>    based approach.
> - Added new attributes (ngroups and nas) to "vdpasim_dev_attr" and
>    propagated them from vdpa_sim_net to vdpa_sim
> - Widened the data-type for "asid" member of vhost_msg_v2 to __u32
>    to accommodate PASID


This is great. Then the semantic matches exactly the PASID proposal here[1].


> - Fixed the buildbot warnings
> - Resolved all checkpatch.pl errors and warnings
> - Tested both control and datapath with Xilinx Smartnic SN1000 series
>    device using QEMU implementing the Shadow virtqueue and support for
>    VQ groups and ASID available at:
>    github.com/eugpermar/qemu/releases/tag/vdpa_sw_live_migration.d%2F
>    asid_groups-v1.d%2F00


On top, we may extend the netlink protocol to report the mapping between virtqueue to its groups.
[GD>>] Yes, I've already discussed this with Eugenio. For testing purpose, I added the mapping in Xilinx netdriver "sfc".

Thanks

[1] 
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg08077.html


>
> Changes since RFC:
>
> - tweak vhost uAPI documentation
> - switch to use device specific IOTLB really in patch 4
> - tweak the commit log
> - fix that ASID in vhost is claimed to be 32 actually but 16bit
>    actually
> - fix use after free when using ASID with IOTLB batching requests
> - switch to use Stefano's patch for having separated iov
> - remove unused "used_as" variable
> - fix the iotlb/asid checking in vhost_vdpa_unmap()
>
> Thanks
>
> Gautam Dawar (19):
>    vhost: move the backend feature bits to vhost_types.h
>    virtio-vdpa: don't set callback if virtio doesn't need it
>    vhost-vdpa: passing iotlb to IOMMU mapping helpers
>    vhost-vdpa: switch to use vhost-vdpa specific IOTLB
>    vdpa: introduce virtqueue groups
>    vdpa: multiple address spaces support
>    vdpa: introduce config operations for associating ASID to a virtqueue
>      group
>    vhost_iotlb: split out IOTLB initialization
>    vhost: support ASID in IOTLB API
>    vhost-vdpa: introduce asid based IOTLB
>    vhost-vdpa: introduce uAPI to get the number of virtqueue groups
>    vhost-vdpa: introduce uAPI to get the number of address spaces
>    vhost-vdpa: uAPI to get virtqueue group id
>    vhost-vdpa: introduce uAPI to set group ASID
>    vhost-vdpa: support ASID based IOTLB API
>    vdpa_sim: advertise VIRTIO_NET_F_MTU
>    vdpa_sim: factor out buffer completion logic
>    vdpa_sim: filter destination mac address
>    vdpasim: control virtqueue support
>
>   drivers/vdpa/ifcvf/ifcvf_main.c      |   8 +-
>   drivers/vdpa/mlx5/net/mlx5_vnet.c    |  11 +-
>   drivers/vdpa/vdpa.c                  |   5 +
>   drivers/vdpa/vdpa_sim/vdpa_sim.c     | 100 ++++++++--
>   drivers/vdpa/vdpa_sim/vdpa_sim.h     |   3 +
>   drivers/vdpa/vdpa_sim/vdpa_sim_net.c | 169 +++++++++++++----
>   drivers/vhost/iotlb.c                |  23 ++-
>   drivers/vhost/vdpa.c                 | 272 +++++++++++++++++++++------
>   drivers/vhost/vhost.c                |  23 ++-
>   drivers/vhost/vhost.h                |   4 +-
>   drivers/virtio/virtio_vdpa.c         |   2 +-
>   include/linux/vdpa.h                 |  46 ++++-
>   include/linux/vhost_iotlb.h          |   2 +
>   include/uapi/linux/vhost.h           |  25 ++-
>   include/uapi/linux/vhost_types.h     |  11 +-
>   15 files changed, 566 insertions(+), 138 deletions(-)
>