mbox series

[RFC,0/1] virtio-net: add support for SR-IOV emulation

Message ID 1689731808-3009-1-git-send-email-yui.washidu@gmail.com (mailing list archive)
Headers show
Series virtio-net: add support for SR-IOV emulation | expand

Message

Yui Washizu July 19, 2023, 1:56 a.m. UTC
This patch series is the first step towards enabling
hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
We are considering that this hardware offloading enables
the use of high-performance networks in virtual infrastructures,
such as container infrastructures on VMs.

To enable L2 packet switching by SR-IOV VFs, we are considering the following:
- making the guest recognize virtio-net devices as SR-IOV PF devices
  (archived with this patch series)
- allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
  leaving the L2 packet switching feature to the management layer like libvirt
  - This makes hardware offloading of L2 packet switching possible.
    For example, when using vDPA devices, it allows the guest
    to utilize SR-IOV NIC embedded switch of hosts.

This patch series aims to enable SR-IOV emulation on virtio-net devices.
With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
on the virtio-net device.
Currently, we are unable to specify the properties of a VF created from the guest.
The properties are set to their default values.
In the future, we plan to allow users to set the properties.

qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
# when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
# <num> means the max number of VF on guest

Example commands to create VFs in virtio-net device from the guest:

guest% readlink -f /sys/class/net/eth1/device
 /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
guest% ip link show
 eth0: ....
 eth1: ....
 eth2: .... #virtual VF created
 eth3: .... #virtual VF created
 
Please note that communication between VF and PF/VF is not possible by this patch series itself.

Yui Washizu (1):
  virtio-pci: add SR-IOV capability

 hw/pci/msix.c                  |  8 +++--
 hw/pci/pci.c                   |  4 +++
 hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
 include/hw/virtio/virtio-pci.h |  1 +
 4 files changed, 66 insertions(+), 9 deletions(-)

Comments

Jason Wang July 20, 2023, 2:20 a.m. UTC | #1
On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>
> This patch series is the first step towards enabling
> hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
> We are considering that this hardware offloading enables
> the use of high-performance networks in virtual infrastructures,
> such as container infrastructures on VMs.
>
> To enable L2 packet switching by SR-IOV VFs, we are considering the following:
> - making the guest recognize virtio-net devices as SR-IOV PF devices
>   (archived with this patch series)
> - allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
>   leaving the L2 packet switching feature to the management layer like libvirt

Could you please show the qemu command line you want to propose here?

>   - This makes hardware offloading of L2 packet switching possible.
>     For example, when using vDPA devices, it allows the guest
>     to utilize SR-IOV NIC embedded switch of hosts.

This would be interesting.

Thanks

>
> This patch series aims to enable SR-IOV emulation on virtio-net devices.
> With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
> The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
> on the virtio-net device.
> Currently, we are unable to specify the properties of a VF created from the guest.
> The properties are set to their default values.
> In the future, we plan to allow users to set the properties.
>
> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
> # when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
> # <num> means the max number of VF on guest
>
> Example commands to create VFs in virtio-net device from the guest:
>
> guest% readlink -f /sys/class/net/eth1/device
>  /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
> guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
> guest% ip link show
>  eth0: ....
>  eth1: ....
>  eth2: .... #virtual VF created
>  eth3: .... #virtual VF created
>
> Please note that communication between VF and PF/VF is not possible by this patch series itself.
>
> Yui Washizu (1):
>   virtio-pci: add SR-IOV capability
>
>  hw/pci/msix.c                  |  8 +++--
>  hw/pci/pci.c                   |  4 +++
>  hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
>  include/hw/virtio/virtio-pci.h |  1 +
>  4 files changed, 66 insertions(+), 9 deletions(-)
>
> --
> 2.39.3
>
Yui Washizu July 24, 2023, 2:32 a.m. UTC | #2
On 2023/07/20 11:20, Jason Wang wrote:
> On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>> This patch series is the first step towards enabling
>> hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
>> We are considering that this hardware offloading enables
>> the use of high-performance networks in virtual infrastructures,
>> such as container infrastructures on VMs.
>>
>> To enable L2 packet switching by SR-IOV VFs, we are considering the following:
>> - making the guest recognize virtio-net devices as SR-IOV PF devices
>>    (archived with this patch series)
>> - allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
>>    leaving the L2 packet switching feature to the management layer like libvirt
> Could you please show the qemu command line you want to propose here?


I am considering how to specify the properties of VFs to connect SR-IOV 
VFs to the backend networks.


For example:


qemu-system-x86_64 -device 
pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
                    -netdev tap,id=hostnet0,vhost=on
                    -netdev tap,id=vfnet1,vhost=on # backend network for 
SR-IOV VF 1
                    -netdev tap,id=vfnet2,vhost=on # backend network for 
SR-IOV VF 2
                    -device 
virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,...


In this example, we can specify multiple backend networks to the VFs
by adding "sriov_netdev" and separating them with ":".
Additionally, when passing properties like "rx_queue_size" to VFs, we 
can utilize new properties,
such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is 
passed to all VFs.

I'm still considering about how to specify it, so please give me any 
comments if you have any.


>>    - This makes hardware offloading of L2 packet switching possible.
>>      For example, when using vDPA devices, it allows the guest
>>      to utilize SR-IOV NIC embedded switch of hosts.
> This would be interesting.
>
> Thanks
>
>> This patch series aims to enable SR-IOV emulation on virtio-net devices.
>> With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
>> The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
>> on the virtio-net device.
>> Currently, we are unable to specify the properties of a VF created from the guest.
>> The properties are set to their default values.
>> In the future, we plan to allow users to set the properties.
>>
>> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
>> # when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
>> # <num> means the max number of VF on guest
>>
>> Example commands to create VFs in virtio-net device from the guest:
>>
>> guest% readlink -f /sys/class/net/eth1/device
>>   /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
>> guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
>> guest% ip link show
>>   eth0: ....
>>   eth1: ....
>>   eth2: .... #virtual VF created
>>   eth3: .... #virtual VF created
>>
>> Please note that communication between VF and PF/VF is not possible by this patch series itself.
>>
>> Yui Washizu (1):
>>    virtio-pci: add SR-IOV capability
>>
>>   hw/pci/msix.c                  |  8 +++--
>>   hw/pci/pci.c                   |  4 +++
>>   hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
>>   include/hw/virtio/virtio-pci.h |  1 +
>>   4 files changed, 66 insertions(+), 9 deletions(-)
>>
>> --
>> 2.39.3
>>
Jason Wang July 24, 2023, 6:58 a.m. UTC | #3
On Mon, Jul 24, 2023 at 10:32 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>
>
> On 2023/07/20 11:20, Jason Wang wrote:
> > On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> wrote:
> >> This patch series is the first step towards enabling
> >> hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
> >> We are considering that this hardware offloading enables
> >> the use of high-performance networks in virtual infrastructures,
> >> such as container infrastructures on VMs.
> >>
> >> To enable L2 packet switching by SR-IOV VFs, we are considering the following:
> >> - making the guest recognize virtio-net devices as SR-IOV PF devices
> >>    (archived with this patch series)
> >> - allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
> >>    leaving the L2 packet switching feature to the management layer like libvirt
> > Could you please show the qemu command line you want to propose here?
>
>
> I am considering how to specify the properties of VFs to connect SR-IOV
> VFs to the backend networks.
>
>
> For example:
>
>
> qemu-system-x86_64 -device
> pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
>                     -netdev tap,id=hostnet0,vhost=on
>                     -netdev tap,id=vfnet1,vhost=on # backend network for
> SR-IOV VF 1
>                     -netdev tap,id=vfnet2,vhost=on # backend network for
> SR-IOV VF 2
>                     -device
> virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,...
>
>
> In this example, we can specify multiple backend networks to the VFs
> by adding "sriov_netdev" and separating them with ":".

This seems what is in my mind as well, more below

> Additionally, when passing properties like "rx_queue_size" to VFs, we
> can utilize new properties,
> such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is
> passed to all VFs.

Or we can introduce new device like:

-netdev tap,id=hn0 \
-device virtio-net-pci,netdev=hn0,id=vnet_pf \
-netdev tap,netdev=hn1 \
-device virtio-net-pci-vf,netdev=hn1,id=vf0,pf=vnet_pf,rx_queue_size=XYZ ... \

This allows us to reuse the codes for configuring vf parameters. But
note that rx_queue_size doesn't make too much sense to vhost-vDPA, as
qemu can perform nothing more than a simple sanity test.

Thanks

>
> I'm still considering about how to specify it, so please give me any
> comments if you have any.
>
>
> >>    - This makes hardware offloading of L2 packet switching possible.
> >>      For example, when using vDPA devices, it allows the guest
> >>      to utilize SR-IOV NIC embedded switch of hosts.
> > This would be interesting.
> >
> > Thanks
> >
> >> This patch series aims to enable SR-IOV emulation on virtio-net devices.
> >> With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
> >> The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
> >> on the virtio-net device.
> >> Currently, we are unable to specify the properties of a VF created from the guest.
> >> The properties are set to their default values.
> >> In the future, we plan to allow users to set the properties.
> >>
> >> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
> >> # when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
> >> # <num> means the max number of VF on guest
> >>
> >> Example commands to create VFs in virtio-net device from the guest:
> >>
> >> guest% readlink -f /sys/class/net/eth1/device
> >>   /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
> >> guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
> >> guest% ip link show
> >>   eth0: ....
> >>   eth1: ....
> >>   eth2: .... #virtual VF created
> >>   eth3: .... #virtual VF created
> >>
> >> Please note that communication between VF and PF/VF is not possible by this patch series itself.
> >>
> >> Yui Washizu (1):
> >>    virtio-pci: add SR-IOV capability
> >>
> >>   hw/pci/msix.c                  |  8 +++--
> >>   hw/pci/pci.c                   |  4 +++
> >>   hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
> >>   include/hw/virtio/virtio-pci.h |  1 +
> >>   4 files changed, 66 insertions(+), 9 deletions(-)
> >>
> >> --
> >> 2.39.3
> >>
>
Yui Washizu July 28, 2023, 7:35 a.m. UTC | #4
On 2023/07/24 15:58, Jason Wang wrote:
> On Mon, Jul 24, 2023 at 10:32 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>>
>> On 2023/07/20 11:20, Jason Wang wrote:
>>> On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>>>> This patch series is the first step towards enabling
>>>> hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
>>>> We are considering that this hardware offloading enables
>>>> the use of high-performance networks in virtual infrastructures,
>>>> such as container infrastructures on VMs.
>>>>
>>>> To enable L2 packet switching by SR-IOV VFs, we are considering the following:
>>>> - making the guest recognize virtio-net devices as SR-IOV PF devices
>>>>     (archived with this patch series)
>>>> - allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
>>>>     leaving the L2 packet switching feature to the management layer like libvirt
>>> Could you please show the qemu command line you want to propose here?
>>
>> I am considering how to specify the properties of VFs to connect SR-IOV
>> VFs to the backend networks.
>>
>>
>> For example:
>>
>>
>> qemu-system-x86_64 -device
>> pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
>>                      -netdev tap,id=hostnet0,vhost=on
>>                      -netdev tap,id=vfnet1,vhost=on # backend network for
>> SR-IOV VF 1
>>                      -netdev tap,id=vfnet2,vhost=on # backend network for
>> SR-IOV VF 2
>>                      -device
>> virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,...
>>
>>
>> In this example, we can specify multiple backend networks to the VFs
>> by adding "sriov_netdev" and separating them with ":".
> This seems what is in my mind as well, more below
>
>> Additionally, when passing properties like "rx_queue_size" to VFs, we
>> can utilize new properties,
>> such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is
>> passed to all VFs.
> Or we can introduce new device like:
>
> -netdev tap,id=hn0 \
> -device virtio-net-pci,netdev=hn0,id=vnet_pf \
> -netdev tap,netdev=hn1 \
> -device virtio-net-pci-vf,netdev=hn1,id=vf0,pf=vnet_pf,rx_queue_size=XYZ ... \
>
> This allows us to reuse the codes for configuring vf parameters.


Yes, I was also considering this method was one of the option.


I was concerned that the implementation
for it might be different from other devices specified by the "-device" 
option
because VF devices are created when the guest creates VFs from PFs,
not when the guest starts,
and the implementation may be complicated.
So I dropped the idea then.


However, I think it is a cleaner way for specifying VFs than my 
suggested one.


I will further consider whether the implementation will be really 
difficult or not.


Thank you.


> But
> note that rx_queue_size doesn't make too much sense to vhost-vDPA, as
> qemu can perform nothing more than a simple sanity test.
>
> Thanks
>
>> I'm still considering about how to specify it, so please give me any
>> comments if you have any.
>>
>>
>>>>     - This makes hardware offloading of L2 packet switching possible.
>>>>       For example, when using vDPA devices, it allows the guest
>>>>       to utilize SR-IOV NIC embedded switch of hosts.
>>> This would be interesting.
>>>
>>> Thanks
>>>
>>>> This patch series aims to enable SR-IOV emulation on virtio-net devices.
>>>> With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
>>>> The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
>>>> on the virtio-net device.
>>>> Currently, we are unable to specify the properties of a VF created from the guest.
>>>> The properties are set to their default values.
>>>> In the future, we plan to allow users to set the properties.
>>>>
>>>> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
>>>> # when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
>>>> # <num> means the max number of VF on guest
>>>>
>>>> Example commands to create VFs in virtio-net device from the guest:
>>>>
>>>> guest% readlink -f /sys/class/net/eth1/device
>>>>    /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
>>>> guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
>>>> guest% ip link show
>>>>    eth0: ....
>>>>    eth1: ....
>>>>    eth2: .... #virtual VF created
>>>>    eth3: .... #virtual VF created
>>>>
>>>> Please note that communication between VF and PF/VF is not possible by this patch series itself.
>>>>
>>>> Yui Washizu (1):
>>>>     virtio-pci: add SR-IOV capability
>>>>
>>>>    hw/pci/msix.c                  |  8 +++--
>>>>    hw/pci/pci.c                   |  4 +++
>>>>    hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
>>>>    include/hw/virtio/virtio-pci.h |  1 +
>>>>    4 files changed, 66 insertions(+), 9 deletions(-)
>>>>
>>>> --
>>>> 2.39.3
>>>>
Yui Washizu Aug. 30, 2023, 5:28 a.m. UTC | #5
On 2023/07/24 15:58, Jason Wang wrote:
> On Mon, Jul 24, 2023 at 10:32 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>>
>> On 2023/07/20 11:20, Jason Wang wrote:
>>> On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> wrote:
>>>> This patch series is the first step towards enabling
>>>> hardware offloading of the L2 packet switching feature on virtio-net device to host machine.
>>>> We are considering that this hardware offloading enables
>>>> the use of high-performance networks in virtual infrastructures,
>>>> such as container infrastructures on VMs.
>>>>
>>>> To enable L2 packet switching by SR-IOV VFs, we are considering the following:
>>>> - making the guest recognize virtio-net devices as SR-IOV PF devices
>>>>     (archived with this patch series)
>>>> - allowing virtio-net devices to connect SR-IOV VFs to the backend networks,
>>>>     leaving the L2 packet switching feature to the management layer like libvirt
>>> Could you please show the qemu command line you want to propose here?
>>
>> I am considering how to specify the properties of VFs to connect SR-IOV
>> VFs to the backend networks.
>>
>>
>> For example:
>>
>>
>> qemu-system-x86_64 -device
>> pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
>>                      -netdev tap,id=hostnet0,vhost=on
>>                      -netdev tap,id=vfnet1,vhost=on # backend network for
>> SR-IOV VF 1
>>                      -netdev tap,id=vfnet2,vhost=on # backend network for
>> SR-IOV VF 2
>>                      -device
>> virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,...
>>
>>
>> In this example, we can specify multiple backend networks to the VFs
>> by adding "sriov_netdev" and separating them with ":".
> This seems what is in my mind as well, more below
>
>> Additionally, when passing properties like "rx_queue_size" to VFs, we
>> can utilize new properties,
>> such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is
>> passed to all VFs.
> Or we can introduce new device like:
>
> -netdev tap,id=hn0 \
> -device virtio-net-pci,netdev=hn0,id=vnet_pf \
> -netdev tap,netdev=hn1 \
> -device virtio-net-pci-vf,netdev=hn1,id=vf0,pf=vnet_pf,rx_queue_size=XYZ ... \
>
> This allows us to reuse the codes for configuring vf parameters. But
> note that rx_queue_size doesn't make too much sense to vhost-vDPA, as
> qemu can perform nothing more than a simple sanity test.
>
> Thanks


Thanks for proposing this new way.

I have considered how to implement this.


As virtio-net-pci-vf device should show up

on the guest only when the guest OS creates a VF,

the guest must not be able to see the VF device on PCI bus when qemu starts.

However, it's hard to realize this without overcomplicating

relevant code due to current qemu implementation.

It's because qdev_device_add_from_qdict,

a function which is called when devices are specified

with "-device" option of qemu startup command,

always create devices by qdev_new and qdev_realize.

It might be possible that we fix it

so that qdev_new/qdev_realize aren't triggered for virtio-net-pci-vf 
devices,

but It seems that we need to special case the device in very generic code

like qdev_device_add_from_qdict(), qdev_device_add(),

device_init_func() or their caller function.


Given my current ideas,

it seems like this PATCH could become complex.

Woule you have any suggestions

for achieving this in more simple way possible ?



>> I'm still considering about how to specify it, so please give me any
>> comments if you have any.
>>
>>
>>>>     - This makes hardware offloading of L2 packet switching possible.
>>>>       For example, when using vDPA devices, it allows the guest
>>>>       to utilize SR-IOV NIC embedded switch of hosts.
>>> This would be interesting.
>>>
>>> Thanks
>>>
>>>> This patch series aims to enable SR-IOV emulation on virtio-net devices.
>>>> With this series, the guest can identify the virtio-net device as an SR-IOV PF device.
>>>> The newly added property 'sriov_max_vfs' allows us to enable the SR-IOV feature
>>>> on the virtio-net device.
>>>> Currently, we are unable to specify the properties of a VF created from the guest.
>>>> The properties are set to their default values.
>>>> In the future, we plan to allow users to set the properties.
>>>>
>>>> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
>>>> # when 'sriov_max_vfs' is present, the SR-IOV feature will be automatically enabled
>>>> # <num> means the max number of VF on guest
>>>>
>>>> Example commands to create VFs in virtio-net device from the guest:
>>>>
>>>> guest% readlink -f /sys/class/net/eth1/device
>>>>    /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
>>>> guest% echo "2" > /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
>>>> guest% ip link show
>>>>    eth0: ....
>>>>    eth1: ....
>>>>    eth2: .... #virtual VF created
>>>>    eth3: .... #virtual VF created
>>>>
>>>> Please note that communication between VF and PF/VF is not possible by this patch series itself.
>>>>
>>>> Yui Washizu (1):
>>>>     virtio-pci: add SR-IOV capability
>>>>
>>>>    hw/pci/msix.c                  |  8 +++--
>>>>    hw/pci/pci.c                   |  4 +++
>>>>    hw/virtio/virtio-pci.c         | 62 ++++++++++++++++++++++++++++++----
>>>>    include/hw/virtio/virtio-pci.h |  1 +
>>>>    4 files changed, 66 insertions(+), 9 deletions(-)
>>>>
>>>> --
>>>> 2.39.3
>>>>
Yui Washizu Sept. 6, 2023, 5:09 a.m. UTC | #6
Hi Jason,


On 2023/08/30 14:28, Yui Washizu wrote:
>
> On 2023/07/24 15:58, Jason Wang wrote:
>> On Mon, Jul 24, 2023 at 10:32 AM Yui Washizu <yui.washidu@gmail.com> 
>> wrote:
>>>
>>> On 2023/07/20 11:20, Jason Wang wrote:
>>>> On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com> 
>>>> wrote:
>>>>> This patch series is the first step towards enabling
>>>>> hardware offloading of the L2 packet switching feature on 
>>>>> virtio-net device to host machine.
>>>>> We are considering that this hardware offloading enables
>>>>> the use of high-performance networks in virtual infrastructures,
>>>>> such as container infrastructures on VMs.
>>>>>
>>>>> To enable L2 packet switching by SR-IOV VFs, we are considering 
>>>>> the following:
>>>>> - making the guest recognize virtio-net devices as SR-IOV PF devices
>>>>>     (archived with this patch series)
>>>>> - allowing virtio-net devices to connect SR-IOV VFs to the backend 
>>>>> networks,
>>>>>     leaving the L2 packet switching feature to the management 
>>>>> layer like libvirt
>>>> Could you please show the qemu command line you want to propose here?
>>>
>>> I am considering how to specify the properties of VFs to connect SR-IOV
>>> VFs to the backend networks.
>>>
>>>
>>> For example:
>>>
>>>
>>> qemu-system-x86_64 -device
>>> pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
>>>                      -netdev tap,id=hostnet0,vhost=on
>>>                      -netdev tap,id=vfnet1,vhost=on # backend 
>>> network for
>>> SR-IOV VF 1
>>>                      -netdev tap,id=vfnet2,vhost=on # backend 
>>> network for
>>> SR-IOV VF 2
>>>                      -device
>>> virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,... 
>>>
>>>
>>>
>>> In this example, we can specify multiple backend networks to the VFs
>>> by adding "sriov_netdev" and separating them with ":".
>> This seems what is in my mind as well, more below
>>
>>> Additionally, when passing properties like "rx_queue_size" to VFs, we
>>> can utilize new properties,
>>> such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is
>>> passed to all VFs.
>> Or we can introduce new device like:
>>
>> -netdev tap,id=hn0 \
>> -device virtio-net-pci,netdev=hn0,id=vnet_pf \
>> -netdev tap,netdev=hn1 \
>> -device 
>> virtio-net-pci-vf,netdev=hn1,id=vf0,pf=vnet_pf,rx_queue_size=XYZ ... \
>>
>> This allows us to reuse the codes for configuring vf parameters. But
>> note that rx_queue_size doesn't make too much sense to vhost-vDPA, as
>> qemu can perform nothing more than a simple sanity test.
>>
>> Thanks
>
>
> Thanks for proposing this new way.
>
> I have considered how to implement this.
>
>
> As virtio-net-pci-vf device should show up
>
> on the guest only when the guest OS creates a VF,
>
> the guest must not be able to see the VF device on PCI bus when qemu 
> starts.
>
> However, it's hard to realize this without overcomplicating
>
> relevant code due to current qemu implementation.
>
> It's because qdev_device_add_from_qdict,
>
> a function which is called when devices are specified
>
> with "-device" option of qemu startup command,
>
> always create devices by qdev_new and qdev_realize.
>
> It might be possible that we fix it
>
> so that qdev_new/qdev_realize aren't triggered for virtio-net-pci-vf 
> devices,
>
> but It seems that we need to special case the device in very generic code
>
> like qdev_device_add_from_qdict(), qdev_device_add(),
>
> device_init_func() or their caller function.
>
>
> Given my current ideas,
>
> it seems like this PATCH could become complex.
>
> Woule you have any suggestions
>
> for achieving this in more simple way possible ?
>
>

I was wondering if you could give me some feedback.
Best regard.


>
>>> I'm still considering about how to specify it, so please give me any
>>> comments if you have any.
>>>
>>>
>>>>>     - This makes hardware offloading of L2 packet switching possible.
>>>>>       For example, when using vDPA devices, it allows the guest
>>>>>       to utilize SR-IOV NIC embedded switch of hosts.
>>>> This would be interesting.
>>>>
>>>> Thanks
>>>>
>>>>> This patch series aims to enable SR-IOV emulation on virtio-net 
>>>>> devices.
>>>>> With this series, the guest can identify the virtio-net device as 
>>>>> an SR-IOV PF device.
>>>>> The newly added property 'sriov_max_vfs' allows us to enable the 
>>>>> SR-IOV feature
>>>>> on the virtio-net device.
>>>>> Currently, we are unable to specify the properties of a VF created 
>>>>> from the guest.
>>>>> The properties are set to their default values.
>>>>> In the future, we plan to allow users to set the properties.
>>>>>
>>>>> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
>>>>> # when 'sriov_max_vfs' is present, the SR-IOV feature will be 
>>>>> automatically enabled
>>>>> # <num> means the max number of VF on guest
>>>>>
>>>>> Example commands to create VFs in virtio-net device from the guest:
>>>>>
>>>>> guest% readlink -f /sys/class/net/eth1/device
>>>>> /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
>>>>> guest% echo "2" > 
>>>>> /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
>>>>> guest% ip link show
>>>>>    eth0: ....
>>>>>    eth1: ....
>>>>>    eth2: .... #virtual VF created
>>>>>    eth3: .... #virtual VF created
>>>>>
>>>>> Please note that communication between VF and PF/VF is not 
>>>>> possible by this patch series itself.
>>>>>
>>>>> Yui Washizu (1):
>>>>>     virtio-pci: add SR-IOV capability
>>>>>
>>>>>    hw/pci/msix.c                  |  8 +++--
>>>>>    hw/pci/pci.c                   |  4 +++
>>>>>    hw/virtio/virtio-pci.c         | 62 
>>>>> ++++++++++++++++++++++++++++++----
>>>>>    include/hw/virtio/virtio-pci.h |  1 +
>>>>>    4 files changed, 66 insertions(+), 9 deletions(-)
>>>>>
>>>>> -- 
>>>>> 2.39.3
>>>>>
Jason Wang Sept. 15, 2023, 7:01 a.m. UTC | #7
On Wed, Sep 6, 2023 at 1:11 PM Yui Washizu <yui.washidu@gmail.com> wrote:
>
>
> Hi Jason,
>
>
> On 2023/08/30 14:28, Yui Washizu wrote:
> >
> > On 2023/07/24 15:58, Jason Wang wrote:
> >> On Mon, Jul 24, 2023 at 10:32 AM Yui Washizu <yui.washidu@gmail.com>
> >> wrote:
> >>>
> >>> On 2023/07/20 11:20, Jason Wang wrote:
> >>>> On Wed, Jul 19, 2023 at 9:59 AM Yui Washizu <yui.washidu@gmail.com>
> >>>> wrote:
> >>>>> This patch series is the first step towards enabling
> >>>>> hardware offloading of the L2 packet switching feature on
> >>>>> virtio-net device to host machine.
> >>>>> We are considering that this hardware offloading enables
> >>>>> the use of high-performance networks in virtual infrastructures,
> >>>>> such as container infrastructures on VMs.
> >>>>>
> >>>>> To enable L2 packet switching by SR-IOV VFs, we are considering
> >>>>> the following:
> >>>>> - making the guest recognize virtio-net devices as SR-IOV PF devices
> >>>>>     (archived with this patch series)
> >>>>> - allowing virtio-net devices to connect SR-IOV VFs to the backend
> >>>>> networks,
> >>>>>     leaving the L2 packet switching feature to the management
> >>>>> layer like libvirt
> >>>> Could you please show the qemu command line you want to propose here?
> >>>
> >>> I am considering how to specify the properties of VFs to connect SR-IOV
> >>> VFs to the backend networks.
> >>>
> >>>
> >>> For example:
> >>>
> >>>
> >>> qemu-system-x86_64 -device
> >>> pcie-root-port,port=8,chassis=8,id=pci.8,bus=pcie.0,multifunction=on
> >>>                      -netdev tap,id=hostnet0,vhost=on
> >>>                      -netdev tap,id=vfnet1,vhost=on # backend
> >>> network for
> >>> SR-IOV VF 1
> >>>                      -netdev tap,id=vfnet2,vhost=on # backend
> >>> network for
> >>> SR-IOV VF 2
> >>>                      -device
> >>> virtio-net-pci,netdev=hostnet0,sriov_max_vfs=2,sriov_netdev=vfnet1:vfnet2,...
> >>>
> >>>
> >>>
> >>> In this example, we can specify multiple backend networks to the VFs
> >>> by adding "sriov_netdev" and separating them with ":".
> >> This seems what is in my mind as well, more below
> >>
> >>> Additionally, when passing properties like "rx_queue_size" to VFs, we
> >>> can utilize new properties,
> >>> such as "sriov_rx_queue_size_per_vfs," to ensure that the same value is
> >>> passed to all VFs.
> >> Or we can introduce new device like:
> >>
> >> -netdev tap,id=hn0 \
> >> -device virtio-net-pci,netdev=hn0,id=vnet_pf \
> >> -netdev tap,netdev=hn1 \
> >> -device
> >> virtio-net-pci-vf,netdev=hn1,id=vf0,pf=vnet_pf,rx_queue_size=XYZ ... \
> >>
> >> This allows us to reuse the codes for configuring vf parameters. But
> >> note that rx_queue_size doesn't make too much sense to vhost-vDPA, as
> >> qemu can perform nothing more than a simple sanity test.
> >>
> >> Thanks
> >
> >
> > Thanks for proposing this new way.
> >
> > I have considered how to implement this.
> >
> >
> > As virtio-net-pci-vf device should show up
> >
> > on the guest only when the guest OS creates a VF,
> >
> > the guest must not be able to see the VF device on PCI bus when qemu
> > starts.
> >
> > However, it's hard to realize this without overcomplicating
> >
> > relevant code due to current qemu implementation.
> >
> > It's because qdev_device_add_from_qdict,
> >
> > a function which is called when devices are specified
> >
> > with "-device" option of qemu startup command,
> >
> > always create devices by qdev_new and qdev_realize.
> >
> > It might be possible that we fix it
> >
> > so that qdev_new/qdev_realize aren't triggered for virtio-net-pci-vf
> > devices,
> >
> > but It seems that we need to special case the device in very generic code
> >
> > like qdev_device_add_from_qdict(), qdev_device_add(),
> >
> > device_init_func() or their caller function.
> >
> >
> > Given my current ideas,
> >
> > it seems like this PATCH could become complex.
> >
> > Woule you have any suggestions
> >
> > for achieving this in more simple way possible ?
> >
> >
>
> I was wondering if you could give me some feedback.
> Best regard.

Sorry for the late reply, I think we can start from the easy way from
your point and see.

Thanks

>
>
> >
> >>> I'm still considering about how to specify it, so please give me any
> >>> comments if you have any.
> >>>
> >>>
> >>>>>     - This makes hardware offloading of L2 packet switching possible.
> >>>>>       For example, when using vDPA devices, it allows the guest
> >>>>>       to utilize SR-IOV NIC embedded switch of hosts.
> >>>> This would be interesting.
> >>>>
> >>>> Thanks
> >>>>
> >>>>> This patch series aims to enable SR-IOV emulation on virtio-net
> >>>>> devices.
> >>>>> With this series, the guest can identify the virtio-net device as
> >>>>> an SR-IOV PF device.
> >>>>> The newly added property 'sriov_max_vfs' allows us to enable the
> >>>>> SR-IOV feature
> >>>>> on the virtio-net device.
> >>>>> Currently, we are unable to specify the properties of a VF created
> >>>>> from the guest.
> >>>>> The properties are set to their default values.
> >>>>> In the future, we plan to allow users to set the properties.
> >>>>>
> >>>>> qemu-system-x86_64 --device virtio-net,sriov_max_vfs=<num>
> >>>>> # when 'sriov_max_vfs' is present, the SR-IOV feature will be
> >>>>> automatically enabled
> >>>>> # <num> means the max number of VF on guest
> >>>>>
> >>>>> Example commands to create VFs in virtio-net device from the guest:
> >>>>>
> >>>>> guest% readlink -f /sys/class/net/eth1/device
> >>>>> /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/virtio1
> >>>>> guest% echo "2" >
> >>>>> /sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/sriov_numvfs
> >>>>> guest% ip link show
> >>>>>    eth0: ....
> >>>>>    eth1: ....
> >>>>>    eth2: .... #virtual VF created
> >>>>>    eth3: .... #virtual VF created
> >>>>>
> >>>>> Please note that communication between VF and PF/VF is not
> >>>>> possible by this patch series itself.
> >>>>>
> >>>>> Yui Washizu (1):
> >>>>>     virtio-pci: add SR-IOV capability
> >>>>>
> >>>>>    hw/pci/msix.c                  |  8 +++--
> >>>>>    hw/pci/pci.c                   |  4 +++
> >>>>>    hw/virtio/virtio-pci.c         | 62
> >>>>> ++++++++++++++++++++++++++++++----
> >>>>>    include/hw/virtio/virtio-pci.h |  1 +
> >>>>>    4 files changed, 66 insertions(+), 9 deletions(-)
> >>>>>
> >>>>> --
> >>>>> 2.39.3
> >>>>>
>