Message ID | 20240428-sriov-v4-0-ac8ac6212982@daynix.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio-net: add support for SR-IOV emulation | expand |
On 2024/04/28 18:05, Akihiko Odaki wrote: > Based-on: <20240315-reuse-v9-0-67aa69af4d53@daynix.com> > ("[PATCH for 9.1 v9 00/11] hw/pci: SR-IOV related fixes and improvements") > > Introduction > ------------ > > This series is based on the RFC series submitted by Yui Washizu[1]. > See also [2] for the context. > > This series enables SR-IOV emulation for virtio-net. It is useful > to test SR-IOV support on the guest, or to expose several vDPA devices > in a VM. vDPA devices can also provide L2 switching feature for > offloading though it is out of scope to allow the guest to configure > such a feature. > > The PF side code resides in virtio-pci. The VF side code resides in > the PCI common infrastructure, but it is restricted to work only for > virtio-net-pci because of lack of validation. > > User Interface > -------------- > > A user can configure a SR-IOV capable virtio-net device by adding > virtio-net-pci functions to a bus. Below is a command line example: > -netdev user,id=n -netdev user,id=o > -netdev user,id=p -netdev user,id=q > -device pcie-root-port,id=b > -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f > -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f > -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f > -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f > > The VFs specify the paired PF with "sriov-pf" property. The PF must be > added after all VFs. It is user's responsibility to ensure that VFs have > function numbers larger than one of the PF, and the function numbers > have a consistent stride. I tried to start a VM with more than 8 VFs allocated using your patch, but the following error occured and qemu didn't work: VF function number overflows. I think the cause of this error is that virtio-net-pci PFs don't have ARI. (pcie_ari_init is not added to virtio-net-pci when PFs are initialized.) I think it is possible to add it later, but how about adding pcie_ari_init ? As a trial, adding pcie_ari_init to virtio_pci_realize enabled the creation of more than 8 VFs. > > Keeping VF instances > -------------------- > > A problem with SR-IOV emulation is that it needs to hotplug the VFs as > the guest requests. Previously, this behavior was implemented by > realizing and unrealizing VFs at runtime. However, this strategy does > not work well for the proposed virtio-net emulation; in this proposal, > device options passed in the command line must be maintained as VFs > are hotplugged, but they are consumed when the machine starts and not > available after that, which makes realizing VFs at runtime impossible. > > As an strategy alternative to runtime realization/unrealization, this > series proposes to reuse the code to power down PCI Express devices. > When a PCI Express device is powered down, it will be hidden from the > guest but will be kept realized. This effectively implements the > behavior we need for the SR-IOV emulation. > > Summary > ------- > > Patch 1 disables ROM BAR, which virtio-net-pci enables by default, for > VFs. > Patch 2 makes zero stride valid for 1 VF configuration. > Patch 3 and 4 adds validations. > Patch 5 adds user-created SR-IOV VF infrastructure. > Patch 6 makes virtio-pci work as SR-IOV PF for user-created VFs. > Patch 7 allows user to create SR-IOV VFs with virtio-net-pci. > > [1] https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.washidu@gmail.com/ > [2] https://lore.kernel.org/all/5d46f455-f530-4e5e-9ae7-13a2297d4bc5@daynix.com/ > > Co-developed-by: Yui Washizu <yui.washidu@gmail.com> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > --- > Changes in v4: > - Added patch "hw/pci: Fix SR-IOV VF number calculation" to fix division > by zero reported by Yui Washizu. > - Rebased. > - Link to v3: https://lore.kernel.org/r/20240305-sriov-v3-0-abdb75770372@daynix.com > > Changes in v3: > - Rebased. > - Link to v2: https://lore.kernel.org/r/20231210-sriov-v2-0-b959e8a6dfaf@daynix.com > > Changes in v2: > - Changed to keep VF instances. > - Link to v1: https://lore.kernel.org/r/20231202-sriov-v1-0-32b3570f7bd6@daynix.com > > --- > Akihiko Odaki (7): > hw/pci: Do not add ROM BAR for SR-IOV VF > hw/pci: Fix SR-IOV VF number calculation > pcie_sriov: Ensure PF and VF are mutually exclusive > pcie_sriov: Check PCI Express for SR-IOV PF > pcie_sriov: Allow user to create SR-IOV device > virtio-pci: Implement SR-IOV PF > virtio-net: Implement SR-IOV VF > > include/hw/pci/pci_device.h | 6 +- > include/hw/pci/pcie_sriov.h | 19 +++ > hw/pci/pci.c | 76 +++++++---- > hw/pci/pcie_sriov.c | 298 +++++++++++++++++++++++++++++++++++--------- > hw/virtio/virtio-net-pci.c | 1 + > hw/virtio/virtio-pci.c | 7 ++ > 6 files changed, 323 insertions(+), 84 deletions(-) > --- > base-commit: 2ac5458086ab61282f30c2f8bdf2ae9a0a06a75d > change-id: 20231202-sriov-9402fb262be8 > > Best regards,
On 2024/05/16 11:00, Yui Washizu wrote: > > On 2024/04/28 18:05, Akihiko Odaki wrote: >> Based-on: <20240315-reuse-v9-0-67aa69af4d53@daynix.com> >> ("[PATCH for 9.1 v9 00/11] hw/pci: SR-IOV related fixes and >> improvements") >> >> Introduction >> ------------ >> >> This series is based on the RFC series submitted by Yui Washizu[1]. >> See also [2] for the context. >> >> This series enables SR-IOV emulation for virtio-net. It is useful >> to test SR-IOV support on the guest, or to expose several vDPA devices >> in a VM. vDPA devices can also provide L2 switching feature for >> offloading though it is out of scope to allow the guest to configure >> such a feature. >> >> The PF side code resides in virtio-pci. The VF side code resides in >> the PCI common infrastructure, but it is restricted to work only for >> virtio-net-pci because of lack of validation. >> >> User Interface >> -------------- >> >> A user can configure a SR-IOV capable virtio-net device by adding >> virtio-net-pci functions to a bus. Below is a command line example: >> -netdev user,id=n -netdev user,id=o >> -netdev user,id=p -netdev user,id=q >> -device pcie-root-port,id=b >> -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f >> -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f >> -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f >> -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f >> >> The VFs specify the paired PF with "sriov-pf" property. The PF must be >> added after all VFs. It is user's responsibility to ensure that VFs have >> function numbers larger than one of the PF, and the function numbers >> have a consistent stride. > > > I tried to start a VM with more than 8 VFs allocated using your patch, > but the following error occured and qemu didn't work: > VF function number overflows. > > I think the cause of this error is that virtio-net-pci PFs don't have ARI. > (pcie_ari_init is not added to virtio-net-pci when PFs are initialized.) > I think it is possible to add it later, > but how about adding pcie_ari_init ? > > As a trial, > adding pcie_ari_init to virtio_pci_realize enabled the creation of more > than 8 VFs. I have just looked into that possibility, but adding pcie_ari_init to virtio_pci_realize has some implications. Unconditionally calling pcie_ari_init will break the existing configuration of virtio-pci devices so we need to implement some logic to detect when ARI is needed. Preferably such logic should be implemented in the common PCI infrastructure instead of implementing it in virtio-pci so that other PCI multifunction devices can benefit from it. While I don't think implementing this will be too complicated, I need to ensure that such a feature is really needed before doing so.
On 2024/07/15 14:15, Akihiko Odaki wrote: > On 2024/05/16 11:00, Yui Washizu wrote: >> >> On 2024/04/28 18:05, Akihiko Odaki wrote: >>> Based-on: <20240315-reuse-v9-0-67aa69af4d53@daynix.com> >>> ("[PATCH for 9.1 v9 00/11] hw/pci: SR-IOV related fixes and >>> improvements") >>> >>> Introduction >>> ------------ >>> >>> This series is based on the RFC series submitted by Yui Washizu[1]. >>> See also [2] for the context. >>> >>> This series enables SR-IOV emulation for virtio-net. It is useful >>> to test SR-IOV support on the guest, or to expose several vDPA devices >>> in a VM. vDPA devices can also provide L2 switching feature for >>> offloading though it is out of scope to allow the guest to configure >>> such a feature. >>> >>> The PF side code resides in virtio-pci. The VF side code resides in >>> the PCI common infrastructure, but it is restricted to work only for >>> virtio-net-pci because of lack of validation. >>> >>> User Interface >>> -------------- >>> >>> A user can configure a SR-IOV capable virtio-net device by adding >>> virtio-net-pci functions to a bus. Below is a command line example: >>> -netdev user,id=n -netdev user,id=o >>> -netdev user,id=p -netdev user,id=q >>> -device pcie-root-port,id=b >>> -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f >>> -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f >>> -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f >>> -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f >>> >>> The VFs specify the paired PF with "sriov-pf" property. The PF must be >>> added after all VFs. It is user's responsibility to ensure that VFs >>> have >>> function numbers larger than one of the PF, and the function numbers >>> have a consistent stride. >> >> >> I tried to start a VM with more than 8 VFs allocated using your patch, >> but the following error occured and qemu didn't work: >> VF function number overflows. >> >> I think the cause of this error is that virtio-net-pci PFs don't have >> ARI. >> (pcie_ari_init is not added to virtio-net-pci when PFs are initialized.) >> I think it is possible to add it later, >> but how about adding pcie_ari_init ? >> >> As a trial, >> adding pcie_ari_init to virtio_pci_realize enabled the creation of >> more than 8 VFs. > > I have just looked into that possibility, but adding pcie_ari_init to > virtio_pci_realize has some implications. Unconditionally calling > pcie_ari_init will break the existing configuration of virtio-pci > devices so we need to implement some logic to detect when ARI is > needed. Preferably such logic should be implemented in the common PCI > infrastructure instead of implementing it in virtio-pci so that other > PCI multifunction devices can benefit from it. > > While I don't think implementing this will be too complicated, I need > to ensure that such a feature is really needed before doing so. OK. I want to use this emulation for offloading virtual network in a environment where there are many containers in VMs. So, I consider that the feature is need. I think that 7 VFs are too few. I'll keep thinking about the feature's necessity. I'll add other comments to RFC v5 patch. Regards, Yui Washizu
On 2024/07/31 18:34, Yui Washizu wrote: > > On 2024/07/15 14:15, Akihiko Odaki wrote: >> On 2024/05/16 11:00, Yui Washizu wrote: >>> >>> On 2024/04/28 18:05, Akihiko Odaki wrote: >>>> Based-on: <20240315-reuse-v9-0-67aa69af4d53@daynix.com> >>>> ("[PATCH for 9.1 v9 00/11] hw/pci: SR-IOV related fixes and >>>> improvements") >>>> >>>> Introduction >>>> ------------ >>>> >>>> This series is based on the RFC series submitted by Yui Washizu[1]. >>>> See also [2] for the context. >>>> >>>> This series enables SR-IOV emulation for virtio-net. It is useful >>>> to test SR-IOV support on the guest, or to expose several vDPA devices >>>> in a VM. vDPA devices can also provide L2 switching feature for >>>> offloading though it is out of scope to allow the guest to configure >>>> such a feature. >>>> >>>> The PF side code resides in virtio-pci. The VF side code resides in >>>> the PCI common infrastructure, but it is restricted to work only for >>>> virtio-net-pci because of lack of validation. >>>> >>>> User Interface >>>> -------------- >>>> >>>> A user can configure a SR-IOV capable virtio-net device by adding >>>> virtio-net-pci functions to a bus. Below is a command line example: >>>> -netdev user,id=n -netdev user,id=o >>>> -netdev user,id=p -netdev user,id=q >>>> -device pcie-root-port,id=b >>>> -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f >>>> -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f >>>> -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f >>>> -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f >>>> >>>> The VFs specify the paired PF with "sriov-pf" property. The PF must be >>>> added after all VFs. It is user's responsibility to ensure that VFs >>>> have >>>> function numbers larger than one of the PF, and the function numbers >>>> have a consistent stride. >>> >>> >>> I tried to start a VM with more than 8 VFs allocated using your patch, >>> but the following error occured and qemu didn't work: >>> VF function number overflows. >>> >>> I think the cause of this error is that virtio-net-pci PFs don't have >>> ARI. >>> (pcie_ari_init is not added to virtio-net-pci when PFs are initialized.) >>> I think it is possible to add it later, >>> but how about adding pcie_ari_init ? >>> >>> As a trial, >>> adding pcie_ari_init to virtio_pci_realize enabled the creation of >>> more than 8 VFs. >> >> I have just looked into that possibility, but adding pcie_ari_init to >> virtio_pci_realize has some implications. Unconditionally calling >> pcie_ari_init will break the existing configuration of virtio-pci >> devices so we need to implement some logic to detect when ARI is >> needed. Preferably such logic should be implemented in the common PCI >> infrastructure instead of implementing it in virtio-pci so that other >> PCI multifunction devices can benefit from it. >> >> While I don't think implementing this will be too complicated, I need >> to ensure that such a feature is really needed before doing so. > > > OK. > I want to use this emulation for offloading virtual network > in a environment where there are many containers in VMs. > So, I consider that the feature is need. > I think that 7 VFs are too few. > I'll keep thinking about the feature's necessity. I understand there could be many containers in VMs, but will a single device deal with them? If the virtio-net VFs are backed by the vDPA capability of one physical device, it will not have VFs more than that. The VMs must have several PFs individually paired with VFs to accommodate more containers on one VM. I don't know much about vDPA-capable device, but as a reference, igb only has 8 VFs. > > > I'll add other comments to RFC v5 patch. The RFC tag is already dropped. Regards, Akihiko Odaki
On Thu, Aug 01, 2024 at 02:37:55PM +0900, Akihiko Odaki wrote: > I don't know much about vDPA-capable device, but as a reference, igb only > has 8 VFs. modern vdpa capable devices have much more than 8 VFs, 8 is a very low number.
Based-on: <20240315-reuse-v9-0-67aa69af4d53@daynix.com> ("[PATCH for 9.1 v9 00/11] hw/pci: SR-IOV related fixes and improvements") Introduction ------------ This series is based on the RFC series submitted by Yui Washizu[1]. See also [2] for the context. This series enables SR-IOV emulation for virtio-net. It is useful to test SR-IOV support on the guest, or to expose several vDPA devices in a VM. vDPA devices can also provide L2 switching feature for offloading though it is out of scope to allow the guest to configure such a feature. The PF side code resides in virtio-pci. The VF side code resides in the PCI common infrastructure, but it is restricted to work only for virtio-net-pci because of lack of validation. User Interface -------------- A user can configure a SR-IOV capable virtio-net device by adding virtio-net-pci functions to a bus. Below is a command line example: -netdev user,id=n -netdev user,id=o -netdev user,id=p -netdev user,id=q -device pcie-root-port,id=b -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f The VFs specify the paired PF with "sriov-pf" property. The PF must be added after all VFs. It is user's responsibility to ensure that VFs have function numbers larger than one of the PF, and the function numbers have a consistent stride. Keeping VF instances -------------------- A problem with SR-IOV emulation is that it needs to hotplug the VFs as the guest requests. Previously, this behavior was implemented by realizing and unrealizing VFs at runtime. However, this strategy does not work well for the proposed virtio-net emulation; in this proposal, device options passed in the command line must be maintained as VFs are hotplugged, but they are consumed when the machine starts and not available after that, which makes realizing VFs at runtime impossible. As an strategy alternative to runtime realization/unrealization, this series proposes to reuse the code to power down PCI Express devices. When a PCI Express device is powered down, it will be hidden from the guest but will be kept realized. This effectively implements the behavior we need for the SR-IOV emulation. Summary ------- Patch 1 disables ROM BAR, which virtio-net-pci enables by default, for VFs. Patch 2 makes zero stride valid for 1 VF configuration. Patch 3 and 4 adds validations. Patch 5 adds user-created SR-IOV VF infrastructure. Patch 6 makes virtio-pci work as SR-IOV PF for user-created VFs. Patch 7 allows user to create SR-IOV VFs with virtio-net-pci. [1] https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.washidu@gmail.com/ [2] https://lore.kernel.org/all/5d46f455-f530-4e5e-9ae7-13a2297d4bc5@daynix.com/ Co-developed-by: Yui Washizu <yui.washidu@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> --- Changes in v4: - Added patch "hw/pci: Fix SR-IOV VF number calculation" to fix division by zero reported by Yui Washizu. - Rebased. - Link to v3: https://lore.kernel.org/r/20240305-sriov-v3-0-abdb75770372@daynix.com Changes in v3: - Rebased. - Link to v2: https://lore.kernel.org/r/20231210-sriov-v2-0-b959e8a6dfaf@daynix.com Changes in v2: - Changed to keep VF instances. - Link to v1: https://lore.kernel.org/r/20231202-sriov-v1-0-32b3570f7bd6@daynix.com --- Akihiko Odaki (7): hw/pci: Do not add ROM BAR for SR-IOV VF hw/pci: Fix SR-IOV VF number calculation pcie_sriov: Ensure PF and VF are mutually exclusive pcie_sriov: Check PCI Express for SR-IOV PF pcie_sriov: Allow user to create SR-IOV device virtio-pci: Implement SR-IOV PF virtio-net: Implement SR-IOV VF include/hw/pci/pci_device.h | 6 +- include/hw/pci/pcie_sriov.h | 19 +++ hw/pci/pci.c | 76 +++++++---- hw/pci/pcie_sriov.c | 298 +++++++++++++++++++++++++++++++++++--------- hw/virtio/virtio-net-pci.c | 1 + hw/virtio/virtio-pci.c | 7 ++ 6 files changed, 323 insertions(+), 84 deletions(-) --- base-commit: 2ac5458086ab61282f30c2f8bdf2ae9a0a06a75d change-id: 20231202-sriov-9402fb262be8 Best regards,