diff mbox series

[RFC,3/3] vfio/pci: Expose PCIe PASID capability to userspace

Message ID 20230926093121.18676-4-yi.l.liu@intel.com (mailing list archive)
State New
Headers show
Series vfio-pci support pasid attach/detach | expand

Commit Message

Yi Liu Sept. 26, 2023, 9:31 a.m. UTC
This exposes PCIe PASID capability to userspace and where to emulate this
capability if wants to further expose it to VM.

And this only exposes PASID capability for devices which has PCIe PASID
extended struture in its configuration space. While for VFs, userspace
is still unable to see this capability as SR-IOV spec forbides VF to
implement PASID capability extended structure. It is a TODO in future.
Related discussion can be found in below links:

https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/
https://lore.kernel.org/kvm/BL1PR11MB5271A60035EF591A5BE8AC878C01A@BL1PR11MB5271.namprd11.prod.outlook.com/

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/pci/vfio_pci_config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tian, Kevin Sept. 27, 2023, 8:07 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, September 26, 2023 5:31 PM
> 
> This exposes PCIe PASID capability to userspace and where to emulate this
> capability if wants to further expose it to VM.
> 
> And this only exposes PASID capability for devices which has PCIe PASID
> extended struture in its configuration space. While for VFs, userspace
> is still unable to see this capability as SR-IOV spec forbides VF to
> implement PASID capability extended structure. It is a TODO in future.
> Related discussion can be found in below links:
> 
> https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/
> https://lore.kernel.org/kvm/BL1PR11MB5271A60035EF591A5BE8AC878C01A
> @BL1PR11MB5271.namprd11.prod.outlook.com/
> 

Yes, we need a decision for VF case.

If the consensus is to continue exposing the PASID capability in vfio-pci
config space by developing a kernel quirk mechanism to find offset for
VF, then this patch for PF is orthogonal to that VF work and can go as it is.

But if the decision is to have a device feature for the user to enumerate
the vPASID capability and let the VMM take care of finding the vPASID
cap offset, then better we start doing that for PF too since it's not good
to have two enumeration interfaces for PF/VF respectively.

My preference is via device feature given Qemu already includes lots of
quirks for vfio-pci devices. Another reason is that when supporting vPASID
with SIOV there are some arch constraints which the driver needs to
report to the user to follow (e.g.  don't assign ENQCMD-capable sibling
vdev's to a same guest, etc.). A device feature interface can better
encapsulate everything related to vPASID in one place.

Thanks
Kevin
Alex Williamson Sept. 27, 2023, 6:52 p.m. UTC | #2
On Wed, 27 Sep 2023 08:07:54 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Tuesday, September 26, 2023 5:31 PM
> > 
> > This exposes PCIe PASID capability to userspace and where to emulate this
> > capability if wants to further expose it to VM.
> > 
> > And this only exposes PASID capability for devices which has PCIe PASID
> > extended struture in its configuration space. While for VFs, userspace
> > is still unable to see this capability as SR-IOV spec forbides VF to
> > implement PASID capability extended structure. It is a TODO in future.
> > Related discussion can be found in below links:
> > 
> > https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/
> > https://lore.kernel.org/kvm/BL1PR11MB5271A60035EF591A5BE8AC878C01A
> > @BL1PR11MB5271.namprd11.prod.outlook.com/
> >   
> 
> Yes, we need a decision for VF case.
> 
> If the consensus is to continue exposing the PASID capability in vfio-pci
> config space by developing a kernel quirk mechanism to find offset for
> VF, then this patch for PF is orthogonal to that VF work and can go as it is.
> 
> But if the decision is to have a device feature for the user to enumerate
> the vPASID capability and let the VMM take care of finding the vPASID
> cap offset, then better we start doing that for PF too since it's not good
> to have two enumeration interfaces for PF/VF respectively.

Note also that QEMU implements a lazy algorithm for exposing
capabilities, the default is to expose them, so we need to consider
existing VMs seeing a new read-only PASID capability on an assigned PF.

That might support an alternate means to expose the capability.

> My preference is via device feature given Qemu already includes lots of
> quirks for vfio-pci devices. Another reason is that when supporting vPASID
> with SIOV there are some arch constraints which the driver needs to
> report to the user to follow (e.g.  don't assign ENQCMD-capable sibling
> vdev's to a same guest, etc.).

?!

> A device feature interface can better
> encapsulate everything related to vPASID in one place.

Sorry if I don't remember, have you posted a proposal for the device
feature interface?  Thanks,

Alex
Tian, Kevin Sept. 28, 2023, 1:46 a.m. UTC | #3
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Thursday, September 28, 2023 2:53 AM
> 
> On Wed, 27 Sep 2023 08:07:54 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > Sent: Tuesday, September 26, 2023 5:31 PM
> > >
> > > This exposes PCIe PASID capability to userspace and where to emulate
> this
> > > capability if wants to further expose it to VM.
> > >
> > > And this only exposes PASID capability for devices which has PCIe PASID
> > > extended struture in its configuration space. While for VFs, userspace
> > > is still unable to see this capability as SR-IOV spec forbides VF to
> > > implement PASID capability extended structure. It is a TODO in future.
> > > Related discussion can be found in below links:
> > >
> > > https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/
> > >
> https://lore.kernel.org/kvm/BL1PR11MB5271A60035EF591A5BE8AC878C01A
> > > @BL1PR11MB5271.namprd11.prod.outlook.com/
> > >
> >
> > Yes, we need a decision for VF case.
> >
> > If the consensus is to continue exposing the PASID capability in vfio-pci
> > config space by developing a kernel quirk mechanism to find offset for
> > VF, then this patch for PF is orthogonal to that VF work and can go as it is.
> >
> > But if the decision is to have a device feature for the user to enumerate
> > the vPASID capability and let the VMM take care of finding the vPASID
> > cap offset, then better we start doing that for PF too since it's not good
> > to have two enumeration interfaces for PF/VF respectively.
> 
> Note also that QEMU implements a lazy algorithm for exposing
> capabilities, the default is to expose them, so we need to consider
> existing VMs seeing a new read-only PASID capability on an assigned PF.
> 
> That might support an alternate means to expose the capability.

Yep. that's also a valid point.

> 
> > My preference is via device feature given Qemu already includes lots of
> > quirks for vfio-pci devices. Another reason is that when supporting vPASID
> > with SIOV there are some arch constraints which the driver needs to
> > report to the user to follow (e.g.  don't assign ENQCMD-capable sibling
> > vdev's to a same guest, etc.).
> 
> ?!

Sorry that I didn't plan to elaborate that tricky constraint before we show
the overall SIOV/vPASID implementation. Explaining it requires lots of
context and here just want to mention the potential requirement in case
we need more proofs to go this direction. 
diff mbox series

Patch

diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 7e2e62ab0869..dfae5ad5ebc0 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -95,7 +95,7 @@  static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] = {
 	[PCI_EXT_CAP_ID_LTR]	=	PCI_EXT_CAP_LTR_SIZEOF,
 	[PCI_EXT_CAP_ID_SECPCI]	=	0,	/* not yet */
 	[PCI_EXT_CAP_ID_PMUX]	=	0,	/* not yet */
-	[PCI_EXT_CAP_ID_PASID]	=	0,	/* not yet */
+	[PCI_EXT_CAP_ID_PASID]	=	PCI_EXT_CAP_PASID_SIZEOF,
 	[PCI_EXT_CAP_ID_DVSEC]	=	0xFF,
 };