diff mbox series

[09/11] PCI: add matching checks for driver_override binding

Message ID 20210603160809.15845-10-mgurtovoy@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Introduce vfio-pci-core subsystem | expand

Commit Message

Max Gurtovoy June 3, 2021, 4:08 p.m. UTC
Allowing any driver in the system to be unconditionally bound to any
PCI HW is dangerous. Connecting a driver to a physical HW device it was
never intended to operate may trigger exploitable kernel bugs, or worse.
It also allows userspace to load and run kernel code that otherwise
would never be runnable on the system.

driver_override was designed to make it easier to load vfio_pci, so
focus it on that single use case. driver_override will only work on
drivers that specifically opt into this feature and the driver now has
the opportunity to provide a proper match table that indicates what HW
it can properly support. vfio-pci continues to support everything.

This also causes the modalias tables to be populated with dedicated
match table and userspace now becomes aware that vfio-pci can be loaded
against any PCI device using driver_override.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 Documentation/ABI/testing/sysfs-bus-pci |  6 +++++-
 drivers/pci/pci-driver.c                | 22 ++++++++++++----------
 drivers/vfio/pci/vfio_pci.c             |  9 ++++++++-
 3 files changed, 25 insertions(+), 12 deletions(-)

Comments

Alex Williamson June 8, 2021, 9:26 p.m. UTC | #1
On Thu, 3 Jun 2021 19:08:07 +0300
Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> Allowing any driver in the system to be unconditionally bound to any
> PCI HW is dangerous. Connecting a driver to a physical HW device it was
> never intended to operate may trigger exploitable kernel bugs, or worse.
> It also allows userspace to load and run kernel code that otherwise
> would never be runnable on the system.

This is just another way that an admin can do bad things, with the
intention that they know what they're doing and if not they get to
keep the pieces.  There's also still the new_id scheme for binding the
wrong drivers to devices, so the hole this claims to be addressing is
still fully present.

> driver_override was designed to make it easier to load vfio_pci, so

Actually driver_override was designed to resolve the non-deterministic
behavior of new_id, which allows inserting dynamic match entries.  The
problem is those match entries match any device that might come along
during the time window when userspace is trying to bind a specific
device to a specific driver.  driver_override flipped the problem to
match a device to a driver rather than vice versa.  Other bus types
have since adopted driver_override interfaces as well.

> focus it on that single use case. driver_override will only work on

It's used for other use cases across numerous bus types now.  For
instance, how can I user driver_override to bind pci-stub to a device
after this?  driverctl(8) uses driver_override to perform arbitrary
driver overrides, this breaks all but the vfio-pci use case.

> drivers that specifically opt into this feature and the driver now has
> the opportunity to provide a proper match table that indicates what HW
> it can properly support. vfio-pci continues to support everything.

In doing so, this also breaks the new_id method for vfio-pci.  Sorry,
with so many userspace regressions, crippling the driver_override
interface with an assumption of such a narrow focus, creating a vfio
specific match flag, I don't see where this can go.  Thanks,

Alex
Jason Gunthorpe June 8, 2021, 10:45 p.m. UTC | #2
On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
> > drivers that specifically opt into this feature and the driver now has
> > the opportunity to provide a proper match table that indicates what HW
> > it can properly support. vfio-pci continues to support everything.
> 
> In doing so, this also breaks the new_id method for vfio-pci.  

Does it? How? The driver_override flag is per match entry not for the
entire device so new_id added things will work the same as before as
their new match entry's flags will be zero.

> Sorry, with so many userspace regressions, crippling the
> driver_override interface with an assumption of such a narrow focus,
> creating a vfio specific match flag, I don't see where this can go.
> Thanks,

On the other hand it overcomes all the objections from the last go
round: how userspace figures out which driver to use with
driver_override and integrating the universal driver into the scheme.

pci_stub could be delt with by marking it for driver_override like
vfio_pci.

But driverctl as a general tool working with any module is not really
addressable.

Is the only issue the blocking of the arbitary binding? That is not a
critical peice of this, IIRC

Jason
Alex Williamson June 9, 2021, 1:27 a.m. UTC | #3
On Tue, 8 Jun 2021 19:45:17 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
> > > drivers that specifically opt into this feature and the driver now has
> > > the opportunity to provide a proper match table that indicates what HW
> > > it can properly support. vfio-pci continues to support everything.  
> > 
> > In doing so, this also breaks the new_id method for vfio-pci.    
> 
> Does it? How? The driver_override flag is per match entry not for the
> entire device so new_id added things will work the same as before as
> their new match entry's flags will be zero.

Hmm, that might have been a testing issue; combining driverctl with
manual new_id testing might have left a driver_override in place.
 
> > Sorry, with so many userspace regressions, crippling the
> > driver_override interface with an assumption of such a narrow focus,
> > creating a vfio specific match flag, I don't see where this can go.
> > Thanks,  
> 
> On the other hand it overcomes all the objections from the last go
> round: how userspace figures out which driver to use with
> driver_override and integrating the universal driver into the scheme.
> 
> pci_stub could be delt with by marking it for driver_override like
> vfio_pci.

By marking it a "vfio driver override"? :-\

> But driverctl as a general tool working with any module is not really
> addressable.
> 
> Is the only issue the blocking of the arbitary binding? That is not a
> critical peice of this, IIRC

We can't break userspace, which means new_id and driver_override need
to work as they do now.  There are scads of driver binding scripts in
the wild, for vfio-pci and other drivers.  We can't assume such a
narrow scope.  Thanks,

Alex
Max Gurtovoy June 9, 2021, 9:26 a.m. UTC | #4
On 6/9/2021 4:27 AM, Alex Williamson wrote:
> On Tue, 8 Jun 2021 19:45:17 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
>>>> drivers that specifically opt into this feature and the driver now has
>>>> the opportunity to provide a proper match table that indicates what HW
>>>> it can properly support. vfio-pci continues to support everything.
>>> In doing so, this also breaks the new_id method for vfio-pci.
>> Does it? How? The driver_override flag is per match entry not for the
>> entire device so new_id added things will work the same as before as
>> their new match entry's flags will be zero.
> Hmm, that might have been a testing issue; combining driverctl with
> manual new_id testing might have left a driver_override in place.
>   
>>> Sorry, with so many userspace regressions, crippling the
>>> driver_override interface with an assumption of such a narrow focus,
>>> creating a vfio specific match flag, I don't see where this can go.
>>> Thanks,
>> On the other hand it overcomes all the objections from the last go
>> round: how userspace figures out which driver to use with
>> driver_override and integrating the universal driver into the scheme.
>>
>> pci_stub could be delt with by marking it for driver_override like
>> vfio_pci.
> By marking it a "vfio driver override"? :-\

Of course not. We'll mark it as "stub driver override".

>
>> But driverctl as a general tool working with any module is not really
>> addressable.
>>
>> Is the only issue the blocking of the arbitary binding? That is not a
>> critical peice of this, IIRC
> We can't break userspace, which means new_id and driver_override need
> to work as they do now.  There are scads of driver binding scripts in
> the wild, for vfio-pci and other drivers.  We can't assume such a
> narrow scope.  Thanks,
>
> Alex
>
Max Gurtovoy June 13, 2021, 8:19 a.m. UTC | #5
On 6/9/2021 4:27 AM, Alex Williamson wrote:
> On Tue, 8 Jun 2021 19:45:17 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
>>>> drivers that specifically opt into this feature and the driver now has
>>>> the opportunity to provide a proper match table that indicates what HW
>>>> it can properly support. vfio-pci continues to support everything.
>>> In doing so, this also breaks the new_id method for vfio-pci.
>> Does it? How? The driver_override flag is per match entry not for the
>> entire device so new_id added things will work the same as before as
>> their new match entry's flags will be zero.
> Hmm, that might have been a testing issue; combining driverctl with
> manual new_id testing might have left a driver_override in place.
>   
>>> Sorry, with so many userspace regressions, crippling the
>>> driver_override interface with an assumption of such a narrow focus,
>>> creating a vfio specific match flag, I don't see where this can go.
>>> Thanks,
>> On the other hand it overcomes all the objections from the last go
>> round: how userspace figures out which driver to use with
>> driver_override and integrating the universal driver into the scheme.
>>
>> pci_stub could be delt with by marking it for driver_override like
>> vfio_pci.
> By marking it a "vfio driver override"? :-\
>
>> But driverctl as a general tool working with any module is not really
>> addressable.
>>
>> Is the only issue the blocking of the arbitary binding? That is not a
>> critical peice of this, IIRC
> We can't break userspace, which means new_id and driver_override need
> to work as they do now.  There are scads of driver binding scripts in
> the wild, for vfio-pci and other drivers.  We can't assume such a
> narrow scope.  Thanks,

what about the following code ?

@@ -152,12 +152,28 @@ static const struct pci_device_id 
*pci_match_device(struct pci_driver *drv,
         }
         spin_unlock(&drv->dynids.lock);

-       if (!found_id)
-               found_id = pci_match_id(drv->id_table, dev);
+       if (found_id)
+               return found_id;

-       /* driver_override will always match, send a dummy id */
-       if (!found_id && dev->driver_override)
+       found_id = pci_match_id(drv->id_table, dev);
+       if (found_id) {
+               /*
+                * if we found id in the static table, we must fulfill the
+                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
+                * set, driver_override should be provided).
+                */
+               bool is_driver_override =
+                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
+               if ((is_driver_override && !dev->driver_override) ||
+                   (dev->driver_override && !is_driver_override))
+                       return NULL;
+       } else if (dev->driver_override) {
+               /*
+                * if we didn't find suitable id in the static table,
+                * driver_override will still , send a dummy id
+                */
                 found_id = &pci_device_id_any;
+       }

         return found_id;
  }


dynamic ids (new_id) works as before.

Old driver_override works as before.

For "new" driver_override we must fulfill the new rules.

>
> Alex
>
Christoph Hellwig June 14, 2021, 5:40 a.m. UTC | #6
On Sun, Jun 13, 2021 at 11:19:46AM +0300, Max Gurtovoy wrote:
> what about the following code ?
> 
> @@ -152,12 +152,28 @@ static const struct pci_device_id
> *pci_match_device(struct pci_driver *drv,
> ?????????????? }
> ?????????????? spin_unlock(&drv->dynids.lock);
> 
> -???????????? if (!found_id)
> -???????????????????????????? found_id = pci_match_id(drv->id_table, dev);
> +???????????? if (found_id)
> +???????????????????????????? return found_id;

Something is broken in your mailer because this does not look like code
at all.
Max Gurtovoy June 14, 2021, 8:18 a.m. UTC | #7
On 6/14/2021 8:40 AM, Christoph Hellwig wrote:
> On Sun, Jun 13, 2021 at 11:19:46AM +0300, Max Gurtovoy wrote:
>> what about the following code ?
>>
>> @@ -152,12 +152,28 @@ static const struct pci_device_id
>> *pci_match_device(struct pci_driver *drv,
>> ?????????????? }
>> ?????????????? spin_unlock(&drv->dynids.lock);
>>
>> -???????????? if (!found_id)
>> -???????????????????????????? found_id = pci_match_id(drv->id_table, dev);
>> +???????????? if (found_id)
>> +???????????????????????????? return found_id;
> Something is broken in your mailer because this does not look like code
> at all.

Sorry, see the attached patches.
From 4a6050f0ce50f0cdef4c06d7cabf0260a14a8615 Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <mgurtovoy@nvidia.com>
Date: Thu, 13 May 2021 13:53:19 +0300
Subject: [PATCH 1/2] PCI: add flags field to pci_device_id structure

This field will be used to allow pci modules to set some specific data
that can be used for matching, aliases and other hints. Add example for
"driver_override" pci drivers that will set a special prefix in the
modules.alias table. In the future, this flag will enforce
"driver_override" to work on drivers that specifically opt into this
feature. The udev utility will not try to load these drivers
automatically in order to bind to new discovered devices. This will be
because the modalias tables populated by those drivers will be different
from "regular" pci modalias tables. Userspace utilities, such as
libvirt, will need to adjust and bind devices according to new matching
mechanism with taking "driver_override" enforcement into consideration.
Add vfio and stub to "driver_override" PCI family.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 Documentation/PCI/pci.rst         |  1 +
 include/linux/mod_devicetable.h   | 11 +++++++++++
 include/linux/pci.h               | 27 +++++++++++++++++++++++++++
 scripts/mod/devicetable-offsets.c |  1 +
 scripts/mod/file2alias.c          | 10 ++++++++--
 5 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 814b40f8360b..0855657daf93 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
   - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
   - class and classmask fields default to 0
   - driver_data defaults to 0UL.
+  - flags field defaults to 0.
 
 Note that driver_data must match the value used by any of the pci_device_id
 entries defined in the driver. This makes the driver_data field mandatory
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 7d45b5f989b0..51b852a2d32b 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -16,6 +16,15 @@ typedef unsigned long kernel_ulong_t;
 
 #define PCI_ANY_ID (~0)
 
+
+enum pci_id_flags {
+	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
+	PCI_ID_F_STUB_DRIVER_OVERRIDE	= 1 << 1,
+};
+
+#define PCI_ID_F_DRIVER_OVERRIDE \
+	(PCI_ID_F_VFIO_DRIVER_OVERRIDE | PCI_ID_F_STUB_DRIVER_OVERRIDE)
+
 /**
  * struct pci_device_id - PCI device ID structure
  * @vendor:		Vendor ID to match (or PCI_ANY_ID)
@@ -34,12 +43,14 @@ typedef unsigned long kernel_ulong_t;
  *			Best practice is to use driver_data as an index
  *			into a static list of equivalent device types,
  *			instead of using it as a pointer.
+ * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
  */
 struct pci_device_id {
 	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
 	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
 	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
 	kernel_ulong_t driver_data;	/* Data private to the driver */
+	__u32 flags;
 };
 
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c20211e59a57..4af761552068 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -898,6 +898,33 @@ struct pci_driver {
 	.vendor = (vend), .device = (dev), \
 	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
 
+/**
+ * PCI_DEVICE_FLAGS - macro used to describe a PCI device with specific flags.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ * @fl: PCI Device flags as a bitmap of pci_id_flags enum
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID.
+ */
+#define PCI_DEVICE_FLAGS(vend, dev, fl) \
+	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
+	.subdevice = PCI_ANY_ID, .flags = (fl)
+
+/**
+ * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
+ *                                   "driver_override" PCI device.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
+ */
+#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
+	PCI_DEVICE_FLAGS(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+
 /**
  * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
  * @vend: the 16 bit PCI Vendor ID
diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
index 9bb6c7edccc4..b927c36b8333 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -42,6 +42,7 @@ int main(void)
 	DEVID_FIELD(pci_device_id, subdevice);
 	DEVID_FIELD(pci_device_id, class);
 	DEVID_FIELD(pci_device_id, class_mask);
+	DEVID_FIELD(pci_device_id, flags);
 
 	DEVID(ccw_device_id);
 	DEVID_FIELD(ccw_device_id, match_flags);
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 7c97fa8e36bc..6863cc312055 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
 	return 1;
 }
 
-/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
+/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
 static int do_pci_entry(const char *filename,
 			void *symval, char *alias)
 {
@@ -440,8 +440,14 @@ static int do_pci_entry(const char *filename,
 	DEF_FIELD(symval, pci_device_id, subdevice);
 	DEF_FIELD(symval, pci_device_id, class);
 	DEF_FIELD(symval, pci_device_id, class_mask);
+	DEF_FIELD(symval, pci_device_id, flags);
 
-	strcpy(alias, "pci:");
+	if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+		strcpy(alias, "vfio_pci:");
+	else if (flags & PCI_ID_F_STUB_DRIVER_OVERRIDE)
+		strcpy(alias, "stub_pci:");
+	else
+		strcpy(alias, "pci:");
 	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
 	ADD(alias, "d", device != PCI_ANY_ID, device);
 	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
Christoph Hellwig June 14, 2021, 3:27 p.m. UTC | #8
On Mon, Jun 14, 2021 at 11:18:32AM +0300, Max Gurtovoy wrote:
>   *			into a static list of equivalent device types,
>   *			instead of using it as a pointer.
> + * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
>   */
>  struct pci_device_id {
>  	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
>  	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
>  	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
>  	kernel_ulong_t driver_data;	/* Data private to the driver */
> +	__u32 flags;
>  };

Isn't struct pci_device_id a userspace ABI due to MODULE_DEVICE_TABLE()?
Jason Gunthorpe June 14, 2021, 4:01 p.m. UTC | #9
On Mon, Jun 14, 2021 at 04:27:33PM +0100, Christoph Hellwig wrote:
> On Mon, Jun 14, 2021 at 11:18:32AM +0300, Max Gurtovoy wrote:
> >   *			into a static list of equivalent device types,
> >   *			instead of using it as a pointer.
> > + * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
> >   */
> >  struct pci_device_id {
> >  	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
> >  	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
> >  	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
> >  	kernel_ulong_t driver_data;	/* Data private to the driver */
> > +	__u32 flags;
> >  };
> 
> Isn't struct pci_device_id a userspace ABI due to MODULE_DEVICE_TABLE()?

Not that I can find, it isn't under include/uapi and the way to find
this information is by looking for symbols starting with "__mod_"

Debian Code Search says the only place with '"__mod_"' is in
file2alias.c at least

Do you know of something? If yes this file should be moved

Jason
Christoph Hellwig June 14, 2021, 4:15 p.m. UTC | #10
On Mon, Jun 14, 2021 at 01:01:25PM -0300, Jason Gunthorpe wrote:
> > Isn't struct pci_device_id a userspace ABI due to MODULE_DEVICE_TABLE()?
> 
> Not that I can find, it isn't under include/uapi and the way to find
> this information is by looking for symbols starting with "__mod_"
> 
> Debian Code Search says the only place with '"__mod_"' is in
> file2alias.c at least
> 
> Do you know of something? If yes this file should be moved

Seems lke file2alias.c is indeed the only consumer.  So it is a
userspace ABI, but ony to a file included in the kernel tree.
Jason Gunthorpe June 14, 2021, 4:33 p.m. UTC | #11
On Mon, Jun 14, 2021 at 05:15:22PM +0100, Christoph Hellwig wrote:
> On Mon, Jun 14, 2021 at 01:01:25PM -0300, Jason Gunthorpe wrote:
> > > Isn't struct pci_device_id a userspace ABI due to MODULE_DEVICE_TABLE()?
> > 
> > Not that I can find, it isn't under include/uapi and the way to find
> > this information is by looking for symbols starting with "__mod_"
> > 
> > Debian Code Search says the only place with '"__mod_"' is in
> > file2alias.c at least
> > 
> > Do you know of something? If yes this file should be moved
> 
> Seems lke file2alias.c is indeed the only consumer.  So it is a
> userspace ABI, but ony to a file included in the kernel tree.

As I understand it, things are tighter than that, it is only an API
between different parts of kbuild - so it is OK to change
it. module.alias is the uAPI this data gets marshaled into.

Jason
Alex Williamson June 14, 2021, 6:42 p.m. UTC | #12
On Sun, 13 Jun 2021 11:19:46 +0300
Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 6/9/2021 4:27 AM, Alex Williamson wrote:
> > On Tue, 8 Jun 2021 19:45:17 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >  
> >> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:  
> >>>> drivers that specifically opt into this feature and the driver now has
> >>>> the opportunity to provide a proper match table that indicates what HW
> >>>> it can properly support. vfio-pci continues to support everything.  
> >>> In doing so, this also breaks the new_id method for vfio-pci.  
> >> Does it? How? The driver_override flag is per match entry not for the
> >> entire device so new_id added things will work the same as before as
> >> their new match entry's flags will be zero.  
> > Hmm, that might have been a testing issue; combining driverctl with
> > manual new_id testing might have left a driver_override in place.
> >     
> >>> Sorry, with so many userspace regressions, crippling the
> >>> driver_override interface with an assumption of such a narrow focus,
> >>> creating a vfio specific match flag, I don't see where this can go.
> >>> Thanks,  
> >> On the other hand it overcomes all the objections from the last go
> >> round: how userspace figures out which driver to use with
> >> driver_override and integrating the universal driver into the scheme.
> >>
> >> pci_stub could be delt with by marking it for driver_override like
> >> vfio_pci.  
> > By marking it a "vfio driver override"? :-\
> >  
> >> But driverctl as a general tool working with any module is not really
> >> addressable.
> >>
> >> Is the only issue the blocking of the arbitary binding? That is not a
> >> critical peice of this, IIRC  
> > We can't break userspace, which means new_id and driver_override need
> > to work as they do now.  There are scads of driver binding scripts in
> > the wild, for vfio-pci and other drivers.  We can't assume such a
> > narrow scope.  Thanks,  
> 
> what about the following code ?
> 
> @@ -152,12 +152,28 @@ static const struct pci_device_id 
> *pci_match_device(struct pci_driver *drv,
>          }
>          spin_unlock(&drv->dynids.lock);
> 
> -       if (!found_id)
> -               found_id = pci_match_id(drv->id_table, dev);
> +       if (found_id)
> +               return found_id;

a) A dynamic ID match always works regardless of driver override...

> 
> -       /* driver_override will always match, send a dummy id */
> -       if (!found_id && dev->driver_override)
> +       found_id = pci_match_id(drv->id_table, dev);
> +       if (found_id) {
> +               /*
> +                * if we found id in the static table, we must fulfill the
> +                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
> +                * set, driver_override should be provided).
> +                */
> +               bool is_driver_override =
> +                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
> +               if ((is_driver_override && !dev->driver_override) ||

b) A static ID match fails if the driver provides an override flag and
the device does not have an override set, or... 

> +                   (dev->driver_override && !is_driver_override))

c) The device has an override set and the driver does not support the
override flag.

> +                       return NULL;
> +       } else if (dev->driver_override) {
> +               /*
> +                * if we didn't find suitable id in the static table,
> +                * driver_override will still , send a dummy id
> +                */
>                  found_id = &pci_device_id_any;
> +       }
> 
>          return found_id;
>   }
> 
> 
> dynamic ids (new_id) works as before.
> 
> Old driver_override works as before.

This is deceptively complicated, but no, I don't believe it does.  By
my understanding of c) an "old" driver can no longer use
driver_override for binding a known device.  It seems that if we have a
static ID match, then we cannot have a driver_override set for the
device in such a case.  This is a userspace regression.

> For "new" driver_override we must fulfill the new rules.

For override'able drivers, the static table is almost useless other
than using it for modules.alias support and potentially to provide
driver_data.  As above, I find this all pretty confusing and I'd advise
trying to write a concise set of rules outlining the behavior of
driver_override vs dynamic IDs vs static IDs vs "override'able" driver
flags.  I tried, I can't, it's convoluted and full of exceptions.
Thanks,

Alex
Max Gurtovoy June 14, 2021, 11:12 p.m. UTC | #13
On 6/14/2021 9:42 PM, Alex Williamson wrote:
> On Sun, 13 Jun 2021 11:19:46 +0300
> Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>> On 6/9/2021 4:27 AM, Alex Williamson wrote:
>>> On Tue, 8 Jun 2021 19:45:17 -0300
>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>   
>>>> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
>>>>>> drivers that specifically opt into this feature and the driver now has
>>>>>> the opportunity to provide a proper match table that indicates what HW
>>>>>> it can properly support. vfio-pci continues to support everything.
>>>>> In doing so, this also breaks the new_id method for vfio-pci.
>>>> Does it? How? The driver_override flag is per match entry not for the
>>>> entire device so new_id added things will work the same as before as
>>>> their new match entry's flags will be zero.
>>> Hmm, that might have been a testing issue; combining driverctl with
>>> manual new_id testing might have left a driver_override in place.
>>>      
>>>>> Sorry, with so many userspace regressions, crippling the
>>>>> driver_override interface with an assumption of such a narrow focus,
>>>>> creating a vfio specific match flag, I don't see where this can go.
>>>>> Thanks,
>>>> On the other hand it overcomes all the objections from the last go
>>>> round: how userspace figures out which driver to use with
>>>> driver_override and integrating the universal driver into the scheme.
>>>>
>>>> pci_stub could be delt with by marking it for driver_override like
>>>> vfio_pci.
>>> By marking it a "vfio driver override"? :-\
>>>   
>>>> But driverctl as a general tool working with any module is not really
>>>> addressable.
>>>>
>>>> Is the only issue the blocking of the arbitary binding? That is not a
>>>> critical peice of this, IIRC
>>> We can't break userspace, which means new_id and driver_override need
>>> to work as they do now.  There are scads of driver binding scripts in
>>> the wild, for vfio-pci and other drivers.  We can't assume such a
>>> narrow scope.  Thanks,
>> what about the following code ?
>>
>> @@ -152,12 +152,28 @@ static const struct pci_device_id
>> *pci_match_device(struct pci_driver *drv,
>>           }
>>           spin_unlock(&drv->dynids.lock);
>>
>> -       if (!found_id)
>> -               found_id = pci_match_id(drv->id_table, dev);
>> +       if (found_id)
>> +               return found_id;
> a) A dynamic ID match always works regardless of driver override...
>
>> -       /* driver_override will always match, send a dummy id */
>> -       if (!found_id && dev->driver_override)
>> +       found_id = pci_match_id(drv->id_table, dev);
>> +       if (found_id) {
>> +               /*
>> +                * if we found id in the static table, we must fulfill the
>> +                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
>> +                * set, driver_override should be provided).
>> +                */
>> +               bool is_driver_override =
>> +                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
>> +               if ((is_driver_override && !dev->driver_override) ||
> b) A static ID match fails if the driver provides an override flag and
> the device does not have an override set, or...
>
>> +                   (dev->driver_override && !is_driver_override))
> c) The device has an override set and the driver does not support the
> override flag.
>
>> +                       return NULL;
>> +       } else if (dev->driver_override) {
>> +               /*
>> +                * if we didn't find suitable id in the static table,
>> +                * driver_override will still , send a dummy id
>> +                */
>>                   found_id = &pci_device_id_any;
>> +       }
>>
>>           return found_id;
>>    }
>>
>>
>> dynamic ids (new_id) works as before.
>>
>> Old driver_override works as before.
> This is deceptively complicated, but no, I don't believe it does.  By
> my understanding of c) an "old" driver can no longer use
> driver_override for binding a known device.  It seems that if we have a
> static ID match, then we cannot have a driver_override set for the
> device in such a case.  This is a userspace regression.

If I'll remove condition c) everyone will be happy ?

I really would like to end this ongoing discussion and finally have a 
clear idea of what we want.

By clear I mean C code.

If we'll continue raising ideas we'll never reach our goal. And my goal 
is the next merge window.

>
>> For "new" driver_override we must fulfill the new rules.
> For override'able drivers, the static table is almost useless other
> than using it for modules.alias support and potentially to provide
> driver_data.  As above, I find this all pretty confusing and I'd advise
> trying to write a concise set of rules outlining the behavior of
> driver_override vs dynamic IDs vs static IDs vs "override'able" driver
> flags.  I tried, I can't, it's convoluted and full of exceptions.
> Thanks,
>
> Alex
>
Alex Williamson June 15, 2021, 3 p.m. UTC | #14
On Tue, 15 Jun 2021 02:12:15 +0300
Max Gurtovoy <mgurtovoy@nvidia.com> wrote:

> On 6/14/2021 9:42 PM, Alex Williamson wrote:
> > On Sun, 13 Jun 2021 11:19:46 +0300
> > Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >  
> >> On 6/9/2021 4:27 AM, Alex Williamson wrote:  
> >>> On Tue, 8 Jun 2021 19:45:17 -0300
> >>> Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>>     
> >>>> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:  
> >>>>>> drivers that specifically opt into this feature and the driver now has
> >>>>>> the opportunity to provide a proper match table that indicates what HW
> >>>>>> it can properly support. vfio-pci continues to support everything.  
> >>>>> In doing so, this also breaks the new_id method for vfio-pci.  
> >>>> Does it? How? The driver_override flag is per match entry not for the
> >>>> entire device so new_id added things will work the same as before as
> >>>> their new match entry's flags will be zero.  
> >>> Hmm, that might have been a testing issue; combining driverctl with
> >>> manual new_id testing might have left a driver_override in place.
> >>>        
> >>>>> Sorry, with so many userspace regressions, crippling the
> >>>>> driver_override interface with an assumption of such a narrow focus,
> >>>>> creating a vfio specific match flag, I don't see where this can go.
> >>>>> Thanks,  
> >>>> On the other hand it overcomes all the objections from the last go
> >>>> round: how userspace figures out which driver to use with
> >>>> driver_override and integrating the universal driver into the scheme.
> >>>>
> >>>> pci_stub could be delt with by marking it for driver_override like
> >>>> vfio_pci.  
> >>> By marking it a "vfio driver override"? :-\
> >>>     
> >>>> But driverctl as a general tool working with any module is not really
> >>>> addressable.
> >>>>
> >>>> Is the only issue the blocking of the arbitary binding? That is not a
> >>>> critical peice of this, IIRC  
> >>> We can't break userspace, which means new_id and driver_override need
> >>> to work as they do now.  There are scads of driver binding scripts in
> >>> the wild, for vfio-pci and other drivers.  We can't assume such a
> >>> narrow scope.  Thanks,  
> >> what about the following code ?
> >>
> >> @@ -152,12 +152,28 @@ static const struct pci_device_id
> >> *pci_match_device(struct pci_driver *drv,
> >>           }
> >>           spin_unlock(&drv->dynids.lock);
> >>
> >> -       if (!found_id)
> >> -               found_id = pci_match_id(drv->id_table, dev);
> >> +       if (found_id)
> >> +               return found_id;  
> > a) A dynamic ID match always works regardless of driver override...
> >  
> >> -       /* driver_override will always match, send a dummy id */
> >> -       if (!found_id && dev->driver_override)
> >> +       found_id = pci_match_id(drv->id_table, dev);
> >> +       if (found_id) {
> >> +               /*
> >> +                * if we found id in the static table, we must fulfill the
> >> +                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
> >> +                * set, driver_override should be provided).
> >> +                */
> >> +               bool is_driver_override =
> >> +                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
> >> +               if ((is_driver_override && !dev->driver_override) ||  
> > b) A static ID match fails if the driver provides an override flag and
> > the device does not have an override set, or...
> >  
> >> +                   (dev->driver_override && !is_driver_override))  
> > c) The device has an override set and the driver does not support the
> > override flag.
> >  
> >> +                       return NULL;
> >> +       } else if (dev->driver_override) {
> >> +               /*
> >> +                * if we didn't find suitable id in the static table,
> >> +                * driver_override will still , send a dummy id
> >> +                */
> >>                   found_id = &pci_device_id_any;
> >> +       }
> >>
> >>           return found_id;
> >>    }
> >>
> >>
> >> dynamic ids (new_id) works as before.
> >>
> >> Old driver_override works as before.  
> > This is deceptively complicated, but no, I don't believe it does.  By
> > my understanding of c) an "old" driver can no longer use
> > driver_override for binding a known device.  It seems that if we have a
> > static ID match, then we cannot have a driver_override set for the
> > device in such a case.  This is a userspace regression.  
> 
> If I'll remove condition c) everyone will be happy ?
> 
> I really would like to end this ongoing discussion and finally have a 
> clear idea of what we want.
> 
> By clear I mean C code.
> 
> If we'll continue raising ideas we'll never reach our goal. And my goal 
> is the next merge window.

Bjorn would ultimately need to make the call on that, I don't see an
obvious regression if c) is dropped.  pci-stub and pci-pf-stub should
be included in the proposal so we can better understand how creating a
"vfio" override in PCI-core plays out for other override types.  Also I
don't think dynamic IDs should be handled uniquely, new_id_store()
should gain support for flags and userspace should be able to add new
dynamic ID with override-only matches to the table.  Thanks,

Alex
Jason Gunthorpe June 15, 2021, 3:04 p.m. UTC | #15
On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:

> "vfio" override in PCI-core plays out for other override types.  Also I
> don't think dynamic IDs should be handled uniquely, new_id_store()
> should gain support for flags and userspace should be able to add new
> dynamic ID with override-only matches to the table.  Thanks,

Why? Once all the enforcement is stripped out the only purpose of the
new flag is to signal a different prepration of modules.alias - which
won't happen for the new_id path anyhow

Jason
Alex Williamson June 15, 2021, 4:20 p.m. UTC | #16
On Tue, 15 Jun 2021 12:04:58 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:
> 
> > "vfio" override in PCI-core plays out for other override types.  Also I
> > don't think dynamic IDs should be handled uniquely, new_id_store()
> > should gain support for flags and userspace should be able to add new
> > dynamic ID with override-only matches to the table.  Thanks,  
> 
> Why? Once all the enforcement is stripped out the only purpose of the
> new flag is to signal a different prepration of modules.alias - which
> won't happen for the new_id path anyhow

Because new_id allows the admin to insert a new pci_device_id which has
been extended to include a flags field and intentionally handling
dynamic IDs differently from static IDs seems like generally a bad
thing.  For example, maybe the admin wants to specify nouveau as only
an override match for all 10de: class vga devices.  Thanks,

Alex
Jason Gunthorpe June 15, 2021, 8:42 p.m. UTC | #17
On Tue, Jun 15, 2021 at 10:20:49AM -0600, Alex Williamson wrote:
> On Tue, 15 Jun 2021 12:04:58 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:
> > 
> > > "vfio" override in PCI-core plays out for other override types.  Also I
> > > don't think dynamic IDs should be handled uniquely, new_id_store()
> > > should gain support for flags and userspace should be able to add new
> > > dynamic ID with override-only matches to the table.  Thanks,  
> > 
> > Why? Once all the enforcement is stripped out the only purpose of the
> > new flag is to signal a different prepration of modules.alias - which
> > won't happen for the new_id path anyhow
> 
> Because new_id allows the admin to insert a new pci_device_id which has
> been extended to include a flags field and intentionally handling
> dynamic IDs differently from static IDs seems like generally a bad
> thing.  

I'd agree with you if there was a functional difference at runtime,
but since that was all removed, I don't think we should touch new_id.

This ends up effectively being only a kbuild related patch that
changes how modules.alias is built.

Jason
Alex Williamson June 15, 2021, 9:59 p.m. UTC | #18
On Tue, 15 Jun 2021 17:42:16 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 15, 2021 at 10:20:49AM -0600, Alex Williamson wrote:
> > On Tue, 15 Jun 2021 12:04:58 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:
> > >   
> > > > "vfio" override in PCI-core plays out for other override types.  Also I
> > > > don't think dynamic IDs should be handled uniquely, new_id_store()
> > > > should gain support for flags and userspace should be able to add new
> > > > dynamic ID with override-only matches to the table.  Thanks,    
> > > 
> > > Why? Once all the enforcement is stripped out the only purpose of the
> > > new flag is to signal a different prepration of modules.alias - which
> > > won't happen for the new_id path anyhow  
> > 
> > Because new_id allows the admin to insert a new pci_device_id which has
> > been extended to include a flags field and intentionally handling
> > dynamic IDs differently from static IDs seems like generally a bad
> > thing.    
> 
> I'd agree with you if there was a functional difference at runtime,
> but since that was all removed, I don't think we should touch new_id.
> 
> This ends up effectively being only a kbuild related patch that
> changes how modules.alias is built.

But it wasn't all removed.  The proposal had:

 a) Short circuit the dynamic ID match
 b) Fail a driver-override-only match without a driver_override
 c) Fail a non-driver-override-only match with a driver_override

Max is only proposing removing c).

b) alone is a functional, runtime difference.  As per my previous
example, think about using it as effectively an anti-match.  Userspace
can create dynamic entries that will be matched before static entries
that can demote a driver to not be able to bind to a device
automatically.  That's functional and useful, I can't think of any
other way we have to effectively remove a static match entry from a
driver.  Thanks,

Alex
Jason Gunthorpe June 15, 2021, 11 p.m. UTC | #19
On Tue, Jun 15, 2021 at 03:59:00PM -0600, Alex Williamson wrote:
> On Tue, 15 Jun 2021 17:42:16 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Jun 15, 2021 at 10:20:49AM -0600, Alex Williamson wrote:
> > > On Tue, 15 Jun 2021 12:04:58 -0300
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >   
> > > > On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:
> > > >   
> > > > > "vfio" override in PCI-core plays out for other override types.  Also I
> > > > > don't think dynamic IDs should be handled uniquely, new_id_store()
> > > > > should gain support for flags and userspace should be able to add new
> > > > > dynamic ID with override-only matches to the table.  Thanks,    
> > > > 
> > > > Why? Once all the enforcement is stripped out the only purpose of the
> > > > new flag is to signal a different prepration of modules.alias - which
> > > > won't happen for the new_id path anyhow  
> > > 
> > > Because new_id allows the admin to insert a new pci_device_id which has
> > > been extended to include a flags field and intentionally handling
> > > dynamic IDs differently from static IDs seems like generally a bad
> > > thing.    
> > 
> > I'd agree with you if there was a functional difference at runtime,
> > but since that was all removed, I don't think we should touch new_id.
> > 
> > This ends up effectively being only a kbuild related patch that
> > changes how modules.alias is built.
> 
> But it wasn't all removed.  The proposal had:
> 
>  a) Short circuit the dynamic ID match
>  b) Fail a driver-override-only match without a driver_override
>  c) Fail a non-driver-override-only match with a driver_override
> 
> Max is only proposing removing c).
> 
> b) alone is a functional, runtime difference.

I would state b) differently:

b) Ignore the driver-override-only match entries in the ID table.

As if we look at new_id, I can't think of any reason for userspace to
add an entry to the ID table and then tell the kernel to ignore
it. If you want the kernel to ignore it then just don't add it in the
first place.

Do you have some other scenario in mind?

Jason
Alex Williamson June 15, 2021, 11:22 p.m. UTC | #20
On Tue, 15 Jun 2021 20:00:17 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 15, 2021 at 03:59:00PM -0600, Alex Williamson wrote:
> > On Tue, 15 Jun 2021 17:42:16 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Tue, Jun 15, 2021 at 10:20:49AM -0600, Alex Williamson wrote:  
> > > > On Tue, 15 Jun 2021 12:04:58 -0300
> > > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >     
> > > > > On Tue, Jun 15, 2021 at 09:00:29AM -0600, Alex Williamson wrote:
> > > > >     
> > > > > > "vfio" override in PCI-core plays out for other override types.  Also I
> > > > > > don't think dynamic IDs should be handled uniquely, new_id_store()
> > > > > > should gain support for flags and userspace should be able to add new
> > > > > > dynamic ID with override-only matches to the table.  Thanks,      
> > > > > 
> > > > > Why? Once all the enforcement is stripped out the only purpose of the
> > > > > new flag is to signal a different prepration of modules.alias - which
> > > > > won't happen for the new_id path anyhow    
> > > > 
> > > > Because new_id allows the admin to insert a new pci_device_id which has
> > > > been extended to include a flags field and intentionally handling
> > > > dynamic IDs differently from static IDs seems like generally a bad
> > > > thing.      
> > > 
> > > I'd agree with you if there was a functional difference at runtime,
> > > but since that was all removed, I don't think we should touch new_id.
> > > 
> > > This ends up effectively being only a kbuild related patch that
> > > changes how modules.alias is built.  
> > 
> > But it wasn't all removed.  The proposal had:
> > 
> >  a) Short circuit the dynamic ID match
> >  b) Fail a driver-override-only match without a driver_override
> >  c) Fail a non-driver-override-only match with a driver_override
> > 
> > Max is only proposing removing c).
> > 
> > b) alone is a functional, runtime difference.  
> 
> I would state b) differently:
> 
> b) Ignore the driver-override-only match entries in the ID table.

No, pci_match_device() returns NULL if a match is found that is marked
driver-override-only and a driver_override is not specified.  That's
the same as no match at all.  We don't then go on to search past that
match in the table, we fail to bind the driver.  That's effectively an
anti-match when there's no driver_override on the device.

> As if we look at new_id, I can't think of any reason for userspace to
> add an entry to the ID table and then tell the kernel to ignore
> it. If you want the kernel to ignore it then just don't add it in the
> first place.
> 
> Do you have some other scenario in mind?

Sure, what if I have two different GPUs in my system, one works fine
with the FOSS driver, the other requires a 3rd party driver.  I don't
want to blacklist the FOSS driver, but I don't want it to claim the
other GPU.  I can create an anti-match that effectively removes one GPU
from the FOSS driver unless it's bound with a driver_override.

I understand that's not your intended use case, but I think this allows
that and justifies handling a dynamic ID the same as a static ID.
Adding a field to pci_device_id, which is otherwise able to be fully
specified via new_id, except for this field, feels like a bug.  Thanks,

Alex
Jason Gunthorpe June 15, 2021, 11:32 p.m. UTC | #21
On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:

> > > b) alone is a functional, runtime difference.  
> > 
> > I would state b) differently:
> > 
> > b) Ignore the driver-override-only match entries in the ID table.
> 
> No, pci_match_device() returns NULL if a match is found that is marked
> driver-override-only and a driver_override is not specified.  That's
> the same as no match at all.  We don't then go on to search past that
> match in the table, we fail to bind the driver.  That's effectively an
> anti-match when there's no driver_override on the device.

anti-match isn't the intention. The deployment will have match tables
where all entires are either flags=0 or are driver-override-only.

I would say that mixed match tables make driver-override-only into an
anti-match is actually a minor bug in the patch.

The series isn't about adding some new anti-match scheme.

> I understand that's not your intended use case, but I think this allows
> that and justifies handling a dynamic ID the same as a static ID.
> Adding a field to pci_device_id, which is otherwise able to be fully
> specified via new_id, except for this field, feels like a bug.  Thanks,

Okay, I see what you are saying clearly now.

Your example usage seems legit to me, but I really don't want to
entangle it with this series. It is a seperate idea, it can go as a
seperate work that uses the new flags and an updated new_id and
related parts by someone who wants it.

I hope you'll understand that having NVIDIA Mellanox persue what you
describe above is just not going to work..

Jason
Alex Williamson June 16, 2021, 12:22 a.m. UTC | #22
On Tue, 15 Jun 2021 20:32:57 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:
> 
> > > > b) alone is a functional, runtime difference.    
> > > 
> > > I would state b) differently:
> > > 
> > > b) Ignore the driver-override-only match entries in the ID table.  
> > 
> > No, pci_match_device() returns NULL if a match is found that is marked
> > driver-override-only and a driver_override is not specified.  That's
> > the same as no match at all.  We don't then go on to search past that
> > match in the table, we fail to bind the driver.  That's effectively an
> > anti-match when there's no driver_override on the device.  
> 
> anti-match isn't the intention. The deployment will have match tables
> where all entires are either flags=0 or are driver-override-only.

I'd expect pci-pf-stub to have one of each, an any-id with
override-only flag and the one device ID currently in the table with
no flag.

> I would say that mixed match tables make driver-override-only into an
> anti-match is actually a minor bug in the patch.
> 
> The series isn't about adding some new anti-match scheme.
> 
> > I understand that's not your intended use case, but I think this allows
> > that and justifies handling a dynamic ID the same as a static ID.
> > Adding a field to pci_device_id, which is otherwise able to be fully
> > specified via new_id, except for this field, feels like a bug.  Thanks,  
> 
> Okay, I see what you are saying clearly now.
> 
> Your example usage seems legit to me, but I really don't want to
> entangle it with this series. It is a seperate idea, it can go as a
> seperate work that uses the new flags and an updated new_id and
> related parts by someone who wants it.
> 
> I hope you'll understand that having NVIDIA Mellanox persue what you
> describe above is just not going to work..

I understand that use case might hit a nerve, but I don't particularly
see why handling static and dynamic IDs consistently wrt to this new
flags field is controversial.  Thanks,

Alex
Jason Gunthorpe June 16, 2021, 12:34 a.m. UTC | #23
On Tue, Jun 15, 2021 at 06:22:45PM -0600, Alex Williamson wrote:
> On Tue, 15 Jun 2021 20:32:57 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:
> > 
> > > > > b) alone is a functional, runtime difference.    
> > > > 
> > > > I would state b) differently:
> > > > 
> > > > b) Ignore the driver-override-only match entries in the ID table.  
> > > 
> > > No, pci_match_device() returns NULL if a match is found that is marked
> > > driver-override-only and a driver_override is not specified.  That's
> > > the same as no match at all.  We don't then go on to search past that
> > > match in the table, we fail to bind the driver.  That's effectively an
> > > anti-match when there's no driver_override on the device.  
> > 
> > anti-match isn't the intention. The deployment will have match tables
> > where all entires are either flags=0 or are driver-override-only.
> 
> I'd expect pci-pf-stub to have one of each, an any-id with
> override-only flag and the one device ID currently in the table with
> no flag.

Oh Hum. Actually I think this shows the anti-match behavior is
actually a bug.. :(

For something like pci_pf_stub_whitelist, if we add a
driver_override-only using the PCI any id then it effectively disables
new_id completely because the match search will alway find the
driver_override match first and stop searching. There is no chance to
see things new_id adds.

We have to fix this patch so flags isn't an anti-match to make it work
without user regression.

Jason
Max Gurtovoy June 16, 2021, 11:28 p.m. UTC | #24
On 6/16/2021 3:34 AM, Jason Gunthorpe wrote:
> On Tue, Jun 15, 2021 at 06:22:45PM -0600, Alex Williamson wrote:
>> On Tue, 15 Jun 2021 20:32:57 -0300
>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>>> On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:
>>>
>>>>>> b) alone is a functional, runtime difference.
>>>>> I would state b) differently:
>>>>>
>>>>> b) Ignore the driver-override-only match entries in the ID table.
>>>> No, pci_match_device() returns NULL if a match is found that is marked
>>>> driver-override-only and a driver_override is not specified.  That's
>>>> the same as no match at all.  We don't then go on to search past that
>>>> match in the table, we fail to bind the driver.  That's effectively an
>>>> anti-match when there's no driver_override on the device.
>>> anti-match isn't the intention. The deployment will have match tables
>>> where all entires are either flags=0 or are driver-override-only.
>> I'd expect pci-pf-stub to have one of each, an any-id with
>> override-only flag and the one device ID currently in the table with
>> no flag.
> Oh Hum. Actually I think this shows the anti-match behavior is
> actually a bug.. :(
>
> For something like pci_pf_stub_whitelist, if we add a
> driver_override-only using the PCI any id then it effectively disables
> new_id completely because the match search will alway find the
> driver_override match first and stop searching. There is no chance to
> see things new_id adds.

Actually the dynamic table is the first table the driver search. So 
new_id works exactly the same AFAIU.

But you're right for static mixed table (I assumed that this will never 
happen I guess).

If we put the any_id_override id before the non_override AMAZON device 
entry in the pci-pf-stub we'll fail with the matching to the AMAZON device.

What about the bellow untested addition (also remove condition c):

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 296de7bc9dc9..2d46f6cd96f7 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -136,7 +136,7 @@ static const struct pci_device_id 
*pci_match_device(struct pci_driver *drv,
                                                     struct pci_dev *dev)
  {
         struct pci_dynid *dynid;
-       const struct pci_device_id *found_id = NULL;
+       const struct pci_device_id *found_id = NULL, *ids;

         /* When driver_override is set, only bind to the matching driver */
         if (dev->driver_override && strcmp(dev->driver_override, 
drv->name))
@@ -155,8 +155,8 @@ static const struct pci_device_id 
*pci_match_device(struct pci_driver *drv,
         if (found_id)
                 return found_id;

-       found_id = pci_match_id(drv->id_table, dev);
-       if (found_id) {
+       ids = drv->id_table;
+       while ((found_id = pci_match_id(ids, dev))) {
                 /*
                  * if we found id in the static table, we must fulfill the
                  * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
@@ -164,17 +164,19 @@ static const struct pci_device_id 
*pci_match_device(struct pci_driver *drv,
                  */
                 bool is_driver_override =
                         (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
-               if ((is_driver_override && !dev->driver_override) ||
-                   (dev->driver_override && !is_driver_override))
-                       return NULL;
-       } else if (dev->driver_override) {
-               /*
-                * if we didn't find suitable id in the static table,
-                * driver_override will still , send a dummy id
-                */
-               found_id = &pci_device_id_any;
+               if (is_driver_override && !dev->driver_override)
+                       ids = found_id++; /* continue searching */
+               else
+                       break;
         }

+       /*
+        * if no static match, driver_override will always match, send a 
dummy
+        * id.
+        */
+       if (!found_id && dev->driver_override)
+               found_id = &pci_device_id_any;
+
         return found_id;
  }

diff --git a/drivers/pci/pci-pf-stub.c b/drivers/pci/pci-pf-stub.c
index 45855a5e9fca..49544ba9a7af 100644
--- a/drivers/pci/pci-pf-stub.c
+++ b/drivers/pci/pci-pf-stub.c
@@ -19,6 +19,7 @@
   */
  static const struct pci_device_id pci_pf_stub_whitelist[] = {
         { PCI_VDEVICE(AMAZON, 0x0053) },
+       { PCI_DEVICE_FLAGS(PCI_ANY_ID, PCI_ANY_ID, 
PCI_ID_F_STUB_DRIVER_OVERRIDE) }, /* match all by default (override) */
         /* required last entry */
         { 0 }
  };


>
> We have to fix this patch so flags isn't an anti-match to make it work
> without user regression.
>
> Jason
Jason Gunthorpe June 16, 2021, 11:33 p.m. UTC | #25
On Thu, Jun 17, 2021 at 02:28:36AM +0300, Max Gurtovoy wrote:
> 
> On 6/16/2021 3:34 AM, Jason Gunthorpe wrote:
> > On Tue, Jun 15, 2021 at 06:22:45PM -0600, Alex Williamson wrote:
> > > On Tue, 15 Jun 2021 20:32:57 -0300
> > > Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > 
> > > > On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:
> > > > 
> > > > > > > b) alone is a functional, runtime difference.
> > > > > > I would state b) differently:
> > > > > > 
> > > > > > b) Ignore the driver-override-only match entries in the ID table.
> > > > > No, pci_match_device() returns NULL if a match is found that is marked
> > > > > driver-override-only and a driver_override is not specified.  That's
> > > > > the same as no match at all.  We don't then go on to search past that
> > > > > match in the table, we fail to bind the driver.  That's effectively an
> > > > > anti-match when there's no driver_override on the device.
> > > > anti-match isn't the intention. The deployment will have match tables
> > > > where all entires are either flags=0 or are driver-override-only.
> > > I'd expect pci-pf-stub to have one of each, an any-id with
> > > override-only flag and the one device ID currently in the table with
> > > no flag.
> > Oh Hum. Actually I think this shows the anti-match behavior is
> > actually a bug.. :(
> > 
> > For something like pci_pf_stub_whitelist, if we add a
> > driver_override-only using the PCI any id then it effectively disables
> > new_id completely because the match search will alway find the
> > driver_override match first and stop searching. There is no chance to
> > see things new_id adds.
> 
> Actually the dynamic table is the first table the driver search. So new_id
> works exactly the same AFAIU.

Oh, even better, so it isn't really an issue

> But you're right for static mixed table (I assumed that this will never
> happen I guess).

Me too, we could organize the driver-overrides to be last
 
> -       found_id = pci_match_id(drv->id_table, dev);
> -       if (found_id) {
> +       ids = drv->id_table;
> +       while ((found_id = pci_match_id(ids, dev))) {

Yeah, keep searching makes logical sense to me

> diff --git a/drivers/pci/pci-pf-stub.c b/drivers/pci/pci-pf-stub.c
> index 45855a5e9fca..49544ba9a7af 100644
> +++ b/drivers/pci/pci-pf-stub.c
> @@ -19,6 +19,7 @@
>   */
>  static const struct pci_device_id pci_pf_stub_whitelist[] = {
>         { PCI_VDEVICE(AMAZON, 0x0053) },
> +       { PCI_DEVICE_FLAGS(PCI_ANY_ID, PCI_ANY_ID,
> PCI_ID_F_STUB_DRIVER_OVERRIDE) }, /* match all by default (override) */
>         /* required last entry */
>         { 0 }

And we don't really want this change any more right? No reason to put
pci_stub in the module.alias file?

Thanks,
Jason
Max Gurtovoy June 16, 2021, 11:42 p.m. UTC | #26
On 6/17/2021 2:33 AM, Jason Gunthorpe wrote:
> On Thu, Jun 17, 2021 at 02:28:36AM +0300, Max Gurtovoy wrote:
>> On 6/16/2021 3:34 AM, Jason Gunthorpe wrote:
>>> On Tue, Jun 15, 2021 at 06:22:45PM -0600, Alex Williamson wrote:
>>>> On Tue, 15 Jun 2021 20:32:57 -0300
>>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>
>>>>> On Tue, Jun 15, 2021 at 05:22:42PM -0600, Alex Williamson wrote:
>>>>>
>>>>>>>> b) alone is a functional, runtime difference.
>>>>>>> I would state b) differently:
>>>>>>>
>>>>>>> b) Ignore the driver-override-only match entries in the ID table.
>>>>>> No, pci_match_device() returns NULL if a match is found that is marked
>>>>>> driver-override-only and a driver_override is not specified.  That's
>>>>>> the same as no match at all.  We don't then go on to search past that
>>>>>> match in the table, we fail to bind the driver.  That's effectively an
>>>>>> anti-match when there's no driver_override on the device.
>>>>> anti-match isn't the intention. The deployment will have match tables
>>>>> where all entires are either flags=0 or are driver-override-only.
>>>> I'd expect pci-pf-stub to have one of each, an any-id with
>>>> override-only flag and the one device ID currently in the table with
>>>> no flag.
>>> Oh Hum. Actually I think this shows the anti-match behavior is
>>> actually a bug.. :(
>>>
>>> For something like pci_pf_stub_whitelist, if we add a
>>> driver_override-only using the PCI any id then it effectively disables
>>> new_id completely because the match search will alway find the
>>> driver_override match first and stop searching. There is no chance to
>>> see things new_id adds.
>> Actually the dynamic table is the first table the driver search. So new_id
>> works exactly the same AFAIU.
> Oh, even better, so it isn't really an issue
>
>> But you're right for static mixed table (I assumed that this will never
>> happen I guess).
> Me too, we could organize the driver-overrides to be last

Yes we could, but in 2 years from now I'll forget this rule :)

And others may not be aware of it.

>   
>> -       found_id = pci_match_id(drv->id_table, dev);
>> -       if (found_id) {
>> +       ids = drv->id_table;
>> +       while ((found_id = pci_match_id(ids, dev))) {
> Yeah, keep searching makes logical sense to me
>
>> diff --git a/drivers/pci/pci-pf-stub.c b/drivers/pci/pci-pf-stub.c
>> index 45855a5e9fca..49544ba9a7af 100644
>> +++ b/drivers/pci/pci-pf-stub.c
>> @@ -19,6 +19,7 @@
>>    */
>>   static const struct pci_device_id pci_pf_stub_whitelist[] = {
>>          { PCI_VDEVICE(AMAZON, 0x0053) },
>> +       { PCI_DEVICE_FLAGS(PCI_ANY_ID, PCI_ANY_ID,
>> PCI_ID_F_STUB_DRIVER_OVERRIDE) }, /* match all by default (override) */
>>          /* required last entry */
>>          { 0 }
> And we don't really want this change any more right? No reason to put
> pci_stub in the module.alias file?

I actually did it in the patches I attached earlier.

It will look like:

stub_pci:v*d*sv*sd*bc*sc*i*

pci:v00001D0Fd00000053sv*sd*bc*sc*i*

I think it's good practice to avoid matching automatically and auto 
loading any_id_override and vfio_override drivers in general.

Do you see a reason not adding this alias for stub drivers but adding it 
to vfio_pci drivers ?

>
> Thanks,
> Jason
Jason Gunthorpe June 16, 2021, 11:44 p.m. UTC | #27
On Thu, Jun 17, 2021 at 02:42:46AM +0300, Max Gurtovoy wrote:

> Do you see a reason not adding this alias for stub drivers but
> adding it to vfio_pci drivers ?

It creates uABI without a userspace user and that is strongly
discouraged. The 'stub_pci:' prefix becomes fixed ABI.

Jason
Max Gurtovoy June 16, 2021, 11:51 p.m. UTC | #28
On 6/17/2021 2:44 AM, Jason Gunthorpe wrote:
> On Thu, Jun 17, 2021 at 02:42:46AM +0300, Max Gurtovoy wrote:
>
>> Do you see a reason not adding this alias for stub drivers but
>> adding it to vfio_pci drivers ?
> It creates uABI without a userspace user and that is strongly
> discouraged. The 'stub_pci:' prefix becomes fixed ABI.
so is it better to have "pci:v*d*sv*sd*bc*sc*i*" for stub drivers ? or 
not adding alias at all if stub flag is set ?
>
> Jason
Jason Gunthorpe June 16, 2021, 11:56 p.m. UTC | #29
On Thu, Jun 17, 2021 at 02:51:09AM +0300, Max Gurtovoy wrote:
> 
> On 6/17/2021 2:44 AM, Jason Gunthorpe wrote:
> > On Thu, Jun 17, 2021 at 02:42:46AM +0300, Max Gurtovoy wrote:
> > 
> > > Do you see a reason not adding this alias for stub drivers but
> > > adding it to vfio_pci drivers ?
> > It creates uABI without a userspace user and that is strongly
> > discouraged. The 'stub_pci:' prefix becomes fixed ABI.
> so is it better to have "pci:v*d*sv*sd*bc*sc*i*" for stub drivers ?

No, we don't want to convey any new information about stub drivers to
userspace.

> or not adding alias at all if stub flag is set ?

Yes, just don't change it at all, IMHO.

Jason
Max Gurtovoy June 20, 2021, 2:46 p.m. UTC | #30
On 6/17/2021 2:56 AM, Jason Gunthorpe wrote:
> On Thu, Jun 17, 2021 at 02:51:09AM +0300, Max Gurtovoy wrote:
>> On 6/17/2021 2:44 AM, Jason Gunthorpe wrote:
>>> On Thu, Jun 17, 2021 at 02:42:46AM +0300, Max Gurtovoy wrote:
>>>
>>>> Do you see a reason not adding this alias for stub drivers but
>>>> adding it to vfio_pci drivers ?
>>> It creates uABI without a userspace user and that is strongly
>>> discouraged. The 'stub_pci:' prefix becomes fixed ABI.
>> so is it better to have "pci:v*d*sv*sd*bc*sc*i*" for stub drivers ?
> No, we don't want to convey any new information about stub drivers to
> userspace.
>
>> or not adding alias at all if stub flag is set ?
> Yes, just don't change it at all, IMHO.

Ok.

I've attached 2 patches that I would like to agree on before we'll send 
the V5.

They include the pci-pf-stub additions and keep searching for matching 
static ids as we discussed.

Max.

>
> Jason
From 67be1e3a6525c73fa582e8f5aa703b982d6d8114 Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <mgurtovoy@nvidia.com>
Date: Thu, 13 May 2021 13:53:19 +0300
Subject: [PATCH 1/2] PCI: add flags field to pci_device_id structure

This field will be used to allow pci modules to set some specific data
that can be used for matching, aliases and other hints. Add example for
"driver_override" pci drivers that may want to set a special prefix in
the modules.alias table. In the future, this flag will enforce
"driver_override" to work on drivers that specifically opt into this
feature. In this case, the udev utility will not try to load these
drivers automatically in order to bind to new discovered devices. This
will be because the modalias tables populated by those drivers will be
different from "regular" pci modalias tables. Userspace utilities, such
as libvirt for vfio devices, will need to adjust and bind devices
according to new matching mechanism with taking "driver_override"
enforcement into consideration. Add vfio and stub to "driver_override"
PCI family and create new alias for "driver_override" vfio devices.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 Documentation/PCI/pci.rst         |  1 +
 include/linux/mod_devicetable.h   | 11 +++++++++++
 include/linux/pci.h               | 27 +++++++++++++++++++++++++++
 scripts/mod/devicetable-offsets.c |  1 +
 scripts/mod/file2alias.c          | 10 ++++++++--
 5 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 814b40f8360b..0855657daf93 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -103,6 +103,7 @@ need pass only as many optional fields as necessary:
   - subvendor and subdevice fields default to PCI_ANY_ID (FFFFFFFF)
   - class and classmask fields default to 0
   - driver_data defaults to 0UL.
+  - flags field defaults to 0.
 
 Note that driver_data must match the value used by any of the pci_device_id
 entries defined in the driver. This makes the driver_data field mandatory
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 7d45b5f989b0..51b852a2d32b 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -16,6 +16,15 @@ typedef unsigned long kernel_ulong_t;
 
 #define PCI_ANY_ID (~0)
 
+
+enum pci_id_flags {
+	PCI_ID_F_VFIO_DRIVER_OVERRIDE	= 1 << 0,
+	PCI_ID_F_STUB_DRIVER_OVERRIDE	= 1 << 1,
+};
+
+#define PCI_ID_F_DRIVER_OVERRIDE \
+	(PCI_ID_F_VFIO_DRIVER_OVERRIDE | PCI_ID_F_STUB_DRIVER_OVERRIDE)
+
 /**
  * struct pci_device_id - PCI device ID structure
  * @vendor:		Vendor ID to match (or PCI_ANY_ID)
@@ -34,12 +43,14 @@ typedef unsigned long kernel_ulong_t;
  *			Best practice is to use driver_data as an index
  *			into a static list of equivalent device types,
  *			instead of using it as a pointer.
+ * @flags:		PCI flags of the driver. Bitmap of pci_id_flags enum.
  */
 struct pci_device_id {
 	__u32 vendor, device;		/* Vendor and device ID or PCI_ANY_ID*/
 	__u32 subvendor, subdevice;	/* Subsystem ID's or PCI_ANY_ID */
 	__u32 class, class_mask;	/* (class,subclass,prog-if) triplet */
 	kernel_ulong_t driver_data;	/* Data private to the driver */
+	__u32 flags;
 };
 
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c20211e59a57..4af761552068 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -898,6 +898,33 @@ struct pci_driver {
 	.vendor = (vend), .device = (dev), \
 	.subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID
 
+/**
+ * PCI_DEVICE_FLAGS - macro used to describe a PCI device with specific flags.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ * @fl: PCI Device flags as a bitmap of pci_id_flags enum
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID.
+ */
+#define PCI_DEVICE_FLAGS(vend, dev, fl) \
+	.vendor = (vend), .device = (dev), .subvendor = PCI_ANY_ID, \
+	.subdevice = PCI_ANY_ID, .flags = (fl)
+
+/**
+ * PCI_DRIVER_OVERRIDE_DEVICE_VFIO - macro used to describe a VFIO
+ *                                   "driver_override" PCI device.
+ * @vend: the 16 bit PCI Vendor ID
+ * @dev: the 16 bit PCI Device ID
+ *
+ * This macro is used to create a struct pci_device_id that matches a
+ * specific device. The subvendor and subdevice fields will be set to
+ * PCI_ANY_ID and the flags will be set to PCI_ID_F_VFIO_DRIVER_OVERRIDE.
+ */
+#define PCI_DRIVER_OVERRIDE_DEVICE_VFIO(vend, dev) \
+	PCI_DEVICE_FLAGS(vend, dev, PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+
 /**
  * PCI_DEVICE_SUB - macro used to describe a specific PCI device with subsystem
  * @vend: the 16 bit PCI Vendor ID
diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c
index 9bb6c7edccc4..b927c36b8333 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -42,6 +42,7 @@ int main(void)
 	DEVID_FIELD(pci_device_id, subdevice);
 	DEVID_FIELD(pci_device_id, class);
 	DEVID_FIELD(pci_device_id, class_mask);
+	DEVID_FIELD(pci_device_id, flags);
 
 	DEVID(ccw_device_id);
 	DEVID_FIELD(ccw_device_id, match_flags);
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 7c97fa8e36bc..2b2b7d875416 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -426,7 +426,7 @@ static int do_ieee1394_entry(const char *filename,
 	return 1;
 }
 
-/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
+/* Looks like: pci:vNdNsvNsdNbcNscNiN or <prefix>_pci:vNdNsvNsdNbcNscNiN. */
 static int do_pci_entry(const char *filename,
 			void *symval, char *alias)
 {
@@ -440,8 +440,14 @@ static int do_pci_entry(const char *filename,
 	DEF_FIELD(symval, pci_device_id, subdevice);
 	DEF_FIELD(symval, pci_device_id, class);
 	DEF_FIELD(symval, pci_device_id, class_mask);
+	DEF_FIELD(symval, pci_device_id, flags);
 
-	strcpy(alias, "pci:");
+	if (!flags)
+		strcpy(alias, "pci:");
+	else if (flags & PCI_ID_F_VFIO_DRIVER_OVERRIDE)
+		strcpy(alias, "vfio_pci:");
+	else
+		return 0;
 	ADD(alias, "v", vendor != PCI_ANY_ID, vendor);
 	ADD(alias, "d", device != PCI_ANY_ID, device);
 	ADD(alias, "sv", subvendor != PCI_ANY_ID, subvendor);
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ef00fada2efb..6d78970b1c69 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -289,7 +289,11 @@  Description:
 		will not bind to any driver.  This also allows devices to
 		opt-out of driver binding using a driver_override name such as
 		"none".  Only a single driver may be specified in the override,
-		there is no support for parsing delimiters.
+		there is no support for parsing delimiters.  The binding to the
+		device is allowed if and only if the matching driver has
+		enabled the driver_override alias in the ID table (by setting
+		a suitable bit in the "flags" field of pci_device_id
+		structure).
 
 What:		/sys/bus/pci/devices/.../numa_node
 Date:		Oct 2014
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index ec44a79e951a..e4037db817da 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -115,13 +115,6 @@  const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
 }
 EXPORT_SYMBOL(pci_match_id);
 
-static const struct pci_device_id pci_device_id_any = {
-	.vendor = PCI_ANY_ID,
-	.device = PCI_ANY_ID,
-	.subvendor = PCI_ANY_ID,
-	.subdevice = PCI_ANY_ID,
-};
-
 /**
  * pci_match_device - See if a device matches a driver's list of IDs
  * @drv: the PCI driver to match against
@@ -137,6 +130,7 @@  static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
 {
 	struct pci_dynid *dynid;
 	const struct pci_device_id *found_id = NULL;
+	bool is_driver_override;
 
 	/* When driver_override is set, only bind to the matching driver */
 	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
@@ -155,9 +149,17 @@  static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
 	if (!found_id)
 		found_id = pci_match_id(drv->id_table, dev);
 
-	/* driver_override will always match, send a dummy id */
-	if (!found_id && dev->driver_override)
-		found_id = &pci_device_id_any;
+	/*
+	 * if and only if PCI_ID_F_DRIVER_OVERRIDE flag is set, driver_override
+	 * should be provided.
+	 */
+	if (found_id) {
+		is_driver_override =
+			(found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
+		if ((is_driver_override && !dev->driver_override) ||
+		    (dev->driver_override && !is_driver_override))
+			return NULL;
+	}
 
 	return found_id;
 }
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 850ea3a94e28..9beb4b841945 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -166,9 +166,16 @@  static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
 	return vfio_pci_core_sriov_configure(pdev, nr_virtfn);
 }
 
+static const struct pci_device_id vfio_pci_table[] = {
+	{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_ANY_ID, PCI_ANY_ID) }, /* match all by default */
+	{ 0, }
+};
+
+MODULE_DEVICE_TABLE(pci, vfio_pci_table);
+
 static struct pci_driver vfio_pci_driver = {
 	.name			= "vfio-pci",
-	.id_table		= NULL, /* only dynamic ids */
+	.id_table		= vfio_pci_table,
 	.probe			= vfio_pci_probe,
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,