diff mbox

[v3] PCI: rework new_id interface for known vendor/device values

Message ID jpgbnwkplbo.fsf@nelium.bos.redhat.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Bandan Das April 2, 2014, 1:32 a.m. UTC
While using the new_id interface, the user can unintentionally feed
incorrect values if the driver static table has a matching entry.
This is possible since only the device and vendor fields are
mandatory and the rest are optional. As a result, store_new_id
will fill in default values that are then passed on to the driver
and can have unintended consequences.

As an example, consider the ixgbe driver and the 82599EB network card :
echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id

This will pass a driver_data value of 0 to the driver whereas
the index 0 in ixgbe actually points to a different set of card
operations.

This change returns an error if the user attempts to add a dynid for
a vendor/device combination for which a static entry already exists.
However, if the user intentionally wants a different set of values,
she must provide all the 7 fields and that will be accepted.

In KVM/device assignment scenario, the user might want 
to bind a device back to the host driver by writing to new_id
and trip on a possible null pointer dereference.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
v3:
relocate pdev decl
v2:
1. Return error if there is a matching static entry
and change commit message to reflect this behavior
3. Fill in a pdev and call pci_match_id instead of creating
a new matching function
4. Change commit message to reflect that libvirt does not
depend on this behavior

 drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

Comments

Alex Williamson April 2, 2014, 1:41 a.m. UTC | #1
On Tue, 2014-04-01 at 21:32 -0400, Bandan Das wrote:
> While using the new_id interface, the user can unintentionally feed
> incorrect values if the driver static table has a matching entry.
> This is possible since only the device and vendor fields are
> mandatory and the rest are optional. As a result, store_new_id
> will fill in default values that are then passed on to the driver
> and can have unintended consequences.
> 
> As an example, consider the ixgbe driver and the 82599EB network card :
> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> 
> This will pass a driver_data value of 0 to the driver whereas
> the index 0 in ixgbe actually points to a different set of card
> operations.
> 
> This change returns an error if the user attempts to add a dynid for
> a vendor/device combination for which a static entry already exists.
> However, if the user intentionally wants a different set of values,
> she must provide all the 7 fields and that will be accepted.
> 
> In KVM/device assignment scenario, the user might want 
> to bind a device back to the host driver by writing to new_id
> and trip on a possible null pointer dereference.
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>

Looks ok to me

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>

> ---
> v3:
> relocate pdev decl
> v2:
> 1. Return error if there is a matching static entry
> and change commit message to reflect this behavior
> 3. Fill in a pdev and call pci_match_id instead of creating
> a new matching function
> 4. Change commit message to reflect that libvirt does not
> depend on this behavior
> 
>  drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 25f0bc6..a65a014 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -107,7 +107,7 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  		subdevice=PCI_ANY_ID, class=0, class_mask=0;
>  	unsigned long driver_data=0;
>  	int fields=0;
> -	int retval;
> +	int retval = 0;
>  
>  	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
>  			&vendor, &device, &subvendor, &subdevice,
> @@ -115,6 +115,26 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  	if (fields < 2)
>  		return -EINVAL;
>  
> +	if (fields != 7) {
> +		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
> +		if (!pdev)
> +			return -ENOMEM;
> +
> +		pdev->vendor = vendor;
> +		pdev->device = device;
> +		pdev->subsystem_vendor = subvendor;
> +		pdev->subsystem_device = subdevice;
> +		pdev->class = class;
> +
> +		if (pci_match_id(pdrv->id_table, pdev))
> +			retval = -EEXIST;
> +
> +		kfree(pdev);
> +
> +		if (retval)
> +			return retval;
> +	}
> +
>  	/* Only accept driver_data values that match an existing id_table
>  	   entry */
>  	if (ids) {



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas April 24, 2014, 10:45 p.m. UTC | #2
On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
> 
> While using the new_id interface, the user can unintentionally feed
> incorrect values if the driver static table has a matching entry.
> This is possible since only the device and vendor fields are
> mandatory and the rest are optional. As a result, store_new_id
> will fill in default values that are then passed on to the driver
> and can have unintended consequences.
> 
> As an example, consider the ixgbe driver and the 82599EB network card :
> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> 
> This will pass a driver_data value of 0 to the driver whereas
> the index 0 in ixgbe actually points to a different set of card
> operations.
> 
> This change returns an error if the user attempts to add a dynid for
> a vendor/device combination for which a static entry already exists.
> However, if the user intentionally wants a different set of values,
> she must provide all the 7 fields and that will be accepted.
> 
> In KVM/device assignment scenario, the user might want 
> to bind a device back to the host driver by writing to new_id
> and trip on a possible null pointer dereference.

I don't understand this last KVM comment.  If this patch fixes a null
pointer dereference, it must be because we return -EEXIST instead of
calling the driver's probe method.

Can you outline the sequence of events and the drivers involved?  Did we
start with a device that was claimed by vfio, and now we're trying to get
ixgbe to claim it by writing to /sys/bus/pci/drivers/ixgbe/new_id?  If so,
does that mean the user has to know what driver_data value to supply?

I know you didn't add the new_id mechanism, and this patch makes it safer
than it was before, but I'm uneasy about it in general.  Most drivers do
not validate the driver_data value.  They assume it came out of the
id_table supplied by the driver and is therefore trustworthy.  But new_id
is a loophole that allows a user (hopefully only root) to pass arbitrary
junk to the driver.

I wonder if the device assignment machinery should be more integrated into
the PCI core instead of trying to be "just another driver."  It seems like
we're doing a lot of work to try to get the driver binding mechanism to do
what we need for device assignment.

Bjorn

> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
> v3:
> relocate pdev decl
> v2:
> 1. Return error if there is a matching static entry
> and change commit message to reflect this behavior
> 3. Fill in a pdev and call pci_match_id instead of creating
> a new matching function
> 4. Change commit message to reflect that libvirt does not
> depend on this behavior
> 
>  drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 25f0bc6..a65a014 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -107,7 +107,7 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  		subdevice=PCI_ANY_ID, class=0, class_mask=0;
>  	unsigned long driver_data=0;
>  	int fields=0;
> -	int retval;
> +	int retval = 0;
>  
>  	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
>  			&vendor, &device, &subvendor, &subdevice,
> @@ -115,6 +115,26 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  	if (fields < 2)
>  		return -EINVAL;
>  
> +	if (fields != 7) {
> +		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
> +		if (!pdev)
> +			return -ENOMEM;
> +
> +		pdev->vendor = vendor;
> +		pdev->device = device;
> +		pdev->subsystem_vendor = subvendor;
> +		pdev->subsystem_device = subdevice;
> +		pdev->class = class;
> +
> +		if (pci_match_id(pdrv->id_table, pdev))
> +			retval = -EEXIST;
> +
> +		kfree(pdev);
> +
> +		if (retval)
> +			return retval;
> +	}
> +
>  	/* Only accept driver_data values that match an existing id_table
>  	   entry */
>  	if (ids) {
> -- 
> 1.8.3.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Williamson April 25, 2014, 5:39 p.m. UTC | #3
On Thu, 2014-04-24 at 16:45 -0600, Bjorn Helgaas wrote:
> On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
> > 
> > While using the new_id interface, the user can unintentionally feed
> > incorrect values if the driver static table has a matching entry.
> > This is possible since only the device and vendor fields are
> > mandatory and the rest are optional. As a result, store_new_id
> > will fill in default values that are then passed on to the driver
> > and can have unintended consequences.
> > 
> > As an example, consider the ixgbe driver and the 82599EB network card :
> > echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> > 
> > This will pass a driver_data value of 0 to the driver whereas
> > the index 0 in ixgbe actually points to a different set of card
> > operations.
> > 
> > This change returns an error if the user attempts to add a dynid for
> > a vendor/device combination for which a static entry already exists.
> > However, if the user intentionally wants a different set of values,
> > she must provide all the 7 fields and that will be accepted.
> > 
> > In KVM/device assignment scenario, the user might want 
> > to bind a device back to the host driver by writing to new_id
> > and trip on a possible null pointer dereference.
> 
> I don't understand this last KVM comment.  If this patch fixes a null
> pointer dereference, it must be because we return -EEXIST instead of
> calling the driver's probe method.

Right, the NULL pointer dereference is because drivers implicitly trust
the driver_data field supplied to their probe function.  This patch
prevents the user from supplying a "shorthand" vendor/device new_id that
would conflict with an existing static ID by returning -EEXIST on the
new_id update.  This is not really a KVM problem, but prevention of a
user error for the new_id interface; there is no reason for the user to
add a new_id that duplicates an existing ID unless they want to modify
the extended fields.

> Can you outline the sequence of events and the drivers involved?  Did we
> start with a device that was claimed by vfio, and now we're trying to get
> ixgbe to claim it by writing to /sys/bus/pci/drivers/ixgbe/new_id?  If so,
> does that mean the user has to know what driver_data value to supply?

I believe the driver is ixgbe, the device starts out bound to ixgbe, the
user adds the vendor/device IDs to the new_id of a different driver
(which could be pci-stub or vfio-pci or even some alternate host drive
for the device).  They then finish with the alternate driver, use
remove_id, and attempt to rebind back to ixgbe by writing vendor/device
to ixgbe new_id.  This is clearly wrong, the driver already handles this
device and the user should have used drivers_probe or even the ixgbe
bind interface.  However, as it works now, ixgbe now has a new dynamic
"shorthand" match for the device and since dynamic IDs are matched
before static IDs, the device_data from that match (NULL) is passed to
the driver probe() function.  Chaos follows since the driver implicitly
trusts that device_data as something provided by the driver. 

> I know you didn't add the new_id mechanism, and this patch makes it safer
> than it was before, but I'm uneasy about it in general.  Most drivers do
> not validate the driver_data value.  They assume it came out of the
> id_table supplied by the driver and is therefore trustworthy.  But new_id
> is a loophole that allows a user (hopefully only root) to pass arbitrary
> junk to the driver.

The sysfs files are only accessible to root by default.  Your uneasiness
seems to be the new_id mechanism in general.  It is a gap that drivers
implicitly trust a field that can be supplied by the user.  I believe
there's a test in the code somewhere that verifies that device_data at
least matches an existing device_data as a small sanity check.  This
patch closes another gap by disallowing new_ids that are not fully
specified to supersede an existing entry. 

> I wonder if the device assignment machinery should be more integrated into
> the PCI core instead of trying to be "just another driver."  It seems like
> we're doing a lot of work to try to get the driver binding mechanism to do
> what we need for device assignment.

This problem is only tangentially related to device assignment, any PCI
driver can hit this.  Maybe in practice the reason for touching these
files is often device assignment, but this interface pre-dates KVM.  Do
you have suggestions how device assignment could be more integrated to
PCI core?  Note that vfio is intentionally device agnostic and support
for assignment of platform devices using vfio is being actively
developed.  We do have a new binding mechanism awaiting review that
tries to avoid some of the faults with the new_id/remove_id interface.
In this case the user would not need to add a new_id and would use
drivers_probe on both sides of the attach/re-attach.  This is not a
replacement for Bandan's patch, but you can find it on the list here:

              Subject:
[PATCH] PCI: Introduce new device
binding path using
pci_dev.driver_override
               Date:
Fri, 04 Apr 2014 14:19:20 -0600

Thanks,
Alex 

> > Signed-off-by: Bandan Das <bsd@redhat.com>
> > ---
> > v3:
> > relocate pdev decl
> > v2:
> > 1. Return error if there is a matching static entry
> > and change commit message to reflect this behavior
> > 3. Fill in a pdev and call pci_match_id instead of creating
> > a new matching function
> > 4. Change commit message to reflect that libvirt does not
> > depend on this behavior
> > 
> >  drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 25f0bc6..a65a014 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -107,7 +107,7 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
> >  		subdevice=PCI_ANY_ID, class=0, class_mask=0;
> >  	unsigned long driver_data=0;
> >  	int fields=0;
> > -	int retval;
> > +	int retval = 0;
> >  
> >  	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
> >  			&vendor, &device, &subvendor, &subdevice,
> > @@ -115,6 +115,26 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
> >  	if (fields < 2)
> >  		return -EINVAL;
> >  
> > +	if (fields != 7) {
> > +		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
> > +		if (!pdev)
> > +			return -ENOMEM;
> > +
> > +		pdev->vendor = vendor;
> > +		pdev->device = device;
> > +		pdev->subsystem_vendor = subvendor;
> > +		pdev->subsystem_device = subdevice;
> > +		pdev->class = class;
> > +
> > +		if (pci_match_id(pdrv->id_table, pdev))
> > +			retval = -EEXIST;
> > +
> > +		kfree(pdev);
> > +
> > +		if (retval)
> > +			return retval;
> > +	}
> > +
> >  	/* Only accept driver_data values that match an existing id_table
> >  	   entry */
> >  	if (ids) {
> > -- 
> > 1.8.3.1
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das April 25, 2014, 5:51 p.m. UTC | #4
Bjorn Helgaas <bhelgaas@google.com> writes:

> On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
>> 
>> While using the new_id interface, the user can unintentionally feed
>> incorrect values if the driver static table has a matching entry.
>> This is possible since only the device and vendor fields are
>> mandatory and the rest are optional. As a result, store_new_id
>> will fill in default values that are then passed on to the driver
>> and can have unintended consequences.
>> 
>> As an example, consider the ixgbe driver and the 82599EB network card :
>> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
>> 
>> This will pass a driver_data value of 0 to the driver whereas
>> the index 0 in ixgbe actually points to a different set of card
>> operations.
>> 
>> This change returns an error if the user attempts to add a dynid for
>> a vendor/device combination for which a static entry already exists.
>> However, if the user intentionally wants a different set of values,
>> she must provide all the 7 fields and that will be accepted.
>> 
>> In KVM/device assignment scenario, the user might want 
>> to bind a device back to the host driver by writing to new_id
>> and trip on a possible null pointer dereference.
>
> I don't understand this last KVM comment.  If this patch fixes a null
> pointer dereference, it must be because we return -EEXIST instead of
> calling the driver's probe method.

A null pointer dereference in the ixgbe driver's struct ixgbe_info
that points to operations for a card model. In this case, when the user 
uses the new_id interface (without specifying driver_data), it defaults
to 0. So, ixgbe_info points to ixgbe_82598_info with mac_ops set to 
mac_ops_82598 while the card in question is a 82599.

> Can you outline the sequence of events and the drivers involved?  Did we

Something like this is enough to trigger this -
echo "b:f:d" > /sys/bus/.../driver/unbind
echo "b:f:d" > /sys/bus/pci/drives/ixgbe/new_id
echo 16 > /sys/bus/pci/devices/b:f:d/sriov_numvfs

> start with a device that was claimed by vfio, and now we're trying to get
> ixgbe to claim it by writing to /sys/bus/pci/drivers/ixgbe/new_id?  If so,
> does that mean the user has to know what driver_data value to supply?

Yes, but isn't it better than defaulting to 0 ?

> I know you didn't add the new_id mechanism, and this patch makes it safer
> than it was before, but I'm uneasy about it in general.  Most drivers do
> not validate the driver_data value.  They assume it came out of the
> id_table supplied by the driver and is therefore trustworthy.  But new_id
> is a loophole that allows a user (hopefully only root) to pass arbitrary
> junk to the driver.

I think this is what this patch does. If the user intends to, let her
pass arbitrary junk, let's not assume values on behalf of the user.

> I wonder if the device assignment machinery should be more integrated into
> the PCI core instead of trying to be "just another driver."  It seems like
> we're doing a lot of work to try to get the driver binding mechanism to do
> what we need for device assignment.

Agreed, the example I mentioned above is something likely to be
attempted by someone doing device assignment. But I still think
that if the user wants to use new_id, she (or the driver) provides
the value of driver_data. Why should pci assume 0 on behalf of the
user ?

Another option could be that if we do want to keep the driver_data
field optional, maybe the default is -1 (rather than 0). That way,
drivers can fail probe or do something else if they have 
use of it's value. 

> Bjorn
>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>> v3:
>> relocate pdev decl
>> v2:
>> 1. Return error if there is a matching static entry
>> and change commit message to reflect this behavior
>> 3. Fill in a pdev and call pci_match_id instead of creating
>> a new matching function
>> 4. Change commit message to reflect that libvirt does not
>> depend on this behavior
>> 
>>  drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
>>  1 file changed, 21 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 25f0bc6..a65a014 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -107,7 +107,7 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>>  		subdevice=PCI_ANY_ID, class=0, class_mask=0;
>>  	unsigned long driver_data=0;
>>  	int fields=0;
>> -	int retval;
>> +	int retval = 0;
>>  
>>  	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
>>  			&vendor, &device, &subvendor, &subdevice,
>> @@ -115,6 +115,26 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>>  	if (fields < 2)
>>  		return -EINVAL;
>>  
>> +	if (fields != 7) {
>> +		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
>> +		if (!pdev)
>> +			return -ENOMEM;
>> +
>> +		pdev->vendor = vendor;
>> +		pdev->device = device;
>> +		pdev->subsystem_vendor = subvendor;
>> +		pdev->subsystem_device = subdevice;
>> +		pdev->class = class;
>> +
>> +		if (pci_match_id(pdrv->id_table, pdev))
>> +			retval = -EEXIST;
>> +
>> +		kfree(pdev);
>> +
>> +		if (retval)
>> +			return retval;
>> +	}
>> +
>>  	/* Only accept driver_data values that match an existing id_table
>>  	   entry */
>>  	if (ids) {
>> -- 
>> 1.8.3.1
>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas April 28, 2014, 11:37 p.m. UTC | #5
On Fri, Apr 25, 2014 at 11:39:36AM -0600, Alex Williamson wrote:
> On Thu, 2014-04-24 at 16:45 -0600, Bjorn Helgaas wrote:
> > On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
> > > 
> > > While using the new_id interface, the user can unintentionally feed
> > > incorrect values if the driver static table has a matching entry.
> > > This is possible since only the device and vendor fields are
> > > mandatory and the rest are optional. As a result, store_new_id
> > > will fill in default values that are then passed on to the driver
> > > and can have unintended consequences.
> > > 
> > > As an example, consider the ixgbe driver and the 82599EB network card :
> > > echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> > > 
> > > This will pass a driver_data value of 0 to the driver whereas
> > > the index 0 in ixgbe actually points to a different set of card
> > > operations.
> > > 
> > > This change returns an error if the user attempts to add a dynid for
> > > a vendor/device combination for which a static entry already exists.
> > > However, if the user intentionally wants a different set of values,
> > > she must provide all the 7 fields and that will be accepted.
> > > 
> > > In KVM/device assignment scenario, the user might want 
> > > to bind a device back to the host driver by writing to new_id
> > > and trip on a possible null pointer dereference.
> > 
> > I don't understand this last KVM comment.  If this patch fixes a null
> > pointer dereference, it must be because we return -EEXIST instead of
> > calling the driver's probe method.
> 
> Right, the NULL pointer dereference is because drivers implicitly trust
> the driver_data field supplied to their probe function.  This patch
> prevents the user from supplying a "shorthand" vendor/device new_id that
> would conflict with an existing static ID by returning -EEXIST on the
> new_id update.  This is not really a KVM problem, but prevention of a
> user error for the new_id interface; there is no reason for the user to
> add a new_id that duplicates an existing ID unless they want to modify
> the extended fields.

Yep, this all makes sense.  I think I'll just drop the KVM paragraph
from the changelog because the problem isn't KVM-specific.

> > I know you didn't add the new_id mechanism, and this patch makes it safer
> > than it was before, but I'm uneasy about it in general.  Most drivers do
> > not validate the driver_data value.  They assume it came out of the
> > id_table supplied by the driver and is therefore trustworthy.  But new_id
> > is a loophole that allows a user (hopefully only root) to pass arbitrary
> > junk to the driver.
> 
> The sysfs files are only accessible to root by default.  Your uneasiness
> seems to be the new_id mechanism in general.  It is a gap that drivers
> implicitly trust a field that can be supplied by the user.  I believe
> there's a test in the code somewhere that verifies that device_data at
> least matches an existing device_data as a small sanity check.  This
> patch closes another gap by disallowing new_ids that are not fully
> specified to supersede an existing entry. 

Yep, exactly.  This definitely makes it better than it was before.

> > I wonder if the device assignment machinery should be more integrated into
> > the PCI core instead of trying to be "just another driver."  It seems like
> > we're doing a lot of work to try to get the driver binding mechanism to do
> > what we need for device assignment.
> 
> This problem is only tangentially related to device assignment, any PCI
> driver can hit this.  Maybe in practice the reason for touching these
> files is often device assignment, but this interface pre-dates KVM.  Do
> you have suggestions how device assignment could be more integrated to
> PCI core?  Note that vfio is intentionally device agnostic and support
> for assignment of platform devices using vfio is being actively
> developed.

I don't have a suggestion, but I think using the driver model for this
feels a bit like putting a square peg in a round hole.  A device-
agnostic driver is sort of a strange concept to begin with, and the
machinations to tweak the driver/device binding order and pass a
device back and forth between host drivers and vfio seem sort of
artificial.  It feels like we're using the binding mechanism in a way
it wasn't designed for, and it's not surprising that there are issues.

> We do have a new binding mechanism awaiting review that
> tries to avoid some of the faults with the new_id/remove_id interface.
> In this case the user would not need to add a new_id and would use
> drivers_probe on both sides of the attach/re-attach.

I saw that go by, but haven't looked in detail.  I'm hoping Greg pays
attention to those, since he's more of an overall driver model guy
than I am.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas April 28, 2014, 11:39 p.m. UTC | #6
On Fri, Apr 25, 2014 at 01:51:50PM -0400, Bandan Das wrote:
> Bjorn Helgaas <bhelgaas@google.com> writes:
> 
> > On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
> >> 
> >> While using the new_id interface, the user can unintentionally feed
> >> incorrect values if the driver static table has a matching entry.
> >> This is possible since only the device and vendor fields are
> >> mandatory and the rest are optional. As a result, store_new_id
> >> will fill in default values that are then passed on to the driver
> >> and can have unintended consequences.
> >> 
> >> As an example, consider the ixgbe driver and the 82599EB network card :
> >> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> >> 
> >> This will pass a driver_data value of 0 to the driver whereas
> >> the index 0 in ixgbe actually points to a different set of card
> >> operations.
> >> 
> >> This change returns an error if the user attempts to add a dynid for
> >> a vendor/device combination for which a static entry already exists.
> >> However, if the user intentionally wants a different set of values,
> >> she must provide all the 7 fields and that will be accepted.
> >> 
> >> In KVM/device assignment scenario, the user might want 
> >> to bind a device back to the host driver by writing to new_id
> >> and trip on a possible null pointer dereference.
> >
> > I don't understand this last KVM comment.  If this patch fixes a null
> > pointer dereference, it must be because we return -EEXIST instead of
> > calling the driver's probe method.
> 
> A null pointer dereference in the ixgbe driver's struct ixgbe_info
> that points to operations for a card model. In this case, when the user 
> uses the new_id interface (without specifying driver_data), it defaults
> to 0. So, ixgbe_info points to ixgbe_82598_info with mac_ops set to 
> mac_ops_82598 while the card in question is a 82599.
> 
> > Can you outline the sequence of events and the drivers involved?  Did we
> 
> Something like this is enough to trigger this -
> echo "b:f:d" > /sys/bus/.../driver/unbind
> echo "b:f:d" > /sys/bus/pci/drives/ixgbe/new_id
> echo 16 > /sys/bus/pci/devices/b:f:d/sriov_numvfs
> 
> > start with a device that was claimed by vfio, and now we're trying to get
> > ixgbe to claim it by writing to /sys/bus/pci/drivers/ixgbe/new_id?  If so,
> > does that mean the user has to know what driver_data value to supply?
> 
> Yes, but isn't it better than defaulting to 0 ?
> 
> > I know you didn't add the new_id mechanism, and this patch makes it safer
> > than it was before, but I'm uneasy about it in general.  Most drivers do
> > not validate the driver_data value.  They assume it came out of the
> > id_table supplied by the driver and is therefore trustworthy.  But new_id
> > is a loophole that allows a user (hopefully only root) to pass arbitrary
> > junk to the driver.
> 
> I think this is what this patch does. If the user intends to, let her
> pass arbitrary junk, let's not assume values on behalf of the user.

Yep, I agree, I was just trying to figure out if there was something
specific to KVM here.  But I don't think there is.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas April 29, 2014, 11:41 p.m. UTC | #7
On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
> 
> While using the new_id interface, the user can unintentionally feed
> incorrect values if the driver static table has a matching entry.
> This is possible since only the device and vendor fields are
> mandatory and the rest are optional. As a result, store_new_id
> will fill in default values that are then passed on to the driver
> and can have unintended consequences.
> 
> As an example, consider the ixgbe driver and the 82599EB network card :
> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
> 
> This will pass a driver_data value of 0 to the driver whereas
> the index 0 in ixgbe actually points to a different set of card
> operations.
> 
> This change returns an error if the user attempts to add a dynid for
> a vendor/device combination for which a static entry already exists.
> However, if the user intentionally wants a different set of values,
> she must provide all the 7 fields and that will be accepted.
> 
> In KVM/device assignment scenario, the user might want 
> to bind a device back to the host driver by writing to new_id
> and trip on a possible null pointer dereference.
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>

Applied with Alex's ack to pci/misc for v3.16, thanks!

> ---
> v3:
> relocate pdev decl
> v2:
> 1. Return error if there is a matching static entry
> and change commit message to reflect this behavior
> 3. Fill in a pdev and call pci_match_id instead of creating
> a new matching function
> 4. Change commit message to reflect that libvirt does not
> depend on this behavior
> 
>  drivers/pci/pci-driver.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 25f0bc6..a65a014 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -107,7 +107,7 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  		subdevice=PCI_ANY_ID, class=0, class_mask=0;
>  	unsigned long driver_data=0;
>  	int fields=0;
> -	int retval;
> +	int retval = 0;
>  
>  	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
>  			&vendor, &device, &subvendor, &subdevice,
> @@ -115,6 +115,26 @@ store_new_id(struct device_driver *driver, const char *buf, size_t count)
>  	if (fields < 2)
>  		return -EINVAL;
>  
> +	if (fields != 7) {
> +		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
> +		if (!pdev)
> +			return -ENOMEM;
> +
> +		pdev->vendor = vendor;
> +		pdev->device = device;
> +		pdev->subsystem_vendor = subvendor;
> +		pdev->subsystem_device = subdevice;
> +		pdev->class = class;
> +
> +		if (pci_match_id(pdrv->id_table, pdev))
> +			retval = -EEXIST;
> +
> +		kfree(pdev);
> +
> +		if (retval)
> +			return retval;
> +	}
> +
>  	/* Only accept driver_data values that match an existing id_table
>  	   entry */
>  	if (ids) {
> -- 
> 1.8.3.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Konstantin Khlebnikov June 15, 2014, 3:05 p.m. UTC | #8
On Wed, Apr 30, 2014 at 3:41 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Apr 01, 2014 at 09:32:59PM -0400, Bandan Das wrote:
>>
>> While using the new_id interface, the user can unintentionally feed
>> incorrect values if the driver static table has a matching entry.
>> This is possible since only the device and vendor fields are
>> mandatory and the rest are optional. As a result, store_new_id
>> will fill in default values that are then passed on to the driver
>> and can have unintended consequences.
>>
>> As an example, consider the ixgbe driver and the 82599EB network card :
>> echo "8086 10fb" > /sys/bus/pci/drivers/ixgbe/new_id
>>
>> This will pass a driver_data value of 0 to the driver whereas
>> the index 0 in ixgbe actually points to a different set of card
>> operations.
>>
>> This change returns an error if the user attempts to add a dynid for
>> a vendor/device combination for which a static entry already exists.
>> However, if the user intentionally wants a different set of values,
>> she must provide all the 7 fields and that will be accepted.
>>
>> In KVM/device assignment scenario, the user might want
>> to bind a device back to the host driver by writing to new_id
>> and trip on a possible null pointer dereference.

It's not directly related to this patch, but I have a problem with
this interface.

iwlwifi.ko stores internal pointer in private_data field. So, it's
hard to guess right one for adding new alias.
Currently I'm using CONFIG_KALLSYMS_ALL=y, but it seems so ugly.

I was thinking about adding new interface: "clone_id" (or "dup_id")
which would make a copy of existing entry but with different product:vendor id.
And I mostly have a patch for that.

Or it's would be better to fix that driver (replace opaque pointer
with some stable indexes)?


>>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>
> Applied with Alex's ack to pci/misc for v3.16, thanks!
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 25f0bc6..a65a014 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -107,7 +107,7 @@  store_new_id(struct device_driver *driver, const char *buf, size_t count)
 		subdevice=PCI_ANY_ID, class=0, class_mask=0;
 	unsigned long driver_data=0;
 	int fields=0;
-	int retval;
+	int retval = 0;
 
 	fields = sscanf(buf, "%x %x %x %x %x %x %lx",
 			&vendor, &device, &subvendor, &subdevice,
@@ -115,6 +115,26 @@  store_new_id(struct device_driver *driver, const char *buf, size_t count)
 	if (fields < 2)
 		return -EINVAL;
 
+	if (fields != 7) {
+		struct pci_dev *pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
+		if (!pdev)
+			return -ENOMEM;
+
+		pdev->vendor = vendor;
+		pdev->device = device;
+		pdev->subsystem_vendor = subvendor;
+		pdev->subsystem_device = subdevice;
+		pdev->class = class;
+
+		if (pci_match_id(pdrv->id_table, pdev))
+			retval = -EEXIST;
+
+		kfree(pdev);
+
+		if (retval)
+			return retval;
+	}
+
 	/* Only accept driver_data values that match an existing id_table
 	   entry */
 	if (ids) {