mbox series

[RFC,0/6] pci: Rework is_added race fix and address bridge enable races

Message ID 20180817044902.31420-1-benh@kernel.crashing.org (mailing list archive)
Headers show
Series pci: Rework is_added race fix and address bridge enable races | expand

Message

Benjamin Herrenschmidt Aug. 17, 2018, 4:48 a.m. UTC
This is really two series but since they conflict a bit separately
here they are in one:

First we undo the mess of those atomic priv_flags. The atomicity
doesn't provide any security since there's no locking against
the other state pertaining to those flags, it only protects the
flags themselves.

The is_added mess is fixed much more simply by moving the assignment
of the flag to before we start the driver. This is in line with the
rest of the PCI code: until bound to the device model, we are essentially
assuming a single threaded environment. is_added is a flag that is
logically owned by that part of the PCI code, and thus should be set
and cleared within that "safer" environment.

This removes the horrid relative includes in the powerpc code as well.

The second part aims at fixing the enable/disable/set_master races,
and does so by providing a framework for future device state locking
issues.

It introduces a pci_dev->state_mutex which is used at a lower level
than the device_lock (the device lock isn't suitable, as explained
in the cset comments) and uses it to protect enablement and set_master.

Comments

Benjamin Herrenschmidt Aug. 17, 2018, 4:57 a.m. UTC | #1
s/device_add/device_attach .. ugh.

On Fri, 2018-08-17 at 14:48 +1000, Benjamin Herrenschmidt wrote:
> This re-fixes the bug reported by Hari Vyas <hari.vyas@broadcom.com>
> after my revert of his commit but in a much simpler way.
> 
> The main issues is that is_added was being set after the driver
> got bound and started, and thus setting it could race with other
> changes to struct pci_dev.
> 
> This fixes it by setting the flag first, which also has the
> advantage of matching the fact that we are clearing it *after*
> unbinding in the remove path, thus the flag is now symtetric
> and always set while the driver code is running.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  drivers/pci/bus.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 35b7fc87eac5..48ae63673aa8 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -321,16 +321,16 @@ void pci_bus_add_device(struct pci_dev *dev)
>  	pci_proc_attach_device(dev);
>  	pci_bridge_d3_update(dev);
>  
> +	dev->is_added = 1;
>  	dev->match_driver = true;
>  	retval = device_attach(&dev->dev);
>  	if (retval < 0 && retval != -EPROBE_DEFER) {
> +		dev->is_added = 0;
>  		pci_warn(dev, "device attach failed (%d)\n", retval);
>  		pci_proc_detach_device(dev);
>  		pci_remove_sysfs_dev_files(dev);
>  		return;
>  	}
> -
> -	dev->is_added = 1;
>  }
>  EXPORT_SYMBOL_GPL(pci_bus_add_device);
>
Benjamin Herrenschmidt Aug. 17, 2018, 5:03 a.m. UTC | #2
On Fri, 2018-08-17 at 14:48 +1000, Benjamin Herrenschmidt wrote:
> The second part aims at fixing the enable/disable/set_master races,
> and does so by providing a framework for future device state locking
> issues.
> 
> It introduces a pci_dev->state_mutex which is used at a lower level
> than the device_lock (the device lock isn't suitable, as explained
> in the cset comments) and uses it to protect enablement and set_master.

As discussed in the series, I'm not using the device_lock because I
don't like it creeping out of drivers/base too much unless you
explicitly try to lock against concurrent add/remove.

That being said, if we decided we prefer using it to solve the
enable/disable race, then we have to be careful of a few things:

 - Driver callbacks hold it, so we can't take it from within
pci_enable_device(), pci_set_master() etc... themselves. We'll have to
assume the caller has it

 - The above means auditing callers of these various APIs that might be
calling them from outside of a driver callback and add the necessary
lock

 - We need to take great care about the possibility of the parent
device(s) lock being held. It can happen for USB for example. Now we
don't have USB->PCI adapters that I know of that uses the PCI stack in
Linux but generally assuming your parent lock isn't held is risky. So I
would be a bit weary of walking up the bridge chain and taking it
unconditionally.

Alternatively, we could hijack an existing global lock, for example
mvoe the bridge_mutex outside of ACPI and use it for the upwards bridge
walk, and ignore the other possible races with enable/disable.

Cheers,
Ben.