Message ID | 20180817044902.31420-1-benh@kernel.crashing.org (mailing list archive) |
---|---|
Headers | show |
Series | pci: Rework is_added race fix and address bridge enable races | expand |
s/device_add/device_attach .. ugh. On Fri, 2018-08-17 at 14:48 +1000, Benjamin Herrenschmidt wrote: > This re-fixes the bug reported by Hari Vyas <hari.vyas@broadcom.com> > after my revert of his commit but in a much simpler way. > > The main issues is that is_added was being set after the driver > got bound and started, and thus setting it could race with other > changes to struct pci_dev. > > This fixes it by setting the flag first, which also has the > advantage of matching the fact that we are clearing it *after* > unbinding in the remove path, thus the flag is now symtetric > and always set while the driver code is running. > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > --- > drivers/pci/bus.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > index 35b7fc87eac5..48ae63673aa8 100644 > --- a/drivers/pci/bus.c > +++ b/drivers/pci/bus.c > @@ -321,16 +321,16 @@ void pci_bus_add_device(struct pci_dev *dev) > pci_proc_attach_device(dev); > pci_bridge_d3_update(dev); > > + dev->is_added = 1; > dev->match_driver = true; > retval = device_attach(&dev->dev); > if (retval < 0 && retval != -EPROBE_DEFER) { > + dev->is_added = 0; > pci_warn(dev, "device attach failed (%d)\n", retval); > pci_proc_detach_device(dev); > pci_remove_sysfs_dev_files(dev); > return; > } > - > - dev->is_added = 1; > } > EXPORT_SYMBOL_GPL(pci_bus_add_device); >
On Fri, 2018-08-17 at 14:48 +1000, Benjamin Herrenschmidt wrote: > The second part aims at fixing the enable/disable/set_master races, > and does so by providing a framework for future device state locking > issues. > > It introduces a pci_dev->state_mutex which is used at a lower level > than the device_lock (the device lock isn't suitable, as explained > in the cset comments) and uses it to protect enablement and set_master. As discussed in the series, I'm not using the device_lock because I don't like it creeping out of drivers/base too much unless you explicitly try to lock against concurrent add/remove. That being said, if we decided we prefer using it to solve the enable/disable race, then we have to be careful of a few things: - Driver callbacks hold it, so we can't take it from within pci_enable_device(), pci_set_master() etc... themselves. We'll have to assume the caller has it - The above means auditing callers of these various APIs that might be calling them from outside of a driver callback and add the necessary lock - We need to take great care about the possibility of the parent device(s) lock being held. It can happen for USB for example. Now we don't have USB->PCI adapters that I know of that uses the PCI stack in Linux but generally assuming your parent lock isn't held is risky. So I would be a bit weary of walking up the bridge chain and taking it unconditionally. Alternatively, we could hijack an existing global lock, for example mvoe the bridge_mutex outside of ACPI and use it for the upwards bridge walk, and ignore the other possible races with enable/disable. Cheers, Ben.