Message ID | 20220404062539.1710-1-jianjay.zhou@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Commit | a91ee0e9fca9d7501286cfbced9b30a33e52740a |
Headers | show |
Series | pci: avoid dead lock between device reset and sriov disable | expand |
On 2022/4/4 14:25, Jay Zhou wrote: > Call trace of PF SRIOV disable: > sriov_numvfs_store > device_lock <----------------- (1) get the device lock > ->sriov_configure # e.g. vfio_pci_sriov_configure > sriov_disable > pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0 > > Call trace of PF reset: > reset_store > pci_reset_function > pci_dev_lock > pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1 > device_lock <------------- (3) want to get the device lock > > These two oprations would wait for each other forever if the > code execution sequence is (1)(2)(3)(4). > > Let's get the device lock and then the config access lock in > pci_dev_lock(). > > Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> The patch looks good to me, Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> I met the same problem and tried to fix it in the same way. It's worth to be merged if somebody meets the same problem again. https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/ > --- > drivers/pci/pci.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 9ecce435fb3f..61a6db1d21f6 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe) > > void pci_dev_lock(struct pci_dev *dev) > { > - pci_cfg_access_lock(dev); > /* block PM suspend, driver probe, etc. */ > device_lock(&dev->dev); > + pci_cfg_access_lock(dev); > } > EXPORT_SYMBOL_GPL(pci_dev_lock); > > /* Return 1 on successful lock, 0 on contention */ > int pci_dev_trylock(struct pci_dev *dev) > { > - if (pci_cfg_access_trylock(dev)) { > - if (device_trylock(&dev->dev)) > + if (device_trylock(&dev->dev)) { > + if (pci_cfg_access_trylock(dev)) > return 1; > - pci_cfg_access_unlock(dev); > + device_unlock(&dev->dev); > } > > return 0; > @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock); > > void pci_dev_unlock(struct pci_dev *dev) > { > - device_unlock(&dev->dev); > pci_cfg_access_unlock(dev); > + device_unlock(&dev->dev); > } > EXPORT_SYMBOL_GPL(pci_dev_unlock); > >
[+cc Don, Jason, Leon, just FYI] On Mon, Apr 04, 2022 at 02:25:39PM +0800, Jay Zhou wrote: > Call trace of PF SRIOV disable: > sriov_numvfs_store > device_lock <----------------- (1) get the device lock > ->sriov_configure # e.g. vfio_pci_sriov_configure > sriov_disable > pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0 > > Call trace of PF reset: > reset_store > pci_reset_function > pci_dev_lock > pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1 > device_lock <------------- (3) want to get the device lock > > These two oprations would wait for each other forever if the > code execution sequence is (1)(2)(3)(4). > > Let's get the device lock and then the config access lock in > pci_dev_lock(). > > Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> Thanks, I (finally) applied Yicong's identical patch from a year ago to pci/virtualization for v5.19, with a commit log combined and adapted from both: PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store() The sysfs sriov_numvfs_store() path acquires the device lock before the config space access lock: sriov_numvfs_store device_lock # A (1) acquire device lock sriov_configure vfio_pci_sriov_configure # (for example) vfio_pci_core_sriov_configure pci_disable_sriov sriov_disable pci_cfg_access_lock pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0 Previously, pci_dev_lock() acquired the config space access lock before the device lock: pci_dev_lock pci_cfg_access_lock dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1 device_lock # A (3) wait for device lock Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may deadlock with sriov_numvfs_store() if the operations occur in the sequence (1) (2) (3) (4). Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires the device lock before the config space access lock, the same as the sriov_numvfs_store() path. [bhelgaas: combined and adapted commit log from Jay Zhou's independent subsequent posting: https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com] Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/ Also-posted-by: Jay Zhou <jianjay.zhou@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pci.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 9ecce435fb3f..61a6db1d21f6 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe) > > void pci_dev_lock(struct pci_dev *dev) > { > - pci_cfg_access_lock(dev); > /* block PM suspend, driver probe, etc. */ > device_lock(&dev->dev); > + pci_cfg_access_lock(dev); > } > EXPORT_SYMBOL_GPL(pci_dev_lock); > > /* Return 1 on successful lock, 0 on contention */ > int pci_dev_trylock(struct pci_dev *dev) > { > - if (pci_cfg_access_trylock(dev)) { > - if (device_trylock(&dev->dev)) > + if (device_trylock(&dev->dev)) { > + if (pci_cfg_access_trylock(dev)) > return 1; > - pci_cfg_access_unlock(dev); > + device_unlock(&dev->dev); > } > > return 0; > @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock); > > void pci_dev_unlock(struct pci_dev *dev) > { > - device_unlock(&dev->dev); > pci_cfg_access_unlock(dev); > + device_unlock(&dev->dev); > } > EXPORT_SYMBOL_GPL(pci_dev_unlock); > > -- > 2.27.0 >
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 9ecce435fb3f..61a6db1d21f6 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe) void pci_dev_lock(struct pci_dev *dev) { - pci_cfg_access_lock(dev); /* block PM suspend, driver probe, etc. */ device_lock(&dev->dev); + pci_cfg_access_lock(dev); } EXPORT_SYMBOL_GPL(pci_dev_lock); /* Return 1 on successful lock, 0 on contention */ int pci_dev_trylock(struct pci_dev *dev) { - if (pci_cfg_access_trylock(dev)) { - if (device_trylock(&dev->dev)) + if (device_trylock(&dev->dev)) { + if (pci_cfg_access_trylock(dev)) return 1; - pci_cfg_access_unlock(dev); + device_unlock(&dev->dev); } return 0; @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock); void pci_dev_unlock(struct pci_dev *dev) { - device_unlock(&dev->dev); pci_cfg_access_unlock(dev); + device_unlock(&dev->dev); } EXPORT_SYMBOL_GPL(pci_dev_unlock);
Call trace of PF SRIOV disable: sriov_numvfs_store device_lock <----------------- (1) get the device lock ->sriov_configure # e.g. vfio_pci_sriov_configure sriov_disable pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0 Call trace of PF reset: reset_store pci_reset_function pci_dev_lock pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1 device_lock <------------- (3) want to get the device lock These two oprations would wait for each other forever if the code execution sequence is (1)(2)(3)(4). Let's get the device lock and then the config access lock in pci_dev_lock(). Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> --- drivers/pci/pci.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)