[1/2] PCIe hotplug interrupt and AER deadlock with reset_lock and device_lock
diff mbox series

Message ID 20200615143250.438252-2-ian.may@canonical.com
State New
Delegated to: Bjorn Helgaas
Headers show
Series
  • Hotplug interrupt and AER recovery deadlocks
Related show

Commit Message

Ian May June 15, 2020, 2:32 p.m. UTC
Currently when a hotplug interrupt and AER recovery triggers simultaneously
the following deadlock can occur.

        Hotplug				       AER
----------------------------       ---------------------------
device_lock(&dev->dev)		   down_write(&ctrl->reset_lock)
down_read(&ctrl->reset_lock)       device_lock(&dev->dev)

This patch adds a reset_lock and reset_unlock hotplug_slot_op.  This would allow
the controller reset_lock/reset_unlock to be moved from pciehp_reset_slot to
pci_slot_reset function allowing the controller reset_lock to be acquired before
the device lock allowing for both hotplug and AER to grab the reset_lock
and device lock in proper order.

Signed-off-by: Ian May <ian.may@canonical.com>
---
 drivers/pci/hotplug/pciehp.h      |  2 ++
 drivers/pci/hotplug/pciehp_core.c |  2 ++
 drivers/pci/hotplug/pciehp_hpc.c  | 21 ++++++++++++++++++---
 drivers/pci/pci.c                 |  6 ++++++
 include/linux/pci_hotplug.h       |  2 ++
 5 files changed, 30 insertions(+), 3 deletions(-)

Comments

Lukas Wunner June 15, 2020, 6:56 p.m. UTC | #1
On Mon, Jun 15, 2020 at 09:32:49AM -0500, Ian May wrote:
> Currently when a hotplug interrupt and AER recovery triggers simultaneously
> the following deadlock can occur.
> 
>         Hotplug				       AER
> ----------------------------       ---------------------------
> device_lock(&dev->dev)		   down_write(&ctrl->reset_lock)
> down_read(&ctrl->reset_lock)       device_lock(&dev->dev)
> 
> This patch adds a reset_lock and reset_unlock hotplug_slot_op.
> This would allow the controller reset_lock/reset_unlock to be moved
> from pciehp_reset_slot to pci_slot_reset function allowing the controller
> reset_lock to be acquired before the device lock allowing for both hotplug
> and AER to grab the reset_lock and device lock in proper order.

I've posted a patch to address such issues on Mar 31, just haven't
gotten around to respin it with a proper commit message:

https://lore.kernel.org/linux-pci/20200331130139.46oxbade6rcbaicb@wunner.de/

I've solved it by moving the reset lock into struct pci_slot.
I think that's simpler than adding two callbacks.

Do you think the AER deadlock could be fixed based on my approach?

Thanks,

Lukas
Ian May June 16, 2020, 5:13 p.m. UTC | #2
Hi Lukas,

Thanks for the quick reply!  I like your solution and have confirmed it
solves the first deadlock we see between the Hotplug interrupt and AER
recovery.

Thanks,
Ian

On 6/15/20 1:56 PM, Lukas Wunner wrote:
> On Mon, Jun 15, 2020 at 09:32:49AM -0500, Ian May wrote:
>> Currently when a hotplug interrupt and AER recovery triggers simultaneously
>> the following deadlock can occur.
>>
>>         Hotplug				       AER
>> ----------------------------       ---------------------------
>> device_lock(&dev->dev)		   down_write(&ctrl->reset_lock)
>> down_read(&ctrl->reset_lock)       device_lock(&dev->dev)
>>
>> This patch adds a reset_lock and reset_unlock hotplug_slot_op.
>> This would allow the controller reset_lock/reset_unlock to be moved
>> from pciehp_reset_slot to pci_slot_reset function allowing the controller
>> reset_lock to be acquired before the device lock allowing for both hotplug
>> and AER to grab the reset_lock and device lock in proper order.
> I've posted a patch to address such issues on Mar 31, just haven't
> gotten around to respin it with a proper commit message:
>
> https://lore.kernel.org/linux-pci/20200331130139.46oxbade6rcbaicb@wunner.de/
>
> I've solved it by moving the reset lock into struct pci_slot.
> I think that's simpler than adding two callbacks.
>
> Do you think the AER deadlock could be fixed based on my approach?
>
> Thanks,
>
> Lukas
Bjorn Helgaas July 14, 2020, 11:58 p.m. UTC | #3
On Mon, Jun 15, 2020 at 08:56:50PM +0200, Lukas Wunner wrote:
> On Mon, Jun 15, 2020 at 09:32:49AM -0500, Ian May wrote:
> > Currently when a hotplug interrupt and AER recovery triggers simultaneously
> > the following deadlock can occur.
> > 
> >         Hotplug				       AER
> > ----------------------------       ---------------------------
> > device_lock(&dev->dev)		   down_write(&ctrl->reset_lock)
> > down_read(&ctrl->reset_lock)       device_lock(&dev->dev)
> > 
> > This patch adds a reset_lock and reset_unlock hotplug_slot_op.
> > This would allow the controller reset_lock/reset_unlock to be moved
> > from pciehp_reset_slot to pci_slot_reset function allowing the controller
> > reset_lock to be acquired before the device lock allowing for both hotplug
> > and AER to grab the reset_lock and device lock in proper order.
> 
> I've posted a patch to address such issues on Mar 31, just haven't
> gotten around to respin it with a proper commit message:
> 
> https://lore.kernel.org/linux-pci/20200331130139.46oxbade6rcbaicb@wunner.de/
> 
> I've solved it by moving the reset lock into struct pci_slot.
> I think that's simpler than adding two callbacks.
> 
> Do you think the AER deadlock could be fixed based on my approach?

How should we proceed on this?
Lukas Wunner July 17, 2020, 5:20 a.m. UTC | #4
On Tue, Jun 16, 2020 at 12:13:23PM -0500, Ian May wrote:
> Thanks for the quick reply! I like your solution and have confirmed it
> solves the first deadlock we see between the Hotplug interrupt and AER
> recovery.

Thank you for the confirmation (and sorry for the delay).  I'm cooking
up a proper patch right now.

One question regarding your patch [2/2]:  If, instead of this patch,
you change pci_bus_error_reset() to call "device_lock(bridge)" rather
than "mutex_lock(&pci_slot_mutex)", do you still see deadlocks?

Taking the pci_slot_mutex in pci_bus_error_reset() was actually the
right thing to do because it holds the driver of the hotplug port
in place.  (The hotplug port above the bus being reset.)  Without
that, dereferencing slot->hotplug in pci_slot_reset() wouldn't be
safe.  My fear is that acquiring the device_lock() of the bridge
leading to the bus being reset may cause other deadlocks, in particular
in cascaded topologies such as Thunderbolt, which I suspect may be
what you're dealing with.

Thanks,

Lukas
Lukas Wunner Aug. 1, 2020, 4:14 p.m. UTC | #5
[cc += Keith]

On Fri, Jul 17, 2020 at 09:02:22AM -0500, Ian May wrote:
> I do now have a "better" patch that I was going to submit to the list
> where I converted the pci_slot_mutex to a rw_semaphore.  Do you see
> any potential problems with changing the lock type?  I attached the
> patch if you are interested in checking it over.

The question is, if pci_slot_mutex is an rw_semaphore, can it happen
that pciehp acquires it for writing, provoking a deadlock like this:

        Hotplug                                AER
	----------------------------       ---------------------------
      1 down_read(&ctrl->reset_lock)
	                                 2 down_read(&pci_slot_mutex)
      3 down_write(&pci_slot_mutex)
                                         4 down_write(&ctrl->reset_lock)
	** DEADLOCK **

I think this can happen if the device inserted into the hotplug slot
contains a PCIe switch which itself has hotplug ports.  That's the
case with Thunderbolt:  Every Thunderbolt device contains a PCIe
switch with hotplug ports to extend the Thunderbolt chain.  E.g.
the PCIe hierarchy looks like this for a Thunderbolt host controller
with a chain of two devices:

Root - Upstream - Downstream - Upstream - Downstream - Upstream - Downstream

(host ...)                     (1st device ...)        (2nd device ...)

When a Thunderbolt device is attached, pci_slot_mutex would be taken
for writing in pci_create_slot():

pciehp_configure_device()
  pci_scan_slot()
    pci_scan_single_device()
      pci_scan_device()
            pci_setup_device()
                pci_dev_assign_slot() # acquire pci_slot_mutex for reading
        pci_device_add() # match_driver = false; device_add()
    pci_bus_add_devices()
      pci_bus_add_device() # match_driver = true;  device_attach()
        device_attach()
          __device_attach()
            __device_attach_driver()
              driver_probe_device()
                pcie_portdrv_probe()
                  pcie_port_device_register()
                    pcie_device_init()
                      device_register()
                        device_add()
                          bus_probe_device()
                            device_initial_probe()
                              __device_attach()
                                __device_attach_driver()
                                  driver_probe_device()
                                    pciehp_probe()
                                      init_slot()
				        pci_hp_initialize()
					  pci_create_slot()
					    down_write(pci_slot_mutex)

(You may want to double-check that I got this right.)

In principle, Keith did the right thing to acquire pci_slot_mutex in
pci_bus_error_reset() for accessing the bus->slots list.

I need to think some more to come up with a solution for this particular
deadlock.  Maybe using a klist and traversing it with klist_iter_init()
(holds a ref on each slot, allowing concurrent list access) or something
along those lines...

Thanks,

Lukas

Patch
diff mbox series

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index e28b4fffd84d..1e98604cec83 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -183,6 +183,8 @@  void pciehp_release_ctrl(struct controller *ctrl);
 
 int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
 int pciehp_sysfs_disable_slot(struct hotplug_slot *hotplug_slot);
+int pciehp_reset_lock(struct hotplug_slot *hotplug_slot);
+int pciehp_reset_unlock(struct hotplug_slot *hotplug_slot);
 int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, int probe);
 int pciehp_get_attention_status(struct hotplug_slot *hotplug_slot, u8 *status);
 int pciehp_set_raw_indicator_status(struct hotplug_slot *h_slot, u8 status);
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 4c032d75c874..a8da3e6a59b8 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -63,6 +63,8 @@  static int init_slot(struct controller *ctrl)
 	ops->get_power_status = get_power_status;
 	ops->get_adapter_status = get_adapter_status;
 	ops->reset_slot = pciehp_reset_slot;
+	ops->reset_lock = pciehp_reset_lock;
+	ops->reset_unlock = pciehp_reset_unlock;
 	if (MRL_SENS(ctrl))
 		ops->get_latch_status = get_latch_status;
 	if (ATTN_LED(ctrl)) {
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 81b23f07719e..185ec9d1b0d0 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -798,6 +798,24 @@  void pcie_disable_interrupt(struct controller *ctrl)
 	pcie_write_cmd(ctrl, 0, mask);
 }
 
+int pciehp_reset_lock(struct hotplug_slot *hotplug_slot)
+{
+	struct controller *ctrl = to_ctrl(hotplug_slot);
+
+	down_write(&ctrl->reset_lock);
+
+	return 0;
+}
+
+int pciehp_reset_unlock(struct hotplug_slot *hotplug_slot)
+{
+	struct controller *ctrl = to_ctrl(hotplug_slot);
+
+	up_write(&ctrl->reset_lock);
+
+	return 0;
+}
+
 /*
  * pciehp has a 1:1 bus:slot relationship so we ultimately want a secondary
  * bus reset of the bridge, but at the same time we want to ensure that it is
@@ -816,8 +834,6 @@  int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, int probe)
 	if (probe)
 		return 0;
 
-	down_write(&ctrl->reset_lock);
-
 	if (!ATTN_BUTTN(ctrl)) {
 		ctrl_mask |= PCI_EXP_SLTCTL_PDCE;
 		stat_mask |= PCI_EXP_SLTSTA_PDC;
@@ -836,7 +852,6 @@  int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, int probe)
 	ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
 		 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL, ctrl_mask);
 
-	up_write(&ctrl->reset_lock);
 	return rc;
 }
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index da21293f1111..e32c5a1a706e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5228,14 +5228,20 @@  static int pci_slot_reset(struct pci_slot *slot, int probe)
 		return -ENOTTY;
 
 	if (!probe)
+	{
+		slot->hotplug->ops->reset_lock(slot->hotplug);
 		pci_slot_lock(slot);
+	}
 
 	might_sleep();
 
 	rc = pci_reset_hotplug_slot(slot->hotplug, probe);
 
 	if (!probe)
+	{
 		pci_slot_unlock(slot);
+		slot->hotplug->ops->reset_unlock(slot->hotplug);
+	}
 
 	return rc;
 }
diff --git a/include/linux/pci_hotplug.h b/include/linux/pci_hotplug.h
index f694eb2ca978..fce5ad979346 100644
--- a/include/linux/pci_hotplug.h
+++ b/include/linux/pci_hotplug.h
@@ -45,6 +45,8 @@  struct hotplug_slot_ops {
 	int (*get_latch_status)		(struct hotplug_slot *slot, u8 *value);
 	int (*get_adapter_status)	(struct hotplug_slot *slot, u8 *value);
 	int (*reset_slot)		(struct hotplug_slot *slot, int probe);
+	int (*reset_lock)               (struct hotplug_slot *slot);
+	int (*reset_unlock)             (struct hotplug_slot *slot);
 };
 
 /**