diff mbox series

[v7,3/5] PCI: pciehp: check and wait port status out of DPC before handling DLLSC and PDC

Message ID 20201003075514.32935-4-haifeng.zhao@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Bjorn Helgaas
Headers show
Series Fix DPC hotplug race and enhance error handling | expand

Commit Message

Zhao, Haifeng Oct. 3, 2020, 7:55 a.m. UTC
When root port has DPC capability and it is enabled, then triggered by
errors, DPC DLLSC and PDC etc interrupts will be sent to DPC driver, pciehp
drivers almost at the same time.
That will cause following messed and confused errors handling/recovery/removal
/plugin procedure.

1. Port and device are in error recovery reseting initiated by DPC hardware,
   pciehp driver treats them as device is doing hot-remove or hot-plugin the
   same time.

2. While DPC handler calling device driver->err_handler callback(
   error_detected/resume etc), but the slot may be powered off by

   pciehp
   -> remove_board()
      -> pciehp_power_off_slot().

3. While DPC handler -> pci_do_recovery is doing different action to detect
   error and recover based on device->error_state, pciehp driver could change
   it on the fly by:

   pciehp_unconfigure_device()
   ->pci_walk_bus()
     -> pci_dev_set_disconnected()

4. While DPC handler is calling device driver err_handler callback to detect
   error and recover, pciehp driver could is doing device unbind and release
   its driver.

   ...

While NON-FATAL/FATAL errors happen while hotplug is(is not)doing, result
is not determinate.

So we need some kind of synchronization between pciehp DLLSC/PDC handling
and DPC driver error recover handling.  we need a determinate result
of DPC error containment, link is recovered, link isn't recovered, device
is still there, device is removed, then do pciehp hot-remove and hot-plugin
procudure, don't mix them together.

Per our test on ICS platform, DPC error containment and software handler will
take 10ms up to 50ms till clean the DPC triggered status. it is quick enough
for pciehp compared with its 1000ms waiting to ignore DLLSC/PDC after doing
power off.

With this patch, the handling flow of DPC containment and hotplug is
partly ordered and serialized, let hardware DPC do the controller reset
etc recovery action first, then DPC driver handling the call-back from
device drivers, clear the DPC status, at the end, pciehp handle the DLLSC
and PDC etc.

After tens of PCIe Gen4 NVMe SSD brute force hot-remove and hot-plugin with
any time internval between the two actions, also stressed with the DPC
injection test. system recovered to normal working state from NON-FATAL/FATAL
errors as expected. hotplug works well without any random undeterminate errors
or malfunction.

Brute DPC error injection script:

for i in {0..100}
do
        setpci -s 64:02.0 0x196.w=000a
        setpci -s 65:00.0 0x04.w=0544
        mount /dev/nvme0n1p1 /root/nvme
        sleep 1
done

Signed-off-by: Ethan Zhao <haifeng.zhao@intel.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Tested-by: Shanshan Zhang <ShanshanX.Zhang@intel.com>
---
Changes:
 v2: revise doc according to Andy's suggestion.
 v3: no change.
 v4: no change.
 v5: no change.
 v6: moved to [3/5] from [2/5] and re-wrote description.
 v7: no change.

 drivers/pci/hotplug/pciehp_hpc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Lukas Wunner Oct. 4, 2020, 7:13 p.m. UTC | #1
On Sat, Oct 03, 2020 at 03:55:12AM -0400, Ethan Zhao wrote:
> When root port has DPC capability and it is enabled, then triggered by
> errors, DPC DLLSC and PDC etc interrupts will be sent to DPC driver, pciehp
> drivers almost at the same time.

Do the DLLSC and PDC events occur as a result of handling the error
or do they occur independently?

If the latter, I don't see how we can tell whether the card in the
slot is still the same.

If the former, holding the hotplug slot's reset_lock and doing something
along the lines of pciehp_reset_slot() (or calling it directly) might
solve the race.

Thanks,

Lukas
Ethan Zhao Oct. 7, 2020, 7:48 a.m. UTC | #2
Lukas,

On Mon, Oct 5, 2020 at 3:13 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Sat, Oct 03, 2020 at 03:55:12AM -0400, Ethan Zhao wrote:
> > When root port has DPC capability and it is enabled, then triggered by
> > errors, DPC DLLSC and PDC etc interrupts will be sent to DPC driver, pciehp
> > drivers almost at the same time.
>
> Do the DLLSC and PDC events occur as a result of handling the error
> or do they occur independently?
They could happen independently if links were recovered then the card
was removed.
They could happen as a result of handling the errors the same time.

So don't assume DLLSC and PDC all occur at the same time.

>
> If the latter, I don't see how we can tell whether the card in the
> slot is still the same.
If PDC happens, the card in the slot might not be the same.  so
hot-removal /hot -plugin handling follows the PDC event.
>
> If the former, holding the hotplug slot's reset_lock and doing something
> along the lines of pciehp_reset_slot() (or calling it directly) might
> solve the race.

DPC reset is done by hardware, only AER calls pciehp_reset_slot() as recovery
handling initiated by software.

Thanks,
Ethan
>
> Thanks,
>
> Lukas
diff mbox series

Patch

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 53433b37e181..6f271160f18d 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -710,8 +710,10 @@  static irqreturn_t pciehp_ist(int irq, void *dev_id)
 	down_read(&ctrl->reset_lock);
 	if (events & DISABLE_SLOT)
 		pciehp_handle_disable_request(ctrl);
-	else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC))
+	else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC)) {
+		pci_wait_port_outdpc(pdev);
 		pciehp_handle_presence_or_link_change(ctrl, events);
+	}
 	up_read(&ctrl->reset_lock);
 
 	ret = IRQ_HANDLED;