diff mbox

PCI/portdrv: do not disable device on remove()

Message ID 1527011883-21320-1-git-send-email-okaya@codeaurora.org (mailing list archive)
State Not Applicable, archived
Delegated to: Andy Gross
Headers show

Commit Message

Sinan Kaya May 22, 2018, 5:58 p.m. UTC
'Commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during
shutdown")' has been added to kernel to shutdown pending PCIe port
service interrupts during reboot so that a newly started kexec kernel
wouldn't observe pending interrupts.

pcie_port_device_remove() is disabling the root port and switches by
calling pci_disable_device() after all PCIe service drivers are shutdown.

pci_disable_device() has a much wider impact then port service itself and
it prevents all inbound transactions to reach to the system and impacts
the entire PCI traffic behind the bridge.

Issue is that pcie_port_device_remove() doesn't maintain any coordination
with the rest of the PCI device drivers in the system before clearing the
bus master bit.

This has been found to cause crashes on HP DL360 Gen9 machines during
reboot. Besides, kexec is already clearing the bus master bit in
pci_device_shutdown() after all PCI drivers are removed.

Just remove the extra clear here.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779
Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
Cc: stable@vger.kernel.org
Reported-by: Ryan Finnie <ryan@finnie.org>
---
 drivers/pci/pcie/portdrv_core.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Ryan Finnie May 22, 2018, 7:55 p.m. UTC | #1
On 05/22/2018 10:58 AM, Sinan Kaya wrote:
> 'Commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during
> shutdown")' has been added to kernel to shutdown pending PCIe port
> service interrupts during reboot so that a newly started kexec kernel
> wouldn't observe pending interrupts.
> 
> pcie_port_device_remove() is disabling the root port and switches by
> calling pci_disable_device() after all PCIe service drivers are shutdown.
> 
> pci_disable_device() has a much wider impact then port service itself and
> it prevents all inbound transactions to reach to the system and impacts
> the entire PCI traffic behind the bridge.
> 
> Issue is that pcie_port_device_remove() doesn't maintain any coordination
> with the rest of the PCI device drivers in the system before clearing the
> bus master bit.
> 
> This has been found to cause crashes on HP DL360 Gen9 machines during
> reboot. Besides, kexec is already clearing the bus master bit in
> pci_device_shutdown() after all PCI drivers are removed.

FAOD, this problem has been observed on both DL360 Gen9 and DL380 Gen9,
in both EFI and legacy modes.  I suspect all Gen9 models with the P89
firmware base.

> Just remove the extra clear here.

Thank you!  Fix tested on:

- DL360 Gen9
- DL380 Gen9
- DL380 Gen10 (confirmed no regression)
- DL380 G7 (confirmed no regression)

> Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779
> Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
> Cc: stable@vger.kernel.org
> Reported-by: Ryan Finnie <ryan@finnie.org>

Tested-by: Ryan Finnie <ryan@finnie.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Wunner May 23, 2018, 2:24 a.m. UTC | #2
On Tue, May 22, 2018 at 01:58:00PM -0400, Sinan Kaya wrote:
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -409,7 +409,6 @@ void pcie_port_device_remove(struct pci_dev *dev)
>  {
>  	device_for_each_child(&dev->dev, NULL, remove_iter);
>  	pci_free_irq_vectors(dev);
> -	pci_disable_device(dev);
>  }

Shutdown aside, pci_disable_device() is also not called in the ->remove
path with this patch, right?  Seems wrong.  E.g. when unbinding the driver
from the root port device, or when unplugging a port (happens all the time
with Thunderbolt).

Thanks,

Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sinan Kaya May 23, 2018, 2:40 a.m. UTC | #3
On 5/22/2018 10:24 PM, Lukas Wunner wrote:
> On Tue, May 22, 2018 at 01:58:00PM -0400, Sinan Kaya wrote:
>> --- a/drivers/pci/pcie/portdrv_core.c
>> +++ b/drivers/pci/pcie/portdrv_core.c
>> @@ -409,7 +409,6 @@ void pcie_port_device_remove(struct pci_dev *dev)
>>  {
>>  	device_for_each_child(&dev->dev, NULL, remove_iter);
>>  	pci_free_irq_vectors(dev);
>> -	pci_disable_device(dev);
>>  }
> 
> Shutdown aside, pci_disable_device() is also not called in the ->remove
> path with this patch, right?  Seems wrong.  E.g. when unbinding the driver
> from the root port device, or when unplugging a port (happens all the time
> with Thunderbolt).

Agreed. I'll spin another version where I skip disable on shutdown path only.

> 
> Thanks,
> 
> Lukas
>
diff mbox

Patch

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index c9c0663..d22a95d 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -409,7 +409,6 @@  void pcie_port_device_remove(struct pci_dev *dev)
 {
 	device_for_each_child(&dev->dev, NULL, remove_iter);
 	pci_free_irq_vectors(dev);
-	pci_disable_device(dev);
 }
 
 /**