diff mbox series

[v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain

Message ID 20240711132544.9048-1-kabel@kernel.org (mailing list archive)
State Superseded
Delegated to: Krzysztof Wilczyński
Headers show
Series [v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain | expand

Commit Message

Marek Behún July 11, 2024, 1:25 p.m. UTC
From: Pali Rohár <pali@kernel.org>

The documentation for the irq_domain_remove() function says that all
mappings within the IRQ domain must be disposed before the domain is
removed.

Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
method, which causes the kernel to crash when unloading the driver and
then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.

Unmapping of the IRQs at this point of the .remove() method is safe,
since the PCIe bus is already unregistered, and all its devices are
unbound from their drivers and removed. If there was indeed any
remaining use of PCIe resources, then it would mean that PCIe hotplug
code is broken, and we have bigger problems.

Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>
Signed-off-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Marek Behún <kabel@kernel.org>
[ Marek: refactored a little, added more explanation to commit message ]
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
Changes since v1:
- added explanation into commit message about why this is safe to do,
  as suggested by Andy. The explanation originally comes from Pali:
    https://lore.kernel.org/linux-arm-kernel/20220809133911.hqi7eyskcq2sojia@pali/
---
 drivers/pci/controller/pci-mvebu.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Andrew Lunn July 11, 2024, 2:03 p.m. UTC | #1
On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> From: Pali Rohár <pali@kernel.org>
> 
> The documentation for the irq_domain_remove() function says that all
> mappings within the IRQ domain must be disposed before the domain is
> removed.
> 
> Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
> method, which causes the kernel to crash when unloading the driver and
> then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.
> 
> Unmapping of the IRQs at this point of the .remove() method is safe,
> since the PCIe bus is already unregistered, and all its devices are
> unbound from their drivers and removed. If there was indeed any
> remaining use of PCIe resources, then it would mean that PCIe hotplug
> code is broken, and we have bigger problems.
> 
> Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
> Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>
> Signed-off-by: Pali Rohár <pali@kernel.org>
> Reviewed-by: Marek Behún <kabel@kernel.org>
> [ Marek: refactored a little, added more explanation to commit message ]
> Signed-off-by: Marek Behún <kabel@kernel.org>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew
Bjorn Helgaas July 12, 2024, 8:41 p.m. UTC | #2
[+cc Pali, seems like the author should be included,
Thomas, Marc since they actually know about IRQs, unlike me]

On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> From: Pali Rohár <pali@kernel.org>
> 
> The documentation for the irq_domain_remove() function says that all
> mappings within the IRQ domain must be disposed before the domain is
> removed.
> 
> Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
> method, which causes the kernel to crash when unloading the driver and
> then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.
> 
> Unmapping of the IRQs at this point of the .remove() method is safe,
> since the PCIe bus is already unregistered, and all its devices are
> unbound from their drivers and removed. If there was indeed any
> remaining use of PCIe resources, then it would mean that PCIe hotplug
> code is broken, and we have bigger problems.
> 
> Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
> Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>

Is there a URL for this report?

> Signed-off-by: Pali Rohár <pali@kernel.org>
> Reviewed-by: Marek Behún <kabel@kernel.org>
> [ Marek: refactored a little, added more explanation to commit message ]
> Signed-off-by: Marek Behún <kabel@kernel.org>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
> Changes since v1:
> - added explanation into commit message about why this is safe to do,
>   as suggested by Andy. The explanation originally comes from Pali:
>     https://lore.kernel.org/linux-arm-kernel/20220809133911.hqi7eyskcq2sojia@pali/
> ---
>  drivers/pci/controller/pci-mvebu.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> index 29fe09c99e7d..91a02b23aeb1 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -1683,8 +1683,15 @@ static void mvebu_pcie_remove(struct platform_device *pdev)
>  			irq_set_chained_handler_and_data(irq, NULL, NULL);
>  
>  		/* Remove IRQ domains. */
> -		if (port->intx_irq_domain)
> +		if (port->intx_irq_domain) {
> +			for (int j = 0; j < PCI_NUM_INTX; j++) {
> +				int virq = irq_find_mapping(port->intx_irq_domain, j);
> +
> +				if (virq > 0)
> +					irq_dispose_mapping(virq);

I am not an IRQ expert, so all I can really do is compare this to
usage in other drivers.

There are 20+ drivers in drivers/pci/controller, and I don't see
irq_dispose_mapping() usage similar to this elsewhere.  Does that mean
most or all of the other drivers have a similar defect?

> +			}
>  			irq_domain_remove(port->intx_irq_domain);
> +		}
>  
>  		/* Free config space for emulated root bridge. */
>  		pci_bridge_emul_cleanup(&port->bridge);
> -- 
> 2.44.2
>
Thomas Gleixner July 13, 2024, 10:33 a.m. UTC | #3
On Fri, Jul 12 2024 at 15:41, Bjorn Helgaas wrote:
> On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
>>  		/* Remove IRQ domains. */
>> -		if (port->intx_irq_domain)
>> +		if (port->intx_irq_domain) {
>> +			for (int j = 0; j < PCI_NUM_INTX; j++) {
>> +				int virq = irq_find_mapping(port->intx_irq_domain, j);
>> +
>> +				if (virq > 0)
>> +					irq_dispose_mapping(virq);
>
> I am not an IRQ expert, so all I can really do is compare this to
> usage in other drivers.
>
> There are 20+ drivers in drivers/pci/controller, and I don't see
> irq_dispose_mapping() usage similar to this elsewhere.  Does that mean
> most or all of the other drivers have a similar defect?

Right.

But the real question is why is such a mapping not torn down by the
entity (device, bridge, whatever) which set it up in the first place?

Thanks,

        tglx
Bjorn Helgaas July 13, 2024, 7:32 p.m. UTC | #4
On Sat, Jul 13, 2024 at 12:33:29PM +0200, Thomas Gleixner wrote:
> On Fri, Jul 12 2024 at 15:41, Bjorn Helgaas wrote:
> > On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> >>  		/* Remove IRQ domains. */
> >> -		if (port->intx_irq_domain)
> >> +		if (port->intx_irq_domain) {
> >> +			for (int j = 0; j < PCI_NUM_INTX; j++) {
> >> +				int virq = irq_find_mapping(port->intx_irq_domain, j);
> >> +
> >> +				if (virq > 0)
> >> +					irq_dispose_mapping(virq);
> >
> > I am not an IRQ expert, so all I can really do is compare this to
> > usage in other drivers.
> >
> > There are 20+ drivers in drivers/pci/controller, and I don't see
> > irq_dispose_mapping() usage similar to this elsewhere.  Does that mean
> > most or all of the other drivers have a similar defect?
> 
> Right.
> 
> But the real question is why is such a mapping not torn down by the
> entity (device, bridge, whatever) which set it up in the first place?

Marek/Pali, the commit log mentions a crash when unloading.  Do you
have a pointer to any details?  Maybe there's a driver there that we
can fix?

Bjorn
Thomas Gleixner July 13, 2024, 7:56 p.m. UTC | #5
On Sat, Jul 13 2024 at 14:32, Bjorn Helgaas wrote:
> On Sat, Jul 13, 2024 at 12:33:29PM +0200, Thomas Gleixner wrote:
>> > There are 20+ drivers in drivers/pci/controller, and I don't see
>> > irq_dispose_mapping() usage similar to this elsewhere.  Does that mean
>> > most or all of the other drivers have a similar defect?
>> 
>> Right.
>> 
>> But the real question is why is such a mapping not torn down by the
>> entity (device, bridge, whatever) which set it up in the first place?
>
> Marek/Pali, the commit log mentions a crash when unloading.  Do you
> have a pointer to any details?  Maybe there's a driver there that we
> can fix?

Pali already explained it in his reply to me, but for some stupid reason
(my fault hitting reply instead of reply-all) this did not make the list.

Let me replay the discussion:

>>> Pali:
>>> I'm not expert neither. But IIRC it is because endpoint drivers do not
>>> call irq_dispose_mapping at their own for shared IRQs. And this is how
>>> shared PCI INTA-D interrupts are implemented in the kernel. First
>>> endpoint driver (e.g. wifi card) request for shared interrupt and kernel
>>> automatically creates irq mapping if it does not exist. Second endpoint
>>> driver (e.g. second wifi card) request for shared interrupt and kernel
>>> just returns existing mapping. And when the first endpoint driver is
>>> releasing its own irq handler it do not dispose irq mapping because it
>>> may be shared with other endpoint drivers (in this case with second wifi
>>> card). Seems that the owner of these shared mappings is the PCI
>>> controller driver and it has to do dispose them on its own removal (when
>>> releasing the domain for shared PCI IRQs).

>> tglx:
>> Working around this in a particular PCI controller driver is just wrong
>> then. This wants a common cleanup function so all affected drivers which
>> remove the INTX irqdomain can be fixed up without copy & pasta of the
>> very same code. Something like the below and then change the
>> irq_domain_remove() call for the INTX domain to use that function.

There was some additional back and forth but the actual patch which can
be used for both INTX and other, e.g. error reporting, domains is below.

Sorry for taking this offlist unintenitonally.

The background of all this is that initially PCI[e] controllers were not
removable/modular. Later on the whole modularization effort created this
problem.

Thanks,

        tglx
---

--- a/drivers/pci/irq.c
+++ b/drivers/pci/irq.c
@@ -278,3 +278,19 @@ int __weak pcibios_alloc_irq(struct pci_
 void __weak pcibios_free_irq(struct pci_dev *dev)
 {
 }
+
+#ifdef CONFIG_IRQDOMAIN
+void pci_remove_irqdomain(struct irqdomain *domain, unsigned int nr_irqs)
+{
+	/*
+	 * Comment explaining the oddity of this.
+	 */
+	for (unsigned int i = 0; j < nr_irqs; i++) {
+		int virq = irq_find_mapping(domain, i);
+
+		if (virq > 0)
+			irq_dispose_mapping(virq);
+	}
+	irq_domain_remove(domain);
+}
+#endif
diff mbox series

Patch

diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 29fe09c99e7d..91a02b23aeb1 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -1683,8 +1683,15 @@  static void mvebu_pcie_remove(struct platform_device *pdev)
 			irq_set_chained_handler_and_data(irq, NULL, NULL);
 
 		/* Remove IRQ domains. */
-		if (port->intx_irq_domain)
+		if (port->intx_irq_domain) {
+			for (int j = 0; j < PCI_NUM_INTX; j++) {
+				int virq = irq_find_mapping(port->intx_irq_domain, j);
+
+				if (virq > 0)
+					irq_dispose_mapping(virq);
+			}
 			irq_domain_remove(port->intx_irq_domain);
+		}
 
 		/* Free config space for emulated root bridge. */
 		pci_bridge_emul_cleanup(&port->bridge);