diff mbox series

[v1] PCI: starfive: Fix kmemleak in StarFive PCIe driver's IRQ handling

Message ID 20250208140110.2389-1-linux.amoon@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Krzysztof Wilczyński
Headers show
Series [v1] PCI: starfive: Fix kmemleak in StarFive PCIe driver's IRQ handling | expand

Commit Message

Anand Moon Feb. 8, 2025, 2:01 p.m. UTC
kmemleak reported following debug log

$ sudo cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffffd6c47c2600 (size 128):
  comm "kworker/u16:2", pid 38, jiffies 4294942263
  hex dump (first 32 bytes):
    cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 4f07ff07):
    __create_object+0x2a/0xfc
    kmemleak_alloc+0x38/0x98
    __kmalloc_cache_noprof+0x296/0x444
    request_threaded_irq+0x168/0x284
    devm_request_threaded_irq+0xa8/0x13c
    plda_init_interrupts+0x46e/0x858
    plda_pcie_host_init+0x356/0x468
    starfive_pcie_probe+0x2f6/0x398
    platform_probe+0x106/0x150
    really_probe+0x30e/0x746
    __driver_probe_device+0x11c/0x2c2
    driver_probe_device+0x5e/0x316
    __device_attach_driver+0x296/0x3a4
    bus_for_each_drv+0x1d0/0x260
    __device_attach+0x1fa/0x2d6
    device_initial_probe+0x14/0x28
unreferenced object 0xffffffd6c47c2900 (size 128):
  comm "kworker/u16:2", pid 38, jiffies 4294942281

This patch addresses a kmemleak reported during StarFive PCIe driver
initialization. The leak was observed with kmemleak reporting
unreferenced objects related to IRQ handling. The backtrace pointed
to the `request_threaded_irq` and related functions within the
`plda_init_interrupts` path, indicating that some allocated memory
related to IRQs was not being properly freed.

The issue was that while the driver requested IRQs, it wasn't
correctly handling the lifecycle of the associated resources.
This patch introduces an event IRQ handler and the necessary
infrastructure to manage these IRQs, preventing the memory leak.

Fixes: 647690479660 ("PCI: microchip: Add request_event_irq() callback function")
Cc: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
---
 drivers/pci/controller/plda/pcie-starfive.c | 39 +++++++++++++++++++++
 1 file changed, 39 insertions(+)


base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b

Comments

Markus Elfring Feb. 8, 2025, 4:33 p.m. UTC | #1
> This patch addresses a kmemleak reported during StarFive PCIe driver> This patch introduces an event IRQ handler and the necessary
> infrastructure to manage these IRQs, preventing the memory leak.

See also:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.13#n94

Regards,
Markus
Anand Moon Feb. 9, 2025, 4:22 a.m. UTC | #2
Hi Markus,

On Sat, 8 Feb 2025 at 22:03, Markus Elfring <Markus.Elfring@web.de> wrote:
>
> …
> > This patch addresses a kmemleak reported during StarFive PCIe driver
> …
> > This patch introduces an event IRQ handler and the necessary
> > infrastructure to manage these IRQs, preventing the memory leak.
>
> See also:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.13#n94
>
Thanks I will update the commit message, one I get more feedback on
this code changes.
> Regards,
> Markus

Thanks
-Anand
Manivannan Sadhasivam Feb. 10, 2025, 5:44 p.m. UTC | #3
On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> kmemleak reported following debug log
> 
> $ sudo cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffffffd6c47c2600 (size 128):
>   comm "kworker/u16:2", pid 38, jiffies 4294942263
>   hex dump (first 32 bytes):
>     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace (crc 4f07ff07):
>     __create_object+0x2a/0xfc
>     kmemleak_alloc+0x38/0x98
>     __kmalloc_cache_noprof+0x296/0x444
>     request_threaded_irq+0x168/0x284
>     devm_request_threaded_irq+0xa8/0x13c
>     plda_init_interrupts+0x46e/0x858
>     plda_pcie_host_init+0x356/0x468
>     starfive_pcie_probe+0x2f6/0x398
>     platform_probe+0x106/0x150
>     really_probe+0x30e/0x746
>     __driver_probe_device+0x11c/0x2c2
>     driver_probe_device+0x5e/0x316
>     __device_attach_driver+0x296/0x3a4
>     bus_for_each_drv+0x1d0/0x260
>     __device_attach+0x1fa/0x2d6
>     device_initial_probe+0x14/0x28
> unreferenced object 0xffffffd6c47c2900 (size 128):
>   comm "kworker/u16:2", pid 38, jiffies 4294942281
> 
> This patch addresses a kmemleak reported during StarFive PCIe driver
> initialization. The leak was observed with kmemleak reporting
> unreferenced objects related to IRQ handling. The backtrace pointed
> to the `request_threaded_irq` and related functions within the
> `plda_init_interrupts` path, indicating that some allocated memory
> related to IRQs was not being properly freed.
> 
> The issue was that while the driver requested IRQs, it wasn't
> correctly handling the lifecycle of the associated resources.

What resources?

> This patch introduces an event IRQ handler and the necessary
> infrastructure to manage these IRQs, preventing the memory leak.
> 

These handles appear pointless to me. What purpose are they serving?

- Mani

> Fixes: 647690479660 ("PCI: microchip: Add request_event_irq() callback function")
> Cc: Minda Chen <minda.chen@starfivetech.com>
> Signed-off-by: Anand Moon <linux.amoon@gmail.com>
> ---
>  drivers/pci/controller/plda/pcie-starfive.c | 39 +++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/drivers/pci/controller/plda/pcie-starfive.c b/drivers/pci/controller/plda/pcie-starfive.c
> index e73c1b7bc8ef..a57177a8be8a 100644
> --- a/drivers/pci/controller/plda/pcie-starfive.c
> +++ b/drivers/pci/controller/plda/pcie-starfive.c
> @@ -381,7 +381,46 @@ static const struct plda_pcie_host_ops sf_host_ops = {
>  	.host_deinit = starfive_pcie_host_deinit,
>  };
>  
> +/* IRQ handler for PCIe events */
> +static irqreturn_t starfive_event_handler(int irq, void *dev_id)
> +{
> +	struct plda_pcie_rp *port = dev_id;
> +	struct irq_data *data;
> +
> +	/* Retrieve IRQ data */
> +	data = irq_domain_get_irq_data(port->event_domain, irq);
> +
> +	if (data) {
> +		dev_err_ratelimited(port->dev, "Event IRQ %ld triggered\n",
> +				    data->hwirq);
> +	} else {
> +		dev_err_ratelimited(port->dev, "Invalid event IRQ %d\n", irq);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Request an IRQ for a specific event */
> +static int starfive_request_event_irq(struct plda_pcie_rp *plda, int event_irq,
> +				      int event)
> +{
> +	int ret;
> +
> +	/* Request the IRQ and associate it with the handler */
> +	ret = devm_request_irq(plda->dev, event_irq, starfive_event_handler,
> +			       IRQF_SHARED, "starfivepcie", plda);
> +
> +	if (ret) {
> +		dev_err(plda->dev, "Failed to request IRQ %d: %d\n", event_irq,
> +			ret);
> +		return ret;
> +	}
> +
> +	return ret;
> +}
> +
>  static const struct plda_event stf_pcie_event = {
> +	.request_event_irq = starfive_request_event_irq,
>  	.intx_event = EVENT_PM_MSI_INT_INTX,
>  	.msi_event  = EVENT_PM_MSI_INT_MSI
>  };
> 
> base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b
> -- 
> 2.48.1
>
Anand Moon Feb. 10, 2025, 7:39 p.m. UTC | #4
Hi Manivannan

On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
<manivannan.sadhasivam@linaro.org> wrote:
>
> On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > kmemleak reported following debug log
> >
> > $ sudo cat /sys/kernel/debug/kmemleak
> > unreferenced object 0xffffffd6c47c2600 (size 128):
> >   comm "kworker/u16:2", pid 38, jiffies 4294942263
> >   hex dump (first 32 bytes):
> >     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
> >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >   backtrace (crc 4f07ff07):
> >     __create_object+0x2a/0xfc
> >     kmemleak_alloc+0x38/0x98
> >     __kmalloc_cache_noprof+0x296/0x444
> >     request_threaded_irq+0x168/0x284
> >     devm_request_threaded_irq+0xa8/0x13c
> >     plda_init_interrupts+0x46e/0x858
> >     plda_pcie_host_init+0x356/0x468
> >     starfive_pcie_probe+0x2f6/0x398
> >     platform_probe+0x106/0x150
> >     really_probe+0x30e/0x746
> >     __driver_probe_device+0x11c/0x2c2
> >     driver_probe_device+0x5e/0x316
> >     __device_attach_driver+0x296/0x3a4
> >     bus_for_each_drv+0x1d0/0x260
> >     __device_attach+0x1fa/0x2d6
> >     device_initial_probe+0x14/0x28
> > unreferenced object 0xffffffd6c47c2900 (size 128):
> >   comm "kworker/u16:2", pid 38, jiffies 4294942281
> >
> > This patch addresses a kmemleak reported during StarFive PCIe driver
> > initialization. The leak was observed with kmemleak reporting
> > unreferenced objects related to IRQ handling. The backtrace pointed
> > to the `request_threaded_irq` and related functions within the
> > `plda_init_interrupts` path, indicating that some allocated memory
> > related to IRQs was not being properly freed.
> >
> > The issue was that while the driver requested IRQs, it wasn't
> > correctly handling the lifecycle of the associated resources.
>
> What resources?
>
The Microchip PCIe host driver [1] handles  PCI, SEC, DEBUG, and LOCAL
hardware events
through a dedicated framework [2]. This framework allows the core driver [3]
to monitor and wait for these specific events.

[1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
[2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
[3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466

StarFive PCIe driver currently lacks the necessary `request_event_irq`
implementation
to integrate with this event-handling mechanism, which prevents the core driver
from properly responding to these events on StarFive platforms.

static const struct plda_event mc_event = {
.  request_event_irq = mc_request_event_irq,
  .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
  .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
};

This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
Since the core [3] driver is monitoring for these events

> > This patch introduces an event IRQ handler and the necessary
> > infrastructure to manage these IRQs, preventing the memory leak.
> >
>
> These handles appear pointless to me. What purpose are they serving?
>
Yes, you are correct, the core event monitoring framework [3] triggered a
kernel memory leak. This patch adds a dummy IRQ callback as a
placeholder for proper event handling, which can be implemented in a
future patch.

> - Mani
>
Thanks
-Anand

> > Fixes: 647690479660 ("PCI: microchip: Add request_event_irq() callback function")
> > Cc: Minda Chen <minda.chen@starfivetech.com>
> > Signed-off-by: Anand Moon <linux.amoon@gmail.com>
> > ---
> >  drivers/pci/controller/plda/pcie-starfive.c | 39 +++++++++++++++++++++
> >  1 file changed, 39 insertions(+)
> >
> > diff --git a/drivers/pci/controller/plda/pcie-starfive.c b/drivers/pci/controller/plda/pcie-starfive.c
> > index e73c1b7bc8ef..a57177a8be8a 100644
> > --- a/drivers/pci/controller/plda/pcie-starfive.c
> > +++ b/drivers/pci/controller/plda/pcie-starfive.c
> > @@ -381,7 +381,46 @@ static const struct plda_pcie_host_ops sf_host_ops = {
> >       .host_deinit = starfive_pcie_host_deinit,
> >  };
> >
> > +/* IRQ handler for PCIe events */
> > +static irqreturn_t starfive_event_handler(int irq, void *dev_id)
> > +{
> > +     struct plda_pcie_rp *port = dev_id;
> > +     struct irq_data *data;
> > +
> > +     /* Retrieve IRQ data */
> > +     data = irq_domain_get_irq_data(port->event_domain, irq);
> > +
> > +     if (data) {
> > +             dev_err_ratelimited(port->dev, "Event IRQ %ld triggered\n",
> > +                                 data->hwirq);
> > +     } else {
> > +             dev_err_ratelimited(port->dev, "Invalid event IRQ %d\n", irq);
> > +     }
> > +
> > +     return IRQ_HANDLED;
> > +}
> > +
> > +/* Request an IRQ for a specific event */
> > +static int starfive_request_event_irq(struct plda_pcie_rp *plda, int event_irq,
> > +                                   int event)
> > +{
> > +     int ret;
> > +
> > +     /* Request the IRQ and associate it with the handler */
> > +     ret = devm_request_irq(plda->dev, event_irq, starfive_event_handler,
> > +                            IRQF_SHARED, "starfivepcie", plda);
> > +
> > +     if (ret) {
> > +             dev_err(plda->dev, "Failed to request IRQ %d: %d\n", event_irq,
> > +                     ret);
> > +             return ret;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> >  static const struct plda_event stf_pcie_event = {
> > +     .request_event_irq = starfive_request_event_irq,
> >       .intx_event = EVENT_PM_MSI_INT_INTX,
> >       .msi_event  = EVENT_PM_MSI_INT_MSI
> >  };
> >
> > base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b
> > --
> > 2.48.1
> >
>
> --
> மணிவண்ணன் சதாசிவம்
Manivannan Sadhasivam Feb. 14, 2025, 6:09 a.m. UTC | #5
On Tue, Feb 11, 2025 at 01:09:04AM +0530, Anand Moon wrote:
> Hi Manivannan
> 
> On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
> > On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > > kmemleak reported following debug log
> > >
> > > $ sudo cat /sys/kernel/debug/kmemleak
> > > unreferenced object 0xffffffd6c47c2600 (size 128):
> > >   comm "kworker/u16:2", pid 38, jiffies 4294942263
> > >   hex dump (first 32 bytes):
> > >     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
> > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > >   backtrace (crc 4f07ff07):
> > >     __create_object+0x2a/0xfc
> > >     kmemleak_alloc+0x38/0x98
> > >     __kmalloc_cache_noprof+0x296/0x444
> > >     request_threaded_irq+0x168/0x284
> > >     devm_request_threaded_irq+0xa8/0x13c
> > >     plda_init_interrupts+0x46e/0x858
> > >     plda_pcie_host_init+0x356/0x468
> > >     starfive_pcie_probe+0x2f6/0x398
> > >     platform_probe+0x106/0x150
> > >     really_probe+0x30e/0x746
> > >     __driver_probe_device+0x11c/0x2c2
> > >     driver_probe_device+0x5e/0x316
> > >     __device_attach_driver+0x296/0x3a4
> > >     bus_for_each_drv+0x1d0/0x260
> > >     __device_attach+0x1fa/0x2d6
> > >     device_initial_probe+0x14/0x28
> > > unreferenced object 0xffffffd6c47c2900 (size 128):
> > >   comm "kworker/u16:2", pid 38, jiffies 4294942281
> > >
> > > This patch addresses a kmemleak reported during StarFive PCIe driver
> > > initialization. The leak was observed with kmemleak reporting
> > > unreferenced objects related to IRQ handling. The backtrace pointed
> > > to the `request_threaded_irq` and related functions within the
> > > `plda_init_interrupts` path, indicating that some allocated memory
> > > related to IRQs was not being properly freed.
> > >
> > > The issue was that while the driver requested IRQs, it wasn't
> > > correctly handling the lifecycle of the associated resources.
> >
> > What resources?
> >
> The Microchip PCIe host driver [1] handles  PCI, SEC, DEBUG, and LOCAL
> hardware events
> through a dedicated framework [2]. This framework allows the core driver [3]
> to monitor and wait for these specific events.
> 

Microchip driver also has its own 'event_ops' and 'event_irq_chip', so defining
'request_event_irq()' callback makes sense to me.

> [1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
> [2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
> [3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466
> 
> StarFive PCIe driver currently lacks the necessary `request_event_irq`
> implementation
> to integrate with this event-handling mechanism, which prevents the core driver
> from properly responding to these events on StarFive platforms.
> 
> static const struct plda_event mc_event = {
> .  request_event_irq = mc_request_event_irq,
>   .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
>   .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
> };
> 
> This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
> Since the core [3] driver is monitoring for these events
> 

This still doesn't make sense to me. Under what condition you observed the
kmemleak? Since it points to devm_request_irq(), I can understand that the
memory allocated for the IRQ is not freed. But when does it happen?

> > > This patch introduces an event IRQ handler and the necessary
> > > infrastructure to manage these IRQs, preventing the memory leak.
> > >
> >
> > These handles appear pointless to me. What purpose are they serving?
> >
> Yes, you are correct, the core event monitoring framework [3] triggered a
> kernel memory leak. This patch adds a dummy IRQ callback as a
> placeholder for proper event handling, which can be implemented in a
> future patch.
> 

The dummy request_event_irq() callback is not supposed to be needed in the first
place. So clearly, this patch is not fixing the actual memory leak but trying to
cover it up.

- Mani
Anand Moon Feb. 20, 2025, 8:13 a.m. UTC | #6
Hi Manivannan

On Fri, 14 Feb 2025 at 11:39, Manivannan Sadhasivam
<manivannan.sadhasivam@linaro.org> wrote:
>
> On Tue, Feb 11, 2025 at 01:09:04AM +0530, Anand Moon wrote:
> > Hi Manivannan
> >
> > On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
> > <manivannan.sadhasivam@linaro.org> wrote:
> > >
> > > On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > > > kmemleak reported following debug log
> > > >
> > > > $ sudo cat /sys/kernel/debug/kmemleak
> > > > unreferenced object 0xffffffd6c47c2600 (size 128):
> > > >   comm "kworker/u16:2", pid 38, jiffies 4294942263
> > > >   hex dump (first 32 bytes):
> > > >     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
> > > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > > >   backtrace (crc 4f07ff07):
> > > >     __create_object+0x2a/0xfc
> > > >     kmemleak_alloc+0x38/0x98
> > > >     __kmalloc_cache_noprof+0x296/0x444
> > > >     request_threaded_irq+0x168/0x284
> > > >     devm_request_threaded_irq+0xa8/0x13c
> > > >     plda_init_interrupts+0x46e/0x858
> > > >     plda_pcie_host_init+0x356/0x468
> > > >     starfive_pcie_probe+0x2f6/0x398
> > > >     platform_probe+0x106/0x150
> > > >     really_probe+0x30e/0x746
> > > >     __driver_probe_device+0x11c/0x2c2
> > > >     driver_probe_device+0x5e/0x316
> > > >     __device_attach_driver+0x296/0x3a4
> > > >     bus_for_each_drv+0x1d0/0x260
> > > >     __device_attach+0x1fa/0x2d6
> > > >     device_initial_probe+0x14/0x28
> > > > unreferenced object 0xffffffd6c47c2900 (size 128):
> > > >   comm "kworker/u16:2", pid 38, jiffies 4294942281
> > > >
> > > > This patch addresses a kmemleak reported during StarFive PCIe driver
> > > > initialization. The leak was observed with kmemleak reporting
> > > > unreferenced objects related to IRQ handling. The backtrace pointed
> > > > to the `request_threaded_irq` and related functions within the
> > > > `plda_init_interrupts` path, indicating that some allocated memory
> > > > related to IRQs was not being properly freed.
> > > >
> > > > The issue was that while the driver requested IRQs, it wasn't
> > > > correctly handling the lifecycle of the associated resources.
> > >
> > > What resources?
> > >
> > The Microchip PCIe host driver [1] handles  PCI, SEC, DEBUG, and LOCAL
> > hardware events
> > through a dedicated framework [2]. This framework allows the core driver [3]
> > to monitor and wait for these specific events.
> >
>
> Microchip driver also has its own 'event_ops' and 'event_irq_chip', so defining
> 'request_event_irq()' callback makes sense to me.
>
> > [1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
> > [2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
> > [3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466
> >
> > StarFive PCIe driver currently lacks the necessary `request_event_irq`
> > implementation
> > to integrate with this event-handling mechanism, which prevents the core driver
> > from properly responding to these events on StarFive platforms.
> >
> > static const struct plda_event mc_event = {
> > .  request_event_irq = mc_request_event_irq,
> >   .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
> >   .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
> > };
> >
> > This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
> > Since the core [3] driver is monitoring for these events
> >
>
> This still doesn't make sense to me. Under what condition you observed the
> kmemleak? Since it points to devm_request_irq(), I can understand that the
> memory allocated for the IRQ is not freed. But when does it happen?
>
The PCIe host driver is responsible for monitoring the PLDA PCIe EVENT irq.
However, I have been unable to locate the register map necessary to
implement the
required code updates,

alarm@archriscv:/media/nvme0/mainline/linux-riscv-6.y-devel$ cat
/proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 10:     101145      76126      79040      80218 RISC-V INTC   5 Edge
    riscv-timer
 12:          3          0          0          0 SiFive PLIC 111 Edge
    17030000.power-controller
 13:         20          0          0          0 SiFive PLIC  30 Edge
    1600c000.rng
 14:          0          0          0          0 SiFive PLIC   1 Edge
    ccache_ecc
 15:          0          0          0          0 SiFive PLIC   3 Edge
    ccache_ecc
 16:          0          0          0          0 SiFive PLIC   4 Edge
    ccache_ecc
 17:          0          0          0          0 SiFive PLIC   2 Edge
    ccache_ecc
 20:          0          0          0          0 SiFive PLIC  73 Edge
    dw_axi_dmac_platform
 21:       1694          0          0          0 SiFive PLIC  32 Edge      ttyS0
 22:          0          0          0          0 SiFive PLIC  35 Edge
    10030000.i2c
 23:          0          0          0          0 SiFive PLIC  75 Edge
    dw-mci
 24:          0          0          0          0 SiFive PLIC  37 Edge
    10050000.i2c
 25:        171          0          0          0 SiFive PLIC  50 Edge
    12050000.i2c
 26:          0          0          0          0 SiFive PLIC  51 Edge
    12060000.i2c
 27:      10419          0          0          0 SiFive PLIC  74 Edge
    dw-mci
 28:          6          0          0          0 SiFive PLIC  25 Edge
    13010000.spi
 29:          0          0          0          0 SiFive PLIC  38 Edge      pl022
 31:          0          0          0          0 PLDA PCIe EVENT  16
Edge      940000000.pcie
 32:          0          0          0          0 PLDA PCIe EVENT  17
Edge      940000000.pcie
 33:          0          0          0          0 PLDA PCIe EVENT  18
Edge      940000000.pcie
 34:          0          0          0          0 PLDA PCIe EVENT  20
Edge      940000000.pcie
 35:          0          0          0          0 PLDA PCIe EVENT  21
Edge      940000000.pcie
 36:          0          0          0          0 PLDA PCIe EVENT  22
Edge      940000000.pcie
 39:          0          0          0          0 PLDA PCIe EVENT  26
Edge      940000000.pcie
 40:          0          0          0          0 PLDA PCIe EVENT  27
Edge      940000000.pcie
 41:          0          0          0          0 17020000.pinctrl  41
Edge      16020000.mmc cd
 42:          0          0          0          0 PLDA PCIe EVENT  28
Edge      940000000.pcie
 44:          0          0          0          0 PLDA PCIe MSI   0
Edge      PCIe PME, aerdrv, PCIe bwctrl
 46:          0          0          0          0 PLDA PCIe EVENT  16
Edge      9c0000000.pcie
 47:          0          0          0          0 PLDA PCIe EVENT  17
Edge      9c0000000.pcie
 48:          0          0          0          0 PLDA PCIe EVENT  18
Edge      9c0000000.pcie
 49:          0          0          0          0 PLDA PCIe EVENT  20
Edge      9c0000000.pcie
 50:          0          0          0          0 PLDA PCIe EVENT  21
Edge      9c0000000.pcie
 51:          0          0          0          0 PLDA PCIe EVENT  22
Edge      9c0000000.pcie
 54:          0          0          0          0 PLDA PCIe EVENT  26
Edge      9c0000000.pcie
 55:          0          0          0          0 PLDA PCIe EVENT  27
Edge      9c0000000.pcie
 56:          0          0          0          0 PLDA PCIe EVENT  28
Edge      9c0000000.pcie
 58:          0          0          0          0 PLDA PCIe MSI
134217728 Edge      PCIe PME, aerdrv, PCIe bwctrl

Yep, Thanks for your review comment.

The following changes fix the kernel memory leak warning at my end.

$ git diff
diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
b/drivers/pci/controller/plda/pcie-plda-host.c
index 4153214ca410..d4e7c1b49607 100644
--- a/drivers/pci/controller/plda/pcie-plda-host.c
+++ b/drivers/pci/controller/plda/pcie-plda-host.c
@@ -461,6 +461,7 @@ int plda_init_interrupts(struct platform_device *pdev,

                if (ret) {
                        dev_err(dev, "failed to request IRQ %d\n", event_irq);
+                       irq_dispose_mapping(event_irq);
                        return ret;
                }
        }
> > > > This patch introduces an event IRQ handler and the necessary
> > > > infrastructure to manage these IRQs, preventing the memory leak.
> > > >
> > >
> > > These handles appear pointless to me. What purpose are they serving?
> > >
> > Yes, you are correct, the core event monitoring framework [3] triggered a
> > kernel memory leak. This patch adds a dummy IRQ callback as a
> > placeholder for proper event handling, which can be implemented in a
> > future patch.
> >
>
> The dummy request_event_irq() callback is not supposed to be needed in the first
> place. So clearly, this patch is not fixing the actual memory leak but trying to
> cover it up.
You are correct.
>
> - Mani
>
Thanks
-Anand
> --
> மணிவண்ணன் சதாசிவம்
Anand Moon Feb. 20, 2025, 10:23 a.m. UTC | #7
Hi Manivannan,

On Thu, 20 Feb 2025 at 13:43, Anand Moon <linux.amoon@gmail.com> wrote:
>
> Hi Manivannan
>
> On Fri, 14 Feb 2025 at 11:39, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
> > On Tue, Feb 11, 2025 at 01:09:04AM +0530, Anand Moon wrote:
> > > Hi Manivannan
> > >
> > > On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
> > > <manivannan.sadhasivam@linaro.org> wrote:
> > > >
> > > > On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > > > > kmemleak reported following debug log
> > > > >
> > > > > $ sudo cat /sys/kernel/debug/kmemleak
> > > > > unreferenced object 0xffffffd6c47c2600 (size 128):
> > > > >   comm "kworker/u16:2", pid 38, jiffies 4294942263
> > > > >   hex dump (first 32 bytes):
> > > > >     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
> > > > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > > > >   backtrace (crc 4f07ff07):
> > > > >     __create_object+0x2a/0xfc
> > > > >     kmemleak_alloc+0x38/0x98
> > > > >     __kmalloc_cache_noprof+0x296/0x444
> > > > >     request_threaded_irq+0x168/0x284
> > > > >     devm_request_threaded_irq+0xa8/0x13c
> > > > >     plda_init_interrupts+0x46e/0x858
> > > > >     plda_pcie_host_init+0x356/0x468
> > > > >     starfive_pcie_probe+0x2f6/0x398
> > > > >     platform_probe+0x106/0x150
> > > > >     really_probe+0x30e/0x746
> > > > >     __driver_probe_device+0x11c/0x2c2
> > > > >     driver_probe_device+0x5e/0x316
> > > > >     __device_attach_driver+0x296/0x3a4
> > > > >     bus_for_each_drv+0x1d0/0x260
> > > > >     __device_attach+0x1fa/0x2d6
> > > > >     device_initial_probe+0x14/0x28
> > > > > unreferenced object 0xffffffd6c47c2900 (size 128):
> > > > >   comm "kworker/u16:2", pid 38, jiffies 4294942281
> > > > >
> > > > > This patch addresses a kmemleak reported during StarFive PCIe driver
> > > > > initialization. The leak was observed with kmemleak reporting
> > > > > unreferenced objects related to IRQ handling. The backtrace pointed
> > > > > to the `request_threaded_irq` and related functions within the
> > > > > `plda_init_interrupts` path, indicating that some allocated memory
> > > > > related to IRQs was not being properly freed.
> > > > >
> > > > > The issue was that while the driver requested IRQs, it wasn't
> > > > > correctly handling the lifecycle of the associated resources.
> > > >
> > > > What resources?
> > > >
> > > The Microchip PCIe host driver [1] handles  PCI, SEC, DEBUG, and LOCAL
> > > hardware events
> > > through a dedicated framework [2]. This framework allows the core driver [3]
> > > to monitor and wait for these specific events.
> > >
> >
> > Microchip driver also has its own 'event_ops' and 'event_irq_chip', so defining
> > 'request_event_irq()' callback makes sense to me.
> >
> > > [1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
> > > [2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
> > > [3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466
> > >
> > > StarFive PCIe driver currently lacks the necessary `request_event_irq`
> > > implementation
> > > to integrate with this event-handling mechanism, which prevents the core driver
> > > from properly responding to these events on StarFive platforms.
> > >
> > > static const struct plda_event mc_event = {
> > > .  request_event_irq = mc_request_event_irq,
> > >   .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
> > >   .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
> > > };
> > >
> > > This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
> > > Since the core [3] driver is monitoring for these events
> > >
> >
> > This still doesn't make sense to me. Under what condition you observed the
> > kmemleak? Since it points to devm_request_irq(), I can understand that the
> > memory allocated for the IRQ is not freed. But when does it happen?
> >
> The PCIe host driver is responsible for monitoring the PLDA PCIe EVENT irq.
> However, I have been unable to locate the register map necessary to
> implement the
> required code updates,
>
> alarm@archriscv:/media/nvme0/mainline/linux-riscv-6.y-devel$ cat
> /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3
>  10:     101145      76126      79040      80218 RISC-V INTC   5 Edge
>     riscv-timer
>  12:          3          0          0          0 SiFive PLIC 111 Edge
>     17030000.power-controller
>  13:         20          0          0          0 SiFive PLIC  30 Edge
>     1600c000.rng
>  14:          0          0          0          0 SiFive PLIC   1 Edge
>     ccache_ecc
>  15:          0          0          0          0 SiFive PLIC   3 Edge
>     ccache_ecc
>  16:          0          0          0          0 SiFive PLIC   4 Edge
>     ccache_ecc
>  17:          0          0          0          0 SiFive PLIC   2 Edge
>     ccache_ecc
>  20:          0          0          0          0 SiFive PLIC  73 Edge
>     dw_axi_dmac_platform
>  21:       1694          0          0          0 SiFive PLIC  32 Edge      ttyS0
>  22:          0          0          0          0 SiFive PLIC  35 Edge
>     10030000.i2c
>  23:          0          0          0          0 SiFive PLIC  75 Edge
>     dw-mci
>  24:          0          0          0          0 SiFive PLIC  37 Edge
>     10050000.i2c
>  25:        171          0          0          0 SiFive PLIC  50 Edge
>     12050000.i2c
>  26:          0          0          0          0 SiFive PLIC  51 Edge
>     12060000.i2c
>  27:      10419          0          0          0 SiFive PLIC  74 Edge
>     dw-mci
>  28:          6          0          0          0 SiFive PLIC  25 Edge
>     13010000.spi
>  29:          0          0          0          0 SiFive PLIC  38 Edge      pl022
>  31:          0          0          0          0 PLDA PCIe EVENT  16
> Edge      940000000.pcie
>  32:          0          0          0          0 PLDA PCIe EVENT  17
> Edge      940000000.pcie
>  33:          0          0          0          0 PLDA PCIe EVENT  18
> Edge      940000000.pcie
>  34:          0          0          0          0 PLDA PCIe EVENT  20
> Edge      940000000.pcie
>  35:          0          0          0          0 PLDA PCIe EVENT  21
> Edge      940000000.pcie
>  36:          0          0          0          0 PLDA PCIe EVENT  22
> Edge      940000000.pcie
>  39:          0          0          0          0 PLDA PCIe EVENT  26
> Edge      940000000.pcie
>  40:          0          0          0          0 PLDA PCIe EVENT  27
> Edge      940000000.pcie
>  41:          0          0          0          0 17020000.pinctrl  41
> Edge      16020000.mmc cd
>  42:          0          0          0          0 PLDA PCIe EVENT  28
> Edge      940000000.pcie
>  44:          0          0          0          0 PLDA PCIe MSI   0
> Edge      PCIe PME, aerdrv, PCIe bwctrl
>  46:          0          0          0          0 PLDA PCIe EVENT  16
> Edge      9c0000000.pcie
>  47:          0          0          0          0 PLDA PCIe EVENT  17
> Edge      9c0000000.pcie
>  48:          0          0          0          0 PLDA PCIe EVENT  18
> Edge      9c0000000.pcie
>  49:          0          0          0          0 PLDA PCIe EVENT  20
> Edge      9c0000000.pcie
>  50:          0          0          0          0 PLDA PCIe EVENT  21
> Edge      9c0000000.pcie
>  51:          0          0          0          0 PLDA PCIe EVENT  22
> Edge      9c0000000.pcie
>  54:          0          0          0          0 PLDA PCIe EVENT  26
> Edge      9c0000000.pcie
>  55:          0          0          0          0 PLDA PCIe EVENT  27
> Edge      9c0000000.pcie
>  56:          0          0          0          0 PLDA PCIe EVENT  28
> Edge      9c0000000.pcie
>  58:          0          0          0          0 PLDA PCIe MSI
> 134217728 Edge      PCIe PME, aerdrv, PCIe bwctrl
>
> Yep, Thanks for your review comment.
>
> The following changes fix the kernel memory leak warning at my end.
>
> $ git diff
> diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
> b/drivers/pci/controller/plda/pcie-plda-host.c
> index 4153214ca410..d4e7c1b49607 100644
> --- a/drivers/pci/controller/plda/pcie-plda-host.c
> +++ b/drivers/pci/controller/plda/pcie-plda-host.c
> @@ -461,6 +461,7 @@ int plda_init_interrupts(struct platform_device *pdev,
>
>                 if (ret) {
>                         dev_err(dev, "failed to request IRQ %d\n", event_irq);
> +                       irq_dispose_mapping(event_irq);
>                         return ret;
>                 }
>         }

Following the change fix this warning in a kernel memory leak.
Would you happen to have any comments on these changes?

diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
b/drivers/pci/controller/plda/pcie-plda-host.c
index 4153214ca410..5a72a5a33074 100644
--- a/drivers/pci/controller/plda/pcie-plda-host.c
+++ b/drivers/pci/controller/plda/pcie-plda-host.c
@@ -280,11 +280,6 @@ static u32 plda_get_events(struct plda_pcie_rp *port)
        return events;
 }

-static irqreturn_t plda_event_handler(int irq, void *dev_id)
-{
-       return IRQ_HANDLED;
-}
-
 static void plda_handle_event(struct irq_desc *desc)
 {
        struct plda_pcie_rp *port = irq_desc_get_handler_data(desc);
@@ -454,13 +449,10 @@ int plda_init_interrupts(struct platform_device *pdev,

                if (event->request_event_irq)
                        ret = event->request_event_irq(port, event_irq, i);
-               else
-                       ret = devm_request_irq(dev, event_irq,
-                                              plda_event_handler,
-                                              0, NULL, port);

                if (ret) {
                        dev_err(dev, "failed to request IRQ %d\n", event_irq);
+                       irq_dispose_mapping(event_irq);
                        return ret;
                }
        }
> > > > > This patch introduces an event IRQ handler and the necessary
> > > > > infrastructure to manage these IRQs, preventing the memory leak.
> > > > >
> > > >
> > > > These handles appear pointless to me. What purpose are they serving?
> > > >
> > > Yes, you are correct, the core event monitoring framework [3] triggered a
> > > kernel memory leak. This patch adds a dummy IRQ callback as a
> > > placeholder for proper event handling, which can be implemented in a
> > > future patch.
> > >
> >
> > The dummy request_event_irq() callback is not supposed to be needed in the first
> > place. So clearly, this patch is not fixing the actual memory leak but trying to
> > cover it up.
> You are correct.
> >
> > - Mani
> >
> Thanks
> -Anand
Thanks
-Anand
> > --
> > மணிவண்ணன் சதாசிவம்
diff mbox series

Patch

diff --git a/drivers/pci/controller/plda/pcie-starfive.c b/drivers/pci/controller/plda/pcie-starfive.c
index e73c1b7bc8ef..a57177a8be8a 100644
--- a/drivers/pci/controller/plda/pcie-starfive.c
+++ b/drivers/pci/controller/plda/pcie-starfive.c
@@ -381,7 +381,46 @@  static const struct plda_pcie_host_ops sf_host_ops = {
 	.host_deinit = starfive_pcie_host_deinit,
 };
 
+/* IRQ handler for PCIe events */
+static irqreturn_t starfive_event_handler(int irq, void *dev_id)
+{
+	struct plda_pcie_rp *port = dev_id;
+	struct irq_data *data;
+
+	/* Retrieve IRQ data */
+	data = irq_domain_get_irq_data(port->event_domain, irq);
+
+	if (data) {
+		dev_err_ratelimited(port->dev, "Event IRQ %ld triggered\n",
+				    data->hwirq);
+	} else {
+		dev_err_ratelimited(port->dev, "Invalid event IRQ %d\n", irq);
+	}
+
+	return IRQ_HANDLED;
+}
+
+/* Request an IRQ for a specific event */
+static int starfive_request_event_irq(struct plda_pcie_rp *plda, int event_irq,
+				      int event)
+{
+	int ret;
+
+	/* Request the IRQ and associate it with the handler */
+	ret = devm_request_irq(plda->dev, event_irq, starfive_event_handler,
+			       IRQF_SHARED, "starfivepcie", plda);
+
+	if (ret) {
+		dev_err(plda->dev, "Failed to request IRQ %d: %d\n", event_irq,
+			ret);
+		return ret;
+	}
+
+	return ret;
+}
+
 static const struct plda_event stf_pcie_event = {
+	.request_event_irq = starfive_request_event_irq,
 	.intx_event = EVENT_PM_MSI_INT_INTX,
 	.msi_event  = EVENT_PM_MSI_INT_MSI
 };