diff mbox series

[net-next] net: ag71xx: disable napi interrupts during probe

Message ID 20240828204135.6543-1-rosenp@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: ag71xx: disable napi interrupts during probe | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 7 this patch: 7
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: chris.snook@gmail.com
netdev/build_clang success Errors and warnings before: 7 this patch: 7
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 7 this patch: 7
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-29--15-00 (tests: 711)

Commit Message

Rosen Penev Aug. 28, 2024, 8:41 p.m. UTC
From: Sven Eckelmann <sven@narfation.org>

ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
interrupts. The handler is trying to use napi_schedule to handle the
processing of packets. But the netif_napi_add for this device is
called a lot later in ag71xx_probe.

It can therefore happen that a still running gmac0/gmac1 is triggering the
interrupt handler with a bit from AG71XX_INT_POLL set in
AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
napi code will crash the system because the ag->napi is not yet
initialized.

The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
AG71XX_INT_POLL related status bits as interrupt before registering the
interrupt handler. ag71xx_hw_start will take care of re-initializing the
AG71XX_REG_INT_ENABLE.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Rosen Penev <rosenp@gmail.com>
---
 drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Keller, Jacob E Aug. 28, 2024, 9:05 p.m. UTC | #1
On 8/28/2024 1:41 PM, Rosen Penev wrote:
> From: Sven Eckelmann <sven@narfation.org>
> 
> ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> interrupts. The handler is trying to use napi_schedule to handle the
> processing of packets. But the netif_napi_add for this device is
> called a lot later in ag71xx_probe.
> 
> It can therefore happen that a still running gmac0/gmac1 is triggering the
> interrupt handler with a bit from AG71XX_INT_POLL set in
> AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> napi code will crash the system because the ag->napi is not yet
> initialized.
> 
> The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> AG71XX_INT_POLL related status bits as interrupt before registering the
> interrupt handler. ag71xx_hw_start will take care of re-initializing the
> AG71XX_REG_INT_ENABLE.
> 
> Signed-off-by: Sven Eckelmann <sven@narfation.org>
> Signed-off-by: Rosen Penev <rosenp@gmail.com>
> ---

The description reads like a bug fix, so I would expect this to be
targeted to net and have a Fixes tag indicating what commit introduced
the issue, maybe:

Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")

The change seems reasonable to me otherwise.

>  drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
> index 0674a042e8d3..435c4b19acdd 100644
> --- a/drivers/net/ethernet/atheros/ag71xx.c
> +++ b/drivers/net/ethernet/atheros/ag71xx.c
> @@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
>  	if (!ag->mac_base)
>  		return -ENOMEM;
>  
> +	/* ensure that HW is in manual polling mode before interrupts are
> +	 * activated. Otherwise ag71xx_interrupt might call napi_schedule
> +	 * before it is initialized by netif_napi_add.
> +	 */
> +	ag71xx_int_disable(ag, AG71XX_INT_POLL);
> +
>  	ndev->irq = platform_get_irq(pdev, 0);
>  	err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
>  			       0x0, dev_name(&pdev->dev), ndev);
Rosen Penev Aug. 29, 2024, 5:46 p.m. UTC | #2
On Wed, Aug 28, 2024 at 2:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>
>
>
> On 8/28/2024 1:41 PM, Rosen Penev wrote:
> > From: Sven Eckelmann <sven@narfation.org>
> >
> > ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> > interrupts. The handler is trying to use napi_schedule to handle the
> > processing of packets. But the netif_napi_add for this device is
> > called a lot later in ag71xx_probe.
> >
> > It can therefore happen that a still running gmac0/gmac1 is triggering the
> > interrupt handler with a bit from AG71XX_INT_POLL set in
> > AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> > napi code will crash the system because the ag->napi is not yet
> > initialized.
> >
> > The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> > AG71XX_INT_POLL related status bits as interrupt before registering the
> > interrupt handler. ag71xx_hw_start will take care of re-initializing the
> > AG71XX_REG_INT_ENABLE.
> >
> > Signed-off-by: Sven Eckelmann <sven@narfation.org>
> > Signed-off-by: Rosen Penev <rosenp@gmail.com>
> > ---
>
> The description reads like a bug fix, so I would expect this to be
> targeted to net and have a Fixes tag indicating what commit introduced
> the issue, maybe:
>
> Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
>
> The change seems reasonable to me otherwise.
OTOH there are currently no dual GMAC users upstream. Just single.

>
> >  drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
> > index 0674a042e8d3..435c4b19acdd 100644
> > --- a/drivers/net/ethernet/atheros/ag71xx.c
> > +++ b/drivers/net/ethernet/atheros/ag71xx.c
> > @@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
> >       if (!ag->mac_base)
> >               return -ENOMEM;
> >
> > +     /* ensure that HW is in manual polling mode before interrupts are
> > +      * activated. Otherwise ag71xx_interrupt might call napi_schedule
> > +      * before it is initialized by netif_napi_add.
> > +      */
> > +     ag71xx_int_disable(ag, AG71XX_INT_POLL);
> > +
> >       ndev->irq = platform_get_irq(pdev, 0);
> >       err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
> >                              0x0, dev_name(&pdev->dev), ndev);
Keller, Jacob E Aug. 29, 2024, 5:50 p.m. UTC | #3
> -----Original Message-----
> From: Rosen Penev <rosenp@gmail.com>
> Sent: Thursday, August 29, 2024 10:47 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; linux@armlinux.org.uk; linux-
> kernel@vger.kernel.org; o.rempel@pengutronix.de; p.zabel@pengutronix.de
> Subject: Re: [PATCH net-next] net: ag71xx: disable napi interrupts during probe
> 
> On Wed, Aug 28, 2024 at 2:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> >
> >
> >
> > On 8/28/2024 1:41 PM, Rosen Penev wrote:
> > > From: Sven Eckelmann <sven@narfation.org>
> > >
> > > ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> > > interrupts. The handler is trying to use napi_schedule to handle the
> > > processing of packets. But the netif_napi_add for this device is
> > > called a lot later in ag71xx_probe.
> > >
> > > It can therefore happen that a still running gmac0/gmac1 is triggering the
> > > interrupt handler with a bit from AG71XX_INT_POLL set in
> > > AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> > > napi code will crash the system because the ag->napi is not yet
> > > initialized.
> > >
> > > The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> > > AG71XX_INT_POLL related status bits as interrupt before registering the
> > > interrupt handler. ag71xx_hw_start will take care of re-initializing the
> > > AG71XX_REG_INT_ENABLE.
> > >
> > > Signed-off-by: Sven Eckelmann <sven@narfation.org>
> > > Signed-off-by: Rosen Penev <rosenp@gmail.com>
> > > ---
> >
> > The description reads like a bug fix, so I would expect this to be
> > targeted to net and have a Fixes tag indicating what commit introduced
> > the issue, maybe:
> >
> > Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
> >
> > The change seems reasonable to me otherwise.
> OTOH there are currently no dual GMAC users upstream. Just single.
> 

If that’s the case, updating the description to make that clear would help.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
index 0674a042e8d3..435c4b19acdd 100644
--- a/drivers/net/ethernet/atheros/ag71xx.c
+++ b/drivers/net/ethernet/atheros/ag71xx.c
@@ -1855,6 +1855,12 @@  static int ag71xx_probe(struct platform_device *pdev)
 	if (!ag->mac_base)
 		return -ENOMEM;
 
+	/* ensure that HW is in manual polling mode before interrupts are
+	 * activated. Otherwise ag71xx_interrupt might call napi_schedule
+	 * before it is initialized by netif_napi_add.
+	 */
+	ag71xx_int_disable(ag, AG71XX_INT_POLL);
+
 	ndev->irq = platform_get_irq(pdev, 0);
 	err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
 			       0x0, dev_name(&pdev->dev), ndev);