Message ID | 20250114-am65-cpsw-fix-tx-irq-free-v1-1-b2069e6ed185@kernel.org (mailing list archive) |
---|---|
State | New |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: ethernet: ti: am65-cpsw: fix freeing IRQ in am65_cpsw_nuss_remove_tx_chns() | expand |
On Tue, Jan 14, 2025 at 06:44:02PM +0200, Roger Quadros wrote: > When getting the IRQ we use k3_udma_glue_rx_get_irq() which returns > negative error value on error. So not NULL check is not sufficient > to deteremine if IRQ is valid. Check that IRQ is greater then zero > to ensure it is valid. > > Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") > Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org>
On Tue, Jan 14, 2025 at 06:44:02PM +0200, Roger Quadros wrote: Hello Roger, > When getting the IRQ we use k3_udma_glue_rx_get_irq() which returns You probably meant "k3_udma_glue_tx_get_irq()" instead? It is used to assign tx_chn->irq within am65_cpsw_nuss_init_tx_chns() as follows: tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn); Additionally, following the above section we have: if (tx_chn->irq < 0) { dev_err(dev, "Failed to get tx dma irq %d\n", tx_chn->irq); ret = tx_chn->irq; goto err; } Could you please provide details on the code-path which will lead to a negative "tx_chn->irq" within "am65_cpsw_nuss_remove_tx_chns()"? There seem to be two callers of am65_cpsw_nuss_remove_tx_chns(), namely: 1. am65_cpsw_nuss_update_tx_rx_chns() 2. am65_cpsw_nuss_suspend() Since both of them seem to invoke am65_cpsw_nuss_remove_tx_chns() only in the case where am65_cpsw_nuss_init_tx_chns() *did not* error out, it appears to me that "tx_chn->irq" will never be negative within am65_cpsw_nuss_remove_tx_chns() Please let me know if I have overlooked something. Regards, Siddharth. > negative error value on error. So not NULL check is not sufficient > to deteremine if IRQ is valid. Check that IRQ is greater then zero > to ensure it is valid. > > Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") > Signed-off-by: Roger Quadros <rogerq@kernel.org> > --- > drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c > index 5465bf872734..e1de45fb18ae 100644 > --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c > +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c > @@ -2248,7 +2248,7 @@ static void am65_cpsw_nuss_remove_tx_chns(struct am65_cpsw_common *common) > for (i = 0; i < common->tx_ch_num; i++) { > struct am65_cpsw_tx_chn *tx_chn = &common->tx_chns[i]; > > - if (tx_chn->irq) > + if (tx_chn->irq > 0) > devm_free_irq(dev, tx_chn->irq, tx_chn); > > netif_napi_del(&tx_chn->napi_tx); > > --- > base-commit: 5bc55a333a2f7316b58edc7573e8e893f7acb532 > change-id: 20250114-am65-cpsw-fix-tx-irq-free-846ac55ee6e1 > > Best regards, > -- > Roger Quadros <rogerq@kernel.org> >
Hi Siddharth, On 15/01/2025 07:18, Siddharth Vadapalli wrote: > On Tue, Jan 14, 2025 at 06:44:02PM +0200, Roger Quadros wrote: > > Hello Roger, > >> When getting the IRQ we use k3_udma_glue_rx_get_irq() which returns > > You probably meant "k3_udma_glue_tx_get_irq()" instead? It is used to > assign tx_chn->irq within am65_cpsw_nuss_init_tx_chns() as follows: Yes I meant tx instead of rx. > > tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn); > > Additionally, following the above section we have: > > if (tx_chn->irq < 0) { > dev_err(dev, "Failed to get tx dma irq %d\n", > tx_chn->irq); > ret = tx_chn->irq; > goto err; > } > > Could you please provide details on the code-path which will lead to a > negative "tx_chn->irq" within "am65_cpsw_nuss_remove_tx_chns()"? > > There seem to be two callers of am65_cpsw_nuss_remove_tx_chns(), namely: > 1. am65_cpsw_nuss_update_tx_rx_chns() > 2. am65_cpsw_nuss_suspend() > Since both of them seem to invoke am65_cpsw_nuss_remove_tx_chns() only > in the case where am65_cpsw_nuss_init_tx_chns() *did not* error out, it > appears to me that "tx_chn->irq" will never be negative within > am65_cpsw_nuss_remove_tx_chns() > > Please let me know if I have overlooked something. The issue is with am65_cpsw_nuss_update_tx_rx_chns(). It can be called repeatedly (by user changing number of TX queues) even if previous call to am65_cpsw_nuss_init_tx_chns() failed. Please try the below patch to simulate the error condition. Then do the following - bring down all network interfaces - change num TX queues to 2. IRQ for 2nd channel fails. - change num TX queues to 3. Now we try to free an invalid IRQ and we see warning. Also I think it is good practice to code for set value than to code for existing code paths as they can change in the future. --test patch starts-- diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 36c29d3db329..c22cadaaf3d3 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -155,7 +155,7 @@ NETIF_MSG_IFUP | NETIF_MSG_PROBE | NETIF_MSG_IFDOWN | \ NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR) -#define AM65_CPSW_DEFAULT_TX_CHNS 8 +#define AM65_CPSW_DEFAULT_TX_CHNS 1 #define AM65_CPSW_DEFAULT_RX_CHN_FLOWS 1 /* CPPI streaming packet interface */ @@ -2346,7 +2348,10 @@ static int am65_cpsw_nuss_init_tx_chns(struct am65_cpsw_common *common) tx_chn->dsize_log2 = __fls(hdesc_size_out); WARN_ON(hdesc_size_out != (1 << tx_chn->dsize_log2)); - tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn); + if (i == 1) + tx_chn->irq = -ENODEV; + else + tx_chn->irq = k3_udma_glue_tx_get_irq(tx_chn->tx_chn); if (tx_chn->irq < 0) { dev_err(dev, "Failed to get tx dma irq %d\n", tx_chn->irq); -- cheers, -roger
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 5465bf872734..e1de45fb18ae 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -2248,7 +2248,7 @@ static void am65_cpsw_nuss_remove_tx_chns(struct am65_cpsw_common *common) for (i = 0; i < common->tx_ch_num; i++) { struct am65_cpsw_tx_chn *tx_chn = &common->tx_chns[i]; - if (tx_chn->irq) + if (tx_chn->irq > 0) devm_free_irq(dev, tx_chn->irq, tx_chn); netif_napi_del(&tx_chn->napi_tx);
When getting the IRQ we use k3_udma_glue_rx_get_irq() which returns negative error value on error. So not NULL check is not sufficient to deteremine if IRQ is valid. Check that IRQ is greater then zero to ensure it is valid. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Roger Quadros <rogerq@kernel.org> --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- base-commit: 5bc55a333a2f7316b58edc7573e8e893f7acb532 change-id: 20250114-am65-cpsw-fix-tx-irq-free-846ac55ee6e1 Best regards,