Message ID | 20240502183436.117117-1-marex@denx.de (mailing list archive) |
---|---|
State | Accepted |
Commit | e0863634bf9f7cf36291ebb5bfa2d16632f79c49 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v3] net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs | expand |
On Thu, May 2, 2024 at 8:34 PM Marek Vasut <marex@denx.de> wrote: > > Currently the driver uses local_bh_disable()/local_bh_enable() in its > IRQ handler to avoid triggering net_rx_action() softirq on exit from > netif_rx(). The net_rx_action() could trigger this driver .start_xmit > callback, which is protected by the same lock as the IRQ handler, so > calling the .start_xmit from netif_rx() from the IRQ handler critical > section protected by the lock could lead to an attempt to claim the > already claimed lock, and a hang. > > The local_bh_disable()/local_bh_enable() approach works only in case > the IRQ handler is protected by a spinlock, but does not work if the > IRQ handler is protected by mutex, i.e. this works for KS8851 with > Parallel bus interface, but not for KS8851 with SPI bus interface. > > Remove the BH manipulation and instead of calling netif_rx() inside > the IRQ handler code protected by the lock, queue all the received > SKBs in the IRQ handler into a queue first, and once the IRQ handler > exits the critical section protected by the lock, dequeue all the > queued SKBs and push them all into netif_rx(). At this point, it is > safe to trigger the net_rx_action() softirq, since the netif_rx() > call is outside of the lock that protects the IRQ handler. > > Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") > Tested-by: Ronald Wahl <ronald.wahl@raritan.com> # KS8851 SPI > Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Eric Dumazet <edumazet@google.com>
On 5/3/24 9:08 AM, Eric Dumazet wrote: > On Thu, May 2, 2024 at 8:34 PM Marek Vasut <marex@denx.de> wrote: >> >> Currently the driver uses local_bh_disable()/local_bh_enable() in its >> IRQ handler to avoid triggering net_rx_action() softirq on exit from >> netif_rx(). The net_rx_action() could trigger this driver .start_xmit >> callback, which is protected by the same lock as the IRQ handler, so >> calling the .start_xmit from netif_rx() from the IRQ handler critical >> section protected by the lock could lead to an attempt to claim the >> already claimed lock, and a hang. >> >> The local_bh_disable()/local_bh_enable() approach works only in case >> the IRQ handler is protected by a spinlock, but does not work if the >> IRQ handler is protected by mutex, i.e. this works for KS8851 with >> Parallel bus interface, but not for KS8851 with SPI bus interface. >> >> Remove the BH manipulation and instead of calling netif_rx() inside >> the IRQ handler code protected by the lock, queue all the received >> SKBs in the IRQ handler into a queue first, and once the IRQ handler >> exits the critical section protected by the lock, dequeue all the >> queued SKBs and push them all into netif_rx(). At this point, it is >> safe to trigger the net_rx_action() softirq, since the netif_rx() >> call is outside of the lock that protects the IRQ handler. >> >> Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") >> Tested-by: Ronald Wahl <ronald.wahl@raritan.com> # KS8851 SPI >> Signed-off-by: Marek Vasut <marex@denx.de> > > Reviewed-by: Eric Dumazet <edumazet@google.com> Thank you and Jakub for your help with this.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 2 May 2024 20:32:59 +0200 you wrote: > Currently the driver uses local_bh_disable()/local_bh_enable() in its > IRQ handler to avoid triggering net_rx_action() softirq on exit from > netif_rx(). The net_rx_action() could trigger this driver .start_xmit > callback, which is protected by the same lock as the IRQ handler, so > calling the .start_xmit from netif_rx() from the IRQ handler critical > section protected by the lock could lead to an attempt to claim the > already claimed lock, and a hang. > > [...] Here is the summary with links: - [net,v3] net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs https://git.kernel.org/netdev/net/c/e0863634bf9f You are awesome, thank you!
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c index d4cdf3d4f5525..502518cdb4618 100644 --- a/drivers/net/ethernet/micrel/ks8851_common.c +++ b/drivers/net/ethernet/micrel/ks8851_common.c @@ -234,12 +234,13 @@ static void ks8851_dbg_dumpkkt(struct ks8851_net *ks, u8 *rxpkt) /** * ks8851_rx_pkts - receive packets from the host * @ks: The device information. + * @rxq: Queue of packets received in this function. * * This is called from the IRQ work queue when the system detects that there * are packets in the receive queue. Find out how many packets there are and * read them from the FIFO. */ -static void ks8851_rx_pkts(struct ks8851_net *ks) +static void ks8851_rx_pkts(struct ks8851_net *ks, struct sk_buff_head *rxq) { struct sk_buff *skb; unsigned rxfc; @@ -299,7 +300,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) ks8851_dbg_dumpkkt(ks, rxpkt); skb->protocol = eth_type_trans(skb, ks->netdev); - __netif_rx(skb); + __skb_queue_tail(rxq, skb); ks->netdev->stats.rx_packets++; ks->netdev->stats.rx_bytes += rxlen; @@ -326,11 +327,11 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) static irqreturn_t ks8851_irq(int irq, void *_ks) { struct ks8851_net *ks = _ks; + struct sk_buff_head rxq; unsigned handled = 0; unsigned long flags; unsigned int status; - - local_bh_disable(); + struct sk_buff *skb; ks8851_lock(ks, &flags); @@ -384,7 +385,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) * from the device so do not bother masking just the RX * from the device. */ - ks8851_rx_pkts(ks); + __skb_queue_head_init(&rxq); + ks8851_rx_pkts(ks, &rxq); } /* if something stopped the rx process, probably due to wanting @@ -408,7 +410,9 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) if (status & IRQ_LCI) mii_check_link(&ks->mii); - local_bh_enable(); + if (status & IRQ_RXI) + while ((skb = __skb_dequeue(&rxq))) + netif_rx(skb); return IRQ_HANDLED; }