Message ID | 20240430011518.110416-1-marex@denx.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs | expand |
On 30.04.24 03:14, Marek Vasut wrote: > Currently the driver uses local_bh_disable()/local_bh_enable() in its > IRQ handler to avoid triggering net_rx_action() softirq on exit from > netif_rx(). The net_rx_action() could trigger this driver .start_xmit > callback, which is protected by the same lock as the IRQ handler, so > calling the .start_xmit from netif_rx() from the IRQ handler critical > section protected by the lock could lead to an attempt to claim the > already claimed lock, and a hang. > > The local_bh_disable()/local_bh_enable() approach works only in case > the IRQ handler is protected by a spinlock, but does not work if the > IRQ handler is protected by mutex, i.e. this works for KS8851 with > Parallel bus interface, but not for KS8851 with SPI bus interface. > > Remove the BH manipulation and instead of calling netif_rx() inside > the IRQ handler code protected by the lock, queue all the received > SKBs in the IRQ handler into a queue first, and once the IRQ handler > exits the critical section protected by the lock, dequeue all the > queued SKBs and push them all into netif_rx(). At this point, it is > safe to trigger the net_rx_action() softirq, since the netif_rx() > call is outside of the lock that protects the IRQ handler. > > Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") > Signed-off-by: Marek Vasut <marex@denx.de> > --- > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Paolo Abeni <pabeni@redhat.com> > Cc: Ronald Wahl <ronald.wahl@raritan.com> > Cc: Simon Horman <horms@kernel.org> > Cc: netdev@vger.kernel.org To me the code looks good. An iperf3 test shows that it now has even slightly more throughput in my setup (two interconnected ks8851-spi). Thanks for this fix! Tested-by: Ronald Wahl <ronald.wahl@raritan.com> > --- > Note: This is basically what Jakub originally suggested in > https://patchwork.kernel.org/project/netdevbpf/patch/20240331142353.93792-2-marex@denx.de/#25785606 > --- > drivers/net/ethernet/micrel/ks8851.h | 1 + > drivers/net/ethernet/micrel/ks8851_common.c | 8 ++++---- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h > index 31f75b4a67fd7..f311074ea13bc 100644 > --- a/drivers/net/ethernet/micrel/ks8851.h > +++ b/drivers/net/ethernet/micrel/ks8851.h > @@ -399,6 +399,7 @@ struct ks8851_net { > > struct work_struct rxctrl_work; > > + struct sk_buff_head rxq; > struct sk_buff_head txq; > unsigned int queued_len; > > diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c > index d4cdf3d4f5525..f7b48e596631f 100644 > --- a/drivers/net/ethernet/micrel/ks8851_common.c > +++ b/drivers/net/ethernet/micrel/ks8851_common.c > @@ -299,7 +299,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) > ks8851_dbg_dumpkkt(ks, rxpkt); > > skb->protocol = eth_type_trans(skb, ks->netdev); > - __netif_rx(skb); > + skb_queue_tail(&ks->rxq, skb); > > ks->netdev->stats.rx_packets++; > ks->netdev->stats.rx_bytes += rxlen; > @@ -330,8 +330,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) > unsigned long flags; > unsigned int status; > > - local_bh_disable(); > - > ks8851_lock(ks, &flags); > > status = ks8851_rdreg16(ks, KS_ISR); > @@ -408,7 +406,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) > if (status & IRQ_LCI) > mii_check_link(&ks->mii); > > - local_bh_enable(); > + while (!skb_queue_empty(&ks->rxq)) > + netif_rx(skb_dequeue(&ks->rxq)); > > return IRQ_HANDLED; > } > @@ -1189,6 +1188,7 @@ int ks8851_probe_common(struct net_device *netdev, struct device *dev, > NETIF_MSG_PROBE | > NETIF_MSG_LINK); > > + skb_queue_head_init(&ks->rxq); > skb_queue_head_init(&ks->txq); > > netdev->ethtool_ops = &ks8851_ethtool_ops;
On 4/30/24 10:24 AM, Ronald Wahl wrote: > On 30.04.24 03:14, Marek Vasut wrote: >> Currently the driver uses local_bh_disable()/local_bh_enable() in its >> IRQ handler to avoid triggering net_rx_action() softirq on exit from >> netif_rx(). The net_rx_action() could trigger this driver .start_xmit >> callback, which is protected by the same lock as the IRQ handler, so >> calling the .start_xmit from netif_rx() from the IRQ handler critical >> section protected by the lock could lead to an attempt to claim the >> already claimed lock, and a hang. >> >> The local_bh_disable()/local_bh_enable() approach works only in case >> the IRQ handler is protected by a spinlock, but does not work if the >> IRQ handler is protected by mutex, i.e. this works for KS8851 with >> Parallel bus interface, but not for KS8851 with SPI bus interface. >> >> Remove the BH manipulation and instead of calling netif_rx() inside >> the IRQ handler code protected by the lock, queue all the received >> SKBs in the IRQ handler into a queue first, and once the IRQ handler >> exits the critical section protected by the lock, dequeue all the >> queued SKBs and push them all into netif_rx(). At this point, it is >> safe to trigger the net_rx_action() softirq, since the netif_rx() >> call is outside of the lock that protects the IRQ handler. >> >> Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ >> thread to fix hang") >> Signed-off-by: Marek Vasut <marex@denx.de> >> --- >> Cc: "David S. Miller" <davem@davemloft.net> >> Cc: Eric Dumazet <edumazet@google.com> >> Cc: Jakub Kicinski <kuba@kernel.org> >> Cc: Paolo Abeni <pabeni@redhat.com> >> Cc: Ronald Wahl <ronald.wahl@raritan.com> >> Cc: Simon Horman <horms@kernel.org> >> Cc: netdev@vger.kernel.org > > To me the code looks good. An iperf3 test shows that it now has even > slightly more throughput in my setup (two interconnected ks8851-spi). > Thanks for this fix! > > Tested-by: Ronald Wahl <ronald.wahl@raritan.com> That's nice. Thank you for testing. Sorry for the breakage.
On Tue, Apr 30, 2024 at 3:15 AM Marek Vasut <marex@denx.de> wrote: > > Currently the driver uses local_bh_disable()/local_bh_enable() in its > IRQ handler to avoid triggering net_rx_action() softirq on exit from > netif_rx(). The net_rx_action() could trigger this driver .start_xmit > callback, which is protected by the same lock as the IRQ handler, so > calling the .start_xmit from netif_rx() from the IRQ handler critical > section protected by the lock could lead to an attempt to claim the > already claimed lock, and a hang. > > The local_bh_disable()/local_bh_enable() approach works only in case > the IRQ handler is protected by a spinlock, but does not work if the > IRQ handler is protected by mutex, i.e. this works for KS8851 with > Parallel bus interface, but not for KS8851 with SPI bus interface. > > Remove the BH manipulation and instead of calling netif_rx() inside > the IRQ handler code protected by the lock, queue all the received > SKBs in the IRQ handler into a queue first, and once the IRQ handler > exits the critical section protected by the lock, dequeue all the > queued SKBs and push them all into netif_rx(). At this point, it is > safe to trigger the net_rx_action() softirq, since the netif_rx() > call is outside of the lock that protects the IRQ handler. > > Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") > Signed-off-by: Marek Vasut <marex@denx.de> > --- > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Paolo Abeni <pabeni@redhat.com> > Cc: Ronald Wahl <ronald.wahl@raritan.com> > Cc: Simon Horman <horms@kernel.org> > Cc: netdev@vger.kernel.org > --- > Note: This is basically what Jakub originally suggested in > https://patchwork.kernel.org/project/netdevbpf/patch/20240331142353.93792-2-marex@denx.de/#25785606 > --- > drivers/net/ethernet/micrel/ks8851.h | 1 + > drivers/net/ethernet/micrel/ks8851_common.c | 8 ++++---- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h > index 31f75b4a67fd7..f311074ea13bc 100644 > --- a/drivers/net/ethernet/micrel/ks8851.h > +++ b/drivers/net/ethernet/micrel/ks8851.h > @@ -399,6 +399,7 @@ struct ks8851_net { > > struct work_struct rxctrl_work; > > + struct sk_buff_head rxq; This is a private queue, so you can avoid the locking overhead for it. > struct sk_buff_head txq; > unsigned int queued_len; > > diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c > index d4cdf3d4f5525..f7b48e596631f 100644 > --- a/drivers/net/ethernet/micrel/ks8851_common.c > +++ b/drivers/net/ethernet/micrel/ks8851_common.c > @@ -299,7 +299,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) > ks8851_dbg_dumpkkt(ks, rxpkt); > > skb->protocol = eth_type_trans(skb, ks->netdev); > - __netif_rx(skb); > + skb_queue_tail(&ks->rxq, skb); __skb_queue_tail() > > ks->netdev->stats.rx_packets++; > ks->netdev->stats.rx_bytes += rxlen; > @@ -330,8 +330,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) > unsigned long flags; > unsigned int status; > > - local_bh_disable(); > - > ks8851_lock(ks, &flags); > > status = ks8851_rdreg16(ks, KS_ISR); > @@ -408,7 +406,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) > if (status & IRQ_LCI) > mii_check_link(&ks->mii); > > - local_bh_enable(); > + while (!skb_queue_empty(&ks->rxq)) > + netif_rx(skb_dequeue(&ks->rxq)); __skb_dequeue() > > return IRQ_HANDLED; > } > @@ -1189,6 +1188,7 @@ int ks8851_probe_common(struct net_device *netdev, struct device *dev, > NETIF_MSG_PROBE | > NETIF_MSG_LINK); > > + skb_queue_head_init(&ks->rxq); __skb_queue_head_init(...); > skb_queue_head_init(&ks->txq); > > netdev->ethtool_ops = &ks8851_ethtool_ops; > -- > 2.43.0 >
diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h index 31f75b4a67fd7..f311074ea13bc 100644 --- a/drivers/net/ethernet/micrel/ks8851.h +++ b/drivers/net/ethernet/micrel/ks8851.h @@ -399,6 +399,7 @@ struct ks8851_net { struct work_struct rxctrl_work; + struct sk_buff_head rxq; struct sk_buff_head txq; unsigned int queued_len; diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c index d4cdf3d4f5525..f7b48e596631f 100644 --- a/drivers/net/ethernet/micrel/ks8851_common.c +++ b/drivers/net/ethernet/micrel/ks8851_common.c @@ -299,7 +299,7 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) ks8851_dbg_dumpkkt(ks, rxpkt); skb->protocol = eth_type_trans(skb, ks->netdev); - __netif_rx(skb); + skb_queue_tail(&ks->rxq, skb); ks->netdev->stats.rx_packets++; ks->netdev->stats.rx_bytes += rxlen; @@ -330,8 +330,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) unsigned long flags; unsigned int status; - local_bh_disable(); - ks8851_lock(ks, &flags); status = ks8851_rdreg16(ks, KS_ISR); @@ -408,7 +406,8 @@ static irqreturn_t ks8851_irq(int irq, void *_ks) if (status & IRQ_LCI) mii_check_link(&ks->mii); - local_bh_enable(); + while (!skb_queue_empty(&ks->rxq)) + netif_rx(skb_dequeue(&ks->rxq)); return IRQ_HANDLED; } @@ -1189,6 +1188,7 @@ int ks8851_probe_common(struct net_device *netdev, struct device *dev, NETIF_MSG_PROBE | NETIF_MSG_LINK); + skb_queue_head_init(&ks->rxq); skb_queue_head_init(&ks->txq); netdev->ethtool_ops = &ks8851_ethtool_ops;
Currently the driver uses local_bh_disable()/local_bh_enable() in its IRQ handler to avoid triggering net_rx_action() softirq on exit from netif_rx(). The net_rx_action() could trigger this driver .start_xmit callback, which is protected by the same lock as the IRQ handler, so calling the .start_xmit from netif_rx() from the IRQ handler critical section protected by the lock could lead to an attempt to claim the already claimed lock, and a hang. The local_bh_disable()/local_bh_enable() approach works only in case the IRQ handler is protected by a spinlock, but does not work if the IRQ handler is protected by mutex, i.e. this works for KS8851 with Parallel bus interface, but not for KS8851 with SPI bus interface. Remove the BH manipulation and instead of calling netif_rx() inside the IRQ handler code protected by the lock, queue all the received SKBs in the IRQ handler into a queue first, and once the IRQ handler exits the critical section protected by the lock, dequeue all the queued SKBs and push them all into netif_rx(). At this point, it is safe to trigger the net_rx_action() softirq, since the netif_rx() call is outside of the lock that protects the IRQ handler. Fixes: be0384bf599c ("net: ks8851: Handle softirqs at the end of IRQ thread to fix hang") Signed-off-by: Marek Vasut <marex@denx.de> --- Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Ronald Wahl <ronald.wahl@raritan.com> Cc: Simon Horman <horms@kernel.org> Cc: netdev@vger.kernel.org --- Note: This is basically what Jakub originally suggested in https://patchwork.kernel.org/project/netdevbpf/patch/20240331142353.93792-2-marex@denx.de/#25785606 --- drivers/net/ethernet/micrel/ks8851.h | 1 + drivers/net/ethernet/micrel/ks8851_common.c | 8 ++++---- 2 files changed, 5 insertions(+), 4 deletions(-)