diff mbox series

[net,v2] net: ks8851: Fix TX stall caused by TX buffer overrun

Message ID 20231212191632.197656-1-rwahl@gmx.de (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] net: ks8851: Fix TX stall caused by TX buffer overrun | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1117 this patch: 1117
netdev/cc_maintainers success CCed 5 of 6 maintainers
netdev/build_clang success Errors and warnings before: 1142 this patch: 1142
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1144 this patch: 1144
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 132 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc fail Errors and warnings before: 1 this patch: 2
netdev/source_inline success Was 0 now: 0

Commit Message

Ronald Wahl Dec. 12, 2023, 7:16 p.m. UTC
From: Ronald Wahl <ronald.wahl@raritan.com>

There is a bug in the ks8851 Ethernet driver that more data is written
to the hardware TX buffer than actually available. This is caused by
wrong accounting of the free TX buffer space.

The driver maintains a tx_space variable that represents the TX buffer
space that is deemed to be free. The ks8851_start_xmit_spi() function
adds an SKB to a queue if tx_space is large enough and reduces tx_space
by the amount of buffer space it will later need in the TX buffer and
then schedules a work item. If there is not enough space then the TX
queue is stopped.

The worker function ks8851_tx_work() dequeues all the SKBs and writes
the data into the hardware TX buffer. The last packet will trigger an
interrupt after it was send. Here it is assumed that all data fits into
the TX buffer.

In the interrupt routine (which runs asynchronously because it is a
threaded interrupt) tx_space is updated with the current value from the
hardware. Also the TX queue is woken up again.

Now it could happen that after data was sent to the hardware and before
handling the TX interrupt new data is queued in ks8851_start_xmit_spi()
when the TX buffer space had still some space left. When the interrupt
is actually handled tx_space is updated from the hardware but now we
already have new SKBs queued that have not been written to the hardware
TX buffer yet. Since tx_space has been overwritten by the value from the
hardware the space is not accounted for.

Now we have more data queued then buffer space available in the hardware
and ks8851_tx_work() will potentially overrun the hardware TX buffer. In
many cases it will still work because often the buffer is written out
fast enough so that no overrun occurs but for example if the peer
throttles us via flow control then an overrun may happen.

This can be fixed in different ways. The most simple way would be to set
tx_space to 0 before writing data to the hardware TX buffer preventing
the queuing of more SKBs until the TX interrupt has been handled. I have
chosen a slightly more efficient (and still rather simple) way and
track the amount of data that is already queued and not yet written to
the hardware. When new SKBs are to be queued the already queued amount
of data is honoured when checking free TX buffer space.

I tested this with a setup of two linked KS8851 running iperf3 between
the two in bidirectional mode. Before the fix I got a stall after some
minutes. With the fix I saw now issues anymore after hours.

Fixes: 3ba81f3ece3c ("net: Micrel KS8851 SPI network driver")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
---
V2: - Added Fixes: tag (issue actually present from the beginning)
    - cosmetics reported by checkpatch

 drivers/net/ethernet/micrel/ks8851.h        |  1 +
 drivers/net/ethernet/micrel/ks8851_common.c | 20 +++++-----
 drivers/net/ethernet/micrel/ks8851_spi.c    | 41 +++++++++++++--------
 3 files changed, 37 insertions(+), 25 deletions(-)

--
2.43.0

Comments

Jakub Kicinski Dec. 13, 2023, 12:14 a.m. UTC | #1
On Tue, 12 Dec 2023 20:16:32 +0100 Ronald Wahl wrote:
>  	struct work_struct	rxctrl_work;
> 
>  	struct sk_buff_head	txq;
> +	unsigned int		queued_len;

This new field is missing a kdoc
Simon Horman Dec. 13, 2023, 4:29 p.m. UTC | #2
On Tue, Dec 12, 2023 at 08:16:32PM +0100, Ronald Wahl wrote:
> From: Ronald Wahl <ronald.wahl@raritan.com>
> 
> There is a bug in the ks8851 Ethernet driver that more data is written
> to the hardware TX buffer than actually available. This is caused by
> wrong accounting of the free TX buffer space.
> 
> The driver maintains a tx_space variable that represents the TX buffer
> space that is deemed to be free. The ks8851_start_xmit_spi() function
> adds an SKB to a queue if tx_space is large enough and reduces tx_space
> by the amount of buffer space it will later need in the TX buffer and
> then schedules a work item. If there is not enough space then the TX
> queue is stopped.
> 
> The worker function ks8851_tx_work() dequeues all the SKBs and writes
> the data into the hardware TX buffer. The last packet will trigger an
> interrupt after it was send. Here it is assumed that all data fits into
> the TX buffer.
> 
> In the interrupt routine (which runs asynchronously because it is a
> threaded interrupt) tx_space is updated with the current value from the
> hardware. Also the TX queue is woken up again.
> 
> Now it could happen that after data was sent to the hardware and before
> handling the TX interrupt new data is queued in ks8851_start_xmit_spi()
> when the TX buffer space had still some space left. When the interrupt
> is actually handled tx_space is updated from the hardware but now we
> already have new SKBs queued that have not been written to the hardware
> TX buffer yet. Since tx_space has been overwritten by the value from the
> hardware the space is not accounted for.
> 
> Now we have more data queued then buffer space available in the hardware
> and ks8851_tx_work() will potentially overrun the hardware TX buffer. In
> many cases it will still work because often the buffer is written out
> fast enough so that no overrun occurs but for example if the peer
> throttles us via flow control then an overrun may happen.
> 
> This can be fixed in different ways. The most simple way would be to set
> tx_space to 0 before writing data to the hardware TX buffer preventing
> the queuing of more SKBs until the TX interrupt has been handled. I have
> chosen a slightly more efficient (and still rather simple) way and
> track the amount of data that is already queued and not yet written to
> the hardware. When new SKBs are to be queued the already queued amount
> of data is honoured when checking free TX buffer space.
> 
> I tested this with a setup of two linked KS8851 running iperf3 between
> the two in bidirectional mode. Before the fix I got a stall after some
> minutes. With the fix I saw now issues anymore after hours.
> 
> Fixes: 3ba81f3ece3c ("net: Micrel KS8851 SPI network driver")
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Ben Dooks <ben.dooks@codethink.co.uk>
> Cc: Tristram Ha <Tristram.Ha@microchip.com>
> Cc: netdev@vger.kernel.org
> Cc: stable@vger.kernel.org # 5.10+
> Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>

...

> diff --git a/drivers/net/ethernet/micrel/ks8851_spi.c b/drivers/net/ethernet/micrel/ks8851_spi.c

...

> @@ -310,6 +322,8 @@ static void ks8851_tx_work(struct work_struct *work)
>  	unsigned long flags;
>  	struct sk_buff *txb;
>  	bool last;
> +	unsigned short tx_space;
> +	unsigned int dequeued_len = 0;

Hi Ronald,

Please consider using reverse xmas tree - longest line to shortest -
for local variable declarations in Networking code.

> 
>  	kss = container_of(work, struct ks8851_net_spi, tx_work);
>  	ks = &kss->ks8851;
> @@ -320,6 +334,7 @@ static void ks8851_tx_work(struct work_struct *work)
>  	while (!last) {
>  		txb = skb_dequeue(&ks->txq);
>  		last = skb_queue_empty(&ks->txq);
> +		dequeued_len += calc_txlen(txb->len);

On the line below it is assumed that txb may be NULL.
But on the line above it is dereferenced unconditionally.
This seems inconsistent.

Flagged by Smatch.

> 
>  		if (txb) {
>  			ks8851_wrreg16_spi(ks, KS_RXQCR,

...
diff mbox series

Patch

diff --git a/drivers/net/ethernet/micrel/ks8851.h b/drivers/net/ethernet/micrel/ks8851.h
index fecd43754cea..ce7e524f2542 100644
--- a/drivers/net/ethernet/micrel/ks8851.h
+++ b/drivers/net/ethernet/micrel/ks8851.h
@@ -399,6 +399,7 @@  struct ks8851_net {
 	struct work_struct	rxctrl_work;

 	struct sk_buff_head	txq;
+	unsigned int		queued_len;

 	struct eeprom_93cx6	eeprom;
 	struct regulator	*vdd_reg;
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index cfbc900d4aeb..0bf13b38b8f5 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -362,16 +362,18 @@  static irqreturn_t ks8851_irq(int irq, void *_ks)
 		handled |= IRQ_RXPSI;

 	if (status & IRQ_TXI) {
-		handled |= IRQ_TXI;
+		unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);

-		/* no lock here, tx queue should have been stopped */
+		netif_dbg(ks, intr, ks->netdev,
+			  "%s: txspace %d\n", __func__, tx_space);

-		/* update our idea of how much tx space is available to the
-		 * system */
-		ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
+		spin_lock(&ks->statelock);
+		ks->tx_space = tx_space;
+		if (netif_queue_stopped(ks->netdev))
+			netif_wake_queue(ks->netdev);
+		spin_unlock(&ks->statelock);

-		netif_dbg(ks, intr, ks->netdev,
-			  "%s: txspace %d\n", __func__, ks->tx_space);
+		handled |= IRQ_TXI;
 	}

 	if (status & IRQ_RXI)
@@ -414,9 +416,6 @@  static irqreturn_t ks8851_irq(int irq, void *_ks)
 	if (status & IRQ_LCI)
 		mii_check_link(&ks->mii);

-	if (status & IRQ_TXI)
-		netif_wake_queue(ks->netdev);
-
 	return IRQ_HANDLED;
 }

@@ -500,6 +499,7 @@  static int ks8851_net_open(struct net_device *dev)
 	ks8851_wrreg16(ks, KS_ISR, ks->rc_ier);
 	ks8851_wrreg16(ks, KS_IER, ks->rc_ier);

+	ks->queued_len = 0;
 	netif_start_queue(ks->netdev);

 	netif_dbg(ks, ifup, ks->netdev, "network device up\n");
diff --git a/drivers/net/ethernet/micrel/ks8851_spi.c b/drivers/net/ethernet/micrel/ks8851_spi.c
index 70bc7253454f..eb089b3120bc 100644
--- a/drivers/net/ethernet/micrel/ks8851_spi.c
+++ b/drivers/net/ethernet/micrel/ks8851_spi.c
@@ -286,6 +286,18 @@  static void ks8851_wrfifo_spi(struct ks8851_net *ks, struct sk_buff *txp,
 		netdev_err(ks->netdev, "%s: spi_sync() failed\n", __func__);
 }

+/**
+ * calc_txlen - calculate size of message to send packet
+ * @len: Length of data
+ *
+ * Returns the size of the TXFIFO message needed to send
+ * this packet.
+ */
+static unsigned int calc_txlen(unsigned int len)
+{
+	return ALIGN(len + 4, 4);
+}
+
 /**
  * ks8851_rx_skb_spi - receive skbuff
  * @ks: The device state
@@ -310,6 +322,8 @@  static void ks8851_tx_work(struct work_struct *work)
 	unsigned long flags;
 	struct sk_buff *txb;
 	bool last;
+	unsigned short tx_space;
+	unsigned int dequeued_len = 0;

 	kss = container_of(work, struct ks8851_net_spi, tx_work);
 	ks = &kss->ks8851;
@@ -320,6 +334,7 @@  static void ks8851_tx_work(struct work_struct *work)
 	while (!last) {
 		txb = skb_dequeue(&ks->txq);
 		last = skb_queue_empty(&ks->txq);
+		dequeued_len += calc_txlen(txb->len);

 		if (txb) {
 			ks8851_wrreg16_spi(ks, KS_RXQCR,
@@ -332,6 +347,13 @@  static void ks8851_tx_work(struct work_struct *work)
 		}
 	}

+	tx_space = ks8851_rdreg16_spi(ks, KS_TXMIR);
+
+	spin_lock(&ks->statelock);
+	ks->queued_len -= dequeued_len;
+	ks->tx_space = tx_space;
+	spin_unlock(&ks->statelock);
+
 	ks8851_unlock_spi(ks, &flags);
 }

@@ -346,18 +368,6 @@  static void ks8851_flush_tx_work_spi(struct ks8851_net *ks)
 	flush_work(&kss->tx_work);
 }

-/**
- * calc_txlen - calculate size of message to send packet
- * @len: Length of data
- *
- * Returns the size of the TXFIFO message needed to send
- * this packet.
- */
-static unsigned int calc_txlen(unsigned int len)
-{
-	return ALIGN(len + 4, 4);
-}
-
 /**
  * ks8851_start_xmit_spi - transmit packet using SPI
  * @skb: The buffer to transmit
@@ -386,16 +396,17 @@  static netdev_tx_t ks8851_start_xmit_spi(struct sk_buff *skb,

 	spin_lock(&ks->statelock);

-	if (needed > ks->tx_space) {
+	if (ks->queued_len + needed > ks->tx_space) {
 		netif_stop_queue(dev);
 		ret = NETDEV_TX_BUSY;
 	} else {
-		ks->tx_space -= needed;
+		ks->queued_len += needed;
 		skb_queue_tail(&ks->txq, skb);
 	}

 	spin_unlock(&ks->statelock);
-	schedule_work(&kss->tx_work);
+	if (ret == NETDEV_TX_OK)
+		schedule_work(&kss->tx_work);

 	return ret;
 }