diff mbox series

[3/5] net: ethernet: mtk_eth_soc: work around issue with sending small fragments

Message ID 20221123095754.36821-3-nbd@nbd.name (mailing list archive)
State New, archived
Headers show
Series [1/5] net: ethernet: mtk_eth_soc: account for vlan in rx header length | expand

Commit Message

Felix Fietkau Nov. 23, 2022, 9:57 a.m. UTC
When frames are sent with very small fragments, the DMA engine appears to
lock up and transmit attempts time out. Fix this by detecting the presence
of small fragments and use skb_gso_segment + skb_linearize to deal with
them

Signed-off-by: Felix Fietkau <nbd@nbd.name>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 36 +++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

Comments

Alexander Lobakin Nov. 24, 2022, 5:54 p.m. UTC | #1
From: Felix Fietkau <nbd@nbd.name>
Date: Wed, 23 Nov 2022 10:57:52 +0100

> When frames are sent with very small fragments, the DMA engine appears to
> lock up and transmit attempts time out. Fix this by detecting the presence
> of small fragments and use skb_gso_segment + skb_linearize to deal with
> them

Nit: all of your commit messages don't have a trailing dot (.), not
sure if it's important, but my eye is missing it definitely :D

skb_gso_segment() and skb_linearize() are slow as hell. I think you
can do it differently. I guess only the first (head) and the last
frag can be so small, right?

So, if a frag from shinfo->frags is less than 16, get a new frag of
the minimum acceptable size via netdev_alloc_frag(), copy the data
to it and pad the rest with zeroes. Then increase skb->len and
skb->data_len, skb_frag_unref() the current, "invalid" frag and
replace the pointer to the new frag. I didn't miss anything I
believe... Zero padding the tail is usual thing for NICs. skb frag
substitution is less common, but should be legit.

If skb_headlen() is less than 16, try doing pskb_may_pull() +
__skb_pull() at first. The argument would be `16 - headlen`. If
pskb_may_pull() returns false, then yeah, you have no choice other
than segmenting and linearizing ._.

> 
> Signed-off-by: Felix Fietkau <nbd@nbd.name>
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 36 +++++++++++++++++++--
>  1 file changed, 34 insertions(+), 2 deletions(-)

[...]

>  	if (unlikely(atomic_read(&ring->free_count) <= ring->thresh))
>  		netif_tx_stop_all_queues(dev);
> -- 
> 2.38.1

Thanks,
Olek
Felix Fietkau Dec. 27, 2022, 9:55 a.m. UTC | #2
On 24.11.22 18:54, Alexander Lobakin wrote:
> From: Felix Fietkau <nbd@nbd.name>
> Date: Wed, 23 Nov 2022 10:57:52 +0100
> 
>> When frames are sent with very small fragments, the DMA engine appears to
>> lock up and transmit attempts time out. Fix this by detecting the presence
>> of small fragments and use skb_gso_segment + skb_linearize to deal with
>> them
> 
> Nit: all of your commit messages don't have a trailing dot (.), not
> sure if it's important, but my eye is missing it definitely :D
> 
> skb_gso_segment() and skb_linearize() are slow as hell. I think you
> can do it differently. I guess only the first (head) and the last
> frag can be so small, right?
> 
> So, if a frag from shinfo->frags is less than 16, get a new frag of
> the minimum acceptable size via netdev_alloc_frag(), copy the data
> to it and pad the rest with zeroes. Then increase skb->len and
> skb->data_len, skb_frag_unref() the current, "invalid" frag and
> replace the pointer to the new frag. I didn't miss anything I
> believe... Zero padding the tail is usual thing for NICs. skb frag
> substitution is less common, but should be legit.
> 
> If skb_headlen() is less than 16, try doing pskb_may_pull() +
> __skb_pull() at first. The argument would be `16 - headlen`. If
> pskb_may_pull() returns false, then yeah, you have no choice other
> than segmenting and linearizing ._.
I looked into this some more and spoke with people at MTK. It appears 
that in principle, the DMA engine is able to process very small 
fragments. However, when it is being flooded with them, a FIFO can 
overflow, which causes the hang that I was observing.
I think your suggestion likely would not fix the issue completely.
A MTK engineer also confirmed that my approach is the correct one for 
handling this.
I will send v2 with an updated description.

Thanks,

- Felix
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 4c9972a94451..e63d2c034ca3 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1441,12 +1441,28 @@  static void mtk_wake_queue(struct mtk_eth *eth)
 	}
 }
 
+static bool mtk_skb_has_small_frag(struct sk_buff *skb)
+{
+	int min_size = 16;
+	int i;
+
+	if (skb_headlen(skb) < min_size)
+		return true;
+
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
+		if (skb_frag_size(&skb_shinfo(skb)->frags[i]) < min_size)
+			return true;
+
+	return false;
+}
+
 static netdev_tx_t mtk_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct mtk_mac *mac = netdev_priv(dev);
 	struct mtk_eth *eth = mac->hw;
 	struct mtk_tx_ring *ring = &eth->tx_ring;
 	struct net_device_stats *stats = &dev->stats;
+	struct sk_buff *segs, *next;
 	bool gso = false;
 	int tx_num;
 
@@ -1468,6 +1484,17 @@  static netdev_tx_t mtk_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		return NETDEV_TX_BUSY;
 	}
 
+	if (skb_is_gso(skb) && mtk_skb_has_small_frag(skb)) {
+		segs = skb_gso_segment(skb, dev->features & ~NETIF_F_ALL_TSO);
+		if (IS_ERR(segs))
+			goto drop;
+
+		if (segs) {
+			consume_skb(skb);
+			skb = segs;
+		}
+	}
+
 	/* TSO: fill MSS info in tcp checksum field */
 	if (skb_is_gso(skb)) {
 		if (skb_cow_head(skb, 0)) {
@@ -1483,8 +1510,13 @@  static netdev_tx_t mtk_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		}
 	}
 
-	if (mtk_tx_map(skb, dev, tx_num, ring, gso) < 0)
-		goto drop;
+	skb_list_walk_safe(skb, skb, next) {
+		if ((mtk_skb_has_small_frag(skb) && skb_linearize(skb)) ||
+		    mtk_tx_map(skb, dev, tx_num, ring, gso) < 0) {
+			stats->tx_dropped++;
+			dev_kfree_skb_any(skb);
+		}
+	}
 
 	if (unlikely(atomic_read(&ring->free_count) <= ring->thresh))
 		netif_tx_stop_all_queues(dev);