diff mbox series

[net] tcp: read multiple skbs in tcp_read_skb()

Message ID 20220912173553.235838-1-xiyou.wangcong@gmail.com (mailing list archive)
State Accepted
Commit db4192a754ebd52300a28abe1a50dd18eae0eb12
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: read multiple skbs in tcp_read_skb() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/apply fail Patch does not apply to net

Commit Message

Cong Wang Sept. 12, 2022, 5:35 p.m. UTC
From: Cong Wang <cong.wang@bytedance.com>

Before we switched to ->read_skb(), ->read_sock() was passed with
desc.count=1, which technically indicates we only read one skb per
->sk_data_ready() call. However, for TCP, this is not true.

TCP at least has sk_rcvlowat which intentionally holds skb's in
receive queue until this watermark is reached. This means when
->sk_data_ready() is invoked there could be multiple skb's in the
queue, therefore we have to read multiple skbs in tcp_read_skb()
instead of one.

Fixes: 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()")
Reported-by: Peilin Ye <peilin.ye@bytedance.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/ipv4/tcp.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Sept. 20, 2022, 1 p.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (master)
by Paolo Abeni <pabeni@redhat.com>:

On Mon, 12 Sep 2022 10:35:53 -0700 you wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> Before we switched to ->read_skb(), ->read_sock() was passed with
> desc.count=1, which technically indicates we only read one skb per
> ->sk_data_ready() call. However, for TCP, this is not true.
> 
> TCP at least has sk_rcvlowat which intentionally holds skb's in
> receive queue until this watermark is reached. This means when
> ->sk_data_ready() is invoked there could be multiple skb's in the
> queue, therefore we have to read multiple skbs in tcp_read_skb()
> instead of one.
> 
> [...]

Here is the summary with links:
  - [net] tcp: read multiple skbs in tcp_read_skb()
    https://git.kernel.org/netdev/net/c/db4192a754eb

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3488388eea5d..e373dde1f46f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1761,19 +1761,28 @@  int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
 	if (sk->sk_state == TCP_LISTEN)
 		return -ENOTCONN;
 
-	skb = tcp_recv_skb(sk, seq, &offset);
-	if (!skb)
-		return 0;
+	while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
+		u8 tcp_flags;
+		int used;
 
-	__skb_unlink(skb, &sk->sk_receive_queue);
-	WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk));
-	copied = recv_actor(sk, skb);
-	if (copied >= 0) {
-		seq += copied;
-		if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+		__skb_unlink(skb, &sk->sk_receive_queue);
+		WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk));
+		tcp_flags = TCP_SKB_CB(skb)->tcp_flags;
+		used = recv_actor(sk, skb);
+		consume_skb(skb);
+		if (used < 0) {
+			if (!copied)
+				copied = used;
+			break;
+		}
+		seq += used;
+		copied += used;
+
+		if (tcp_flags & TCPHDR_FIN) {
 			++seq;
+			break;
+		}
 	}
-	consume_skb(skb);
 	WRITE_ONCE(tp->copied_seq, seq);
 
 	tcp_rcv_space_adjust(sk);