diff mbox series

[v1,net,1/2] tcp: Don't drop head OOB when queuing new OOB.

Message ID 20240820034920.77419-2-takamitz@amazon.co.jp (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series tcp: Fix bugs of OOB handling in TCP. | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 2 maintainers not CCed: shuah@kernel.org linux-kselftest@vger.kernel.org
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 17 this patch: 17
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 118 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-08-20--09-00 (tests: 712)

Commit Message

Takamitsu Iwai Aug. 20, 2024, 3:49 a.m. UTC
If OOB is not at the head of recvq, it's not dropped when a new OOB is
queued.

OTOH, as commit f5ea0768a255 ("selftest: af_unix: Add
non-TCP-compliant test cases in msg_oob.c.") points out, TCP drops OOB
data at the head of recvq when queuing a new OOB data subsequently.

This behavior has been introduced in tcp_check_urg() by deleting
preceding skb when next MSG_OOB data is received. This process is
weird OOB handling, and the comment also says this is wrong.

We remove this code block for appropriate OOB handling.

Now TCP works exactly the same way as AF_UNIX, so this patch enables
kernel to pass the test when removing tcp_incompliant braces.

 #  RUN           msg_oob.no_peek.inline_ex_oob_drop ...
 #            OK  msg_oob.no_peek.inline_ex_oob_drop
 ok 18 msg_oob.no_peek.inline_ex_oob_drop
 #  RUN           msg_oob.peek.ex_oob_drop ...
 #            OK  msg_oob.peek.ex_oob_drop
 ok 28 msg_oob.peek.ex_oob_drop
 #  RUN           msg_oob.peek.ex_oob_drop_2 ...
 #            OK  msg_oob.peek.ex_oob_drop_2
 ok 29 msg_oob.peek.ex_oob_drop_2

Fixes tag refers to the commit of Linux-2.6.12-rc2, but this code was
written at v2.4.4 which is older than this version.

This is a long-standing bug since then, and technically the patch
slightly changes uAPI, but RFC 6091, published in 2011, suggests TCP
urgent mechanism should not be used for newer applications in 2011.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp>
---
 net/ipv4/tcp_input.c                          | 25 ---------
 tools/testing/selftests/net/af_unix/msg_oob.c | 55 ++++++++-----------
 2 files changed, 22 insertions(+), 58 deletions(-)

Comments

Eric Dumazet Aug. 20, 2024, 1:34 p.m. UTC | #1
On Tue, Aug 20, 2024 at 5:50 AM Takamitsu Iwai <takamitz@amazon.co.jp> wrote:
>
> If OOB is not at the head of recvq, it's not dropped when a new OOB is
> queued.
>
> OTOH, as commit f5ea0768a255 ("selftest: af_unix: Add
> non-TCP-compliant test cases in msg_oob.c.") points out, TCP drops OOB
> data at the head of recvq when queuing a new OOB data subsequently.
>
> This behavior has been introduced in tcp_check_urg() by deleting
> preceding skb when next MSG_OOB data is received. This process is
> weird OOB handling, and the comment also says this is wrong.
>
> We remove this code block for appropriate OOB handling.
>
> Now TCP works exactly the same way as AF_UNIX, so this patch enables
> kernel to pass the test when removing tcp_incompliant braces.
>
>  #  RUN           msg_oob.no_peek.inline_ex_oob_drop ...
>  #            OK  msg_oob.no_peek.inline_ex_oob_drop
>  ok 18 msg_oob.no_peek.inline_ex_oob_drop
>  #  RUN           msg_oob.peek.ex_oob_drop ...
>  #            OK  msg_oob.peek.ex_oob_drop
>  ok 28 msg_oob.peek.ex_oob_drop
>  #  RUN           msg_oob.peek.ex_oob_drop_2 ...
>  #            OK  msg_oob.peek.ex_oob_drop_2
>  ok 29 msg_oob.peek.ex_oob_drop_2
>
> Fixes tag refers to the commit of Linux-2.6.12-rc2, but this code was
> written at v2.4.4 which is older than this version.
>
> This is a long-standing bug since then, and technically the patch
> slightly changes uAPI, but RFC 6091, published in 2011, suggests TCP
> urgent mechanism should not be used for newer applications in 2011.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp>
> ---
>  net/ipv4/tcp_input.c                          | 25 ---------
>  tools/testing/selftests/net/af_unix/msg_oob.c | 55 ++++++++-----------
>  2 files changed, 22 insertions(+), 58 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index e37488d3453f..648d0f3ade78 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5830,31 +5830,6 @@ static void tcp_check_urg(struct sock *sk, const struct tcphdr *th)
>         /* Tell the world about our new urgent pointer. */
>         sk_send_sigurg(sk);
>
> -       /* We may be adding urgent data when the last byte read was
> -        * urgent. To do this requires some care. We cannot just ignore
> -        * tp->copied_seq since we would read the last urgent byte again
> -        * as data, nor can we alter copied_seq until this data arrives
> -        * or we break the semantics of SIOCATMARK (and thus sockatmark())
> -        *
> -        * NOTE. Double Dutch. Rendering to plain English: author of comment
> -        * above did something sort of  send("A", MSG_OOB); send("B", MSG_OOB);
> -        * and expect that both A and B disappear from stream. This is _wrong_.
> -        * Though this happens in BSD with high probability, this is occasional.
> -        * Any application relying on this is buggy. Note also, that fix "works"
> -        * only in this artificial test. Insert some normal data between A and B and we will
> -        * decline of BSD again. Verdict: it is better to remove to trap
> -        * buggy users.
> -        */
> -       if (tp->urg_seq == tp->copied_seq && tp->urg_data &&
> -           !sock_flag(sk, SOCK_URGINLINE) && tp->copied_seq != tp->rcv_nxt) {
> -               struct sk_buff *skb = skb_peek(&sk->sk_receive_queue);
> -               tp->copied_seq++;
> -               if (skb && !before(tp->copied_seq, TCP_SKB_CB(skb)->end_seq)) {
> -                       __skb_unlink(skb, &sk->sk_receive_queue);
> -                       __kfree_skb(skb);
> -               }
> -       }
> -

My opinion is that we should not touch this code.

No one sane uses OOB and TCP, otherwise this issue would have been
discussed a long time ago.

If anyone is using it, then your change will change the behavior.

Fact that you changed tools/testing/selftests/net/af_unix/msg_oob.c
should speak by itself.

Note that OOB has been added to AF_UNIX recently, it should have
followed TCP behavior.
Honestly we should not have accepted OOB on AF_UNIX, this was a real mistake.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e37488d3453f..648d0f3ade78 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5830,31 +5830,6 @@  static void tcp_check_urg(struct sock *sk, const struct tcphdr *th)
 	/* Tell the world about our new urgent pointer. */
 	sk_send_sigurg(sk);
 
-	/* We may be adding urgent data when the last byte read was
-	 * urgent. To do this requires some care. We cannot just ignore
-	 * tp->copied_seq since we would read the last urgent byte again
-	 * as data, nor can we alter copied_seq until this data arrives
-	 * or we break the semantics of SIOCATMARK (and thus sockatmark())
-	 *
-	 * NOTE. Double Dutch. Rendering to plain English: author of comment
-	 * above did something sort of 	send("A", MSG_OOB); send("B", MSG_OOB);
-	 * and expect that both A and B disappear from stream. This is _wrong_.
-	 * Though this happens in BSD with high probability, this is occasional.
-	 * Any application relying on this is buggy. Note also, that fix "works"
-	 * only in this artificial test. Insert some normal data between A and B and we will
-	 * decline of BSD again. Verdict: it is better to remove to trap
-	 * buggy users.
-	 */
-	if (tp->urg_seq == tp->copied_seq && tp->urg_data &&
-	    !sock_flag(sk, SOCK_URGINLINE) && tp->copied_seq != tp->rcv_nxt) {
-		struct sk_buff *skb = skb_peek(&sk->sk_receive_queue);
-		tp->copied_seq++;
-		if (skb && !before(tp->copied_seq, TCP_SKB_CB(skb)->end_seq)) {
-			__skb_unlink(skb, &sk->sk_receive_queue);
-			__kfree_skb(skb);
-		}
-	}
-
 	WRITE_ONCE(tp->urg_data, TCP_URG_NOTYET);
 	WRITE_ONCE(tp->urg_seq, ptr);
 
diff --git a/tools/testing/selftests/net/af_unix/msg_oob.c b/tools/testing/selftests/net/af_unix/msg_oob.c
index 535eb2c3d7d1..f3435575dfa5 100644
--- a/tools/testing/selftests/net/af_unix/msg_oob.c
+++ b/tools/testing/selftests/net/af_unix/msg_oob.c
@@ -484,20 +484,17 @@  TEST_F(msg_oob, ex_oob_drop)
 	epollpair(true);
 	siocatmarkpair(true);
 
-	sendpair("y", 1, MSG_OOB);		/* TCP drops "x" at this moment. */
+	sendpair("y", 1, MSG_OOB);
 	epollpair(true);
+	siocatmarkpair(false);
 
-	tcp_incompliant {
-		siocatmarkpair(false);
-
-		recvpair("x", 1, 1, 0);		/* TCP drops "y" by passing through it. */
-		epollpair(true);
-		siocatmarkpair(true);
+	recvpair("x", 1, 1, 0);
+	epollpair(true);
+	siocatmarkpair(true);
 
-		recvpair("y", 1, 1, MSG_OOB);	/* TCP returns -EINVAL. */
-		epollpair(false);
-		siocatmarkpair(true);
-	}
+	recvpair("y", 1, 1, MSG_OOB);
+	epollpair(false);
+	siocatmarkpair(true);
 }
 
 TEST_F(msg_oob, ex_oob_drop_2)
@@ -506,23 +503,17 @@  TEST_F(msg_oob, ex_oob_drop_2)
 	epollpair(true);
 	siocatmarkpair(true);
 
-	sendpair("y", 1, MSG_OOB);		/* TCP drops "x" at this moment. */
+	sendpair("y", 1, MSG_OOB);
 	epollpair(true);
-
-	tcp_incompliant {
-		siocatmarkpair(false);
-	}
+	siocatmarkpair(false);
 
 	recvpair("y", 1, 1, MSG_OOB);
 	epollpair(false);
+	siocatmarkpair(false);
 
-	tcp_incompliant {
-		siocatmarkpair(false);
-
-		recvpair("x", 1, 1, 0);		/* TCP returns -EAGAIN. */
-		epollpair(false);
-		siocatmarkpair(true);
-	}
+	recvpair("x", 1, 1, 0);
+	epollpair(false);
+	siocatmarkpair(true);
 }
 
 TEST_F(msg_oob, ex_oob_ahead_break)
@@ -692,22 +683,20 @@  TEST_F(msg_oob, inline_ex_oob_drop)
 	epollpair(true);
 	siocatmarkpair(true);
 
-	sendpair("y", 1, MSG_OOB);		/* TCP drops "x" at this moment. */
+	sendpair("y", 1, MSG_OOB);
 	epollpair(true);
 
 	setinlinepair();
 
-	tcp_incompliant {
-		siocatmarkpair(false);
+	siocatmarkpair(false);
 
-		recvpair("x", 1, 1, 0);		/* TCP recv()s "y". */
-		epollpair(true);
-		siocatmarkpair(true);
+	recvpair("x", 1, 1, 0);
+	epollpair(true);
+	siocatmarkpair(true);
 
-		recvpair("y", 1, 1, 0);		/* TCP returns -EAGAIN. */
-		epollpair(false);
-		siocatmarkpair(false);
-	}
+	recvpair("y", 1, 1, 0);
+	epollpair(false);
+	siocatmarkpair(false);
 }
 
 TEST_F(msg_oob, inline_ex_oob_siocatmark)