net: remove release/lock_sock in tcp_splice_read

Message ID	20240815084330.166987-1-sunyiqixm@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54EFB19D09F; Thu, 15 Aug 2024 08:43:56 +0000 (UTC) From: sunyiqi <sunyiqixm@gmail.com> To: edumazet@google.com, davem@davemloft.net, dsahern@kernel.org, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, sunyiqi <sunyiqixm@gmail.com> Subject: [PATCH] net: remove release/lock_sock in tcp_splice_read Date: Thu, 15 Aug 2024 16:43:30 +0800 Message-Id: <20240815084330.166987-1-sunyiqixm@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	net: remove release/lock_sock in tcp_splice_read \| expand net: remove release/lock_sock in tcp_splice_read

Context	Check	Description
netdev/series_format	warning	Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 29 this patch: 29
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	fail	1 blamed authors not CCed: axboe@kernel.dk; 1 maintainers not CCed: axboe@kernel.dk
netdev/build_clang	success	Errors and warnings before: 29 this patch: 29
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 29 this patch: 29
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 1 this patch: 1
netdev/source_inline	success	Was 0 now: 0

sunyiqi Aug. 15, 2024, 8:43 a.m. UTC

When enters tcp_splice_read, tcp_splice_read will call lock_sock
for sk in order to prevent other threads acquiring sk and release it
before return.

But in while(tss.len) loop, it releases and re-locks sk, give the other
thread a small window to lock the sk.

As a result, release/lock_sock in the while loop in tcp_splice_read may
cause race condition.

Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
---
 net/ipv4/tcp.c | 2 --
 1 file changed, 2 deletions(-)

Paolo Abeni Aug. 15, 2024, 10 a.m. UTC | #1

On 8/15/24 10:43, sunyiqi wrote:
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
> 
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
> 
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.

Which race condition exactly? do you have a backtrace?

> 
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
> ---
>   net/ipv4/tcp.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>   
>   		if (!tss.len || !timeo)
>   			break;
> -		release_sock(sk);
> -		lock_sock(sk);

This is needed to flush the sk backlog.

Somewhat related, I think we could replace the pair with sk_flush_backlog().

Thanks,

Paolo

Jason Xing Aug. 15, 2024, 10:48 a.m. UTC | #2

On Thu, Aug 15, 2024 at 5:46 PM sunyiqi <sunyiqixm@gmail.com> wrote:
>
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
>
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
>
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.
>
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>

It's more of an optimization instead of a BUG, no?

I don't consider it as a bug, unless you can prove it... Let me ask
what kind of race issues could re-lock cause?

I think holding the socket lock too long is not a good idea because
releasing the lock can give others breathing (having the chance to
finish their own stuff). Please see release_sock().

Thanks,
Jason

> ---
>  net/ipv4/tcp.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>
>                 if (!tss.len || !timeo)
>                         break;
> -               release_sock(sk);
> -               lock_sock(sk);
>
>                 if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
>                     (sk->sk_shutdown & RCV_SHUTDOWN) ||
> --
> 2.34.1
>
>

Jason Xing Aug. 15, 2024, 10:55 a.m. UTC | #3

Hello Paolo,

On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
[...]
> > -             release_sock(sk);
> > -             lock_sock(sk);
>
> This is needed to flush the sk backlog.
>
> Somewhat related, I think we could replace the pair with sk_flush_backlog().
>

Do you think we could do this like the following commit:

commit d41a69f1d390fa3f2546498103cdcd78b30676ff
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Apr 29 14:16:53 2016 -0700

    tcp: make tcp_sendmsg() aware of socket backlog

    Large sendmsg()/write() hold socket lock for the duration of the call,
    unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
    are parked into socket backlog for a long time.

?

Then we can avoid taking the lock too long which results in too many
packets in the backlog.

Thanks,
Jason

Paolo Abeni Aug. 15, 2024, 11:22 a.m. UTC | #4

On 8/15/24 12:55, Jason Xing wrote:
> On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
> [...]
>>> -             release_sock(sk);
>>> -             lock_sock(sk);
>>
>> This is needed to flush the sk backlog.
>>
>> Somewhat related, I think we could replace the pair with sk_flush_backlog().
>>
> 
> Do you think we could do this like the following commit:
> 
> commit d41a69f1d390fa3f2546498103cdcd78b30676ff
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri Apr 29 14:16:53 2016 -0700
> 
>      tcp: make tcp_sendmsg() aware of socket backlog
> 
>      Large sendmsg()/write() hold socket lock for the duration of the call,
>      unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
>      are parked into socket backlog for a long time. >
> ?

Yep. To be more accurate I was looking at commit 
93afcfd1db35882921b2521a637c78755c27b02c

In any case this should be unrelated from the supposed issue.

Cheers,

Paolo

Jason Xing Aug. 15, 2024, 11:34 a.m. UTC | #5

On Thu, Aug 15, 2024 at 7:23 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
>
>
> On 8/15/24 12:55, Jason Xing wrote:
> > On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > [...]
> >>> -             release_sock(sk);
> >>> -             lock_sock(sk);
> >>
> >> This is needed to flush the sk backlog.
> >>
> >> Somewhat related, I think we could replace the pair with sk_flush_backlog().
> >>
> >
> > Do you think we could do this like the following commit:
> >
> > commit d41a69f1d390fa3f2546498103cdcd78b30676ff
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Fri Apr 29 14:16:53 2016 -0700
> >
> >      tcp: make tcp_sendmsg() aware of socket backlog
> >
> >      Large sendmsg()/write() hold socket lock for the duration of the call,
> >      unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
> >      are parked into socket backlog for a long time. >
> > ?
>
> Yep. To be more accurate I was looking at commit
> 93afcfd1db35882921b2521a637c78755c27b02c

Thanks. It arouses my interest. Now I do believe we can do such
optimization in this function.

>
> In any case this should be unrelated from the supposed issue.

Sure.

Thanks,
Jason

net: remove release/lock_sock in tcp_splice_read

Checks

Commit Message

Comments

Patch