diff mbox series

smc: fix refcount bug in sk_psock_get (2)

Message ID 20220709024659.6671-1-yin31149@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series smc: fix refcount bug in sk_psock_get (2) | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 23 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Hawkins Jiawei July 9, 2022, 2:46 a.m. UTC
From: hawk <18801353760@163.com>

Syzkaller reportes refcount bug as follows:
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
...
Call Trace:
 <TASK>
 __refcount_add_not_zero include/linux/refcount.h:163 [inline]
 __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
 refcount_inc_not_zero include/linux/refcount.h:245 [inline]
 sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
 tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
 tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
 tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
 tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
 tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
 sk_backlog_rcv include/net/sock.h:1061 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2849
 release_sock+0x54/0x1b0 net/core/sock.c:3404
 inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
 __sys_shutdown_sock net/socket.c:2331 [inline]
 __sys_shutdown_sock net/socket.c:2325 [inline]
 __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
 __do_sys_shutdown net/socket.c:2351 [inline]
 __se_sys_shutdown net/socket.c:2349 [inline]
 __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 </TASK>

syzbot is try to setup TLS on a SMC socket.

During SMC fallback process in connect syscall, kernel will sets the
smc->sk.sk_socket->file->private_data to smc->clcsock
in smc_switch_to_fallback(), and set smc->clcsock->sk_user_data
to origin smc in smc_fback_replace_callbacks().

When syzbot makes a setsockopt syscall, its argument sockfd
actually points to smc->clcsock, which is not a smc_sock type,
So it won't call smc_setsockopt() in setsockopt syscall,
instead it will call do_tcp_setsockopt() to setup TLS, which
bypasses the fixes 734942cc4ea6, its content is shown as below
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index be3e80b3e27f1..5eff7cccceffc 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -2161,6 +2161,9 @@ static int smc_setsockopt(struct socket *sock,
> 			int level, int optname,
>  	struct smc_sock *smc;
>  	int val, rc;
>
> +	if (level == SOL_TCP && optname == TCP_ULP)
> +		return -EOPNOTSUPP;
> +
>  	smc = smc_sk(sk);
>
>  	/* generic setsockopts reaching us here always apply to the
> @@ -2185,7 +2188,6 @@ static int smc_setsockopt(struct socket *sock,
>			int level, int optname,
>  	if (rc || smc->use_fallback)
>  		goto out;
>  	switch (optname) {
> -	case TCP_ULP:
>  	case TCP_FASTOPEN:
>  	case TCP_FASTOPEN_CONNECT:
>  	case TCP_FASTOPEN_KEY:
> --

Later, sk_psock_get() will treat the smc->clcsock->sk_user_data
as sk_psock type, which triggers the refcnt warning.

So Just disallow this setup in do_tcp_setsockopt() is OK,
by checking whether sk_user_data points to a SMC socket.

Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
Signed-off-by: hawk <18801353760@163.com>
---
 net/ipv4/tcp.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Jakub Kicinski July 9, 2022, 3:06 a.m. UTC | #1
On Sat,  9 Jul 2022 10:46:59 +0800 Hawkins Jiawei wrote:
> Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
> Signed-off-by: hawk <18801353760@163.com>
> ---
>  net/ipv4/tcp.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 9984d23a7f3e..a1e6cab2c748 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3395,10 +3395,23 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
>  	}
>  	case TCP_ULP: {
>  		char name[TCP_ULP_NAME_MAX];
> +		struct sock *smc_sock;
>  
>  		if (optlen < 1)
>  			return -EINVAL;
>  
> +		/* SMC sk_user_data may be treated as psock,
> +		 * which triggers a refcnt warning.
> +		 */
> +		rcu_read_lock();
> +		smc_sock = rcu_dereference_sk_user_data(sk);
> +		if (level == SOL_TCP && smc_sock &&
> +		    smc_sock->__sk_common.skc_family == AF_SMC) {

This should prolly be under the socket lock?

Can we add a bit to SK_USER_DATA_PTRMASK and have ULP-compatible
users (sockmap) opt into ULP cooperation? Modifying TCP is backwards,
layer-wise.

> +			rcu_read_unlock();
> +			return -EOPNOTSUPP;
> +		}
> +		rcu_read_unlock();
> +
Hawkins Jiawei July 9, 2022, 8:36 a.m. UTC | #2
On Sat, 9 Jul 2022 at 11:06, Jakub Kicinski <kuba@kernel.org> wrote:
> On Sat,  9 Jul 2022 10:46:59 +0800 Hawkins Jiawei wrote:
> > Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
> > Signed-off-by: hawk <18801353760@163.com>
> > ---
> >  net/ipv4/tcp.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 9984d23a7f3e..a1e6cab2c748 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -3395,10 +3395,23 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
> >       }
> >       case TCP_ULP: {
> >               char name[TCP_ULP_NAME_MAX];
> > +             struct sock *smc_sock;
> >
> >               if (optlen < 1)
> >                       return -EINVAL;
> >
> > +             /* SMC sk_user_data may be treated as psock,
> > +              * which triggers a refcnt warning.
> > +              */
> > +             rcu_read_lock();
> > +             smc_sock = rcu_dereference_sk_user_data(sk);
> > +             if (level == SOL_TCP && smc_sock &&
> > +                 smc_sock->__sk_common.skc_family == AF_SMC) {
>
> This should prolly be under the socket lock?
>
> Can we add a bit to SK_USER_DATA_PTRMASK and have ULP-compatible
> users (sockmap) opt into ULP cooperation? Modifying TCP is backwards,
> layer-wise.

Thanks for your suggestion, I also agree that modifying TCP directly
is not wise.

I am sorry that I can't follow you on haveing ULP-compatible
users (sockmap) opt into ULP cooperation, yet adding a bit to
SK_USER_DATA_PTRMASK seems like a good way.

I plan to add a mask bit, and check it during sk_psock_get(),
in v2 patch

>
> > +                     rcu_read_unlock();
> > +                     return -EOPNOTSUPP;
> > +             }
> > +             rcu_read_unlock();
> > +
Wen Gu July 11, 2022, 7:21 a.m. UTC | #3
On 2022/7/9 10:46 am, Hawkins Jiawei wrote:

> 
> syzbot is try to setup TLS on a SMC socket.
> 
> During SMC fallback process in connect syscall, kernel will sets the
> smc->sk.sk_socket->file->private_data to smc->clcsock
> in smc_switch_to_fallback(), and set smc->clcsock->sk_user_data
> to origin smc in smc_fback_replace_callbacks().

> 
> Later, sk_psock_get() will treat the smc->clcsock->sk_user_data
> as sk_psock type, which triggers the refcnt warning.
> 


Thanks for your analysis.

Although syzbot found this issue in SMC, seems that it is a generic
issue about sk_user_data usage? Fixing it from SK_USER_DATA_PTRMASK
as you plan should be a right way.
Dan Carpenter July 12, 2022, 9:47 a.m. UTC | #4
On Sat, Jul 09, 2022 at 10:46:59AM +0800, Hawkins Jiawei wrote:
> From: hawk <18801353760@163.com>

Please use your legal name like you would for signing a legal document.

regards,
dan carpenter
Hawkins Jiawei July 13, 2022, 3:10 a.m. UTC | #5
On Mon, 11 Jul 2022 at 15:21, Wen Gu <guwen@linux.alibaba.com> wrote:
>Although syzbot found this issue in SMC, seems that it is a generic
>issue about sk_user_data usage? Fixing it from SK_USER_DATA_PTRMASK
>as you plan should be a right way.

Thanks for your advice. In fact, I found a more
general patch, but it seems that it has not 
been merged until now.

In this bug, the problem is that smc and psock, both use
sk_user_data field to save their private data. So they 
will treat field in their own way.

>> in smc_switch_to_fallback(), and set smc->clcsock->sk_user_data
>> to origin smc in smc_fback_replace_callbacks().
>> 
>> Later, sk_psock_get() will treat the smc->clcsock->sk_user_data
>> as sk_psock type, which triggers the refcnt warning.

So in the patch [PATCH RFC 1/5] net: Add distinct sk_psock field,
psock private data will be moved to the sk_psock field, shown as
below
> The sk_psock facility populates the sk_user_data field with the
> address of an extra bit of metadata. User space sockets never
> populate the sk_user_data field, so this has worked out fine.
> 
> However, kernel consumers such as the RPC client and server do
> populate the sk_user_data field. The sk_psock() function cannot tell
> that the content of sk_user_data does not point to psock metadata,
> so it will happily return a pointer to something else, cast to a
> struct sk_psock.
> 
> Thus kernel consumers and psock currently cannot co-exist.
> 
> We could educate sk_psock() to return NULL if sk_user_data does
> not point to a struct sk_psock. However, a more general solution
> that enables full co-existence psock and other uses of sk_user_data
> might be more interesting.
> 
> Move the struct sk_psock address to its own pointer field so that
> the contents of the sk_user_data field is preserved.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  include/linux/skmsg.h |    2 +-
>  include/net/sock.h    |    4 +++-
>  net/core/skmsg.c      |    6 +++---
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index c5a2d6f50f25..5ef3a07c5b6c 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -277,7 +277,7 @@ static inline void sk_msg_sg_copy_clear(
>			struct sk_msg *msg, u32 start)
>  
>  static inline struct sk_psock *sk_psock(const struct sock *sk)
>  {
> -	return rcu_dereference_sk_user_data(sk);
> +	return rcu_dereference(sk->sk_psock);
>  }
>  
>  static inline void sk_psock_set_state(struct sk_psock *psock,
> diff --git a/include/net/sock.h b/include/net/sock.h
> index c4b91fc19b9c..d2a513169527 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -327,7 +327,8 @@ struct sk_filter;
>    *	@sk_tskey: counter to disambiguate concurrent tstamp requests
>    *	@sk_zckey: counter to order MSG_ZEROCOPY notifications
>    *	@sk_socket: Identd and reporting IO signals
> -  *	@sk_user_data: RPC layer private data
> +  *	@sk_user_data: Upper layer private data
> +  *	@sk_psock: socket policy data (bpf)
>    *	@sk_frag: cached page frag
>    *	@sk_peek_off: current peek_offset value
>    *	@sk_send_head: front of stuff to transmit
> @@ -519,6 +520,7 @@ struct sock {
>  
>  	struct socket		*sk_socket;
>  	void			*sk_user_data;
> +	struct sk_psock	__rcu	*sk_psock;
>  #ifdef CONFIG_SECURITY
>  	void			*sk_security;
>  #endif
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index cc381165ea08..2b3d01d92790 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -695,7 +695,7 @@ struct sk_psock *sk_psock_init(struct sock *sk,
>					int node)
>  
>  	write_lock_bh(&sk->sk_callback_lock);
>  
> -	if (sk->sk_user_data) {
> +	if (sk->sk_psock) {
>  		psock = ERR_PTR(-EBUSY);
>  		goto out;
>  	}
> @@ -726,7 +726,7 @@ struct sk_psock *sk_psock_init(struct sock *sk,
>					int node)
>  	sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED);
>  	refcount_set(&psock->refcnt, 1);
>  
> -	rcu_assign_sk_user_data_nocopy(sk, psock);
> +	rcu_assign_pointer(sk->sk_psock, psock);
>  	sock_hold(sk);
>  
>  out:
> @@ -825,7 +825,7 @@ void sk_psock_drop(struct sock *sk,
>					struct sk_psock *psock)
>  {
>  	write_lock_bh(&sk->sk_callback_lock);
>  	sk_psock_restore_proto(sk, psock);
> -	rcu_assign_sk_user_data(sk, NULL);
> +	rcu_assign_pointer(sk->sk_psock, NULL);
>  	if (psock->progs.stream_parser)
>  		sk_psock_stop_strp(sk, psock);
>  	else if (psock->progs.stream_verdict || psock->progs.skb_verdict)

I have tested this patch and the reproducer did not trigger any issue.

In Patchwork website, this patch fails the checks on
netdev/cc_maintainers. If this patch fails for some other reasons,
I will still fix this bug from SK_USER_DATA_PTRMASK,
as a temporary solution.
Jakub Kicinski July 13, 2022, 3:33 a.m. UTC | #6
On Wed, 13 Jul 2022 11:10:05 +0800 Hawkins Jiawei wrote:
> In Patchwork website, this patch fails the checks on
> netdev/cc_maintainers. If this patch fails for some other reasons,
> I will still fix this bug from SK_USER_DATA_PTRMASK,
> as a temporary solution.

That check just runs scripts/get_maintainer.pl so make sure you CC
folks pointed out by that script and you should be fine.
Hawkins Jiawei July 13, 2022, 3:35 a.m. UTC | #7
On Tue, 12 Jul 2022 at 17:48, Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Sat, Jul 09, 2022 at 10:46:59AM +0800, Hawkins Jiawei wrote:
> > From: hawk <18801353760@163.com>
>
> Please use your legal name like you would for signing a legal document.

Thanks, I will pay attention to it in the future.

>
> regards,
> dan carpenter
>
Hawkins Jiawei July 13, 2022, 3:53 a.m. UTC | #8
On Wed, 13 Jul 2022 at 11:33, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 13 Jul 2022 11:10:05 +0800 Hawkins Jiawei wrote:
> > In Patchwork website, this patch fails the checks on
> > netdev/cc_maintainers. If this patch fails for some other reasons,
> > I will still fix this bug from SK_USER_DATA_PTRMASK,
> > as a temporary solution.
>
> That check just runs scripts/get_maintainer.pl so make sure you CC
> folks pointed out by that script and you should be fine.

Thanks for your reply, yet I am not the patch's author, I
found this patch during my bug analysis.

I will reply the relative email to remind the patch's author.
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9984d23a7f3e..a1e6cab2c748 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3395,10 +3395,23 @@  static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
 	}
 	case TCP_ULP: {
 		char name[TCP_ULP_NAME_MAX];
+		struct sock *smc_sock;
 
 		if (optlen < 1)
 			return -EINVAL;
 
+		/* SMC sk_user_data may be treated as psock,
+		 * which triggers a refcnt warning.
+		 */
+		rcu_read_lock();
+		smc_sock = rcu_dereference_sk_user_data(sk);
+		if (level == SOL_TCP && smc_sock &&
+		    smc_sock->__sk_common.skc_family == AF_SMC) {
+			rcu_read_unlock();
+			return -EOPNOTSUPP;
+		}
+		rcu_read_unlock();
+
 		val = strncpy_from_sockptr(name, optval,
 					min_t(long, TCP_ULP_NAME_MAX - 1,
 					      optlen));