diff mbox series

[net] net/smc: Avoid setting clcsock options after clcsock released

Message ID 1641807505-54454-1-git-send-email-guwen@linux.alibaba.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] net/smc: Avoid setting clcsock options after clcsock released | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 40 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Wen Gu Jan. 10, 2022, 9:38 a.m. UTC
We encountered a crash in smc_setsockopt() and it is caused by
accessing smc->clcsock after clcsock was released.

 BUG: kernel NULL pointer dereference, address: 0000000000000020
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
 RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
 Call Trace:
  <TASK>
  __sys_setsockopt+0xfc/0x190
  __x64_sys_setsockopt+0x20/0x30
  do_syscall_64+0x34/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f16ba83918e
  </TASK>

This patch tries to fix it by holding clcsock_release_lock and
checking whether clcsock has already been released. In case that
a crash of the same reason happens in smc_getsockopt(), this patch
also checkes smc->clcsock in smc_getsockopt().

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
---
 net/smc/af_smc.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Karsten Graul Jan. 11, 2022, 10:03 a.m. UTC | #1
On 10/01/2022 10:38, Wen Gu wrote:
> We encountered a crash in smc_setsockopt() and it is caused by
> accessing smc->clcsock after clcsock was released.
> 
>  BUG: kernel NULL pointer dereference, address: 0000000000000020
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: 0000 [#1] PREEMPT SMP PTI
>  CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
>  RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
>  Call Trace:
>   <TASK>
>   __sys_setsockopt+0xfc/0x190
>   __x64_sys_setsockopt+0x20/0x30
>   do_syscall_64+0x34/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>  RIP: 0033:0x7f16ba83918e
>   </TASK>
> 
> This patch tries to fix it by holding clcsock_release_lock and
> checking whether clcsock has already been released. In case that
> a crash of the same reason happens in smc_getsockopt(), this patch
> also checkes smc->clcsock in smc_getsockopt().
> 
> Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
> ---
>  net/smc/af_smc.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 1c9289f..af423f4 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
>  	/* generic setsockopts reaching us here always apply to the
>  	 * CLC socket
>  	 */
> +	mutex_lock(&smc->clcsock_release_lock);
> +	if (!smc->clcsock) {
> +		mutex_unlock(&smc->clcsock_release_lock);
> +		return -EBADF;
> +	}
>  	if (unlikely(!smc->clcsock->ops->setsockopt))
>  		rc = -EOPNOTSUPP;
>  	else
> @@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
>  		sk->sk_err = smc->clcsock->sk->sk_err;
>  		sk_error_report(sk);
>  	}
> +	mutex_unlock(&smc->clcsock_release_lock);

In the switch() the function smc_switch_to_fallback() might be called which also
accesses smc->clcsock without further checking. This should also be protected then?
Also from all callers of smc_switch_to_fallback() ?

There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem 
happen in setsockopt() for you only? I suspect it depends on the test case.

I wonder if it makes sense to check and protect smc->clcsock at all places in the code where 
it is used... as of now we had no such races like you encountered. But I see that in theory 
this problem could also happen in other code areas.

>  
>  	if (optlen < sizeof(int))
>  		return -EINVAL;
> @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
>  			  char __user *optval, int __user *optlen)
>  {
>  	struct smc_sock *smc;
> +	int rc;
>  
>  	smc = smc_sk(sock->sk);
> +	mutex_lock(&smc->clcsock_release_lock);
> +	if (!smc->clcsock) {
> +		mutex_unlock(&smc->clcsock_release_lock);
> +		return -EBADF;
> +	}
>  	/* socket options apply to the CLC socket */
>  	if (unlikely(!smc->clcsock->ops->getsockopt))
>  		return -EOPNOTSUPP;
> -	return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
> +	rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
>  					     optval, optlen);
> +	mutex_unlock(&smc->clcsock_release_lock);
> +	return rc;
>  }
>  
>  static int smc_ioctl(struct socket *sock, unsigned int cmd,
Wen Gu Jan. 11, 2022, 4:34 p.m. UTC | #2
Thanks for your reply.

On 2022/1/11 6:03 pm, Karsten Graul wrote:
> On 10/01/2022 10:38, Wen Gu wrote:
>> We encountered a crash in smc_setsockopt() and it is caused by
>> accessing smc->clcsock after clcsock was released.
>>
> 
> In the switch() the function smc_switch_to_fallback() might be called which also
> accesses smc->clcsock without further checking. This should also be protected then?
> Also from all callers of smc_switch_to_fallback() ?
> 
> There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem
> happen in setsockopt() for you only? I suspect it depends on the test case.
> 

Yes, it depends on the test case. The crash described here only happens one time when
I run a stress test of nginx/wrk, accompanied with frequent RNIC up/down operations.

Considering accessing smc->clcsock after its release is an uncommon, low probability
issue and only happens in setsockopt() in my test, I choce an simple way to fix it, using
the existing clcsock_release_lock, and only check in smc_setsockopt() and smc_getsockopt().

> I wonder if it makes sense to check and protect smc->clcsock at all places in the code where
> it is used... as of now we had no such races like you encountered. But I see that in theory
> this problem could also happen in other code areas.
> 

But inspired by your questions, I think maybe we should treat the race as a general problem?
Do you think it is necessary to find all the potential race related to the clcsock release and
fix them in a unified approach? like define smc->clcsock as RCU pointer, hold rcu read lock
before accessing smc->clcsock and call synchronize_rcu() before resetting smc->clcsock? just a rough idea :)

Or we should decide it later, do some more tests to see the probability of recurrence of this problem?

Thanks,
Wen Gu
Jakub Kicinski Jan. 11, 2022, 6:14 p.m. UTC | #3
On Mon, 10 Jan 2022 17:38:25 +0800 Wen Gu wrote:
> -	return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
> +	rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
>  					     optval, optlen);

Please do realign the continuation line when moving the opening bracket.
Wen Gu Jan. 12, 2022, 3:32 a.m. UTC | #4
On 2022/1/12 2:14 am, Jakub Kicinski wrote:
> On Mon, 10 Jan 2022 17:38:25 +0800 Wen Gu wrote:
>> -	return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
>> +	rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
>>   					     optval, optlen);
> 
> Please do realign the continuation line when moving the opening bracket.

Thanks for pointing this out. Will fix it.

Thanks,
Wen Gu
Dust Li Jan. 12, 2022, 7:11 a.m. UTC | #5
On Mon, Jan 10, 2022 at 05:38:25PM +0800, Wen Gu wrote:
>We encountered a crash in smc_setsockopt() and it is caused by
>accessing smc->clcsock after clcsock was released.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
> RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
> Call Trace:
>  <TASK>
>  __sys_setsockopt+0xfc/0x190
>  __x64_sys_setsockopt+0x20/0x30
>  do_syscall_64+0x34/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f16ba83918e
>  </TASK>
>
>This patch tries to fix it by holding clcsock_release_lock and
>checking whether clcsock has already been released. In case that
>a crash of the same reason happens in smc_getsockopt(), this patch
>also checkes smc->clcsock in smc_getsockopt().
>
>Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
>---
> net/smc/af_smc.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
>diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
>index 1c9289f..af423f4 100644
>--- a/net/smc/af_smc.c
>+++ b/net/smc/af_smc.c
>@@ -2441,6 +2441,11 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
> 	/* generic setsockopts reaching us here always apply to the
> 	 * CLC socket
> 	 */
>+	mutex_lock(&smc->clcsock_release_lock);
>+	if (!smc->clcsock) {
>+		mutex_unlock(&smc->clcsock_release_lock);
>+		return -EBADF;
>+	}
> 	if (unlikely(!smc->clcsock->ops->setsockopt))
> 		rc = -EOPNOTSUPP;
> 	else
>@@ -2450,6 +2455,7 @@ static int smc_setsockopt(struct socket *sock, int level, int optname,
> 		sk->sk_err = smc->clcsock->sk->sk_err;
> 		sk_error_report(sk);
> 	}
>+	mutex_unlock(&smc->clcsock_release_lock);
> 
> 	if (optlen < sizeof(int))
> 		return -EINVAL;
>@@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
> 			  char __user *optval, int __user *optlen)
> {
> 	struct smc_sock *smc;
>+	int rc;
> 
> 	smc = smc_sk(sock->sk);
>+	mutex_lock(&smc->clcsock_release_lock);
>+	if (!smc->clcsock) {
>+		mutex_unlock(&smc->clcsock_release_lock);
>+		return -EBADF;
>+	}
> 	/* socket options apply to the CLC socket */
> 	if (unlikely(!smc->clcsock->ops->getsockopt))
Missed a mutex_unlock() here ?

> 		return -EOPNOTSUPP;

>-	return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
>+	rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
> 					     optval, optlen);
>+	mutex_unlock(&smc->clcsock_release_lock);
>+	return rc;
> }
> 
> static int smc_ioctl(struct socket *sock, unsigned int cmd,
>-- 
>1.8.3.1
Wen Gu Jan. 12, 2022, 8:16 a.m. UTC | #6
On 2022/1/12 3:11 pm, dust.li wrote:
> On Mon, Jan 10, 2022 at 05:38:25PM +0800, Wen Gu wrote:
>>
>> This patch tries to fix it by holding clcsock_release_lock and
>> checking whether clcsock has already been released. In case that
>> a crash of the same reason happens in smc_getsockopt(), this patch
>> also checkes smc->clcsock in smc_getsockopt().

>> @@ -2509,13 +2515,21 @@ static int smc_getsockopt(struct socket *sock, int level, int optname,
>> 			  char __user *optval, int __user *optlen)
>> {
>> 	struct smc_sock *smc;
>> +	int rc;
>>
>> 	smc = smc_sk(sock->sk);
>> +	mutex_lock(&smc->clcsock_release_lock);
>> +	if (!smc->clcsock) {
>> +		mutex_unlock(&smc->clcsock_release_lock);
>> +		return -EBADF;
>> +	}
>> 	/* socket options apply to the CLC socket */
>> 	if (unlikely(!smc->clcsock->ops->getsockopt))
> Missed a mutex_unlock() here ?
> 
>> 		return -EOPNOTSUPP;

Thanks for pointing it out. Will add an additional mutex_unlock().

Thanks,
Wen Gu
Karsten Graul Jan. 12, 2022, 9:38 a.m. UTC | #7
On 11/01/2022 17:34, Wen Gu wrote:
> Thanks for your reply.
> 
> On 2022/1/11 6:03 pm, Karsten Graul wrote:
>> On 10/01/2022 10:38, Wen Gu wrote:
>>> We encountered a crash in smc_setsockopt() and it is caused by
>>> accessing smc->clcsock after clcsock was released.
>>>
>>
>> In the switch() the function smc_switch_to_fallback() might be called which also
>> accesses smc->clcsock without further checking. This should also be protected then?
>> Also from all callers of smc_switch_to_fallback() ?
>>
>> There are more uses of smc->clcsock (e.g. smc_bind(), ...), so why does this problem
>> happen in setsockopt() for you only? I suspect it depends on the test case.
>>
> 
> Yes, it depends on the test case. The crash described here only happens one time when
> I run a stress test of nginx/wrk, accompanied with frequent RNIC up/down operations.
> 
> Considering accessing smc->clcsock after its release is an uncommon, low probability
> issue and only happens in setsockopt() in my test, I choce an simple way to fix it, using
> the existing clcsock_release_lock, and only check in smc_setsockopt() and smc_getsockopt().
> 
>> I wonder if it makes sense to check and protect smc->clcsock at all places in the code where
>> it is used... as of now we had no such races like you encountered. But I see that in theory
>> this problem could also happen in other code areas.
>>
> 
> But inspired by your questions, I think maybe we should treat the race as a general problem?
> Do you think it is necessary to find all the potential race related to the clcsock release and
> fix them in a unified approach? like define smc->clcsock as RCU pointer, hold rcu read lock
> before accessing smc->clcsock and call synchronize_rcu() before resetting smc->clcsock? just a rough idea :)
> 
> Or we should decide it later, do some more tests to see the probability of recurrence of this problem?

I like the idea to use RCU with rcu_assign_pointer() to protect that pointer!

Lets go with your initial patch (improved to address the access in smc_switch_to_fallback())
for now because it solves your current problem. 

I put that RCU thing on our list, but if either of us here starts working on that please let the
others know so we don't end up doing parallel work on this. But I doubt that we will be able to start working
on that soon.

Thanks for the good idea!
Wen Gu Jan. 13, 2022, 8:23 a.m. UTC | #8
Thanks for your reply.

On 2022/1/12 5:38 pm, Karsten Graul wrote:
> On 11/01/2022 17:34, Wen Gu wrote:
>> Thanks for your reply.
>>
>> On 2022/1/11 6:03 pm, Karsten Graul wrote:
>>> On 10/01/2022 10:38, Wen Gu wrote:
>>>> We encountered a crash in smc_setsockopt() and it is caused by
>>>> accessing smc->clcsock after clcsock was released.
> 
> I like the idea to use RCU with rcu_assign_pointer() to protect that pointer!
> 
> Lets go with your initial patch (improved to address the access in smc_switch_to_fallback())
> for now because it solves your current problem.
> 

OK, I will improve the patch, adding check before clcsock access in smc_switch_to_fallback()
and return an error (-EBADF) if smc->clcsock is NULL. The caller of smc_switch_to_fallback()
will check the return value to identify whether fallback is possible.

> I put that RCU thing on our list, but if either of us here starts working on that please let the
> others know so we don't end up doing parallel work on this. But I doubt that we will be able to start working
> on that soon.
> 
> Thanks for the good idea!

Thank you! If I start working on the RCU things, I will send a RFC to let you know.

Thanks,
Wen Gu
Wen Gu Jan. 13, 2022, 3:15 p.m. UTC | #9
On 2022/1/12 5:38 pm, Karsten Graul wrote:
> On 11/01/2022 17:34, Wen Gu wrote:
>> Thanks for your reply.
>>
>> On 2022/1/11 6:03 pm, Karsten Graul wrote:
>>> On 10/01/2022 10:38, Wen Gu wrote:
>>>> We encountered a crash in smc_setsockopt() and it is caused by
>>>> accessing smc->clcsock after clcsock was released.
>>>>
>>>
>>> In the switch() the function smc_switch_to_fallback() might be called which also
>>> accesses smc->clcsock without further checking. This should also be protected then?
>>> Also from all callers of smc_switch_to_fallback() ?
>>>
> 
> Lets go with your initial patch (improved to address the access in smc_switch_to_fallback())
> for now because it solves your current problem.
> 

Since the upcoming v2 patch added clcsock check in smc_switch_to_fallback(),
the original subject is inappropriate.

So I sent a new patch named 'net/smc: Transitional solution for clcsock race issue'
instead of a v2.

Thanks,
Wen Gu
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 1c9289f..af423f4 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2441,6 +2441,11 @@  static int smc_setsockopt(struct socket *sock, int level, int optname,
 	/* generic setsockopts reaching us here always apply to the
 	 * CLC socket
 	 */
+	mutex_lock(&smc->clcsock_release_lock);
+	if (!smc->clcsock) {
+		mutex_unlock(&smc->clcsock_release_lock);
+		return -EBADF;
+	}
 	if (unlikely(!smc->clcsock->ops->setsockopt))
 		rc = -EOPNOTSUPP;
 	else
@@ -2450,6 +2455,7 @@  static int smc_setsockopt(struct socket *sock, int level, int optname,
 		sk->sk_err = smc->clcsock->sk->sk_err;
 		sk_error_report(sk);
 	}
+	mutex_unlock(&smc->clcsock_release_lock);
 
 	if (optlen < sizeof(int))
 		return -EINVAL;
@@ -2509,13 +2515,21 @@  static int smc_getsockopt(struct socket *sock, int level, int optname,
 			  char __user *optval, int __user *optlen)
 {
 	struct smc_sock *smc;
+	int rc;
 
 	smc = smc_sk(sock->sk);
+	mutex_lock(&smc->clcsock_release_lock);
+	if (!smc->clcsock) {
+		mutex_unlock(&smc->clcsock_release_lock);
+		return -EBADF;
+	}
 	/* socket options apply to the CLC socket */
 	if (unlikely(!smc->clcsock->ops->getsockopt))
 		return -EOPNOTSUPP;
-	return smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
+	rc = smc->clcsock->ops->getsockopt(smc->clcsock, level, optname,
 					     optval, optlen);
+	mutex_unlock(&smc->clcsock_release_lock);
+	return rc;
 }
 
 static int smc_ioctl(struct socket *sock, unsigned int cmd,