diff mbox series

mptcp: Fix crash due to tcp_tsorted_anchor was initialized before release skb

Message ID 1646821209-35620-1-git-send-email-liyonglong@chinatelecom.cn (mailing list archive)
State Accepted, archived
Commit 9f5c50f2e2b56d493a42e8739a65ef3ed4f6064d
Delegated to: Matthieu Baerts
Headers show
Series mptcp: Fix crash due to tcp_tsorted_anchor was initialized before release skb | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch warning total: 0 errors, 2 warnings, 0 checks, 7 lines checked
matttbe/KVM_Validation__normal success Success! ✅
matttbe/KVM_Validation__debug warning Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join - Critical: Global Timeout ❌

Commit Message

YonglongLi March 9, 2022, 10:20 a.m. UTC
get crash when do pressure test of mptcp:

===========================================================================
dst_release: dst:ffffa06ce6e5c058 refcnt:-1
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffffa06ce6e5c058
PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063
Oops: 0011 [#1] SMP PTI
CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G            E
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014
Call Trace:
 ? skb_release_head_state+0x68/0x100
 ? skb_release_all+0xe/0x30
 ? kfree_skb+0x32/0xa0
 ? mptcp_sendmsg_frag+0x57e/0x750
 ? __mptcp_retrans+0x21b/0x3c0
 ? __switch_to_asm+0x35/0x70
 ? mptcp_worker+0x25e/0x320
 ? process_one_work+0x1a7/0x360
 ? worker_thread+0x30/0x390
 ? create_worker+0x1a0/0x1a0
 ? kthread+0x112/0x130
 ? kthread_flush_work_fn+0x10/0x10
 ? ret_from_fork+0x35/0x40
===========================================================================

in __mptcp_alloc_tx_skb skb was alloced and skb->tcp_tsorted_anchor will be
initialized,  in under memory pressure situation sk_wmem_schedule will
return false and then kfree_skb. In this case skb->_skb_refdst is not null
because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and
kfree_skb will try to release dst and casue crash.

Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
---
 net/mptcp/protocol.c | 1 +
 1 file changed, 1 insertion(+)

Comments

MPTCP CI March 9, 2022, 11:55 a.m. UTC | #1
Hi Yonglong,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal:
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6222883100819456
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6222883100819456/summary/summary.txt

- KVM Validation: debug:
  - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join - Critical: Global Timeout ❌:
  - Task: https://cirrus-ci.com/task/4815508217266176
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4815508217266176/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/7b239139921c

Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Paolo Abeni March 9, 2022, 12:15 p.m. UTC | #2
On Wed, 2022-03-09 at 18:20 +0800, Yonglong Li wrote:
> get crash when do pressure test of mptcp:

Ouch!

> ===========================================================================
> dst_release: dst:ffffa06ce6e5c058 refcnt:-1
> kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
> BUG: unable to handle kernel paging request at ffffa06ce6e5c058
> PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063
> Oops: 0011 [#1] SMP PTI
> CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G            E
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014
> Call Trace:
>  ? skb_release_head_state+0x68/0x100
>  ? skb_release_all+0xe/0x30
>  ? kfree_skb+0x32/0xa0
>  ? mptcp_sendmsg_frag+0x57e/0x750
>  ? __mptcp_retrans+0x21b/0x3c0
>  ? __switch_to_asm+0x35/0x70
>  ? mptcp_worker+0x25e/0x320
>  ? process_one_work+0x1a7/0x360
>  ? worker_thread+0x30/0x390
>  ? create_worker+0x1a0/0x1a0
>  ? kthread+0x112/0x130
>  ? kthread_flush_work_fn+0x10/0x10
>  ? ret_from_fork+0x35/0x40
> ===========================================================================
> 
> in __mptcp_alloc_tx_skb skb was alloced and skb->tcp_tsorted_anchor will be
> initialized,  in under memory pressure situation sk_wmem_schedule will
> return false and then kfree_skb. In this case skb->_skb_refdst is not null
> because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and
> kfree_skb will try to release dst and casue crash.

Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache"


> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
> ---
>  net/mptcp/protocol.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 3cb9752..fbb14df 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1199,6 +1199,7 @@ static struct sk_buff *__mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk, g
>  		tcp_skb_entail(ssk, skb);
>  		return skb;
>  	}
> +	tcp_skb_tsorted_anchor_cleanup(skb);
>  	kfree_skb(skb);
>  	return NULL;
>  }

LGTM!

Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Mat Martineau March 9, 2022, 10:11 p.m. UTC | #3
On Wed, 9 Mar 2022, Paolo Abeni wrote:

> On Wed, 2022-03-09 at 18:20 +0800, Yonglong Li wrote:
>> get crash when do pressure test of mptcp:
>
> Ouch!
>
>> ===========================================================================
>> dst_release: dst:ffffa06ce6e5c058 refcnt:-1
>> kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
>> BUG: unable to handle kernel paging request at ffffa06ce6e5c058
>> PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063
>> Oops: 0011 [#1] SMP PTI
>> CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G            E
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014
>> Call Trace:
>>  ? skb_release_head_state+0x68/0x100
>>  ? skb_release_all+0xe/0x30
>>  ? kfree_skb+0x32/0xa0
>>  ? mptcp_sendmsg_frag+0x57e/0x750
>>  ? __mptcp_retrans+0x21b/0x3c0
>>  ? __switch_to_asm+0x35/0x70
>>  ? mptcp_worker+0x25e/0x320
>>  ? process_one_work+0x1a7/0x360
>>  ? worker_thread+0x30/0x390
>>  ? create_worker+0x1a0/0x1a0
>>  ? kthread+0x112/0x130
>>  ? kthread_flush_work_fn+0x10/0x10
>>  ? ret_from_fork+0x35/0x40
>> ===========================================================================
>>
>> in __mptcp_alloc_tx_skb skb was alloced and skb->tcp_tsorted_anchor will be
>> initialized,  in under memory pressure situation sk_wmem_schedule will
>> return false and then kfree_skb. In this case skb->_skb_refdst is not null
>> because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and
>> kfree_skb will try to release dst and casue crash.
>
> Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache"
>
>
>> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
>> ---
>>  net/mptcp/protocol.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>> index 3cb9752..fbb14df 100644
>> --- a/net/mptcp/protocol.c
>> +++ b/net/mptcp/protocol.c
>> @@ -1199,6 +1199,7 @@ static struct sk_buff *__mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk, g
>>  		tcp_skb_entail(ssk, skb);
>>  		return skb;
>>  	}
>> +	tcp_skb_tsorted_anchor_cleanup(skb);
>>  	kfree_skb(skb);
>>  	return NULL;
>>  }
>
> LGTM!
>
> Reviewed-by: Paolo Abeni <pabeni@redhat.com>

I agree, looks good for mptcp-net (with the recommended Fixes tag)

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

--
Mat Martineau
Intel
Matthieu Baerts March 16, 2022, 3:58 p.m. UTC | #4
Hi Yonglong, Paolo, Mat,

On 09/03/2022 11:20, Yonglong Li wrote:
> get crash when do pressure test of mptcp:
> 

(...)

> 
> in __mptcp_alloc_tx_skb skb was alloced and skb->tcp_tsorted_anchor 
> will be initialized,  in under memory pressure situation 
> sk_wmem_schedule will return false and then kfree_skb. In this case 
> skb->_skb_refdst is not null because_skb_refdst and 
> tcp_tsorted_anchor are stored in the same mem, and kfree_skb will
> try to release dst and casue crash.

Thank you for the patch and the reviews!

(sorry for the delay due to the reworking of the git repo)

Now in our tree with Paolo and Mat's RvB tags (and without typos
detected by 'checkpatch.pl --codespell')

- 9f5c50f2e2b5: mptcp: Fix crash due to tcp_tsorted_anchor was
initialized before release skb
- Results (net-next): 1d99000bd70..b0d3123bdf0
- Results (net): f8a0df6cce2..153502a63ba

Builds and tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20220316T155243
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20220316T155243
https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export
https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export-net

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3cb9752..fbb14df 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1199,6 +1199,7 @@  static struct sk_buff *__mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk, g
 		tcp_skb_entail(ssk, skb);
 		return skb;
 	}
+	tcp_skb_tsorted_anchor_cleanup(skb);
 	kfree_skb(skb);
 	return NULL;
 }