diff mbox series

[mptcp-net,v2] mptcp: fix a NULL pointer result in panic

Message ID 20250319022733.51444-1-yangang@kylinos.cn (mailing list archive)
State Accepted, archived
Commit 40ad50026099f08be42da50061c6c5f5a6afa4e7
Delegated to: Matthieu Baerts
Headers show
Series [mptcp-net,v2] mptcp: fix a NULL pointer result in panic | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch warning total: 0 errors, 3 warnings, 0 checks, 34 lines checked
matttbe/shellcheck success MPTCP selftests files have not been modified
matttbe/KVM_Validation__normal success Success! ✅
matttbe/KVM_Validation__debug success Success! ✅
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ success Success! ✅
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ success Success! ✅

Commit Message

Gang Yan March 19, 2025, 2:27 a.m. UTC
When test valkey in mptcp, the kernel will panic due to the function
'mptcp_can_accept_new_subflow' when the subflow_req->msk is NULL.

This problem shows in kernel 6.11.0-rc3 and 6.14.0-rc4. The message
attached is valkey_test in 6.14.0-rc4:

Call trace:
mptcp_can_accept_new_subflow (./net/mptcp/subflow.c:63 (discriminator 4)) (P)
subflow_syn_recv_sock (./net/mptcp/subflow.c:854)
tcp_check_req (./net/ipv4/tcp_minisocks.c:863)
tcp_v4_rcv (./net/ipv4/tcp_ipv4.c:2268)
ip_protocol_deliver_rcu (./net/ipv4/ip_input.c:207)
ip_local_deliver_finish (./net/ipv4/ip_input.c:234)
ip_local_deliver (./net/ipv4/ip_input.c:254)
ip_rcv_finish (./net/ipv4/ip_input.c:449)
...

According to the debug log, the same req received two SYN-ACK in a very
short time. (When the client retransmits the syn ack dueto multiple reasons.
Even if the packets are transmitted with a relevant time interval, they can be
processed by the server on different CPUs concurrently). The 'subflow_req->msk'
ownership is transferred to subflow in the first, and there will be a risk of a
null pointer dereference here.

This patch can fix this issue by moving the 'subflow_req->msk' under the
`own_req == true` conditional.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gang Yan <yangang@kylinos.cn>

---
Notes:
- v2:
  - The MPTCP and the TCP code protect vs such race using the
    `own_req` boolean. So moving the subflow_req->msk under
    the 'own_req'.
---
 net/mptcp/subflow.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Comments

MPTCP CI March 19, 2025, 3:35 a.m. UTC | #1
Hi Gang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/13937733185

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/904006e09ceb
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=945398


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Paolo Abeni March 19, 2025, 7:38 a.m. UTC | #2
On 3/19/25 3:27 AM, Gang Yan wrote:
> When test valkey in mptcp, the kernel will panic due to the function
> 'mptcp_can_accept_new_subflow' when the subflow_req->msk is NULL.
> 
> This problem shows in kernel 6.11.0-rc3 and 6.14.0-rc4. The message
> attached is valkey_test in 6.14.0-rc4:
> 
> Call trace:
> mptcp_can_accept_new_subflow (./net/mptcp/subflow.c:63 (discriminator 4)) (P)
> subflow_syn_recv_sock (./net/mptcp/subflow.c:854)
> tcp_check_req (./net/ipv4/tcp_minisocks.c:863)
> tcp_v4_rcv (./net/ipv4/tcp_ipv4.c:2268)
> ip_protocol_deliver_rcu (./net/ipv4/ip_input.c:207)
> ip_local_deliver_finish (./net/ipv4/ip_input.c:234)
> ip_local_deliver (./net/ipv4/ip_input.c:254)
> ip_rcv_finish (./net/ipv4/ip_input.c:449)
> ...
> 
> According to the debug log, the same req received two SYN-ACK in a very
> short time. (When the client retransmits the syn ack dueto multiple reasons.
> Even if the packets are transmitted with a relevant time interval, they can be
> processed by the server on different CPUs concurrently). The 'subflow_req->msk'
> ownership is transferred to subflow in the first, and there will be a risk of a
> null pointer dereference here.
> 
> This patch can fix this issue by moving the 'subflow_req->msk' under the
> `own_req == true` conditional.
> 
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Gang Yan <yangang@kylinos.cn>

The code LGTM.

Possibly the commit message could be expanded noting that the !msk check
in subflow_hmac_valid() could be dropped, because there is already the
same check under the own_req mpj branch.

Also this needs a Fixes tag:

Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in
use")

Matt: could you please adjust the above while applying the patch?

Thanks!

Paolo
Matthieu Baerts March 19, 2025, 5:47 p.m. UTC | #3
Hi Paolo,

On 19/03/2025 08:38, Paolo Abeni wrote:
> On 3/19/25 3:27 AM, Gang Yan wrote:
>> When test valkey in mptcp, the kernel will panic due to the function
>> 'mptcp_can_accept_new_subflow' when the subflow_req->msk is NULL.
>>
>> This problem shows in kernel 6.11.0-rc3 and 6.14.0-rc4. The message
>> attached is valkey_test in 6.14.0-rc4:
>>
>> Call trace:
>> mptcp_can_accept_new_subflow (./net/mptcp/subflow.c:63 (discriminator 4)) (P)
>> subflow_syn_recv_sock (./net/mptcp/subflow.c:854)
>> tcp_check_req (./net/ipv4/tcp_minisocks.c:863)
>> tcp_v4_rcv (./net/ipv4/tcp_ipv4.c:2268)
>> ip_protocol_deliver_rcu (./net/ipv4/ip_input.c:207)
>> ip_local_deliver_finish (./net/ipv4/ip_input.c:234)
>> ip_local_deliver (./net/ipv4/ip_input.c:254)
>> ip_rcv_finish (./net/ipv4/ip_input.c:449)
>> ...
>>
>> According to the debug log, the same req received two SYN-ACK in a very
>> short time. (When the client retransmits the syn ack dueto multiple reasons.
>> Even if the packets are transmitted with a relevant time interval, they can be
>> processed by the server on different CPUs concurrently). The 'subflow_req->msk'
>> ownership is transferred to subflow in the first, and there will be a risk of a
>> null pointer dereference here.
>>
>> This patch can fix this issue by moving the 'subflow_req->msk' under the
>> `own_req == true` conditional.
>>
>> Suggested-by: Paolo Abeni <pabeni@redhat.com>
>> Signed-off-by: Gang Yan <yangang@kylinos.cn>
> 
> The code LGTM.

Me too, thank you both for having looked at that!

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

> Possibly the commit message could be expanded noting that the !msk check
> in subflow_hmac_valid() could be dropped, because there is already the
> same check under the own_req mpj branch.
> 
> Also this needs a Fixes tag:
> 
> Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in
> use")
> 
> Matt: could you please adjust the above while applying the patch?

Sure, I will do that!

Cheers,
Matt
Matthieu Baerts March 19, 2025, 5:56 p.m. UTC | #4
Hi Paolo, Gang,

On 19/03/2025 18:47, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 19/03/2025 08:38, Paolo Abeni wrote:
>> On 3/19/25 3:27 AM, Gang Yan wrote:
>>> When test valkey in mptcp, the kernel will panic due to the function
>>> 'mptcp_can_accept_new_subflow' when the subflow_req->msk is NULL.
>>>
>>> This problem shows in kernel 6.11.0-rc3 and 6.14.0-rc4. The message
>>> attached is valkey_test in 6.14.0-rc4:
>>>
>>> Call trace:
>>> mptcp_can_accept_new_subflow (./net/mptcp/subflow.c:63 (discriminator 4)) (P)
>>> subflow_syn_recv_sock (./net/mptcp/subflow.c:854)
>>> tcp_check_req (./net/ipv4/tcp_minisocks.c:863)
>>> tcp_v4_rcv (./net/ipv4/tcp_ipv4.c:2268)
>>> ip_protocol_deliver_rcu (./net/ipv4/ip_input.c:207)
>>> ip_local_deliver_finish (./net/ipv4/ip_input.c:234)
>>> ip_local_deliver (./net/ipv4/ip_input.c:254)
>>> ip_rcv_finish (./net/ipv4/ip_input.c:449)
>>> ...
>>>
>>> According to the debug log, the same req received two SYN-ACK in a very
>>> short time. (When the client retransmits the syn ack dueto multiple reasons.
>>> Even if the packets are transmitted with a relevant time interval, they can be
>>> processed by the server on different CPUs concurrently). The 'subflow_req->msk'
>>> ownership is transferred to subflow in the first, and there will be a risk of a
>>> null pointer dereference here.
>>>
>>> This patch can fix this issue by moving the 'subflow_req->msk' under the
>>> `own_req == true` conditional.
>>>
>>> Suggested-by: Paolo Abeni <pabeni@redhat.com>
>>> Signed-off-by: Gang Yan <yangang@kylinos.cn>
>>
>> The code LGTM.
> 
> Me too, thank you both for having looked at that!

Now in our tree (fix for net):

New patches for t/upstream-net and t/upstream:
- 40ad50026099: mptcp: fix NULL pointer in can_accept_new_subflow
- Results: 15fd746f34c5..dba54e66e69e (export-net)
- Results: 2bea3081157b..332a74c2a7d4 (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/06f515aab4fa9d84ba4dfe81b17aa7d77ea22b4d/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/83da8f611c90d09ade3f1fc8b04302d272c8f7b9/checks

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index efe8d86496db..409bd415ef1d 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -754,8 +754,6 @@  static bool subflow_hmac_valid(const struct request_sock *req,
 
 	subflow_req = mptcp_subflow_rsk(req);
 	msk = subflow_req->msk;
-	if (!msk)
-		return false;
 
 	subflow_generate_hmac(READ_ONCE(msk->remote_key),
 			      READ_ONCE(msk->local_key),
@@ -850,12 +848,8 @@  static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 
 	} else if (subflow_req->mp_join) {
 		mptcp_get_options(skb, &mp_opt);
-		if (!(mp_opt.suboptions & OPTION_MPTCP_MPJ_ACK) ||
-		    !subflow_hmac_valid(req, &mp_opt) ||
-		    !mptcp_can_accept_new_subflow(subflow_req->msk)) {
-			SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC);
+		if (!(mp_opt.suboptions & OPTION_MPTCP_MPJ_ACK))
 			fallback = true;
-		}
 	}
 
 create_child:
@@ -905,6 +899,13 @@  static struct sock *subflow_syn_recv_sock(const struct sock *sk,
 				goto dispose_child;
 			}
 
+			if (!subflow_hmac_valid(req, &mp_opt) ||
+			    !mptcp_can_accept_new_subflow(subflow_req->msk)) {
+				SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC);
+				subflow_add_reset_reason(skb, MPTCP_RST_EPROHIBIT);
+				goto dispose_child;
+			}
+
 			/* move the msk reference ownership to the subflow */
 			subflow_req->msk = NULL;
 			ctx->conn = (struct sock *)owner;