From patchwork Thu Jul 4 03:57:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13723174 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 368C3171A7 for ; Thu, 4 Jul 2024 03:57:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720065443; cv=none; b=nrEg9oMhsF6lbhgosqvjbzzKWzFnv50xSl7P7VjlOeWRd/UGnmPpI0feqQHvGp9xZeGfeyi9aylpt3/I73MEic4c7YGWJYQLidbK6xtiJS0I1p/TmuMIJzxAGjvPVDxtwSN76DKeCPzk67S1n+QZTFXHX+0MPm4hBR7nHJpHrDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720065443; c=relaxed/simple; bh=frSa4FjGMnzEnn1TJcxEWmt1otKdFDADdNOJNqw3yhY=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=WnTjPd96M6iEd91IraK7pM8o8rMn3veQHHaLqImXu/iNuqR194rdVBHt7OGMvL8k2kYFXYa61r0trrMKS3iCLkbyUtyqYtZHGuKT2vaALxrhmmPm4oqGwDfjzU4luLUmTYeTEwdJVz9fUVc/qTO1U1t2WatAtW3Yl56y7mKPtOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=PlwkmXVu; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="PlwkmXVu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1720065441; x=1751601441; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=dbW/7r07K5PsqG6xe8d+qsTvorcpN1h7gtsUbvkR+B0=; b=PlwkmXVu1M9poqAucZcQLfVvaVoqfYdDyTTN3B/f8MfjR/vPc/R68a9K YqCXf1lZLACn0/ptDjAPHPts2/fHGuVDrdxWfnATqtcrVHk66NI/DhLp/ xyRHnf8BPfa6nzphmHeN079rI5J0JTYeWvGquolpEOfJEQFXb0tXuwvgH g=; X-IronPort-AV: E=Sophos;i="6.09,183,1716249600"; d="scan'208";a="417662295" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jul 2024 03:57:18 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:25539] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.53.156:2525] with esmtp (Farcaster) id 7a1a12fd-a664-4043-8366-6bb49824388d; Thu, 4 Jul 2024 03:57:17 +0000 (UTC) X-Farcaster-Flow-ID: 7a1a12fd-a664-4043-8366-6bb49824388d Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Thu, 4 Jul 2024 03:57:16 +0000 Received: from 88665a182662.ant.amazon.com.com (10.119.149.6) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Thu, 4 Jul 2024 03:57:14 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , David Ahern CC: Lawrence Brakmo , Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v1 net] tcp: Don't drop SYN+ACK for simultaneous connect(). Date: Wed, 3 Jul 2024 20:57:03 -0700 Message-ID: <20240704035703.95065-1-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D037UWC003.ant.amazon.com (10.13.139.231) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org RFC 9293 states that in the case of simultaneous connect(), the connection gets established when SYN+ACK is received. [0] TCP Peer A TCP Peer B 1. CLOSED CLOSED 2. SYN-SENT --> ... 3. SYN-RECEIVED <-- <-- SYN-SENT 4. ... --> SYN-RECEIVED 5. SYN-RECEIVED --> ... 6. ESTABLISHED <-- <-- SYN-RECEIVED 7. ... --> ESTABLISHED However, since commit 0c24604b68fc ("tcp: implement RFC 5961 4.2"), such a SYN+ACK is dropped in tcp_validate_incoming() and responded with Challenge ACK. For example, the write() syscall in the following packetdrill script fails with -EAGAIN, and wrong SNMP stats get incremented. 0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress) +0 > S 0:0(0) +0 < S 0:0(0) win 1000 +0 > S. 0:0(0) ack 1 +0 < S. 0:0(0) ack 1 win 1000 +0 write(3, ..., 100) = 100 +0 > P. 1:101(100) ack 1 -- # packetdrill cross-synack.pkt cross-synack.pkt:13: runtime error in write call: Expected result 100 but got -1 with errno 11 (Resource temporarily unavailable) # nstat ... TcpExtTCPChallengeACK 1 0.0 TcpExtTCPSYNChallenge 1 0.0 That said, this is no big deal because the Challenge ACK finally let the connection state transition to TCP_ESTABLISHED in both directions. If the peer is not using Linux, there might be a small latency before ACK though. The problem is that bpf_skops_established() is triggered by the Challenge ACK instead of SYN+ACK. This causes the bpf prog to miss the chance to check if the peer supports a TCP option that is expected to be exchanged in SYN and SYN+ACK. Let's accept a bare SYN+ACK for non-TFO TCP_SYN_RECV sockets to avoid such a situation. Link: https://www.rfc-editor.org/rfc/rfc9293.html#section-3.5-7 [0] Fixes: 9872a4bde31b ("bpf: Add TCP connection BPF callbacks") Signed-off-by: Kuniyuki Iwashima --- net/ipv4/tcp_input.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 77294fd5fd3e..70595009bb58 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5980,6 +5980,11 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { + if (sk->sk_state == TCP_SYN_RECV && !tp->syn_fastopen && th->ack && + TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && + TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && + TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) + goto pass; syn_challenge: if (syn_inerr) TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS); @@ -5990,7 +5995,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, } bpf_skops_parse_hdr(sk, skb); - +pass: return true; discard: