From patchwork Tue Mar 18 00:26:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020090 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2040.outbound.protection.outlook.com [40.107.22.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 357481119A; Tue, 18 Mar 2025 00:27:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.22.40 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257671; cv=fail; b=YhhPhzxLv24mHL8qQO3beMuHKf43q2FpMU23moo0cGBk8+WXqEkwhJsz8ShXuL6oEBjStTT2DjY5IqBbXFEqO8fxWcVygQsLOYwxQbSjEF57f7FHt0Ns8FiiceH9CcJkY1X6b60NyS4G3ReORZWXvXeGtSMHj2aoPDUUg5vaK90= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257671; c=relaxed/simple; bh=I/5cWdvN6eEd3lEuewEFqflav7KgszSP7X6bWuPgBpY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=dEX/LfF2jVqlfhaQ/SFpOHZfG2cn8NXqJGOfWyEnWQREWj8IY0df/3DotEZe5Pj4Fto3tsP3UyuigFFoKg7oBpRLgsqJuOUkLiKKfZgWL1K4CTzMvdKQKbgfKzQ7lDWD5pbCrplBns3PfYtj2lBa6Sn11OERJtB+kp4/2AYbHUs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=ccZQue5A; arc=fail smtp.client-ip=40.107.22.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="ccZQue5A" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=VYXETSbiZ6G2JrgClBZZoJd2203r/1ck0EimE4pQsJIWZwJwW7snHhwrMqQ2YUe7ut8wqNjAcLYIFDWOBYjIG0hGx85xo+RduNdSZnG5S/msgigHVePKNPpRF3UrLMXOxLosJXWmDBx9FuXb8HWAdXLJKkPDpXqWxJxG8OdplJkcdWtgq9/hrzMFgvt7uoawCI+XbYzBxfdPb2U6K2xEBxFKJVmHtSvIkP9k3fB6fG+KRETghRwuQp8vvQ8KCwVsxgU3CcAYL1iKAuDRm40OXXyz3lRDLMAbA3fj/nP8/uPiRiNhtxLAmn5+s49ee7Faxbgm/Ex4QtrdVZKy8kUtfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qSmzk3vr/+lmZuoLWRV8sXEH2FYqAyZicyWsAkprjgs=; b=YV4QmFyWgAdSAqadQazctNz9N2aLE7q/jMh6dBOUImrFtYPxjEkuIXEIbRJlhEVxsowxoWoaiINQSN4ICNT8VRMgQIjoCzNf7ZgYyxBMdehTXL5Z0Duvji6LKffTbCxo8VluWpO7fBuE3kt1xdD7FdCpDKpp2TDVtzLgCRL3ux4TqvyvKRq9BHkxD3809yqXgskgMwyuDlCHMeS4+CGhvHFFxgR++iwycwfA31PsFJV+Q49W2RfLm8dQE5+eAXEjadCAaBqrkM25LVkODuENR55/YssPTRzanPh8Ccqy236htvs39hf0NTv4fGfGBLLSFHiBhSafJCi0o4e2V8gX2A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qSmzk3vr/+lmZuoLWRV8sXEH2FYqAyZicyWsAkprjgs=; b=ccZQue5AcKWRo8ROl4RzoDOGBiKrDNdn/4rkgiKo9YlBnsDER9jHNey1pmrLy9+SWAes6+ksM5TOa3tB3/nwUOlTza+uGUjTUwFjBurisFnPeDlBzSQx+P+uzV7Uy1KBVUSk5Vz8em+wNz7wCpeKVckZ3DHNUbrroNlvsGaJA76GnlDq0Xvr33ZZzO2XcJ2GQDHzP2O+yw22u/YIf1FnbKGXBpGik4R2Uz4hucz7NxSK6DHCn795dS/yWOYco4VGhJVen0HaiqtZtY5rO5cb5X/u/62LJG89tTRGcATwsfcX1p/y9f993XhkTh7eWsqDX6z02vBHOOm5uPwnMNFqew== Received: from AM0PR04CA0069.eurprd04.prod.outlook.com (2603:10a6:208:1::46) by PR3PR07MB6633.eurprd07.prod.outlook.com (2603:10a6:102:6c::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:46 +0000 Received: from AMS0EPF000001B7.eurprd05.prod.outlook.com (2603:10a6:208:1:cafe::7) by AM0PR04CA0069.outlook.office365.com (2603:10a6:208:1::46) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.34 via Frontend Transport; Tue, 18 Mar 2025 00:27:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AMS0EPF000001B7.mail.protection.outlook.com (10.167.16.171) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:45 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBh024935; Tue, 18 Mar 2025 00:27:53 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chai-Yu Chang Subject: [PATCH v2 net-next 01/15] tcp: fast path functions later Date: Tue, 18 Mar 2025 01:26:56 +0100 Message-Id: <20250318002710.29483-2-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AMS0EPF000001B7:EE_|PR3PR07MB6633:EE_ X-MS-Office365-Filtering-Correlation-Id: 47772a8a-f667-448c-fec4-08dd65b3b494 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?WucxUQdj4mbywsqz/aLfuVkEf05ekDt?= =?utf-8?q?N/EIVSgVCiMIG7zLCtD5dBBGj/VTAdOlUsIwN6ybwfcZ4FfKIDaDGkUmfxsilclhN?= =?utf-8?q?fxko6YXaqDgUsCS1szzxXWPetyHRSNMHn57/e4VEgY6zJKdp4kOpVHyzyXwSIqWYf?= =?utf-8?q?g2sBrXp05ff87e+Vr4PjRrq5VgQyJt9bTepNDFbrtb2YiWFxajoTp7Ng6RRC/8hsc?= =?utf-8?q?JneTgF4yCq9BZHa5P8LXXyr3P0PLE76vJuJnu6LCb78bQeguBkmgwAB6Ox8ALg7rK?= =?utf-8?q?CBkyPfO4kFuSk8slkdXkJ2bP/Gl1YPwCOd/Iy/HJ703jZ3FwSqTcVSEvHTTazBFGX?= =?utf-8?q?mu3y/JmpoTpR1nhRVibBNVnTqJz72hOeWZJduLSHzhg5XZuIeNo8KlbOe33ZFbcRR?= =?utf-8?q?iErX8TS3LRXtv3WWbzCQYPmYoGRjNIHwHUynymekXdf7GbDPY2fMHKvmrAy8VcDZA?= =?utf-8?q?JzkYS5IagG58JFY8yQjiIG8mWEYe+lDoWWLjIRrmzNBA4qtjLwlHwlePKWBWU9Bnz?= =?utf-8?q?LZCn9CY6pauwQ7jfD1BnkBv6B9FXzgRnrJv7PFR031IT0KKqt9Id6JrQiZLxUCCuK?= =?utf-8?q?+XhsYmFAbI8ouFaHMEor03d4n0hWQRzP8t/pS0CbM8rpGnmROfsY/KNpaih6iXKnF?= =?utf-8?q?F8ZScBDmFMCO9fYMqy5XL7deamlxNh68uN8SzLVTXLKP0fZS+jtDCvh7gOkIEhYdj?= =?utf-8?q?sd1ahfM9gXgoeGMP0tqmcVjXSc5bE6Be191/Djb25SxJ4ZTytOXm05cXv9gU6JJks?= =?utf-8?q?T0hMcz9843YAIPDXUzDHlMotEemr0kHyhRN576uj94TkHn6nCN+F0W1wGk/W4PQOU?= =?utf-8?q?mfvCXNOr3nMsm46NMwJhYhqGIUZr+G1U4gZ+NjCDfIo5aWsKwLQ7si049c454hebg?= =?utf-8?q?Q6K0doAZhMY+P4b8xi2dcIVUu1hl3ADRBauowMfqd7oS+dFhhpgIjA/wwnud2NYss?= =?utf-8?q?LNmoFPqjmR7VN/kqrGGFb7eQkOp4hzMK+EppT2u++QpsyKZtW18Jak8zuJbbE2eM1?= =?utf-8?q?iYmU/8BsZdXMTn1o/qP75J5xM3QKt0p3MmTMNMc0OTnRNtT8OH5zAzbyqp/KxA9Hn?= =?utf-8?q?AwVWqwPfPR1BtxuLyzfFQtndTq1lQ09a/LmvxYi/dqOufaATTHEasWrLII0g2b3L8?= =?utf-8?q?o9cxP+USBunT1Wwm//WDYAPb1agoy5EqQWj33FLpgb278DgpyvaFFgcDNnp+D8TfW?= =?utf-8?q?FuQ9v2GM2wIe6X446be155b7eVO1djVvD0wI1dRo5Abu3yB/jNi2fHQLL5sXLdnpk?= =?utf-8?q?FuEfx1VzeYOKz8gVJQ0tS/OmIt3x5sS9O2WvCHaTxQ/LQqpYpAHCqGBBTB458awXZ?= =?utf-8?q?rFEtvIGcWad8QhUFqkOPUGvGEEBrhhw1VBGHPEF/+UVWRHkyD5q69DIS1rb7YAXkv?= =?utf-8?q?wmCW5boFpXVWoom2J42os94ZwmUijG6zrqSmqTiiUtNk3KthbgGxO4=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:45.8911 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 47772a8a-f667-448c-fec4-08dd65b3b494 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001B7.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR07MB6633 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen The following patch will use tcp_ecn_mode_accecn(), TCP_ACCECN_CEP_INIT_OFFSET, TCP_ACCECN_CEP_ACE_MASK in __tcp_fast_path_on() to make new flag for AccECN. No functional changes. Signed-off-by: Ilpo Järvinen Signed-off-by: Chai-Yu Chang --- include/net/tcp.h | 54 +++++++++++++++++++++++------------------------ 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index d08fbf90495d..830db65e5487 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -811,33 +811,6 @@ static inline u32 __tcp_set_rto(const struct tcp_sock *tp) return usecs_to_jiffies((tp->srtt_us >> 3) + tp->rttvar_us); } -static inline void __tcp_fast_path_on(struct tcp_sock *tp, u32 snd_wnd) -{ - /* mptcp hooks are only on the slow path */ - if (sk_is_mptcp((struct sock *)tp)) - return; - - tp->pred_flags = htonl((tp->tcp_header_len << 26) | - ntohl(TCP_FLAG_ACK) | - snd_wnd); -} - -static inline void tcp_fast_path_on(struct tcp_sock *tp) -{ - __tcp_fast_path_on(tp, tp->snd_wnd >> tp->rx_opt.snd_wscale); -} - -static inline void tcp_fast_path_check(struct sock *sk) -{ - struct tcp_sock *tp = tcp_sk(sk); - - if (RB_EMPTY_ROOT(&tp->out_of_order_queue) && - tp->rcv_wnd && - atomic_read(&sk->sk_rmem_alloc) < sk->sk_rcvbuf && - !tp->urg_data) - tcp_fast_path_on(tp); -} - u32 tcp_delack_max(const struct sock *sk); /* Compute the actual rto_min value */ @@ -1797,6 +1770,33 @@ static inline bool tcp_paws_reject(const struct tcp_options_received *rx_opt, return true; } +static inline void __tcp_fast_path_on(struct tcp_sock *tp, u32 snd_wnd) +{ + /* mptcp hooks are only on the slow path */ + if (sk_is_mptcp((struct sock *)tp)) + return; + + tp->pred_flags = htonl((tp->tcp_header_len << 26) | + ntohl(TCP_FLAG_ACK) | + snd_wnd); +} + +static inline void tcp_fast_path_on(struct tcp_sock *tp) +{ + __tcp_fast_path_on(tp, tp->snd_wnd >> tp->rx_opt.snd_wscale); +} + +static inline void tcp_fast_path_check(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (RB_EMPTY_ROOT(&tp->out_of_order_queue) && + tp->rcv_wnd && + atomic_read(&sk->sk_rmem_alloc) < sk->sk_rcvbuf && + !tp->urg_data) + tcp_fast_path_on(tp); +} + bool tcp_oow_rate_limited(struct net *net, const struct sk_buff *skb, int mib_idx, u32 *last_oow_ack_time); From patchwork Tue Mar 18 00:26:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020091 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2080.outbound.protection.outlook.com [40.107.20.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D66315AF6; Tue, 18 Mar 2025 00:27:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.20.80 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257673; cv=fail; b=ob1GBX5/QnLiGw9BLNhZwFlNRYm990/Vy+ipx8GVABc6De1cKpMWPJisBRhZK36NP5AzEsUNHB7anv9xnqb2hgXJjU0t3ZGBrH/wXu/x6u3R22+B+ULn8jbc2URJlf5Sd+blq7l0hG+r89KjVwTQ+EelaptctKlqMA4XXqirmgI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257673; c=relaxed/simple; bh=a1NQRfa9dVcqvUOHGRW8NVzGEoGbqmvAeN8tjExGJKQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=bLL2EbKMKr0yQWBx8JtKzVTaCe4QAL2wM2sQ+CZn8qKApREEFhbxKFII/+s6mgv5sihlNRyIdV3u+LMfYmJbvfCEcJprW82hbUzl8GjhtKxjcVghlDigeShz75VDiqdXbfRe2WZl5Pe3mw2e2gcUu/pdFec42BtG5wfxkEu51aU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=r3wdl8jg; arc=fail smtp.client-ip=40.107.20.80 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="r3wdl8jg" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Emu3NWVFYmnkqB/Yrfd0Ldi2nyifAnIdMaeqIzo9nAFzfkonjKWDKuey1VR2E12J76FvhjqFF37rl5Tqrm+whKdpwWBJIPCAgKCQacqAqtZ3MX1SsfvhTqvNHmVHBhVJdN6a6Yoc5gYs2DZ4u6rBG0DK5yTGOv9/wVS6xRO5SLDsvjwPQDnc4qUQoNgyJIqtMXf+8g0ulm+TR76/l185qkujry5gDXDFdYOz8T66jsR0z0rzYDWdi0XcRuZhoXgmpf0CGF1xtFXf/g/XIM3qm/uCcqJAVzhPy2jXYqy1SoFs1B3acPzbnw2hOl2V55PSUPnN9r8IjzL7MZE0EQaSeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FyZtOjeyLMQPNv74yecIOz4up/0Lt/pLlFRRh9khBMk=; b=G5ym9wihiQAr6s99yjv4qH6iusganMeMhSO42h6eXGh/mglos/cbmt5wIfBXW0Vtk+47MBG+u0Nws/EPWgLjEvn5AJDP+MjvYTFuwMUQp6EARazOmbBnbZ1A8pTOEEsm4Q2yK2AOmn9HvO2ENbWcRYiQ0EzkXr4KYjm95Fx37BjLElbEefVoWj2v9Sl+UxpqcEVg3DBk41WlC82f8zKmGN0e1mjeGJv9UvGCPbs+wDX2QierStJ6kbfEbU0ajGlURTdPSE/lIwQzRGjX/d03egLBFTnxvKbUjwMINQaUOO9Z84ezxNIknX98hVQ0cBL4Szm3RxDF24O2VrxTDjkLTg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FyZtOjeyLMQPNv74yecIOz4up/0Lt/pLlFRRh9khBMk=; b=r3wdl8jgSCAt7/+Wfn30JIVPDjRbV/LKWgtuIJ51d/HNe7D6z3giHhHqNSDmTEnDLZZ2ztACeJgb/9Y/3R5oESLPOV2ZIrcV2oubdAG8Zvt5B3SpU8FQNPnj0zuRu824ae2vJT1CWXnOiLafrfFuv0YgWghsqtINdqK6gMiKXRMVhc0//ZOidifc5HTpDVlXY4+GIFhZyTQ7pdh7KBG+rFHqJu5mKjhVr8uF/KcipCIlo97zc33/1D+GCBUtrSDlSz+3liSWWHDArYIJzIMC81J9TWVvml3oLnCUc9Y7Ff6jsBEX97QzFdOFIH1U9aoexZIcAef9tdZCIgZqndYIqg== Received: from DUZPR01CA0243.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b5::11) by DB8PR07MB6444.eurprd07.prod.outlook.com (2603:10a6:10:13d::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:48 +0000 Received: from DB5PEPF00014B91.eurprd02.prod.outlook.com (2603:10a6:10:4b5:cafe::33) by DUZPR01CA0243.outlook.office365.com (2603:10a6:10:4b5::11) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB5PEPF00014B91.mail.protection.outlook.com (10.167.8.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:48 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBi024935; Tue, 18 Mar 2025 00:27:55 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Olivier Tilmans , Chia-Yu Chang Subject: [PATCH v2 net-next 02/15] tcp: AccECN core Date: Tue, 18 Mar 2025 01:26:57 +0100 Message-Id: <20250318002710.29483-3-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B91:EE_|DB8PR07MB6444:EE_ X-MS-Office365-Filtering-Correlation-Id: 21c6f6f6-f0d6-46c2-d070-08dd65b3b609 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?4l/JMpXFx6fLwj8SsF5DIoKSZQNj1Q2?= =?utf-8?q?ClA9cnXQmQpVCkLH7ZCq968c8yhC8w1gGiLRDyPCGPsYi+6obWTZ+n5d8aIJLF1P8?= =?utf-8?q?e8sBieKFx7IqaUluy1cj9bYAvP0M9QZZoKz1jkGGe/2q3u2MQ1ZPtfZA12ru/5uMh?= =?utf-8?q?uz30RKZgksZCmR9GnS5JXv8eqqGZ0VsiXFKiUIMy1BpVx1U7O3bOXttmD8P2VSqCj?= =?utf-8?q?isejyrQyXpxzA/LHtTCAZfVFZHlK841DLGeCN/ThRm4sZ+1JoLeLDn8nUpGij1kRn?= =?utf-8?q?3zJgZ/IVwJ6Cbz/u7pk39wiyG8WdJIH+hbEC4Y/C1OJ6SNa6R2sLDWZfHKZCcfrKG?= =?utf-8?q?BajiB+2ZfbphJeHpfg4LUpPrrLU53uG8bBgtvcq8vCsntDsNp2aQvN5bmuiPfme2F?= =?utf-8?q?RTMKGbi3ga+aILqerjyioi7Y4dKWDldOWATvQmmDGwLKMUwRYC2D0YKv0vvL2dKH3?= =?utf-8?q?w3d8YTccGjD6Hfx5jA920K3wpj5kFv1hW9oS1EOPcaEQLSiLJyGMdJNX63BE62UDt?= =?utf-8?q?7Q4LmcqAPySykXigkUc0ZO8KMUInNmU/Mh415aM2fFRTKUvP5mxeC73RtlSwhu4Y0?= =?utf-8?q?b1M7nIMWkCf1Qr9rVgsxpg98EYfUBRKXA7FTtEU42EF9/2Xv3XGFMtgOl/ss9ZUJ/?= =?utf-8?q?tPmT6v+06onoeCLAL3qD+d1DtkyQLiSfiVgBBSvMZ3etwHuhFZ/5Fnao4txqgqFy2?= =?utf-8?q?qnIa+h3rqHDe5qM5ZCDXtSsGyDRIU31/n0VWk90Xw/CpnkJtyHzzfCKoTkEfb7GIf?= =?utf-8?q?XAPX2gsek+dZdm5xmc+a9ZUCxVdNKXqfLouYsxHz6iI2H2uv6CuJTzGguO5vfbk/q?= =?utf-8?q?PY/k6+l1rXFeP73UeVbumc2w9exb3DApO1/GwE4EO1d3hQ6c9AYDk3EQpHRgyjSYH?= =?utf-8?q?grx9t18/MV/DlDcDcVNzwHVTLrZnQnrhqkXeu3G8nghE0MtY8AMZ7bnf/N6UFaIg3?= =?utf-8?q?p12Wn9VuNZF3PmYuWQsU/wJAuTG/Hnsuy7K3hMy65dSOI0EKWJyekrXCwkVDJ9bUM?= =?utf-8?q?fXMLms77F4f7W1byHLFY9CkUlU2z1/KsVwzRSzrQlyfuFVGpFLLzAh7q546WYaAo3?= =?utf-8?q?bfxc/jH+EJxXH8CKYSHLyiinSD0nT8MVzocC2A7Wl4pIHVho/niXRBZ6LLY6DUZxh?= =?utf-8?q?XTynqTzyMxPmcYYp7c4EZJgtVVf9eTkPPUgz92IiNBpigEc2shoKQIq7pmUdoHDsk?= =?utf-8?q?318RhGR9NsulCfi1flRp1i7vYNO+5zeOm0XXiemszH4LgK+K8mFCQFPfuvs/ncald?= =?utf-8?q?cQ/zhl0wWHaM2MllCTLnjKV/hOG0055f4Lvt29x/l+bVHAoaADNaxmfAEoSAIV+2M?= =?utf-8?q?FEKSd2pZ+8DM5pGYi9i5cUg6daTXaCh7RamwV1mPphuSSQ46Txb1AE9GwvuIw4WGi?= =?utf-8?q?MEsQQlIrBNztjnzZDsBJI+YI1xKJ2GDO55/4YrZqRnDq8/ZNjwVa9YtHRvFoqhduB?= =?utf-8?q?T2jgdtgmpL?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:48.3032 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 21c6f6f6-f0d6-46c2-d070-08dd65b3b609 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B91.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR07MB6444 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen This change implements Accurate ECN without negotiation and AccECN Option (that will be added by later changes). Based on AccECN specifications: https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt Accurate ECN allows feeding back the number of CE (congestion experienced) marks accurately to the sender in contrast to RFC3168 ECN that can only signal one marks-seen-yes/no per RTT. Congestion control algorithms can take advantage of the accurate ECN information to fine-tune their congestion response to avoid drastic rate reduction when only mild congestion is encountered. With Accurate ECN, tp->received_ce (r.cep in AccECN spec) keeps track of how many segments have arrived with a CE mark. Accurate ECN uses ACE field (ECE, CWR, AE) to communicate the value back to the sender which updates tp->delivered_ce (s.cep) based on the feedback. This signalling channel is lossy when ACE field overflow occurs. Conservative strategy is selected here to deal with the ACE overflow, however, some strategies using the AccECN option later in the overall patchset mitigate against false overflows detected. The ACE field values on the wire are offset by TCP_ACCECN_CEP_INIT_OFFSET. Delivered_ce/received_ce count the real CE marks rather than forcing all downstream users to adapt to the wire offset. Co-developed-by: Olivier Tilmans Signed-off-by: Olivier Tilmans Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 3 ++ include/net/tcp.h | 26 +++++++++ net/ipv4/tcp.c | 4 +- net/ipv4/tcp_input.c | 121 +++++++++++++++++++++++++++++++++++++----- net/ipv4/tcp_output.c | 21 +++++++- 5 files changed, 160 insertions(+), 15 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 159b2c59eb62..5cc6ecfccb17 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -295,6 +295,9 @@ struct tcp_sock { u32 snd_up; /* Urgent pointer */ u32 delivered; /* Total data packets delivered incl. rexmits */ u32 delivered_ce; /* Like the above but only ECE marked packets */ + u32 received_ce; /* Like the above but for rcvd CE marked pkts */ + u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ + unused2:4; u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ /* diff --git a/include/net/tcp.h b/include/net/tcp.h index 830db65e5487..d8ac90ef391f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -415,6 +415,11 @@ static inline void tcp_ecn_mode_set(struct tcp_sock *tp, u8 mode) tp->ecn_flags |= mode; } +static inline u8 tcp_accecn_ace(const struct tcphdr *th) +{ + return (th->ae << 2) | (th->cwr << 1) | th->ece; +} + enum tcp_tw_status { TCP_TW_SUCCESS = 0, TCP_TW_RST = 1, @@ -963,6 +968,20 @@ static inline u32 tcp_rsk_tsval(const struct tcp_request_sock *treq) #define TCPHDR_ACE (TCPHDR_ECE | TCPHDR_CWR | TCPHDR_AE) #define TCPHDR_SYN_ECN (TCPHDR_SYN | TCPHDR_ECE | TCPHDR_CWR) +#define TCP_ACCECN_CEP_ACE_MASK 0x7 +#define TCP_ACCECN_ACE_MAX_DELTA 6 + +/* To avoid/detect middlebox interference, not all counters start at 0. + * See draft-ietf-tcpm-accurate-ecn for the latest values. + */ +#define TCP_ACCECN_CEP_INIT_OFFSET 5 + +static inline void tcp_accecn_init_counters(struct tcp_sock *tp) +{ + tp->received_ce = 0; + tp->received_ce_pending = 0; +} + /* State flags for sacked in struct tcp_skb_cb */ enum tcp_skb_cb_sacked_flags { TCPCB_SACKED_ACKED = (1 << 0), /* SKB ACK'd by a SACK block */ @@ -1772,11 +1791,18 @@ static inline bool tcp_paws_reject(const struct tcp_options_received *rx_opt, static inline void __tcp_fast_path_on(struct tcp_sock *tp, u32 snd_wnd) { + u32 ace; + /* mptcp hooks are only on the slow path */ if (sk_is_mptcp((struct sock *)tp)) return; + ace = tcp_ecn_mode_accecn(tp) ? + ((tp->delivered_ce + TCP_ACCECN_CEP_INIT_OFFSET) & + TCP_ACCECN_CEP_ACE_MASK) : 0; + tp->pred_flags = htonl((tp->tcp_header_len << 26) | + (ace << 22) | ntohl(TCP_FLAG_ACK) | snd_wnd); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 989c3c3d8e75..494741e4d977 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3362,6 +3362,7 @@ int tcp_disconnect(struct sock *sk, int flags) tp->window_clamp = 0; tp->delivered = 0; tp->delivered_ce = 0; + tcp_accecn_init_counters(tp); if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); @@ -5059,6 +5060,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, snd_up); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ce); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); @@ -5066,7 +5068,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 92 + 4); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 97 + 7); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 5c270cf96678..e3fc1e0bcf57 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -342,14 +342,17 @@ static bool tcp_in_quickack_mode(struct sock *sk) static void tcp_ecn_queue_cwr(struct tcp_sock *tp) { + /* Do not set CWR if in AccECN mode! */ if (tcp_ecn_mode_rfc3168(tp)) tp->ecn_flags |= TCP_ECN_QUEUE_CWR; } static void tcp_ecn_accept_cwr(struct sock *sk, const struct sk_buff *skb) { - if (tcp_hdr(skb)->cwr) { - tcp_sk(sk)->ecn_flags &= ~TCP_ECN_DEMAND_CWR; + struct tcp_sock *tp = tcp_sk(sk); + + if (tcp_ecn_mode_rfc3168(tp) && tcp_hdr(skb)->cwr) { + tp->ecn_flags &= ~TCP_ECN_DEMAND_CWR; /* If the sender is telling us it has entered CWR, then its * cwnd may be very low (even just 1 packet), so we should ACK @@ -385,17 +388,16 @@ static void tcp_data_ecn_check(struct sock *sk, const struct sk_buff *skb) if (tcp_ca_needs_ecn(sk)) tcp_ca_event(sk, CA_EVENT_ECN_IS_CE); - if (!(tp->ecn_flags & TCP_ECN_DEMAND_CWR)) { + if (!(tp->ecn_flags & TCP_ECN_DEMAND_CWR) && + tcp_ecn_mode_rfc3168(tp)) { /* Better not delay acks, sender can have a very low cwnd */ tcp_enter_quickack_mode(sk, 2); tp->ecn_flags |= TCP_ECN_DEMAND_CWR; } - tp->ecn_flags |= TCP_ECN_SEEN; break; default: if (tcp_ca_needs_ecn(sk)) tcp_ca_event(sk, CA_EVENT_ECN_NO_CE); - tp->ecn_flags |= TCP_ECN_SEEN; break; } } @@ -429,10 +431,64 @@ static void tcp_count_delivered(struct tcp_sock *tp, u32 delivered, bool ece_ack) { tp->delivered += delivered; - if (ece_ack) + if (tcp_ecn_mode_rfc3168(tp) && ece_ack) tcp_count_delivered_ce(tp, delivered); } +/* Returns the ECN CE delta */ +static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, + u32 delivered_pkts, int flag) +{ + const struct tcphdr *th = tcp_hdr(skb); + struct tcp_sock *tp = tcp_sk(sk); + u32 delta, safe_delta; + u32 corrected_ace; + + /* Reordered ACK or uncertain due to lack of data to send and ts */ + if (!(flag & (FLAG_FORWARD_PROGRESS | FLAG_TS_PROGRESS))) + return 0; + + if (!(flag & FLAG_SLOWPATH)) { + /* AccECN counter might overflow on large ACKs */ + if (delivered_pkts <= TCP_ACCECN_CEP_ACE_MASK) + return 0; + } + + /* ACE field is not available during handshake */ + if (flag & FLAG_SYN_ACKED) + return 0; + + if (tp->received_ce_pending >= TCP_ACCECN_ACE_MAX_DELTA) + inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_NOW; + + corrected_ace = tcp_accecn_ace(th) - TCP_ACCECN_CEP_INIT_OFFSET; + delta = (corrected_ace - tp->delivered_ce) & TCP_ACCECN_CEP_ACE_MASK; + if (delivered_pkts <= TCP_ACCECN_CEP_ACE_MASK) + return delta; + + safe_delta = delivered_pkts - + ((delivered_pkts - delta) & TCP_ACCECN_CEP_ACE_MASK); + + return safe_delta; +} + +static u32 tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, + u32 delivered_pkts, int *flag) +{ + struct tcp_sock *tp = tcp_sk(sk); + u32 delta; + + delta = __tcp_accecn_process(sk, skb, delivered_pkts, *flag); + if (delta > 0) { + tcp_count_delivered_ce(tp, delta); + *flag |= FLAG_ECE; + /* Recalculate header predictor */ + if (tp->pred_flags) + tcp_fast_path_on(tp); + } + return delta; +} + /* Buffer size and advertised window tuning. * * 1. Tuning sk->sk_sndbuf, when connection enters established state. @@ -3920,7 +3976,8 @@ static void tcp_xmit_recovery(struct sock *sk, int rexmit) } /* Returns the number of packets newly acked or sacked by the current ACK */ -static u32 tcp_newly_delivered(struct sock *sk, u32 prior_delivered, int flag) +static u32 tcp_newly_delivered(struct sock *sk, u32 prior_delivered, + u32 ecn_count, int flag) { const struct net *net = sock_net(sk); struct tcp_sock *tp = tcp_sk(sk); @@ -3928,8 +3985,12 @@ static u32 tcp_newly_delivered(struct sock *sk, u32 prior_delivered, int flag) delivered = tp->delivered - prior_delivered; NET_ADD_STATS(net, LINUX_MIB_TCPDELIVERED, delivered); - if (flag & FLAG_ECE) - NET_ADD_STATS(net, LINUX_MIB_TCPDELIVEREDCE, delivered); + + if (flag & FLAG_ECE) { + if (tcp_ecn_mode_rfc3168(tp)) + ecn_count = delivered; + NET_ADD_STATS(net, LINUX_MIB_TCPDELIVEREDCE, ecn_count); + } return delivered; } @@ -3950,6 +4011,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) u32 delivered = tp->delivered; u32 lost = tp->lost; int rexmit = REXMIT_NONE; /* Flag to (re)transmit to recover losses */ + u32 ecn_count = 0; /* Did we receive ECE/an AccECN ACE update? */ u32 prior_fack; sack_state.first_sackt = 0; @@ -4057,6 +4119,11 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) tcp_rack_update_reo_wnd(sk, &rs); + if (tcp_ecn_mode_accecn(tp)) + ecn_count = tcp_accecn_process(sk, skb, + tp->delivered - delivered, + &flag); + tcp_in_ack_event(sk, flag); if (tp->tlp_high_seq) @@ -4081,7 +4148,8 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP)) sk_dst_confirm(sk); - delivered = tcp_newly_delivered(sk, delivered, flag); + delivered = tcp_newly_delivered(sk, delivered, ecn_count, flag); + lost = tp->lost - lost; /* freshly marked lost */ rs.is_ack_delayed = !!(flag & FLAG_ACK_MAYBE_DELAYED); tcp_rate_gen(sk, delivered, lost, is_sack_reneg, sack_state.rate); @@ -4090,12 +4158,16 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) return 1; no_queue: + if (tcp_ecn_mode_accecn(tp)) + ecn_count = tcp_accecn_process(sk, skb, + tp->delivered - delivered, + &flag); tcp_in_ack_event(sk, flag); /* If data was DSACKed, see if we can undo a cwnd reduction. */ if (flag & FLAG_DSACKING_ACK) { tcp_fastretrans_alert(sk, prior_snd_una, num_dupack, &flag, &rexmit); - tcp_newly_delivered(sk, delivered, flag); + tcp_newly_delivered(sk, delivered, ecn_count, flag); } /* If this ack opens up a zero window, clear backoff. It was * being used to time the probes, and is probably far higher than @@ -4116,7 +4188,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) &sack_state); tcp_fastretrans_alert(sk, prior_snd_una, num_dupack, &flag, &rexmit); - tcp_newly_delivered(sk, delivered, flag); + tcp_newly_delivered(sk, delivered, ecn_count, flag); tcp_xmit_recovery(sk, rexmit); } @@ -5953,6 +6025,26 @@ static void tcp_urg(struct sock *sk, struct sk_buff *skb, const struct tcphdr *t } } +/* Updates Accurate ECN received counters from the received IP ECN field */ +static void tcp_ecn_received_counters(struct sock *sk, + const struct sk_buff *skb) +{ + u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; + u8 is_ce = INET_ECN_is_ce(ecnfield); + struct tcp_sock *tp = tcp_sk(sk); + + if (!INET_ECN_is_not_ect(ecnfield)) { + u32 pcount = is_ce * max_t(u16, 1, skb_shinfo(skb)->gso_segs); + + tp->ecn_flags |= TCP_ECN_SEEN; + + /* ACE counter tracks *all* segments including pure ACKs */ + tp->received_ce += pcount; + tp->received_ce_pending = min(tp->received_ce_pending + pcount, + 0xfU); + } +} + /* Accept RST for rcv_nxt - 1 after a FIN. * When tcp connections are abruptly terminated from Mac OSX (via ^C), a * FIN is sent followed by a RST packet. The RST is sent with the same @@ -6215,6 +6307,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) flag |= __tcp_replace_ts_recent(tp, delta); + tcp_ecn_received_counters(sk, skb); + /* We know that such packets are checksummed * on entry. */ @@ -6259,6 +6353,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) /* Bulk data transfer: receiver */ tcp_cleanup_skb(skb); __skb_pull(skb, tcp_header_len); + tcp_ecn_received_counters(sk, skb); eaten = tcp_queue_rcv(sk, skb, &fragstolen); tcp_event_data_recv(sk, skb); @@ -6299,6 +6394,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) return; step5: + tcp_ecn_received_counters(sk, skb); + reason = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT); if ((int)reason < 0) { reason = -reason; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 124b2e95bb0a..36afa3b40396 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -374,6 +374,17 @@ tcp_ecn_make_synack(const struct request_sock *req, struct tcphdr *th) th->ece = 1; } +static void tcp_accecn_set_ace(struct tcphdr *th, struct tcp_sock *tp) +{ + u32 wire_ace; + + wire_ace = tp->received_ce + TCP_ACCECN_CEP_INIT_OFFSET; + th->ece = !!(wire_ace & 0x1); + th->cwr = !!(wire_ace & 0x2); + th->ae = !!(wire_ace & 0x4); + tp->received_ce_pending = 0; +} + /* Set up ECN state for a packet on a ESTABLISHED socket that is about to * be sent. */ @@ -382,11 +393,17 @@ static void tcp_ecn_send(struct sock *sk, struct sk_buff *skb, { struct tcp_sock *tp = tcp_sk(sk); - if (tcp_ecn_mode_rfc3168(tp)) { + if (!tcp_ecn_mode_any(tp)) + return; + + INET_ECN_xmit(sk); + if (tcp_ecn_mode_accecn(tp)) { + tcp_accecn_set_ace(th, tp); + skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ACCECN; + } else { /* Not-retransmitted data segment: set ECT and inject CWR. */ if (skb->len != tcp_header_len && !before(TCP_SKB_CB(skb)->seq, tp->snd_nxt)) { - INET_ECN_xmit(sk); if (tp->ecn_flags & TCP_ECN_QUEUE_CWR) { tp->ecn_flags &= ~TCP_ECN_QUEUE_CWR; th->cwr = 1; From patchwork Tue Mar 18 00:26:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020094 X-Patchwork-Delegate: kuba@kernel.org Received: from DB3PR0202CU003.outbound.protection.outlook.com (mail-northeuropeazon11011027.outbound.protection.outlook.com [52.101.65.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A75E383A2; Tue, 18 Mar 2025 00:27:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.65.27 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257678; cv=fail; b=OFIlvZzEaFfDalckkBmNBU6Llw6NzfnqdFwxHODLDyEzGU5u4D2ZjcDjOxRVSTSmTKKwOiXd32G5y49kozbyuUQ7s10rkSzageFlvsajvuGqufvAPmGYYl58SecKWgy5ClQSEmmznNkMCmdJfdGTcT6vTmTcrKMmVvlngIzK1oA= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257678; c=relaxed/simple; bh=02fmCom07/6QuKVnab1CkHF3QLQxUAqSxrCxUKcqq9g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=e6KA+HtqUmVCPsHBbMjT39h6aF5LEeBvm3lFmXJzdlvGGusjWSkrQJWLIJzPVIhJVzBQICTvs0CZOGYZ+dhYlWFKzb9WspFZ5D1fKLI1ATAAuooBFfZ0HhR55WPBsecnPgszTXlYBW6umtLQrFqSQSAiUUkyYeN4lMa6hVE/DsA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=ZnUj4IHL; arc=fail smtp.client-ip=52.101.65.27 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="ZnUj4IHL" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FW6l1ybWjtGM5+0/z7QfTIxtMcXNCCmSiW3Afi2I3UP7H4Z7E4ZgLb6wsog042DgQgdcK8z+eUyi8/owwi64vwyzyFleLSN4qQuUAa15MWlzMLq1b37Kt/+1VA1aMIKNvLZiQcKGOlTljVQwl0i6rkkOL8A8eQ6azjfHcwVH2rMY908d1yLO3BrZCmkGLEK8lJT9UQ8vLMM8Y15hL7BJoQb7l/C+U0wCa64dgvT9iNIDSWyfc608xNomEMuy9/+endrmLt5g+Kc/5DGmOH0mhN7tLzxbIuxsw7eIy25iaKxOWcC2TRGfoIUWrY+c2/wY9zdgRsu9jHxpPX8TVH1QyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/bsAE05dsDY1kI/az7l2yfRFRo7MggOS9ysXcloQZT8=; b=qmlvqlIbqoif8LaJQ0lvqVU7QpPHYuz/NZYDaq3EYos0YYdc/UCLpPeriJYwhCk0qJuxkNQGBzjwNFV/bGnPS7QpaaOGkwJC/gkEqA/VkwRYg9W1BsCcQHavGnxEOQDWbAomoDJhUEuNVt7IqNWl/AZ3qY+XkeMvLBv3G1zBZquVSZO5K85H9tBSQ8W6y8bBexYv0DsO9K9XCEX1+CMaHkDySgdndrf04V4gDIJ6408x4jGCTqYPVV+IhFmIrZicZ7XeRpJwcddu9KoP9AKZXx9oMHU6yYXLy8OYBm50fjAV8sd/jLd1g2AdAiHO7N507BxHr66EEIIjyvkOvTtMew== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/bsAE05dsDY1kI/az7l2yfRFRo7MggOS9ysXcloQZT8=; b=ZnUj4IHLe043dFCmRPzeUJYfZMpx5F4hrkYEYhml973RNiZGTZvabA21rx/lk2wzhoPoofHZlDZeezCqohi2HAF5ymauiKXj3Z7FAoyZ3b/30Es10zoRcU85croSlbPVixoUiqQHkoq3mFse5ZNwLLiSFu+hYfxLiebx3BryNzh4DQAH+cLdtjRs2g3orjk6dXCqOqYY1Z+JXjF2MZTM2hiHwCl6zbObVWpjIQV3FuBM/o5NUYh9GDcFg/66hcL+aYcWhKO/i9gT5dEJ13Is+Cv2LPn7+kDo1HCRDL5RmmjWssMuUh1dsCCe2yqOhJavJchdwwI+sREULshOGebfLQ== Received: from AM0PR04CA0116.eurprd04.prod.outlook.com (2603:10a6:208:55::21) by DUZPR07MB9718.eurprd07.prod.outlook.com (2603:10a6:10:4ad::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:50 +0000 Received: from AM1PEPF000252DA.eurprd07.prod.outlook.com (2603:10a6:208:55::4) by AM0PR04CA0116.outlook.office365.com (2603:10a6:208:55::21) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AM1PEPF000252DA.mail.protection.outlook.com (10.167.16.52) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:50 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBj024935; Tue, 18 Mar 2025 00:27:56 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Olivier Tilmans , Chia-Yu Chang Subject: [PATCH v2 net-next 03/15] tcp: accecn: AccECN negotiation Date: Tue, 18 Mar 2025 01:26:58 +0100 Message-Id: <20250318002710.29483-4-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM1PEPF000252DA:EE_|DUZPR07MB9718:EE_ X-MS-Office365-Filtering-Correlation-Id: 864ac6c8-afd0-413d-75f2-08dd65b3b716 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?ATItGKAw0sOjnYlF6cdcGnd4Y+8a/fr?= =?utf-8?q?0ZysRRIOcYFD1QaSB2cWhKzCQPZ0M21YcOx/69TadM7dkiqZG3F9kMJx+9oI/9/C5?= =?utf-8?q?04gbeeu5o6Q+55NbKcFGaVUgXLxN7SI09V44E9jwnAfXUZXIbgfnQ/M40CrbO3ove?= =?utf-8?q?CU40s0L3/gRsQcHKNwtikvCs0WX49FxMheD2qLjP52Vc4R3bJ0kO1Ua0Ks+lTFdl6?= =?utf-8?q?mH7Zjr8vYsSO2tXLfxSiQEnxNxCY4TBpvkWYiSOL0uKTaleIoITpzAza38F6lf5WZ?= =?utf-8?q?IuztDfOlLRZmWDMbJhRXkxoupRZLL/k7EojBJqu1Pc+KGTFXsOX6X25Y66XR1HCAF?= =?utf-8?q?pWUPvdl0AyQUzqee/GhBru5DayHOj7ix69RNi3zgvtJUrVI6NqTddbtRq7R1GBQD9?= =?utf-8?q?1oYrgzSEyLuEyZe5bnE71yszEWLL1P7/FctX021N8db3AFaJTDzoP3mpZNOlq7tKv?= =?utf-8?q?2iBTU6xdqFcJb4b4eJg48efTcY4mhIBvT/LKPfd/6CE9AZLEYoHq6A3Smqusftt/r?= =?utf-8?q?s6csSOKSig1pOLSFSUpUf1YXV6Au+TJo4nZi5SuR4Nw+XD57i2HlmTss6y6r6UCyM?= =?utf-8?q?hzpIypfcPBkXOQMwojNGmNFwfEh8ZHVOEZTy+ueoeia8YICTbzSi+g3HrWcRODRSY?= =?utf-8?q?/vokYzhVaVTGNehXYvk/SSNZMbIGUmMaqLiPSZOpZ7KJ5Z8WPREuhVOJIrZHZhmSg?= =?utf-8?q?PlqvcNreeR5GG145Nx6uAz6ppZaY/YCbQHW4Xm1x+gnQYsNFW+gGB43WlbzNoup8R?= =?utf-8?q?7OAGAghHVSlO83sjP3Z/QuRVuibvH84E/sBlZMIFM/RAGZ4EDzHPajTjxdwTk77eY?= =?utf-8?q?OiBQSlQDbEZ4fVY4gLhdLZi8zlxaS+oP7erm368M3+aFbnrGKoA1odgnOzSS6eoHF?= =?utf-8?q?CpBmQg+w+1L5XtCEEgmeEfrm7azMKOo6Q26wDbE+RymncT8NqQtEOuDFweZaEzLXh?= =?utf-8?q?G88KXup0X/NjvBW/ext8KlGCuamO6mEfBlzSSN8y3MqwUocp0gkS9zT7oi2wVgN8t?= =?utf-8?q?1jb6a9xnOQkTu7ZyTlzGKPS9HmF27MYtV2xeYaIAQEgG1zO8m7tzxuoTBRcnRkPiR?= =?utf-8?q?2tFilkGjUd9AV2vQ53i+Ko1ab8rxISwTuV435N8aFDvwbXQNJ5igQPDX92aAK3Aqu?= =?utf-8?q?kdw9oIiIuBItoWmsCgX0JqErQwlAovLzqPM0aInFLOWPbOLQAjDZMuuYXKC9qoK/9?= =?utf-8?q?QLIUOhdKHyXWIa4zdV66R8mXwoRtZT7Hr7uZqBQQ8eewguJHokYt9TBkVhAUSIOrJ?= =?utf-8?q?jqeyWAPB87c4pTOqY6N7MOqxqNFAJfeAVBjZGUf2QM/E2wUp7v6xEbee+DydFx2as?= =?utf-8?q?kSo5PS4hEgvslVKnH+E2vsQmy+W761HZhRo09R3dz2VdD2TZ43Gpw8tEoaMyPRqcS?= =?utf-8?q?26RZMV08xmWF4LXA5snzAx9rT3fYg7z73M+hr/VAmNk8oRC9bq0ktRUFMmPLcnuF1?= =?utf-8?q?vrWDh2AOQT?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:50.0951 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 864ac6c8-afd0-413d-75f2-08dd65b3b716 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252DA.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DUZPR07MB9718 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen Accurate ECN negotiation parts based on the specification: https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt Accurate ECN is negotiated using ECE, CWR and AE flags in the TCP header. TCP falls back into using RFC3168 ECN if one of the ends supports only RFC3168-style ECN. The AccECN negotiation includes reflecting IP ECN field value seen in SYN and SYNACK back using the same bits as negotiation to allow responding to SYN CE marks and to detect ECN field mangling. CE marks should not occur currently because SYN=1 segments are sent with Non-ECT in IP ECN field (but proposal exists to remove this restriction). Reflecting SYN IP ECN field in SYNACK is relatively simple. Reflecting SYNACK IP ECN field in the final/third ACK of the handshake is more challenging. Linux TCP code is not well prepared for using the final/third ACK a signalling channel which makes things somewhat complicated here. Co-developed-by: Olivier Tilmans Signed-off-by: Olivier Tilmans Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 9 ++- include/net/tcp.h | 80 ++++++++++++++++++- net/ipv4/syncookies.c | 3 + net/ipv4/sysctl_net_ipv4.c | 3 +- net/ipv4/tcp.c | 2 + net/ipv4/tcp_input.c | 155 +++++++++++++++++++++++++++++++++---- net/ipv4/tcp_ipv4.c | 3 +- net/ipv4/tcp_minisocks.c | 51 ++++++++++-- net/ipv4/tcp_output.c | 78 +++++++++++++++---- net/ipv6/syncookies.c | 1 + net/ipv6/tcp_ipv6.c | 1 + 11 files changed, 343 insertions(+), 43 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 5cc6ecfccb17..2ddc9076235b 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -156,6 +156,10 @@ struct tcp_request_sock { #if IS_ENABLED(CONFIG_MPTCP) bool drop_req; #endif + u8 accecn_ok : 1, + syn_ect_snt: 2, + syn_ect_rcv: 2; + u8 accecn_fail_mode:4; u32 txhash; u32 rcv_isn; u32 snt_isn; @@ -373,7 +377,10 @@ struct tcp_sock { u8 compressed_ack; u8 dup_ack_counter:2, tlp_retrans:1, /* TLP is a retransmission */ - unused:5; + syn_ect_snt:2, /* AccECN ECT memory, only */ + syn_ect_rcv:2, /* ... needed durign 3WHS + first seqno */ + wait_third_ack:1; /* Wait 3rd ACK in simultaneous open */ + u8 accecn_fail_mode:4; /* AccECN failure handling */ u8 thin_lto : 1,/* Use linear timeouts for thin streams */ fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */ diff --git a/include/net/tcp.h b/include/net/tcp.h index d8ac90ef391f..93fffeb59a15 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -234,6 +235,37 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOLEN_MSS_ALIGNED 4 #define TCPOLEN_EXP_SMC_BASE_ALIGNED 8 +/* tp->accecn_fail_mode */ +#define TCP_ACCECN_ACE_FAIL_SEND BIT(0) +#define TCP_ACCECN_ACE_FAIL_RECV BIT(1) +#define TCP_ACCECN_OPT_FAIL_SEND BIT(2) +#define TCP_ACCECN_OPT_FAIL_RECV BIT(3) + +static inline bool tcp_accecn_ace_fail_send(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_ACE_FAIL_SEND; +} + +static inline bool tcp_accecn_ace_fail_recv(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_ACE_FAIL_RECV; +} + +static inline bool tcp_accecn_opt_fail_send(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_OPT_FAIL_SEND; +} + +static inline bool tcp_accecn_opt_fail_recv(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_OPT_FAIL_RECV; +} + +static inline void tcp_accecn_fail_mode_set(struct tcp_sock *tp, u8 mode) +{ + tp->accecn_fail_mode |= mode; +} + /* Flags in tp->nonagle */ #define TCP_NAGLE_OFF 1 /* Nagle's algo is disabled */ #define TCP_NAGLE_CORK 2 /* Socket is corked */ @@ -420,6 +452,23 @@ static inline u8 tcp_accecn_ace(const struct tcphdr *th) return (th->ae << 2) | (th->cwr << 1) | th->ece; } +/* Infer the ECT value our SYN arrived with from the echoed ACE field */ +static inline int tcp_accecn_extract_syn_ect(u8 ace) +{ + if (ace & 0x1) + return INET_ECN_ECT_1; + if (!(ace & 0x2)) + return INET_ECN_ECT_0; + if (ace & 0x4) + return INET_ECN_CE; + return INET_ECN_NOT_ECT; +} + +bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); +void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, + u8 syn_ect_snt); +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb); + enum tcp_tw_status { TCP_TW_SUCCESS = 0, TCP_TW_RST = 1, @@ -656,6 +705,15 @@ static inline bool cookie_ecn_ok(const struct net *net, const struct dst_entry * dst_feature(dst, RTAX_FEATURE_ECN); } +/* AccECN specification, 5.1: [...] a server can determine that it + * negotiated AccECN as [...] if the ACK contains an ACE field with + * the value 0b010 to 0b111 (decimal 2 to 7). + */ +static inline bool cookie_accecn_ok(const struct tcphdr *th) +{ + return tcp_accecn_ace(th) > 0x1; +} + #if IS_ENABLED(CONFIG_BPF) static inline bool cookie_bpf_ok(struct sk_buff *skb) { @@ -967,6 +1025,7 @@ static inline u32 tcp_rsk_tsval(const struct tcp_request_sock *treq) #define TCPHDR_ACE (TCPHDR_ECE | TCPHDR_CWR | TCPHDR_AE) #define TCPHDR_SYN_ECN (TCPHDR_SYN | TCPHDR_ECE | TCPHDR_CWR) +#define TCPHDR_SYNACK_ACCECN (TCPHDR_SYN | TCPHDR_ACK | TCPHDR_CWR) #define TCP_ACCECN_CEP_ACE_MASK 0x7 #define TCP_ACCECN_ACE_MAX_DELTA 6 @@ -1050,6 +1109,15 @@ struct tcp_skb_cb { #define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0])) +static inline u16 tcp_accecn_reflector_flags(u8 ect) +{ + u32 flags = ect + 2; + + if (ect == 3) + flags++; + return FIELD_PREP(TCPHDR_ACE, flags); +} + extern const struct inet_connection_sock_af_ops ipv4_specific; #if IS_ENABLED(CONFIG_IPV6) @@ -1172,7 +1240,10 @@ enum tcp_ca_ack_event_flags { #define TCP_CONG_NON_RESTRICTED BIT(0) /* Requires ECN/ECT set on all packets */ #define TCP_CONG_NEEDS_ECN BIT(1) -#define TCP_CONG_MASK (TCP_CONG_NON_RESTRICTED | TCP_CONG_NEEDS_ECN) +/* Require successfully negotiated AccECN capability */ +#define TCP_CONG_NEEDS_ACCECN BIT(2) +#define TCP_CONG_MASK (TCP_CONG_NON_RESTRICTED | TCP_CONG_NEEDS_ECN | \ + TCP_CONG_NEEDS_ACCECN) union tcp_cc_info; @@ -1304,6 +1375,13 @@ static inline bool tcp_ca_needs_ecn(const struct sock *sk) return icsk->icsk_ca_ops->flags & TCP_CONG_NEEDS_ECN; } +static inline bool tcp_ca_needs_accecn(const struct sock *sk) +{ + const struct inet_connection_sock *icsk = inet_csk(sk); + + return icsk->icsk_ca_ops->flags & TCP_CONG_NEEDS_ACCECN; +} + static inline void tcp_ca_event(struct sock *sk, const enum tcp_ca_event event) { const struct inet_connection_sock *icsk = inet_csk(sk); diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 5459a78b9809..3a44eb9c1d1a 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -403,6 +403,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) struct tcp_sock *tp = tcp_sk(sk); struct inet_request_sock *ireq; struct net *net = sock_net(sk); + struct tcp_request_sock *treq; struct request_sock *req; struct sock *ret = sk; struct flowi4 fl4; @@ -428,6 +429,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) } ireq = inet_rsk(req); + treq = tcp_rsk(req); sk_rcv_saddr_set(req_to_sk(req), ip_hdr(skb)->daddr); sk_daddr_set(req_to_sk(req), ip_hdr(skb)->saddr); @@ -482,6 +484,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) if (!req->syncookie) ireq->rcv_wscale = rcv_wscale; ireq->ecn_ok &= cookie_ecn_ok(net, &rt->dst); + treq->accecn_ok = ireq->ecn_ok && cookie_accecn_ok(th); ret = tcp_get_cookie_sock(sk, skb, req, &rt->dst); /* ip_queue_xmit() depends on our flow being setup diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 3a43010d726f..75ec1a599b52 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -47,6 +47,7 @@ static unsigned int udp_child_hash_entries_max = UDP_HTABLE_SIZE_MAX; static int tcp_plb_max_rounds = 31; static int tcp_plb_max_cong_thresh = 256; static unsigned int tcp_tw_reuse_delay_max = TCP_PAWS_MSL * MSEC_PER_SEC; +static int tcp_ecn_mode_max = 5; /* obsolete */ static int sysctl_tcp_low_latency __read_mostly; @@ -728,7 +729,7 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = proc_dou8vec_minmax, .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_TWO, + .extra2 = &tcp_ecn_mode_max, }, { .procname = "tcp_ecn_fallback", diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 494741e4d977..3c8894de5495 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3362,6 +3362,8 @@ int tcp_disconnect(struct sock *sk, int flags) tp->window_clamp = 0; tp->delivered = 0; tp->delivered_ce = 0; + tp->wait_third_ack = 0; + tp->accecn_fail_mode = 0; tcp_accecn_init_counters(tp); if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e3fc1e0bcf57..47da42707738 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -402,14 +402,93 @@ static void tcp_data_ecn_check(struct sock *sk, const struct sk_buff *skb) } } -static void tcp_ecn_rcv_synack(struct tcp_sock *tp, const struct tcphdr *th) +/* AccECN specificaiton, 3.1.2: If a TCP server that implements AccECN + * receives a SYN with the three TCP header flags (AE, CWR and ECE) set + * to any combination other than 000, 011 or 111, it MUST negotiate the + * use of AccECN as if they had been set to 111. + */ +static bool tcp_accecn_syn_requested(const struct tcphdr *th) +{ + u8 ace = tcp_accecn_ace(th); + + return ace && ace != 0x3; +} + +/* Check ECN field transition to detect invalid transitions */ +static bool tcp_ect_transition_valid(u8 snt, u8 rcv) +{ + if (rcv == snt) + return true; + + /* Non-ECT altered to something or something became non-ECT */ + if (snt == INET_ECN_NOT_ECT || rcv == INET_ECN_NOT_ECT) + return false; + /* CE -> ECT(0/1)? */ + if (snt == INET_ECN_CE) + return false; + return true; +} + +bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect) { - if (tcp_ecn_mode_rfc3168(tp) && (!th->ece || th->cwr)) + u8 ect = tcp_accecn_extract_syn_ect(ace); + struct tcp_sock *tp = tcp_sk(sk); + + if (!sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback) + return true; + + if (!tcp_ect_transition_valid(sent_ect, ect)) { + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_ACE_FAIL_RECV); + return false; + } + + return true; +} + +/* See Table 2 of the AccECN draft */ +static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, + u8 ip_dsfield) +{ + struct tcp_sock *tp = tcp_sk(sk); + u8 ace = tcp_accecn_ace(th); + + switch (ace) { + case 0x0: + case 0x7: tcp_ecn_mode_set(tp, TCP_ECN_DISABLED); + break; + case 0x1: + case 0x5: + if (tcp_ecn_mode_pending(tp)) + /* Downgrade from AccECN, or requested initially */ + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + break; + default: + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; + if (INET_ECN_is_ce(ip_dsfield) && + tcp_accecn_validate_syn_feedback(sk, ace, + tp->syn_ect_snt)) { + tp->received_ce++; + tp->received_ce_pending++; + } + break; + } } -static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th) +static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th, + const struct sk_buff *skb) { + if (tcp_ecn_mode_pending(tp)) { + if (!tcp_accecn_syn_requested(th)) { + /* Downgrade to classic ECN feedback */ + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + } else { + tp->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & + INET_ECN_MASK; + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + } + } if (tcp_ecn_mode_rfc3168(tp) && (!th->ece || !th->cwr)) tcp_ecn_mode_set(tp, TCP_ECN_DISABLED); } @@ -3835,7 +3914,7 @@ bool tcp_oow_rate_limited(struct net *net, const struct sk_buff *skb, } /* RFC 5961 7 [ACK Throttling] */ -static void tcp_send_challenge_ack(struct sock *sk) +static void tcp_send_challenge_ack(struct sock *sk, bool accecn_reflector) { struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); @@ -3865,7 +3944,9 @@ static void tcp_send_challenge_ack(struct sock *sk) WRITE_ONCE(net->ipv4.tcp_challenge_count, count - 1); send_ack: NET_INC_STATS(net, LINUX_MIB_TCPCHALLENGEACK); - tcp_send_ack(sk); + __tcp_send_ack(sk, tp->rcv_nxt, + !accecn_reflector ? 0 : + tcp_accecn_reflector_flags(tp->syn_ect_rcv)); } } @@ -4032,7 +4113,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) /* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */ if (before(ack, prior_snd_una - max_window)) { if (!(flag & FLAG_NO_CHALLENGE_ACK)) - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); return -SKB_DROP_REASON_TCP_TOO_OLD_ACK; } goto old_ack; @@ -6026,8 +6107,7 @@ static void tcp_urg(struct sock *sk, struct sk_buff *skb, const struct tcphdr *t } /* Updates Accurate ECN received counters from the received IP ECN field */ -static void tcp_ecn_received_counters(struct sock *sk, - const struct sk_buff *skb) +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) { u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); @@ -6068,6 +6148,7 @@ static bool tcp_reset_check(const struct sock *sk, const struct sk_buff *skb) static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, const struct tcphdr *th, int syn_inerr) { + bool send_accecn_reflector = false; struct tcp_sock *tp = tcp_sk(sk); SKB_DR(reason); @@ -6161,7 +6242,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, if (tp->syn_fastopen && !tp->data_segs_in && sk->sk_state == TCP_ESTABLISHED) tcp_fastopen_active_disable(sk); - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); SKB_DR_SET(reason, TCP_RESET); goto discard; } @@ -6172,16 +6253,27 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { + if (tcp_ecn_mode_accecn(tp)) + send_accecn_reflector = true; if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && - TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) + TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) { + if (!tcp_ecn_disabled(tp)) { + u8 ect = tp->syn_ect_rcv; + + tp->wait_third_ack = true; + __tcp_send_ack(sk, tp->rcv_nxt, + !send_accecn_reflector ? 0 : + tcp_accecn_reflector_flags(ect)); + } goto pass; + } syn_challenge: if (syn_inerr) TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS); NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNCHALLENGE); - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, send_accecn_reflector); SKB_DR_SET(reason, TCP_INVALID_SYN); goto discard; } @@ -6394,6 +6486,12 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) return; step5: + if (unlikely(tp->wait_third_ack)) { + tp->wait_third_ack = 0; + if (tcp_ecn_mode_accecn(tp)) + tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); + tcp_fast_path_on(tp); + } tcp_ecn_received_counters(sk, skb); reason = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT); @@ -6646,7 +6744,8 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * state to ESTABLISHED..." */ - tcp_ecn_rcv_synack(tp, th); + if (tcp_ecn_mode_any(tp)) + tcp_ecn_rcv_synack(sk, th, TCP_SKB_CB(skb)->ip_dsfield); tcp_init_wl(tp, TCP_SKB_CB(skb)->seq); tcp_try_undo_spurious_syn(sk); @@ -6718,7 +6817,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, TCP_DELACK_MAX, false); goto consume; } - tcp_send_ack(sk); + __tcp_send_ack(sk, tp->rcv_nxt, + !tcp_ecn_mode_accecn(tp) ? 0 : + tcp_accecn_reflector_flags(tp->syn_ect_rcv)); return -1; } @@ -6777,7 +6878,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tp->snd_wl1 = TCP_SKB_CB(skb)->seq; tp->max_window = tp->snd_wnd; - tcp_ecn_rcv_syn(tp, th); + tcp_ecn_rcv_syn(tp, th, skb); tcp_mtup_init(sk); tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); @@ -6959,7 +7060,7 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) } /* accept old ack during closing */ if ((int)reason < 0) { - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); reason = -reason; goto discard; } @@ -7006,9 +7107,16 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) tp->lsndtime = tcp_jiffies32; tcp_initialize_rcv_mss(sk); - tcp_fast_path_on(tp); + if (likely(!tp->wait_third_ack)) { + if (tcp_ecn_mode_accecn(tp)) + tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); + tcp_fast_path_on(tp); + } if (sk->sk_shutdown & SEND_SHUTDOWN) tcp_shutdown(sk, SEND_SHUTDOWN); + + if (sk->sk_socket && tp->wait_third_ack) + goto consume; break; case TCP_FIN_WAIT1: { @@ -7178,6 +7286,15 @@ static void tcp_ecn_create_request(struct request_sock *req, bool ect, ecn_ok; u32 ecn_ok_dst; + if (tcp_accecn_syn_requested(th) && + (net->ipv4.sysctl_tcp_ecn >= 3 || tcp_ca_needs_accecn(listen_sk))) { + inet_rsk(req)->ecn_ok = 1; + tcp_rsk(req)->accecn_ok = 1; + tcp_rsk(req)->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & + INET_ECN_MASK; + return; + } + if (!th_ecn) return; @@ -7185,7 +7302,8 @@ static void tcp_ecn_create_request(struct request_sock *req, ecn_ok_dst = dst_feature(dst, DST_FEATURE_ECN_MASK); ecn_ok = READ_ONCE(net->ipv4.sysctl_tcp_ecn) || ecn_ok_dst; - if (((!ect || th->res1) && ecn_ok) || tcp_ca_needs_ecn(listen_sk) || + if (((!ect || th->res1 || th->ae) && ecn_ok) || + tcp_ca_needs_ecn(listen_sk) || (ecn_ok_dst & DST_FEATURE_ECN_CA) || tcp_bpf_ca_needs_ecn((struct sock *)req)) inet_rsk(req)->ecn_ok = 1; @@ -7203,6 +7321,9 @@ static void tcp_openreq_init(struct request_sock *req, tcp_rsk(req)->snt_synack = 0; tcp_rsk(req)->snt_tsval_first = 0; tcp_rsk(req)->last_oow_ack_time = 0; + tcp_rsk(req)->accecn_ok = 0; + tcp_rsk(req)->syn_ect_rcv = 0; + tcp_rsk(req)->syn_ect_snt = 0; req->mss = rx_opt->mss_clamp; req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0; ireq->tstamp_ok = rx_opt->tstamp_ok; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4fa4fbb0ad12..7c52645567eb 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1189,7 +1189,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, enum tcp_synack_type synack_type, struct sk_buff *syn_skb) { - const struct inet_request_sock *ireq = inet_rsk(req); + struct inet_request_sock *ireq = inet_rsk(req); struct flowi4 fl4; int err = -1; struct sk_buff *skb; @@ -1202,6 +1202,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb); if (skb) { + tcp_rsk(req)->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr); tos = READ_ONCE(inet_sk(sk)->tos); diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index fb9349be36b8..0e63f691a387 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -458,12 +458,51 @@ void tcp_openreq_init_rwin(struct request_sock *req, ireq->rcv_wscale = rcv_wscale; } -static void tcp_ecn_openreq_child(struct tcp_sock *tp, - const struct request_sock *req) +void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, + u8 syn_ect_snt) { - tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? - TCP_ECN_MODE_RFC3168 : - TCP_ECN_DISABLED); + u8 ace = tcp_accecn_ace(tcp_hdr(skb)); + struct tcp_sock *tp = tcp_sk(sk); + + switch (ace) { + case 0x0: + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_ACE_FAIL_RECV); + break; + case 0x7: + case 0x5: + case 0x1: + /* Unused but legal values */ + break; + default: + /* Validation only applies to first non-data packet */ + if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq && + !TCP_SKB_CB(skb)->sacked && + tcp_accecn_validate_syn_feedback(sk, ace, syn_ect_snt)) { + if ((tcp_accecn_extract_syn_ect(ace) == INET_ECN_CE) && + !tp->delivered_ce) + tp->delivered_ce++; + } + break; + } +} + +static void tcp_ecn_openreq_child(struct sock *sk, + const struct request_sock *req, + const struct sk_buff *skb) +{ + const struct tcp_request_sock *treq = tcp_rsk(req); + struct tcp_sock *tp = tcp_sk(sk); + + if (treq->accecn_ok) { + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + tp->syn_ect_snt = treq->syn_ect_snt; + tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tcp_ecn_received_counters(sk, skb); + } else { + tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? + TCP_ECN_MODE_RFC3168 : + TCP_ECN_DISABLED); + } } void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) @@ -628,7 +667,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; - tcp_ecn_openreq_child(newtp, req); + tcp_ecn_openreq_child(newsk, req, skb); newtp->fastopen_req = NULL; RCU_INIT_POINTER(newtp->fastopen_rsk, NULL); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 36afa3b40396..fa7facec04ed 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -322,7 +322,7 @@ static u16 tcp_select_window(struct sock *sk) /* Packet ECN state for a SYN-ACK */ static void tcp_ecn_send_synack(struct sock *sk, struct sk_buff *skb) { - const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_CWR; if (tcp_ecn_disabled(tp)) @@ -330,6 +330,13 @@ static void tcp_ecn_send_synack(struct sock *sk, struct sk_buff *skb) else if (tcp_ca_needs_ecn(sk) || tcp_bpf_ca_needs_ecn(sk)) INET_ECN_xmit(sk); + + if (tp->ecn_flags & TCP_ECN_MODE_ACCECN) { + TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_ACE; + TCP_SKB_CB(skb)->tcp_flags |= + tcp_accecn_reflector_flags(tp->syn_ect_rcv); + tp->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; + } } /* Packet ECN state for a SYN. */ @@ -337,8 +344,20 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) { struct tcp_sock *tp = tcp_sk(sk); bool bpf_needs_ecn = tcp_bpf_ca_needs_ecn(sk); - bool use_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn) == 1 || - tcp_ca_needs_ecn(sk) || bpf_needs_ecn; + bool use_ecn, use_accecn; + u8 tcp_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn); + + /* ============== ========================== + * tcp_ecn values Outgoing connections + * ============== ========================== + * 0,2,5 Do not request ECN + * 1,4 Request ECN connection + * 3 Request AccECN connection + * ============== ========================== + */ + use_accecn = tcp_ecn == 3 || tcp_ca_needs_accecn(sk); + use_ecn = tcp_ecn == 1 || tcp_ecn == 4 || + tcp_ca_needs_ecn(sk) || bpf_needs_ecn || use_accecn; if (!use_ecn) { const struct dst_entry *dst = __sk_dst_get(sk); @@ -354,35 +373,58 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) INET_ECN_xmit(sk); TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ECE | TCPHDR_CWR; - tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + if (use_accecn) { + TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_AE; + tcp_ecn_mode_set(tp, TCP_ECN_MODE_PENDING); + tp->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; + } else { + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + } } } static void tcp_ecn_clear_syn(struct sock *sk, struct sk_buff *skb) { - if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback)) + if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback)) { /* tp->ecn_flags are cleared at a later point in time when * SYN ACK is ultimatively being received. */ - TCP_SKB_CB(skb)->tcp_flags &= ~(TCPHDR_ECE | TCPHDR_CWR); + TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_ACE; + } +} + +static void tcp_accecn_echo_syn_ect(struct tcphdr *th, u8 ect) +{ + th->ae = !!(ect & INET_ECN_ECT_0); + th->cwr = ect != INET_ECN_ECT_0; + th->ece = ect == INET_ECN_ECT_1; } static void tcp_ecn_make_synack(const struct request_sock *req, struct tcphdr *th) { - if (inet_rsk(req)->ecn_ok) + if (tcp_rsk(req)->accecn_ok) + tcp_accecn_echo_syn_ect(th, tcp_rsk(req)->syn_ect_rcv); + else if (inet_rsk(req)->ecn_ok) th->ece = 1; } -static void tcp_accecn_set_ace(struct tcphdr *th, struct tcp_sock *tp) +static void tcp_accecn_set_ace(struct tcp_sock *tp, struct sk_buff *skb, + struct tcphdr *th) { u32 wire_ace; - wire_ace = tp->received_ce + TCP_ACCECN_CEP_INIT_OFFSET; - th->ece = !!(wire_ace & 0x1); - th->cwr = !!(wire_ace & 0x2); - th->ae = !!(wire_ace & 0x4); - tp->received_ce_pending = 0; + /* The final packet of the 3WHS or anything like it must reflect + * the SYN/ACK ECT instead of putting CEP into ACE field, such + * case show up in tcp_flags. + */ + if (likely(!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACE))) { + wire_ace = tp->received_ce + TCP_ACCECN_CEP_INIT_OFFSET; + th->ece = !!(wire_ace & 0x1); + th->cwr = !!(wire_ace & 0x2); + th->ae = !!(wire_ace & 0x4); + tp->received_ce_pending = 0; + } } /* Set up ECN state for a packet on a ESTABLISHED socket that is about to @@ -396,9 +438,10 @@ static void tcp_ecn_send(struct sock *sk, struct sk_buff *skb, if (!tcp_ecn_mode_any(tp)) return; - INET_ECN_xmit(sk); + if (!tcp_accecn_ace_fail_recv(tp)) + INET_ECN_xmit(sk); if (tcp_ecn_mode_accecn(tp)) { - tcp_accecn_set_ace(th, tp); + tcp_accecn_set_ace(tp, skb, th); skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ACCECN; } else { /* Not-retransmitted data segment: set ECT and inject CWR. */ @@ -3414,7 +3457,10 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) tcp_retrans_try_collapse(sk, skb, avail_wnd); } - /* RFC3168, section 6.1.1.1. ECN fallback */ + /* RFC3168, section 6.1.1.1. ECN fallback + * As AccECN uses the same SYN flags (+ AE), this check covers both + * cases. + */ if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN_ECN) == TCPHDR_SYN_ECN) tcp_ecn_clear_syn(sk, skb); diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 9d83eadd308b..50046460ee0b 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -264,6 +264,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) if (!req->syncookie) ireq->rcv_wscale = rcv_wscale; ireq->ecn_ok &= cookie_ecn_ok(net, dst); + tcp_rsk(req)->accecn_ok = ireq->ecn_ok && cookie_accecn_ok(th); ret = tcp_get_cookie_sock(sk, skb, req, dst); if (!ret) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index e182ee0a2330..73aeef120b44 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -542,6 +542,7 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst, skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb); if (skb) { + tcp_rsk(req)->syn_ect_snt = np->tclass & INET_ECN_MASK; __tcp_v6_send_check(skb, &ireq->ir_v6_loc_addr, &ireq->ir_v6_rmt_addr); From patchwork Tue Mar 18 00:26:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020092 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2089.outbound.protection.outlook.com [40.107.22.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68474381AF; Tue, 18 Mar 2025 00:27:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.22.89 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257677; cv=fail; b=jwbS/7pNVGF+mCEbEItgtKiHG5eYic5Zq3iDFAq6nSnvsS2zN4G5LuPOFLdO5v1RP8mC5c7gYV89ZAJ8jwzJuLMbz/gmKnKhZgGWoHlShQ5sn3zTOVlOac9MVzjwQddqIrLc0ZOXAhX4uqJMRp1ecIMRve/LjM5wRvFIoOVbZU4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257677; c=relaxed/simple; bh=pSlO/wT+zwQpxMsveSgD5QpwZqxSJPoTbBK9fdRZmes=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=We1vBSeLuxSVkBkE5mJf3vxzMk21TGVFbCny3W7DzWTDxsIXyJntdav0Df6Ponqaw0XO9IwSYCJJ9+EYyWuCfMns6xOxs06QJOzu63nSkUpXTEh1Dsn90S8RGTnBnFyh0TxsUAm4BHD0yHnc5kROy6EkDGeoFMjuYh8rORpn7fQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=lITJo6gP; arc=fail smtp.client-ip=40.107.22.89 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="lITJo6gP" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YgoZ1FVSPBuJnwXcpEpyArI6Z8nf0Yw6R/KwPgp23FeLRnjoONfhJNf6McWmAWsK/oq25jzMRraACkreTOaw5EXg+zbkDx0bm2NmqgrXmHCZ0fg11sluayrZIa0KFRl6hVLthzjOEzMCKS3C2tJGrgis03hyyNjL/Np6EiGDWnFBDLeh+EK3ict8qJY5D+CbywX2daJdePV5JksQHHkQnj5AYTmKrkP2CU6lTvu9XpFOn0sg5FdoFmeJBDcNZ+dUuBf/lLTHPicsqHN+ID08IvjuP3rLe40Ck3EbdCclr9kjZ4Am+e92D9BHGdUfQ+Ht70c0KfIkyzYo8Fkv9ODtkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jbF9Z5AttEUa36DqFVoF0RSc0xJM23wjMzYufsdhZiM=; b=yliVPJn4Zt0YcYjf8R4ZXxjTni8194uHf32snAh3slonZ9XgduW77AR3GKeof23Ivl/B7OdQOORjYIfCLrcJnM3sUVDzC+P6VqcHcI7kMhu60cYLan9MYjzyvliq0VUI/2KIp4Pi6oWv7rYHlqxrSlvzZxSTVnCqpYYmo+WRuoWJqmnl63Ddg6NJWvHUPdwIfkh3/hSI3poE+atJfXSG02o6bLtgmWyXFuEA6ulXMtGTfA4Ob5fIcpIuwfd/5TlCwJE2mKTbmvp2u/socGxVWokjFuwxxB8oMCmtBDhqq9gGiyxofTmfJ6iCaCvB91Sv28JldFKFED16dGAfAMGTOw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jbF9Z5AttEUa36DqFVoF0RSc0xJM23wjMzYufsdhZiM=; b=lITJo6gPLYcfE7f5fPAaws50eJFkOqEb6abeTKonG+S/w7o2z4nCN/3xeslQuGIDHBMvd5ppqEHN3tk/bwyV+sXGcd1txBfhVPOS/4V3uYSD987cmImQP4N+aIngSY3w0kYic4f746Kk8xHsM0gN6MW4qVdQ2utxq7CYGpXRnLPvbDN+v6R22vGeZpiCsSEsZ8j1Ver/1877Exso5xLGN5KTEtFGrC6xO+n94A/BolKVjc09ZF4TIheSJkjTg0lUJPA1Bz1sqCiH9KsYQHWXfBY+QnG/qIn/Rvs6mmMuLIzLcbQXPrWySGDsqfSHJunGNmtLvqtG9l64Hy1pUbVKQw== Received: from AS4P189CA0012.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:5d7::15) by AS5PR07MB10131.eurprd07.prod.outlook.com (2603:10a6:20b:680::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:50 +0000 Received: from AM1PEPF000252DE.eurprd07.prod.outlook.com (2603:10a6:20b:5d7:cafe::2e) by AS4P189CA0012.outlook.office365.com (2603:10a6:20b:5d7::15) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AM1PEPF000252DE.mail.protection.outlook.com (10.167.16.56) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:50 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBk024935; Tue, 18 Mar 2025 00:27:57 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 04/15] tcp: accecn: add AccECN rx byte counters Date: Tue, 18 Mar 2025 01:26:59 +0100 Message-Id: <20250318002710.29483-5-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM1PEPF000252DE:EE_|AS5PR07MB10131:EE_ X-MS-Office365-Filtering-Correlation-Id: ef0b5210-2277-4811-b02c-08dd65b3b71e X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?yvw6tLhA5VDacoNebjwd9bLBfAy3Rnk?= =?utf-8?q?Z/sTKnAmZJlfmxvgzpbuVGSFTxLN79QX7jfnP56P63T1oyove26XC5wOFNFVYZVdi?= =?utf-8?q?zOaRLoQ4R4Ml6EhoVT8L1vl0Ibaryjqd4D3DIdgp72GbbwmzlfOBhluLOPFsuL2ns?= =?utf-8?q?f3bHTy/IxkzmWakofIkNlqWtLTr7UU33ePl/4GML95yZKPSApXeJNGHriAk14tNpU?= =?utf-8?q?m5M2RTUOryvcwYljbMD4IbzPMFj+zfdo4Pmyl93Mdp5dCL1pepa9/TRQLOpfqcLOt?= =?utf-8?q?EGVb3Ugony5vpg6EXkQjrNlelsf2+M1qdsEDhgDF3+8/85Rad8EJo13B9rDqGEzVu?= =?utf-8?q?SWDM495Mis3E6uGG1V1wwL8Z57C5zwqMAVpZ11W6VnHkkupxkdcXYCA+VaXLGTuX7?= =?utf-8?q?GoZ8Sgx4bncRWi0sT0vuNsP2hGYHistjIS/TpgQ4AJ6NazUbtYdMiaqeM16U6jMWT?= =?utf-8?q?JwgSOd6jjrGf69utFdLknTg5nA1UyxkKILg/4ayPlBiqnEwcZ6BjlhfwFkt+qXjQN?= =?utf-8?q?a38k4MlATGEeyEm8q8gjv82ul7zZDA+3Cbv7hRbN880Xr5PpgjjpfVODAvC7BhYNx?= =?utf-8?q?lBXShFzHGVxRsnci5ULZj9b4HS+8VYGEGhwrWqaxH8/F8qZH+CfKth3k+KGvpy2Tr?= =?utf-8?q?RQhLMVM/ceMXExUlPZsiQsg0XnIcQ0jnnWO59XNfMxMOTHUGWnGno3VUBvi0Uxr0T?= =?utf-8?q?17yC5bkaj49V41LMcqhlB6Phlk9s8TEjygaFWFXtfG3EKUgbBineCP4psx/r9bt54?= =?utf-8?q?QkMLEDHZxXjJ0b7cFZNmgxhcjeDxtwHw1zv+NNW3diIdtdq3RL1AzJ9u7U7tusiq8?= =?utf-8?q?X4xV8NeS0coXTaYYzxFjw3tND5OQNZmCx9f70Py9LPhVwSI1el8r9UsRlHWBMKsBW?= =?utf-8?q?R018jsI+OlIkKOR3QD6ThM4ekKO4dZ6mXClVg/nis2Z69AuLaNoM845F8AWK8N+lz?= =?utf-8?q?LddtA0eGcJK62BZEowBkKz0RaHdpc0g5BtTPnoZEXQ7QsL6Ch+bPsRtE9exP9ofoc?= =?utf-8?q?eH6AXeToXHC8huJOsRJQmVd5HZDYzGkRIJRXxoS1GHM4oY0n5QvTDGAxBEXSGPfZl?= =?utf-8?q?Ah2fgs9QLd1dam48Z3oTMDkcIi7pNhXe0W8+3pc5eJfQ2owwj2By/pK4o0cEweN6x?= =?utf-8?q?dDj9chtNtlH/pClfLTUhF6JMVYnOUR1TpZZPv/U14GO9GQFUQMeyNUFeTvtfFtGeC?= =?utf-8?q?8zJCui0tncM0bYDub4f/WAcoVpqMD34LBQU2TNkva95humOqQK7TuuddRoRflK8Ym?= =?utf-8?q?iwf04XkmZQx8ZAiXTuNcnGaaBLy3hxZ0oVSxmHl6/9+1GUmuyFKvUP5WVK2+3WSNm?= =?utf-8?q?GaXH53PswSyC1NMFVirGst2LQrfT9smhuIJOP7qVTEkHs729FxswozClveTLzL6ri?= =?utf-8?q?Ksybqbr6uQG7wKSbDw+nuabc67aan3br2XmzNfK+ixObg+y/haAaNhS4tpSTYHCLm?= =?utf-8?q?dMuhwP/HiG?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:50.1490 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ef0b5210-2277-4811-b02c-08dd65b3b71e X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252DE.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS5PR07MB10131 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen These counters track IP ECN field payload byte sums for all arriving (acceptable) packets. The AccECN option (added by a later patch in the series) echoes these counters back to sender side. Signed-off-by: Ilpo Järvinen Signed-off-by: Neal Cardwell Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 1 + include/net/tcp.h | 18 +++++++++++++++++- net/ipv4/tcp.c | 3 ++- net/ipv4/tcp_input.c | 13 +++++++++---- net/ipv4/tcp_minisocks.c | 3 ++- 5 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 2ddc9076235b..9de5090fadfb 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -300,6 +300,7 @@ struct tcp_sock { u32 delivered; /* Total data packets delivered incl. rexmits */ u32 delivered_ce; /* Like the above but only ECE marked packets */ u32 received_ce; /* Like the above but for rcvd CE marked pkts */ + u32 received_ecn_bytes[3]; u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ unused2:4; u32 app_limited; /* limited until "delivered" reaches this val */ diff --git a/include/net/tcp.h b/include/net/tcp.h index 93fffeb59a15..aff31ba1dea9 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -467,7 +467,8 @@ static inline int tcp_accecn_extract_syn_ect(u8 ace) bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, u8 syn_ect_snt); -void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb); +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, + u32 payload_len); enum tcp_tw_status { TCP_TW_SUCCESS = 0, @@ -1034,11 +1035,26 @@ static inline u32 tcp_rsk_tsval(const struct tcp_request_sock *treq) * See draft-ietf-tcpm-accurate-ecn for the latest values. */ #define TCP_ACCECN_CEP_INIT_OFFSET 5 +#define TCP_ACCECN_E1B_INIT_OFFSET 1 +#define TCP_ACCECN_E0B_INIT_OFFSET 1 +#define TCP_ACCECN_CEB_INIT_OFFSET 0 + +static inline void __tcp_accecn_init_bytes_counters(int *counter_array) +{ + BUILD_BUG_ON(INET_ECN_ECT_1 != 0x1); + BUILD_BUG_ON(INET_ECN_ECT_0 != 0x2); + BUILD_BUG_ON(INET_ECN_CE != 0x3); + + counter_array[INET_ECN_ECT_1 - 1] = 0; + counter_array[INET_ECN_ECT_0 - 1] = 0; + counter_array[INET_ECN_CE - 1] = 0; +} static inline void tcp_accecn_init_counters(struct tcp_sock *tp) { tp->received_ce = 0; tp->received_ce_pending = 0; + __tcp_accecn_init_bytes_counters(tp->received_ecn_bytes); } /* State flags for sacked in struct tcp_skb_cb */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3c8894de5495..49289b5243e3 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -5063,6 +5063,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); @@ -5070,7 +5071,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 97 + 7); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 109 + 3); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 47da42707738..57e3cdb44a51 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6107,7 +6107,8 @@ static void tcp_urg(struct sock *sk, struct sk_buff *skb, const struct tcphdr *t } /* Updates Accurate ECN received counters from the received IP ECN field */ -void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, + u32 payload_len) { u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); @@ -6122,6 +6123,9 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) tp->received_ce += pcount; tp->received_ce_pending = min(tp->received_ce_pending + pcount, 0xfU); + + if (payload_len > 0) + tp->received_ecn_bytes[ecnfield - 1] += payload_len; } } @@ -6399,7 +6403,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) flag |= __tcp_replace_ts_recent(tp, delta); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, 0); /* We know that such packets are checksummed * on entry. @@ -6445,7 +6449,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) /* Bulk data transfer: receiver */ tcp_cleanup_skb(skb); __skb_pull(skb, tcp_header_len); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, + len - tcp_header_len); eaten = tcp_queue_rcv(sk, skb, &fragstolen); tcp_event_data_recv(sk, skb); @@ -6492,7 +6497,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); tcp_fast_path_on(tp); } - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, len - th->doff * 4); reason = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT); if ((int)reason < 0) { diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 0e63f691a387..550c2d9d08b7 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -494,10 +494,11 @@ static void tcp_ecn_openreq_child(struct sock *sk, struct tcp_sock *tp = tcp_sk(sk); if (treq->accecn_ok) { + const struct tcphdr *th = (const struct tcphdr *)skb->data; tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); } else { tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? TCP_ECN_MODE_RFC3168 : From patchwork Tue Mar 18 00:27:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020093 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2078.outbound.protection.outlook.com [40.107.20.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7BE515AF6; Tue, 18 Mar 2025 00:27:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.20.78 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257677; cv=fail; b=tl4V0FjNanPSPOHKkYJLxkohvJT/ZnaQ/sH5NN8tI/abt0lraaEQ6NDDiTqOnAYlBWK8RE7GLXq8sWRQGYL16eBpDJLghJIRb9swsCDJDJj57SpBwxRxhd8qpVb5dt6VoT5rE2acJsRxjTRfkzB22Gg92yeP9ZdB+WW+VmI7uVg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257677; c=relaxed/simple; bh=rMfobB2LeDKZJJYDXDD1lvSwng3bBWFGEregY42JC0A=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=XwISfv0Ga9U1viSM2jZZJ9znopD45gWbmNdj45fmcIrAaj4W3SWTjXxhFJ/XxQv2XEZVAZLTwSh2wWw/ZdAiM94c1Vx9EVlmlm8QfQidlSz3Bj/VrJAOK9K4Cz+Ly0/awrHSiSxhh/Egl5UKNccCDc8VWBZ+REJHllZ2HoZpEJE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=BldSxLFC; arc=fail smtp.client-ip=40.107.20.78 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="BldSxLFC" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=uFck/jEbi2TY8j3LjwSqc+qfMfTPlmaqh6EjVswgkhgA2NaMrX49l4CdGq6y7E2A0qWzvTbcJbbi/WRTvAIYoL7dF4k7u7atW0RmVn8TxoZ0/s0OK+50ZcK5fC9GdEtLEDVQASGwy35hN3CY+Lty5CtFCRaVshzaAoZdWIqpK9GJKmOC0a+M7+QM7o9KxWawcjm42cCvOaM7JCOpE46fHsofayTncgFeT+xgm1FqCR6RQHFKKb8l+Kg27YqYFPu/scxW7z1gC/nzCVJF4LKroNAFQXcCzqKkD0okekOuSCwj8/+qH3Kvhn3gj8OIOIUYUwRq9NQOyC6v/8Wwm16cqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HSVkgj89C5bYbYKQw200DiEqhwGnKsoODxuUzbGElrw=; b=UL5dUg8eHvhY6l1VhOjecgj+dL8zCfgs0l2DPqyorbFKmWCGqk3GgW5Kz5gods0+lNSmjVLHmBOH+fHvB+obfPle5PnYAfYCVxjyfohzfnbRKrZTu6gE7kYhKbBgDMj07ixhUJy+cG8KVm1DOZ9ojrSYt44rDZJ1TUqEDC9nSWkoUPOOzkmq6mBXO45VzYp74HU0OvcPd0NIeS3myT3lgTV3rVjaYXQYW2wIJx3gtoLS0PkPBY26DlzVArrcPpTMv54kanEAnAl2xNotE/jEYfLn0X46Q4ic58JGtHrhqQrE+feUnW+StF09DPAbZWdJbFRsWpP30ctEwhwloXMbhQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HSVkgj89C5bYbYKQw200DiEqhwGnKsoODxuUzbGElrw=; b=BldSxLFC/DJ9HmmxPtCn1LKyjtj8xbPkhzF+cuwJh4rBLEEEarniwYLde4AS7IAzMp0Ad66GZ0/epBN0OYcNppCJf2QG7RD7hyEol0FMs8kyEnaVMBZ1tEp4qLjEnJJ3JPYnRwbARTzkZEh2Y/pMmkxunvSXe96PLNl8h4Ly8m0Sl4WnkEjh0U41Czgav20WIcJygS/zoS0ruLrh3tzObc3xXzVy6o2b/kh+uhDps6TaflpU24d9Y+v53LFS+QWycWLN53xmdgYslYzNQaRlCuLsqzOfyFq0htFtuqCIS9YceYwIF3DKIcAe6JmQAwYDT63ay1wLtQ1fGU3PsR0s4g== Received: from DB9PR06CA0029.eurprd06.prod.outlook.com (2603:10a6:10:1db::34) by PR3PR07MB6812.eurprd07.prod.outlook.com (2603:10a6:102:79::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:52 +0000 Received: from DB5PEPF00014B9F.eurprd02.prod.outlook.com (2603:10a6:10:1db:cafe::62) by DB9PR06CA0029.outlook.office365.com (2603:10a6:10:1db::34) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB5PEPF00014B9F.mail.protection.outlook.com (10.167.8.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:51 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBl024935; Tue, 18 Mar 2025 00:27:59 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 05/15] tcp: accecn: AccECN needs to know delivered bytes Date: Tue, 18 Mar 2025 01:27:00 +0100 Message-Id: <20250318002710.29483-6-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B9F:EE_|PR3PR07MB6812:EE_ X-MS-Office365-Filtering-Correlation-Id: 67ffdb4e-3899-4400-37fb-08dd65b3b823 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|7416014|376014|82310400026|36860700013|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?uIKYX2cD4WaQHaZkKaABvPOpP+OYx3u?= =?utf-8?q?MqRpR55N07py5r1k4fy7h1iYfce9FvDXGYx9MZPX0anLvhxwE/wWQ5uuVXYBoyqS2?= =?utf-8?q?LgzScM4r8YnPX9SGIzGilWy3kD7u6s2O7bwsf0tHCrq1rYRJRhiBX7ToER1hc/gTF?= =?utf-8?q?Bdx6LUtq3lYfae12gKlhD0sEhXKHw4eJhHHeXt/eVNkG5qddoPA7RXsHtqJBprYru?= =?utf-8?q?5eJDuekJdV307WwQC8F4YM1+HUkxJ5iV4oHQAvQXe/SlSUAbq0FV93BZkNCRwvjNn?= =?utf-8?q?EGoizu/XgNBriPNxDtHiAKMv8clu80zfRmDc09unMML1FWeu3HndYw7GS8TsWvy28?= =?utf-8?q?FMF1Wk6WOs9j9dS2LmXXKXK28neCJbe3ZwcMBNXATX68mTNUkRklyU4suJRKye278?= =?utf-8?q?3HPkoPu8Ga43EwsUlx2fFkbtcDQ00yvfWj2mGHA6T6BE9xKbwk23/Whss6urNe6Az?= =?utf-8?q?CS1XnaodWyEGXrALpCnsH0oOdTAtxwOrVA/Fhy/MMvaZkblSYcp2Dt0E3yO885SoR?= =?utf-8?q?MmNatSiuu4KMtvGlsZdueCkz/bwZabwvBEA/cXH93s6L3pKtZJMoGouhWJLMViFvh?= =?utf-8?q?JSDMJ2fEh46hl8vm9wAVTooBCDS+8LnpufkEuirrVbcBJ1MNpjwaeu2ZLIK1mfCtt?= =?utf-8?q?9ZRC+hbyUp/XIzDI2amCQuHFfyKIRX6PZjreJlSOUueKY6/pYvZ62YieMoMe5RcOw?= =?utf-8?q?whI0W4H3PajDhFinnQfyVYyBjf/oAfr+Mu/JwGVk1HTjM/4FMwSKzl6gNKsuxlNau?= =?utf-8?q?hvwVmp12KEdKYgUhKSlw1u+RxRgZbRsll5j8N8xUpNKGDLvSmnUSpmvvJomNGYFjA?= =?utf-8?q?cbPwmMIK+tnreuI6sgikkC9dtiqGm828xA4p0Lj3FkBA2R2wc0pmqujhQf8Wu3BHx?= =?utf-8?q?FhDE1XwRgM1jXjBoW5SX84n5JtUoTeLYF8TUCmt5yp88iiWwgbtwNYe81GPc3t5iL?= =?utf-8?q?FOb1dSq385xSCOhIM4Ak5KFLydSvtG4KmtJlawUP0rS9MVUqEyqAHVsODUI1NyT/3?= =?utf-8?q?9MOruPEy8BSsmddYksnQGnYXBPw4Cousgv3ha07u2bUpm1U62O499posgyVh75QjU?= =?utf-8?q?Vi6RWAxec9l5ilPqe4koBayHOwOmkqmNNsVmSU0SwQQ71SBCzFtD66+95WAXalWLu?= =?utf-8?q?KuHOzOKtpB8tvTpCU+fD/3OKGjPl1Fra4+2iOFSZumZ0hSzup0L7edLxdj0giFgKL?= =?utf-8?q?FTtiO+9WBS/i8JHyqtikdYbeJ1lxwo6uTXG6Y0upugGHxMJ8l+QukJTCXmUy69py5?= =?utf-8?q?ZFcyo/SbUyrrvMJKy0T35tt9U9CS3y46/m9gtL+gkEAxpTxZfvkmjg9XwCgS94Soq?= =?utf-8?q?EUMQquQFKo+DxqQ8Vcso8rDgNY/TBXOLEVXDPdyIV+c8BjPrasM58B0cuHY2YTRGB?= =?utf-8?q?sSOGwDnzDpifv/TBdOKn0wlS0CBQAjvlRYpHH1cqSrNsht6Or/kStO7NWrDGKa9Iu?= =?utf-8?q?eXhV2D3rcc?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(7416014)(376014)(82310400026)(36860700013)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:51.6533 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 67ffdb4e-3899-4400-37fb-08dd65b3b823 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B9F.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR07MB6812 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen AccECN byte counter estimation requires delivered bytes which can be calculated while processing SACK blocks and cumulative ACK. The delivered bytes will be used to estimate the byte counters between AccECN option (on ACKs w/o the option). Non-SACK calculation is quite annoying, inaccurate, and likely bogus. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_input.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 57e3cdb44a51..b7a9534eb47c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1171,6 +1171,7 @@ struct tcp_sacktag_state { u64 last_sackt; u32 reord; u32 sack_delivered; + u32 delivered_bytes; int flag; unsigned int mss_now; struct rate_sample *rate; @@ -1532,7 +1533,7 @@ static int tcp_match_skb_to_sack(struct sock *sk, struct sk_buff *skb, static u8 tcp_sacktag_one(struct sock *sk, struct tcp_sacktag_state *state, u8 sacked, u32 start_seq, u32 end_seq, - int dup_sack, int pcount, + int dup_sack, int pcount, u32 plen, u64 xmit_time) { struct tcp_sock *tp = tcp_sk(sk); @@ -1592,6 +1593,7 @@ static u8 tcp_sacktag_one(struct sock *sk, tp->sacked_out += pcount; /* Out-of-order packets delivered */ state->sack_delivered += pcount; + state->delivered_bytes += plen; /* Lost marker hint past SACKed? Tweak RFC3517 cnt */ if (tp->lost_skb_hint && @@ -1633,7 +1635,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *prev, * tcp_highest_sack_seq() when skb is highest_sack. */ tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked, - start_seq, end_seq, dup_sack, pcount, + start_seq, end_seq, dup_sack, pcount, skb->len, tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); @@ -1925,6 +1927,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk, TCP_SKB_CB(skb)->end_seq, dup_sack, tcp_skb_pcount(skb), + skb->len, tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) @@ -3541,6 +3544,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, if (sacked & TCPCB_SACKED_ACKED) { tp->sacked_out -= acked_pcount; + /* snd_una delta covers these skbs */ + sack->delivered_bytes -= skb->len; } else if (tcp_is_sack(tp)) { tcp_count_delivered(tp, acked_pcount, ece_ack); if (!tcp_skb_spurious_retrans(tp, skb)) @@ -3644,6 +3649,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, delta = prior_sacked - tp->sacked_out; tp->lost_cnt_hint -= min(tp->lost_cnt_hint, delta); } + + sack->delivered_bytes = (skb ? + TCP_SKB_CB(skb)->seq : tp->snd_una) - + prior_snd_una; } else if (skb && rtt_update && sack_rtt_us >= 0 && sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, tcp_skb_timestamp_us(skb))) { @@ -4098,6 +4107,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) sack_state.first_sackt = 0; sack_state.rate = &rs; sack_state.sack_delivered = 0; + sack_state.delivered_bytes = 0; /* We very likely will need to access rtx queue. */ prefetch(sk->tcp_rtx_queue.rb_node); From patchwork Tue Mar 18 00:27:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020095 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2067.outbound.protection.outlook.com [40.107.21.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4EAF117BBF; Tue, 18 Mar 2025 00:27:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.21.67 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257679; cv=fail; b=nQxbjeg3RQPNLD457lBJ4bftWsvcAUMWeVTWSAqyNXP+5cJEmmXdE8IibpYLImuyFbuj48jydE8EXm8AubwKG+QYsysGTaOzNjh3xR9gjCaddZic5h6TFd9BAeDJYFCYpA5ZeLcSQ9eHDVIrn14ODzEyBtSLQqNioiXc7EtNbdI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257679; c=relaxed/simple; bh=mahsolYvqjcKyHZQnjwgoQTNEvOu+cuEN9bCZBp2VAo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=SUDD/j7WVKgZcHNa9myDqvThnGFIeGc1lu1YqdjsSME1AxcQ9lLho1XSshVLpy+RcvRAvO3hlPbndmXwDID9XqwJ7uVwowHM99+RHM8jBgCzBINHW2xIaz6vbwmmMP2eb8Z2nbt8p7plTJJsUhHQs9K6tp6oSw31tMan8HgUskc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=HX0VKdRw; arc=fail smtp.client-ip=40.107.21.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="HX0VKdRw" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CfJleTrYl205cwmlg4hXmXswdFLTol2t/ceHvDZywPD5JDeSg4FaLoWfFq5/tIXas8+N9DmkTAIziBzIwh5tak9j0AE8nX3PsQWUj0iPEn2o/ep2EzDHFmFZCWBZWgTTtZeCqiMDJ6j3kwAx/veXolOdx18t34V/ETlutfxV6FJ8LrgcJ7seGvJwNbf/UtPtufE3U5rEODzBz6iFHLg1W4zVpLp3MshbIkyjJfI00Mn+3cU5F8dyOzLRQGijt6FxKrVtXC26p+uctur2nl4wMV1qzHS9xsDLGeFjZBMoEaHMaA8folIAxdyT4GfLYZazod5XY8PqjmGNJqXOvKzvAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+Dk8yx+3negvL+98nHzFT4Zm8MI6WwXlKQKBxKCLLtQ=; b=eBDj0HGMTVprGGhOoKyYh4wmrJkgtZFIlegZdrp2tyW9OhFLs1qbkqlyVljdGC6yeqgKCG05aNDTO4dAG4igETiuo9AsAVRqdmv99a6/sDEEneqenjghazBsEgEH5YGJ83mDgQdRR3sUibVxXjGdE/6/2o2o4c+FNePszBoaQ4T+e8wNVgm4X5FLAEmwCqMitJzEA7uOmlX4eQ5IqWHOBZdKmS5DHWPxYfE29ZqQyUzWakWc2Oc/Jt2pxAFplTMlBnjenYT35KKnK366mE5cwVck2gVWQjbkxxDofCRmc3uAXhkg7gXJ2lKh3b0u8mfaQwYs5IhDnWcXsXMVx4tD+A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Dk8yx+3negvL+98nHzFT4Zm8MI6WwXlKQKBxKCLLtQ=; b=HX0VKdRwbVEYWOeHoHRde+VhXthFuuHJ27ALoaTn4yFG12XMjg+KVKH8ujxlXllmFbbYMS0DAfi9PK/YkgWtfcFVzLVf/9p/hhah9/fCCfWbB3UH175kyZbPUr5TgoTf/a6Gtgtd1ldT9X/r8WMK6zCTRzYtUz0PkZq4+kGr2Cz+UVaGPyGWeHl5Ok1waKk3voD2lS5DT1bCycQ6+nRatLQWu64QU0JSuMuVK2m+neZr+GZsJKnx2iS3pHZvr86NCcQ/CV+T4fLMKwPvx295dBhhLvumdUxwL8Y3vi8FqUq5Vg+sQJTAovCBolJmaigGhU1xb6zn/xPINge3eBRGTQ== Received: from DBBPR09CA0013.eurprd09.prod.outlook.com (2603:10a6:10:c0::25) by AS1PR07MB9645.eurprd07.prod.outlook.com (2603:10a6:20b:483::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:53 +0000 Received: from DB5PEPF00014B95.eurprd02.prod.outlook.com (2603:10a6:10:c0:cafe::86) by DBBPR09CA0013.outlook.office365.com (2603:10a6:10:c0::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB5PEPF00014B95.mail.protection.outlook.com (10.167.8.233) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:53 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBm024935; Tue, 18 Mar 2025 00:28:00 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 06/15] tcp: allow embedding leftover into option padding Date: Tue, 18 Mar 2025 01:27:01 +0100 Message-Id: <20250318002710.29483-7-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B95:EE_|AS1PR07MB9645:EE_ X-MS-Office365-Filtering-Correlation-Id: 5bb58baa-c305-4955-1084-08dd65b3b8d7 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?6GNdJJtU7cBBq36KSIvIe1FQJEcAjVO?= =?utf-8?q?/Ey/NGiguvQcaK/NknprZ5PbBT+iylY5WVzKpX/UlRvLy9cgt07jifXsB/P3q4Y6o?= =?utf-8?q?C+VyL4M8PBdhMPc6Bk1bdaNhrDZhy6ONBJD2ItO9UpARXieG5gIbG02a269uNrjSZ?= =?utf-8?q?InVwWHSlWpJvQGvs9YoFoGlFYh9FysIGkMPzCBYKoEJ2VKsOAbS4XRmILFFa4Gw/k?= =?utf-8?q?3N1j3A/Tg29n/u0lBbUZ0Fw1ym1JuCmL69ePXUqlN9sB15bZh3eNID8SwlXGeLz7v?= =?utf-8?q?L2uENcx9FRy2poIZjFwTBmH/aDwS1gkozWV7csrazRGsCsgORbZsMlfWxC7xS+wSV?= =?utf-8?q?uaWD6nlA4S5fJX2o/LBS59doccp3coWqXmmB1+3u6PTxTrePxHCRgFxeT8RHGNLhi?= =?utf-8?q?y8XLxBWsjYe6l9nlch05IigUBG4SKxvSpTjauORRPXUTNdpnC2XqcLq0Ti+XET3s/?= =?utf-8?q?q9EhdGjDCIpkaqCP0eTevhdewcm8b2ovwRLAKGWXoB585R8/qZFTVObn1WMjp88WX?= =?utf-8?q?A+UyjDKxi2C2PuvznmasUJXB3H7ohnkiHDLke/lhda0si/W57jCHRPfqJ4Y4KGw4x?= =?utf-8?q?xJ6IShOv4Gfk2TPanG8rD2fJCKPJiLNDsnGtfQsEeWILnKP/WMjNi6NukeXjfDFgy?= =?utf-8?q?0ejMfqejp0RACQS2MmGmy0CpuokmhnC7/5EosFMdD9nr9/PNYNDHMjSUqT+QB5K4W?= =?utf-8?q?hdUOK8Xu9NhpIABuFSWo+7Pmr9RGlb+uLgWGlDQjlYGIL+vlwEwkdb9QgwvYw4YTg?= =?utf-8?q?hg1/zu42XrjstD7A3gcGs1dCemCeSi7WMfhKHiTSJx/WBSqM1MlTYiqlTdek0aDbc?= =?utf-8?q?g5OUxAlyNQ9P1Z2vlNJ2aDiG8H25/mNATLZU1423TTO9Wk5WGXVplEJ6iBnaeXtPX?= =?utf-8?q?pSV++RsqtyTTTl0r+7MRfR9/eul9RrRGm663S9vrXbnP57wsKJ9iqKyUV/IvX8WrL?= =?utf-8?q?mjxJUlSTOb9mV+QKEh1yiIxnocwH8Csgl8OD9auO0DsNR2J1gQ9N/tuPidClwMWYN?= =?utf-8?q?3Utzxc+qGuDD32BRW6LKoenwBBQ9+X4e2kN12Q3mRcrdYvfqfVKrnQn3Eaw+x/8i6?= =?utf-8?q?JhVqVzs0NPXp1h6QbHpkWzFacH9Q2IrrE95Mc+zVf+7wKZSYvZ5/T6ILyix/3YtG2?= =?utf-8?q?c+bNaI5kH1V3mZnCR15AR9QtrDel3V4aQPegk2jvTbbKiDYWrXm5o6+oAjZrsiMch?= =?utf-8?q?CE4Z09QKV4ipyKzGfupjvFlLVHF2l+5JwS0VHWc++JfmSwPK9CtcpOrnhSnYTe4eK?= =?utf-8?q?B2ooLoNPPeOkbFRQXcvTsIgWoJ5Q9wp6dtE4itvyYOZy3ebVB74sPLCE8aEaU0PVU?= =?utf-8?q?FVR6j+rvqpA42UH1CW4Z0U5LJAio9Ji17q5bhsm7Jf/XFV6H46PAXCJ5fa17g3vL6?= =?utf-8?q?yJj7zakClFCsQHfLwo5GOgG+0Zu7UxSR3T0Wazj60esR4Zw47FSPCpbjcNoQ+NsfE?= =?utf-8?q?Bmjm3N1ZbV?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:53.0194 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5bb58baa-c305-4955-1084-08dd65b3b8d7 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B95.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS1PR07MB9645 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen There is some waste space in the option usage due to padding of 32-bit fields. AccECN option can take advantage of those few bytes as its tail is often consuming just a few odd bytes. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_output.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index fa7facec04ed..87a2a20243c7 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -709,6 +709,8 @@ static __be32 *process_tcp_ao_options(struct tcp_sock *tp, return ptr; } +#define NOP_LEFTOVER ((TCPOPT_NOP << 8) | TCPOPT_NOP) + /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -727,8 +729,10 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, struct tcp_out_options *opts, struct tcp_key *key) { + u16 leftover_bytes = NOP_LEFTOVER; /* replace next NOPs if avail */ __be32 *ptr = (__be32 *)(th + 1); u16 options = opts->options; /* mungable copy */ + int leftover_size = 2; if (tcp_key_is_md5(key)) { *ptr++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -763,17 +767,22 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, } if (unlikely(OPTION_SACK_ADVERTISE & options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | + *ptr++ = htonl((leftover_bytes << 16) | (TCPOPT_SACK_PERM << 8) | TCPOLEN_SACK_PERM); + leftover_bytes = NOP_LEFTOVER; } if (unlikely(OPTION_WSCALE & options)) { - *ptr++ = htonl((TCPOPT_NOP << 24) | + u8 highbyte = TCPOPT_NOP; + + if (unlikely(leftover_size == 1)) + highbyte = leftover_bytes >> 8; + *ptr++ = htonl((highbyte << 24) | (TCPOPT_WINDOW << 16) | (TCPOLEN_WINDOW << 8) | opts->ws); + leftover_bytes = NOP_LEFTOVER; } if (unlikely(opts->num_sack_blocks)) { @@ -781,8 +790,7 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, tp->duplicate_sack : tp->selective_acks; int this_sack; - *ptr++ = htonl((TCPOPT_NOP << 24) | - (TCPOPT_NOP << 16) | + *ptr++ = htonl((leftover_bytes << 16) | (TCPOPT_SACK << 8) | (TCPOLEN_SACK_BASE + (opts->num_sack_blocks * TCPOLEN_SACK_PERBLOCK))); @@ -794,6 +802,10 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, } tp->rx_opt.dsack = 0; + } else if (unlikely(leftover_bytes != NOP_LEFTOVER)) { + *ptr++ = htonl((leftover_bytes << 16) | + (TCPOPT_NOP << 8) | + TCPOPT_NOP); } if (unlikely(OPTION_FAST_OPEN_COOKIE & options)) { From patchwork Tue Mar 18 00:27:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020096 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on2044.outbound.protection.outlook.com [40.107.249.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 356C313B2A4; Tue, 18 Mar 2025 00:27:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.249.44 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257680; cv=fail; b=CYFEk0CZn5xacI2rfCaZsFpVKL4Wj4gpWeYkSveU757U8OduOhVlH3wC9nHP1wL/iLKRhT2QT5InyamGMronXcR6TdOZua9rJa4zLAnFkvEqflMjB0f3AReAbk0mI2WVcRfE5MNeWZz+iMbcyMHEkcTSvvsENliQfJshQjWomb0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257680; c=relaxed/simple; bh=sWqYAXfDGe5jiofS64w54a1k3338zORsmWSgQdXHs4M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=DjKc7NSn3QHo2vow2ihhd30SEw3SZTYXQzLJIbe+CuALivmR/uQX9PcdDfuihzxn+663CP2H9GA++IZKxu/NmVMjfxhkRJRkWW/kr50NpMNJkbdtU5KNj6JW51W9dOw8mFmt4dZMJDg2hoPrhjQT7iXGhyM6239icj5kgGS8vhg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=SG1C8Wf+; arc=fail smtp.client-ip=40.107.249.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="SG1C8Wf+" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ZaeK7k6Z1bZ50K6URAnLIauf//jiQsS6YVptoZT6YvteIfPmRIg9sq3gGmHCfwaEd/cMSGHyiqvK+/prFQhskaDHrCC84iXHB9SVmgGEJEwLBfkK8tIJ+ef4bRO/2XFqBAYTmLmQw72KSFQuf13lrKCddJ2nf/Ng3fvSrckl7ddne4zZcVITs0Hf5LAKlotOyCxsh/eNz0KxNwgj4+ewoDZqTt6xCIr0NwD8UNCIYGjCLuakM2Ta9ZVXQomty96mjRG/7D3sPykA2LsAkLRFeEtbsfhkhqntfa7pMioEqZytUM27Z5dtnJX4CdxOEe9bpS1Yz4wg2lg7b6i913yXkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xsEuFaE4xxb9J5cMRIB94+9EU7jYsAI9Fz9C19CcU94=; b=g+5gJqsA+MLsWzFSzbLO1zFzgAN+fP56VYk54N/mcKy/9NsxBr4XnNsv0KlNylGLBgytAP1JdAnyjlZSaNdnndQuGY+VxKXKhDQCX15A+CHrWnKDQxsE3TmejW4T2Z2fG4TYmjjmok8a/991aA43Rp2u8BH7XgFQMMyhcaXKAh8DEGxwRbjEz7Ia4OB/zSvLTzDDKiyIvpkAbmW3dlmYoG6eU6lC3Q4dT0xD8eiF3UehkBAwPBzpb9RVX8O09RvsbnasVM9apFyu6S8IzY6snoMbk5BT1TgfeV2zIp9VrJM1Xh8t9Ci3lKUxpG0SN/l4Dql/TWNvx3VU+Y+eNpQr2A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xsEuFaE4xxb9J5cMRIB94+9EU7jYsAI9Fz9C19CcU94=; b=SG1C8Wf+1pTlTMUpGb+h4Mw8ZMg5pdgT3drP1tNHKyHtbrphhTqHSX5spR1ye7KztAEqO6BAKPQBGjNzzEDzRu9PVJBSuXHfrZ9OKSfjoJbj6fKNmTEAYdSpZstaI257XizSwFIvmJFGSnu7uq/nN4VSkxfpBsLsKs9D6DySGGbkCgZLOx3AZ/vBtB4v3DeBaZ6F4231Cx3P1/Y9zrx75uF/DvzdCMbtZGn+lL/XF5Wo5hbg3lkkS1D4BPo3w1ZUHmW6cxo9iJEcBaReOtEVa1eyVMFqSNVguA0LBir0sQX3lRNYUqThA4pFi3TqJlxsXvq7voS33mJBDWuGsLfujw== Received: from PA7P264CA0260.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:375::15) by DU0PR07MB9292.eurprd07.prod.outlook.com (2603:10a6:10:44c::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:54 +0000 Received: from AM1PEPF000252E0.eurprd07.prod.outlook.com (2603:10a6:102:375:cafe::f) by PA7P264CA0260.outlook.office365.com (2603:10a6:102:375::15) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AM1PEPF000252E0.mail.protection.outlook.com (10.167.16.58) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:54 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBn024935; Tue, 18 Mar 2025 00:28:02 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 07/15] tcp: sack option handling improvements Date: Tue, 18 Mar 2025 01:27:02 +0100 Message-Id: <20250318002710.29483-8-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM1PEPF000252E0:EE_|DU0PR07MB9292:EE_ X-MS-Office365-Filtering-Correlation-Id: a97a4cd9-7a0c-43f7-6a8f-08dd65b3b9b0 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?U08HFwYP9osNXHXpw+eF9TCjjuSzrOw?= =?utf-8?q?xHTs/8vYYQqbQUloxellTsczWXJt5crzMawUzCeoncJFq730u7zDLAZ7hX9X8E1M8?= =?utf-8?q?GU7HX8DG/tnAqstVRwgNWu3LYY0GJXKvz/1fGKdT4jbfkLne9LSDX9D9tUDAGEjdB?= =?utf-8?q?wH6NszCJBNmMrpXjGDvsE558/ZfYV8f46shwUocecYNt72Wp5su95QLHXb2wCfm5C?= =?utf-8?q?hnNrSh295pCyhA4R46xBT1uFikpu+ZtxhVMoSdmdwK7UMch4rLWZQg2g98wQ9RDDT?= =?utf-8?q?INOPFxFPmA90i2vIVgXBvTg82RZVnviDBwE/hcbPw9Bd+crKHur+rj49M0GU5rkFt?= =?utf-8?q?tLcTxMVX/EWZFYARzxs4HiH/KPGt7A1eOt6wiicO/Lh868aeyEItIBREIKZLrsmPV?= =?utf-8?q?g6uFuvnix87TRY+DiyHDHOUpJ2qe0xMhoTEbIOOm6Btrhb1cXFnLfASzdWMxr1ULD?= =?utf-8?q?Lqsdpb3egdAl7VlWZriW8p10ewsR+6oPS0h/XT3TS2GcJEMlhkQ4GEdeBUMWYigyH?= =?utf-8?q?/pG+VeqKfQK+gjfXOkuDzXuZSz6vlXiC7zN6cy+hlYDI9sVjJTrUHxcdQpyEPdpM7?= =?utf-8?q?ydjul1o7mkEPzdY6EpN+oPgPAk4L8MzX9F4diLNYlSnJZeYL03dkbZQ0GKlZBdAQo?= =?utf-8?q?Umwz2am2xB96t/ygDXeTr9gWX/N91AMhEFhtnHucB4HXCwlrDSjRci9whTarekCWa?= =?utf-8?q?EKVxrlW337QYSEP3Y4a3guh8U+YsOWbHrwZKY8acDNEA4bNo69Q3SQMU101yxGwQR?= =?utf-8?q?9sJkH16hETp4+Sp/E9tdEOsqISOcOIh9FnCVG+/etpmrR8EydrQEG9es531aR2FmY?= =?utf-8?q?RE0Str4s68j2f6cglP1h0Rgw+BbPPrxIRfu7UgoaKXx1qyuLTLZJVR+v1iS9mXF6U?= =?utf-8?q?b7I5si18mKVBbqKWf6yV6j+BrN9YST0O0kkJHemNGlMCd5kanMOzbVbqM+VZysHnb?= =?utf-8?q?iifZdTVYug06+9BjZdnFouBOF2kZohHdfGjeUi/8loIVfp8mGOFxSPmuYJ+7Uh8Mo?= =?utf-8?q?5u+s0u0OPgNcXdQXzVpmjySP+QSLvll2dM8TzUvyZn0Dhl68Gc3CGzvTCrZXBlWNy?= =?utf-8?q?8IdzdcDa6WIrUcS9BmAYnfnuDLW4UW3TYvOBAvw1B/kHYK1gKfns3frlttLWFNRM1?= =?utf-8?q?lDJUXm2a1pC7whpE2d47oa5oO1XsX7CtB+Y3wTjdAm4YYGqyZzXQv489gdkRmeFaE?= =?utf-8?q?EwNPffCUjmF6VhsfiOZdjZwrimx2jgiG7v7Ijahaa9POJx1bl+uCswKBjDKcvy7Wf?= =?utf-8?q?Y7nJEU0/LibsOpFZ72RveC6m9nIi4DW5ekiUDS0b8GnKNnouTre7Hq/1fS4tyyocf?= =?utf-8?q?ubrcp8PYjN10Tw/HNt65xlNcy1s57A7/eS4XPRaG0p+/Hu2jmn7U/7VHknvpVrFY0?= =?utf-8?q?AgO5GZ8ZdlLKD8f1767B74xF4krCNJt3Umn6/wK2iTGvtqW7TeakypZB72OQe1HE5?= =?utf-8?q?6qQyAM9YhY?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:54.3501 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a97a4cd9-7a0c-43f7-6a8f-08dd65b3b9b0 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252E0.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR07MB9292 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen 1) Don't early return when sack doesn't fit. AccECN code will be placed after this fragment so no early returns please. 2) Make sure opts->num_sack_blocks is not left undefined. E.g., tcp_current_mss() does not memset its opts struct to zero. AccECN code checks if SACK option is present and may even alter it to make room for AccECN option when many SACK blocks are present. Thus, num_sack_blocks needs to be always valid. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_output.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 87a2a20243c7..3638a865a430 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1103,17 +1103,18 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack; if (unlikely(eff_sacks)) { const unsigned int remaining = MAX_TCP_OPTION_SPACE - size; - if (unlikely(remaining < TCPOLEN_SACK_BASE_ALIGNED + - TCPOLEN_SACK_PERBLOCK)) - return size; - - opts->num_sack_blocks = - min_t(unsigned int, eff_sacks, - (remaining - TCPOLEN_SACK_BASE_ALIGNED) / - TCPOLEN_SACK_PERBLOCK); - - size += TCPOLEN_SACK_BASE_ALIGNED + - opts->num_sack_blocks * TCPOLEN_SACK_PERBLOCK; + if (likely(remaining >= TCPOLEN_SACK_BASE_ALIGNED + + TCPOLEN_SACK_PERBLOCK)) { + opts->num_sack_blocks = + min_t(unsigned int, eff_sacks, + (remaining - TCPOLEN_SACK_BASE_ALIGNED) / + TCPOLEN_SACK_PERBLOCK); + + size += TCPOLEN_SACK_BASE_ALIGNED + + opts->num_sack_blocks * TCPOLEN_SACK_PERBLOCK; + } + } else { + opts->num_sack_blocks = 0; } if (unlikely(BPF_SOCK_OPS_TEST_FLAG(tp, From patchwork Tue Mar 18 00:27:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020097 X-Patchwork-Delegate: kuba@kernel.org Received: from AS8PR03CU001.outbound.protection.outlook.com (mail-westeuropeazon11012013.outbound.protection.outlook.com [52.101.71.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 738411474B8; Tue, 18 Mar 2025 00:28:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.71.13 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257684; cv=fail; b=fzeAW7d2cPCTNVd3J1BRhln1Agx/6oO8LzSNdH4wlPi1OLmvIC1mAuEiyr4Tea7M3CYJKr8yEP1NpwIuzGbhcHeKqRLBN9OuEv70PKr2AZq4WgqgEC86P5FxmS6Nm++BuZ3VQ22AA6seqCwexyxPK2DgN69VeaO2GF9RC5nCvs4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257684; c=relaxed/simple; bh=pz7kprBESYjNFsXpP+1WBM4MP7MQF3GQOq3tY/nnVpY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=E7bHmqAhDpTsdrNKvOtrfbBeYndpNJ0s2mYVMHur2XZ3szxnGfE8zOin2+cjCRqfOEbuKbcZuO3khJR8n6sajBG7GQG5MmyFQNa79fch/UqWjd0IoIEbGHaVRdfUPdDjtOBCdnftPdKeqCtx+s9ArkUV+lrkOiUH9ACRiUrVGk0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=iL4Jdn5K; arc=fail smtp.client-ip=52.101.71.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="iL4Jdn5K" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=RtACFMrt7WeHxeg+/PXYsmT5xTwk0BvUTkTeZ1sCpzeMN5/Phd8IUHbUNRnTisqH6wSz365egKtmOv69WwusAyoVmL/ZiUzl8UqVz1B3xAQ2bMdPGuOd9b94jwXuZrxrz6AxE2ukZFk4J1QtJcdsSQBTQyIl2VIuxKAeciwzonb4WW+ZUExaj9FY9h7IfD9uk7QSgWQV9VTkE1vTjh4oWXGFwWMhf208gSn8AUZ+Bbcchg4o7bCVFwE7tCHF/tYiul12GFl5jw1S7IfTJI7Qg3IcvNm0d8y30PUpUPHHhKeizREXHWJKLJB24mEuqxME4a60W+Dgms1JSlWH5Ug0MA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kt7JWBnuVr/0sWJwrTkcOIq62KmS6SewS74CAwav9Wc=; b=bxDcPsYcJVgsHblIthFPVp3BseT8R+sxY5S1xsFfnHRr36ij8HGpI8GjhW7YMyCYvEzfnxvC+7XXe346Z6Gt1WdKmJFHG04VVvuYq519UP6Uv6n9XSEqIP7ghZfm46jIDvm64e5tybi7Y+/Ktrf1lR/SLh7NymjI+QxAHoCbMGA0J7BxHNex14yww8vgheGBgd/Nz+Nt1vtrcGnQjPNQO3uEOA8HaSdrYkq21akO+CiBEfBO+X7zreJpfHFWKvTG1GAiEkcpPsdNxpvg/BNdSYR2QZQxOv4mUK50GlheqcuSWxfDw5ElBhnbvYdODBcltxE2679jR0YtH7gt9ohcfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kt7JWBnuVr/0sWJwrTkcOIq62KmS6SewS74CAwav9Wc=; b=iL4Jdn5K0JgN3CNRu/a1IKy8FIVhz2Uu/En8i2tWn7diAxJkyLBVsPSL5/Z5ozWC1mhmlG6YrkrPwiClte8UQa4AB7tBT5YU8mzY0LXkOSDZIgZvrrHOC44PlPeWPFsT+Hh5iURBbGtdjOdz2BOjqPC5+lqQYtgrBsnE4BEqz3JG1+DoILIxpBl6+VwQn/XMRpU8Fs1VR5+C/sKFJwAC4Abye+cyXPTgPOsDB5wFbVWFU/WUZNBzNiFUxz/I1uXDiiR1XEO9gDeo4yBYNdqtnRphguBsou5Up3Rx1ywBxOvsEAXZ1ZLYipplizAyUlvx/z0S1GA7y4HJduGBC/OY4A== Received: from DB8PR06CA0052.eurprd06.prod.outlook.com (2603:10a6:10:120::26) by AS2PR07MB9570.eurprd07.prod.outlook.com (2603:10a6:20b:64c::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:55 +0000 Received: from DB5PEPF00014B9B.eurprd02.prod.outlook.com (2603:10a6:10:120:cafe::26) by DB8PR06CA0052.outlook.office365.com (2603:10a6:10:120::26) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB5PEPF00014B9B.mail.protection.outlook.com (10.167.8.168) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:55 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBo024935; Tue, 18 Mar 2025 00:28:03 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 08/15] tcp: accecn: AccECN option Date: Tue, 18 Mar 2025 01:27:03 +0100 Message-Id: <20250318002710.29483-9-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B9B:EE_|AS2PR07MB9570:EE_ X-MS-Office365-Filtering-Correlation-Id: 3f4c4908-3498-4775-d867-08dd65b3ba7b X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|7416014|376014|82310400026|30052699003|36860700013|921020|13003099007; X-Microsoft-Antispam-Message-Info: =?utf-8?q?ESeWF3OAZ/JaXt+CXozn+Sd2bUsz8cv?= =?utf-8?q?vF5g/jwe2I/ee37pDuXsUgX0Aw3djOxmqbVv6hkM2/PpZYH+4+Vf9UPWYBEO4UGF2?= =?utf-8?q?Q5Wrx9f40grn3jTJWH+oMPYDRJPRNYsxljQxkR0+DTHeoIG3inllr5tMOjs2kNTaH?= =?utf-8?q?wlmvCB5IZHd++U5ubHsWL1NtLt1uQJmXbZs0Zy59KCsZv+unPOhePapxmg1AtfVlM?= =?utf-8?q?3XNVirEdz5zQVEJJ8qGPZhhEyxfFzksCj4hnp6/q9Y8OjS1RQb/LL3tnypGogjUpv?= =?utf-8?q?zIBMRlvonLvmXLmQ8hsGjkqaBuYeY8VK1WApF+itQIQLugUlDVpTFW3YpbwaLiaQS?= =?utf-8?q?H4e77xT1zXyOr3Q1YrzW6E5KQ+UVYE344NTAtzW8PvQUub8gcVl8D8Qo/4CzUj8R1?= =?utf-8?q?xtHrncZ2LYc/7HIcz7bc7s/uvvI/6oamqc1mtsLT0WWYRUOssu2V+6ZSmCMwIHT5d?= =?utf-8?q?y/zZVuXuNCSwHUcNV5eZNdzYuaJPXgcmTQdaWoqWovcxIb3lXD/ymGsOQg3zFonVh?= =?utf-8?q?kr3FtIs8h1wPLmeF4SJgrsPKmWBucuswu1Fcn3m1VIh8vf2H0ucFR4TnYUZwtwY7P?= =?utf-8?q?KVrcT+0Y2OsvFtaSPZ28LUkndbgLLdD9z01vX1zTmIqJHgGEBpVf/R4MMJUCcQ5KY?= =?utf-8?q?h1Dd0grA0c3DRlgu+mO5b5v56+KBFhjSTgjlhVGYJakxUdhISFH7z0k95X4G4AT7f?= =?utf-8?q?731aQbDldoFNv91JKUUdShn02Yd6JyYif7LSy24zHNNtAXx6CEd6rGoIH19fTPwoT?= =?utf-8?q?tyahJ8TzbrazVoFuqMKfOQ0NZpj/Yt3XCz+H6HST7FOFvNBDix77YLsJn4psIF3Nw?= =?utf-8?q?u+Um5eDXTtKSE26q/GxVpYyzEU3GSZihCj31+4x/MKelRnwAY1exb+ee6J3Y51ATK?= =?utf-8?q?dCzbt7fagqCGcJ8iRyeW5ziPlOOJDL3+hVZHI9cody6SmZljPGFJFn7Hh0GXe5c1v?= =?utf-8?q?oKHaqb0XE8DPl9b21OORocRgnDDY1n+1HNAP4e6nI0t9+kpa3O/ny/KKEMdNTO9Fv?= =?utf-8?q?HxZkBuQ29OsFiD+mhoOQQjEd+XgEN7CySiL71CBDZCl57sgLVO/5RCwazaXIvbd+U?= =?utf-8?q?ZDUi9vwvoRBwRVZf7YfvpTkfCuPnXqDUvRnZuaf8MR8JZCGuyFiRHZVs63y4HhCwh?= =?utf-8?q?rS8X1/H1/G1KsQO8FouCCqljwPQDvbaCwNsZc3bTJI8nwXK3pZW8HIpxIXPsFvcqN?= =?utf-8?q?KG2ni9WaKt7y6Jq6Ij88Dtnw3FZp/o64cN8ffqU/2UGw1qsfydY/vczHWy1vQDzK5?= =?utf-8?q?aoQG29c+MzhIu04f6DnjDr7qcMcR4cHybi2BNbcKcOYAJTaBPjpUdgRGRJR2E1/+7?= =?utf-8?q?LJUZjy1L/xG7ThmYFpKySGto5/dFTsJcCxmvXLVsSaY/pbS7jh7ICCdoUQOC8yl38?= =?utf-8?q?c2rd0kyR0jfsSjS/WI2KLV9EnCaqZvIoGzWe6klNCO+fLWxAdkXqJYRkxpFbZUiXQ?= =?utf-8?q?JOS9CKx4S5?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(7416014)(376014)(82310400026)(30052699003)(36860700013)(921020)(13003099007);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:55.7614 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3f4c4908-3498-4775-d867-08dd65b3ba7b X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B9B.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR07MB9570 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen The Accurate ECN allows echoing back the sum of bytes for each IP ECN field value in the received packets using AccECN option. This change implements AccECN option tx & rx side processing without option send control related features that are added by a later change. Based on specification: https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt (Some features of the spec will be added in the later changes rather than in this one). A full-length AccECN option is always attempted but if it does not fit, the minimum length is selected based on the counters that have changed since the last update. The AccECN option (with 24-bit fields) often ends in odd sizes so the option write code tries to take advantage of some nop used to pad the other TCP options. The delivered_ecn_bytes pairs with received_ecn_bytes similar to how delivered_ce pairs with received_ce. In contrast to ACE field, however, the option is not always available to update delivered_ecn_bytes. For ACK w/o AccECN option, the delivered bytes calculated based on the cumulative ACK+SACK information are assigned to one of the counters using an estimation heuristic to select the most likely ECN byte counter. Any estimation error is corrected when the next AccECN option arrives. It may occur that the heuristic gets too confused when there are enough different byte counter deltas between ACKs with the AccECN option in which case the heuristic just gives up on updating the counters for a while. tcp_ecn_option sysctl can be used to select option sending mode for AccECN. Signed-off-by: Ilpo Järvinen Signed-off-by: Neal Cardwell Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 8 +- include/net/netns/ipv4.h | 1 + include/net/tcp.h | 13 +++ include/uapi/linux/tcp.h | 7 ++ net/ipv4/sysctl_net_ipv4.c | 9 ++ net/ipv4/tcp.c | 15 +++- net/ipv4/tcp_input.c | 171 +++++++++++++++++++++++++++++++++++-- net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_output.c | 129 ++++++++++++++++++++++++++++ 9 files changed, 346 insertions(+), 8 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 9de5090fadfb..b282c076b6b5 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -122,8 +122,9 @@ struct tcp_options_received { smc_ok : 1, /* SMC seen on SYN packet */ snd_wscale : 4, /* Window scaling received from sender */ rcv_wscale : 4; /* Window scaling to send to receiver */ - u8 saw_unknown:1, /* Received unknown option */ - unused:7; + u8 accecn:6, /* AccECN index in header, 0=no options */ + saw_unknown:1, /* Received unknown option */ + unused:1; u8 num_sacks; /* Number of SACK blocks */ u16 user_mss; /* mss requested by user in ioctl */ u16 mss_clamp; /* Maximal mss, negotiated at connection setup */ @@ -299,10 +300,13 @@ struct tcp_sock { u32 snd_up; /* Urgent pointer */ u32 delivered; /* Total data packets delivered incl. rexmits */ u32 delivered_ce; /* Like the above but only ECE marked packets */ + u32 delivered_ecn_bytes[3]; u32 received_ce; /* Like the above but for rcvd CE marked pkts */ u32 received_ecn_bytes[3]; u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ unused2:4; + u8 accecn_minlen:2,/* Minimum length of AccECN option sent */ + est_ecnfield:2;/* ECN field for AccECN delivered estimates */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ /* diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 650b2dc9199f..8f9feebbf9e1 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -138,6 +138,7 @@ struct netns_ipv4 { struct local_ports ip_local_ports; u8 sysctl_tcp_ecn; + u8 sysctl_tcp_ecn_option; u8 sysctl_tcp_ecn_fallback; u8 sysctl_ip_default_ttl; diff --git a/include/net/tcp.h b/include/net/tcp.h index aff31ba1dea9..1fdd127b18ab 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -204,6 +204,8 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOPT_AO 29 /* Authentication Option (RFC5925) */ #define TCPOPT_MPTCP 30 /* Multipath TCP (RFC6824) */ #define TCPOPT_FASTOPEN 34 /* Fast open (RFC7413) */ +#define TCPOPT_ACCECN0 172 /* 0xAC: Accurate ECN Order 0 */ +#define TCPOPT_ACCECN1 174 /* 0xAE: Accurate ECN Order 1 */ #define TCPOPT_EXP 254 /* Experimental */ /* Magic number to be after the option value for sharing TCP * experimental options. See draft-ietf-tcpm-experimental-options-00.txt @@ -221,6 +223,7 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOLEN_TIMESTAMP 10 #define TCPOLEN_MD5SIG 18 #define TCPOLEN_FASTOPEN_BASE 2 +#define TCPOLEN_ACCECN_BASE 2 #define TCPOLEN_EXP_FASTOPEN_BASE 4 #define TCPOLEN_EXP_SMC_BASE 6 @@ -234,6 +237,13 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOLEN_MD5SIG_ALIGNED 20 #define TCPOLEN_MSS_ALIGNED 4 #define TCPOLEN_EXP_SMC_BASE_ALIGNED 8 +#define TCPOLEN_ACCECN_PERFIELD 3 + +/* Maximum number of byte counters in AccECN option + size */ +#define TCP_ACCECN_NUMFIELDS 3 +#define TCP_ACCECN_MAXSIZE (TCPOLEN_ACCECN_BASE + \ + TCPOLEN_ACCECN_PERFIELD * \ + TCP_ACCECN_NUMFIELDS) /* tp->accecn_fail_mode */ #define TCP_ACCECN_ACE_FAIL_SEND BIT(0) @@ -1055,6 +1065,9 @@ static inline void tcp_accecn_init_counters(struct tcp_sock *tp) tp->received_ce = 0; tp->received_ce_pending = 0; __tcp_accecn_init_bytes_counters(tp->received_ecn_bytes); + __tcp_accecn_init_bytes_counters(tp->delivered_ecn_bytes); + tp->accecn_minlen = 0; + tp->est_ecnfield = 0; } /* State flags for sacked in struct tcp_skb_cb */ diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 92a2e79222ea..38d5d666be2d 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -296,6 +296,13 @@ struct tcp_info { __u32 tcpi_snd_wnd; /* peer's advertised receive window after * scaling (bytes) */ + __u32 tcpi_received_ce; /* # of CE marks received */ + __u32 tcpi_delivered_e1_bytes; /* Accurate ECN byte counters */ + __u32 tcpi_delivered_e0_bytes; + __u32 tcpi_delivered_ce_bytes; + __u32 tcpi_received_e1_bytes; + __u32 tcpi_received_e0_bytes; + __u32 tcpi_received_ce_bytes; __u32 tcpi_rcv_wnd; /* local advertised receive window after * scaling (bytes) */ diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 75ec1a599b52..1d7fd86ca7b9 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -731,6 +731,15 @@ static struct ctl_table ipv4_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = &tcp_ecn_mode_max, }, + { + .procname = "tcp_ecn_option", + .data = &init_net.ipv4.sysctl_tcp_ecn_option, + .maxlen = sizeof(u8), + .mode = 0644, + .proc_handler = proc_dou8vec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_TWO, + }, { .procname = "tcp_ecn_fallback", .data = &init_net.ipv4.sysctl_tcp_ecn_fallback, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 49289b5243e3..d867957334e1 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -270,6 +270,7 @@ #include #include +#include #include #include #include @@ -4091,6 +4092,9 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info) { const struct tcp_sock *tp = tcp_sk(sk); /* iff sk_type == SOCK_STREAM */ const struct inet_connection_sock *icsk = inet_csk(sk); + const u8 ect1_idx = INET_ECN_ECT_1 - 1; + const u8 ect0_idx = INET_ECN_ECT_0 - 1; + const u8 ce_idx = INET_ECN_CE - 1; unsigned long rate; u32 now; u64 rate64; @@ -4209,6 +4213,14 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info) info->tcpi_rehash = tp->plb_rehash + tp->timeout_rehash; info->tcpi_fastopen_client_fail = tp->fastopen_client_fail; + info->tcpi_received_ce = tp->received_ce; + info->tcpi_delivered_e1_bytes = tp->delivered_ecn_bytes[ect1_idx]; + info->tcpi_delivered_e0_bytes = tp->delivered_ecn_bytes[ect0_idx]; + info->tcpi_delivered_ce_bytes = tp->delivered_ecn_bytes[ce_idx]; + info->tcpi_received_e1_bytes = tp->received_ecn_bytes[ect1_idx]; + info->tcpi_received_e0_bytes = tp->received_ecn_bytes[ect0_idx]; + info->tcpi_received_ce_bytes = tp->received_ecn_bytes[ce_idx]; + info->tcpi_total_rto = tp->total_rto; info->tcpi_total_rto_recoveries = tp->total_rto_recoveries; info->tcpi_total_rto_time = tp->total_rto_time; @@ -5062,6 +5074,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, snd_up); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ce); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); @@ -5071,7 +5084,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 109 + 3); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 122 + 6); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b7a9534eb47c..e4db5ccff75c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -70,6 +70,7 @@ #include #include #include +#include #include #include #include @@ -500,6 +501,144 @@ static bool tcp_ecn_rcv_ecn_echo(const struct tcp_sock *tp, const struct tcphdr return false; } +/* Maps IP ECN field ECT/CE code point to AccECN option field number, given + * we are sending fields with Accurate ECN Order 1: ECT(1), CE, ECT(0). + */ +static u8 tcp_ecnfield_to_accecn_optfield(u8 ecnfield) +{ + switch (ecnfield) { + case INET_ECN_NOT_ECT: + return 0; /* AccECN does not send counts of NOT_ECT */ + case INET_ECN_ECT_1: + return 1; + case INET_ECN_CE: + return 2; + case INET_ECN_ECT_0: + return 3; + default: + WARN_ONCE(1, "bad ECN code point: %d\n", ecnfield); + } + return 0; +} + +/* Maps IP ECN field ECT/CE code point to AccECN option field value offset. + * Some fields do not start from zero, to detect zeroing by middleboxes. + */ +static u32 tcp_accecn_field_init_offset(u8 ecnfield) +{ + switch (ecnfield) { + case INET_ECN_NOT_ECT: + return 0; /* AccECN does not send counts of NOT_ECT */ + case INET_ECN_ECT_1: + return TCP_ACCECN_E1B_INIT_OFFSET; + case INET_ECN_CE: + return TCP_ACCECN_CEB_INIT_OFFSET; + case INET_ECN_ECT_0: + return TCP_ACCECN_E0B_INIT_OFFSET; + default: + WARN_ONCE(1, "bad ECN code point: %d\n", ecnfield); + } + return 0; +} + +/* Maps AccECN option field #nr to IP ECN field ECT/CE bits */ +static unsigned int tcp_accecn_optfield_to_ecnfield(unsigned int optfield, + bool order) +{ + u8 tmp; + + optfield = order ? 2 - optfield : optfield; + tmp = optfield + 2; + + return (tmp + (tmp >> 2)) & INET_ECN_MASK; +} + +/* Handles AccECN option ECT and CE 24-bit byte counters update into + * the u32 value in tcp_sock. As we're processing TCP options, it is + * safe to access from - 1. + */ +static s32 tcp_update_ecn_bytes(u32 *cnt, const char *from, u32 init_offset) +{ + u32 truncated = (get_unaligned_be32(from - 1) - init_offset) & + 0xFFFFFFU; + u32 delta = (truncated - *cnt) & 0xFFFFFFU; + + /* If delta has the highest bit set (24th bit) indicating + * negative, sign extend to correct an estimation using + * sign_extend32(delta, 24 - 1) + */ + delta = sign_extend32(delta, 23); + *cnt += delta; + return (s32)delta; +} + +/* Returns true if the byte counters can be used */ +static bool tcp_accecn_process_option(struct tcp_sock *tp, + const struct sk_buff *skb, + u32 delivered_bytes, int flag) +{ + u8 estimate_ecnfield = tp->est_ecnfield; + bool ambiguous_ecn_bytes_incr = false; + bool first_changed = false; + unsigned int optlen; + unsigned char *ptr; + bool order1, res; + unsigned int i; + + if (!(flag & FLAG_SLOWPATH) || !tp->rx_opt.accecn) { + if (estimate_ecnfield) { + u8 ecnfield = estimate_ecnfield - 1; + + tp->delivered_ecn_bytes[ecnfield] += delivered_bytes; + return true; + } + return false; + } + + ptr = skb_transport_header(skb) + tp->rx_opt.accecn; + optlen = ptr[1] - 2; + WARN_ON_ONCE(ptr[0] != TCPOPT_ACCECN0 && ptr[0] != TCPOPT_ACCECN1); + order1 = (ptr[0] == TCPOPT_ACCECN1); + ptr += 2; + + res = !!estimate_ecnfield; + for (i = 0; i < 3; i++) { + if (optlen >= TCPOLEN_ACCECN_PERFIELD) { + u32 init_offset; + u8 ecnfield; + s32 delta; + u32 *cnt; + + ecnfield = tcp_accecn_optfield_to_ecnfield(i, order1); + init_offset = tcp_accecn_field_init_offset(ecnfield); + cnt = &tp->delivered_ecn_bytes[ecnfield - 1]; + delta = tcp_update_ecn_bytes(cnt, ptr, init_offset); + if (delta) { + if (delta < 0) { + res = false; + ambiguous_ecn_bytes_incr = true; + } + if (ecnfield != estimate_ecnfield) { + if (!first_changed) { + tp->est_ecnfield = ecnfield; + first_changed = true; + } else { + res = false; + ambiguous_ecn_bytes_incr = true; + } + } + } + + optlen -= TCPOLEN_ACCECN_PERFIELD; + ptr += TCPOLEN_ACCECN_PERFIELD; + } + } + if (ambiguous_ecn_bytes_incr) + tp->est_ecnfield = 0; + + return res; +} + static void tcp_count_delivered_ce(struct tcp_sock *tp, u32 ecn_count) { tp->delivered_ce += ecn_count; @@ -516,7 +655,8 @@ static void tcp_count_delivered(struct tcp_sock *tp, u32 delivered, /* Returns the ECN CE delta */ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, - u32 delivered_pkts, int flag) + u32 delivered_pkts, u32 delivered_bytes, + int flag) { const struct tcphdr *th = tcp_hdr(skb); struct tcp_sock *tp = tcp_sk(sk); @@ -527,6 +667,8 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, if (!(flag & (FLAG_FORWARD_PROGRESS | FLAG_TS_PROGRESS))) return 0; + tcp_accecn_process_option(tp, skb, delivered_bytes, flag); + if (!(flag & FLAG_SLOWPATH)) { /* AccECN counter might overflow on large ACKs */ if (delivered_pkts <= TCP_ACCECN_CEP_ACE_MASK) @@ -552,12 +694,14 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, } static u32 tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, - u32 delivered_pkts, int *flag) + u32 delivered_pkts, u32 delivered_bytes, + int *flag) { struct tcp_sock *tp = tcp_sk(sk); u32 delta; - delta = __tcp_accecn_process(sk, skb, delivered_pkts, *flag); + delta = __tcp_accecn_process(sk, skb, delivered_pkts, + delivered_bytes, *flag); if (delta > 0) { tcp_count_delivered_ce(tp, delta); *flag |= FLAG_ECE; @@ -4213,6 +4357,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) if (tcp_ecn_mode_accecn(tp)) ecn_count = tcp_accecn_process(sk, skb, tp->delivered - delivered, + sack_state.delivered_bytes, &flag); tcp_in_ack_event(sk, flag); @@ -4252,6 +4397,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) if (tcp_ecn_mode_accecn(tp)) ecn_count = tcp_accecn_process(sk, skb, tp->delivered - delivered, + sack_state.delivered_bytes, &flag); tcp_in_ack_event(sk, flag); /* If data was DSACKed, see if we can undo a cwnd reduction. */ @@ -4379,6 +4525,7 @@ void tcp_parse_options(const struct net *net, ptr = (const unsigned char *)(th + 1); opt_rx->saw_tstamp = 0; + opt_rx->accecn = 0; opt_rx->saw_unknown = 0; while (length > 0) { @@ -4470,6 +4617,12 @@ void tcp_parse_options(const struct net *net, ptr, th->syn, foc, false); break; + case TCPOPT_ACCECN0: + case TCPOPT_ACCECN1: + /* Save offset of AccECN option in TCP header */ + opt_rx->accecn = (ptr - 2) - (__u8 *)th; + break; + case TCPOPT_EXP: /* Fast Open option shares code 254 using a * 16 bits magic number. @@ -4530,11 +4683,14 @@ static bool tcp_fast_parse_options(const struct net *net, */ if (th->doff == (sizeof(*th) / 4)) { tp->rx_opt.saw_tstamp = 0; + tp->rx_opt.accecn = 0; return false; } else if (tp->rx_opt.tstamp_ok && th->doff == ((sizeof(*th) + TCPOLEN_TSTAMP_ALIGNED) / 4)) { - if (tcp_parse_aligned_timestamp(tp, th)) + if (tcp_parse_aligned_timestamp(tp, th)) { + tp->rx_opt.accecn = 0; return true; + } } tcp_parse_options(net, skb, &tp->rx_opt, 1, NULL); @@ -6134,8 +6290,12 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, tp->received_ce_pending = min(tp->received_ce_pending + pcount, 0xfU); - if (payload_len > 0) + if (payload_len > 0) { + u8 minlen = tcp_ecnfield_to_accecn_optfield(ecnfield); tp->received_ecn_bytes[ecnfield - 1] += payload_len; + tp->accecn_minlen = max_t(u8, tp->accecn_minlen, + minlen); + } } } @@ -6359,6 +6519,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) */ tp->rx_opt.saw_tstamp = 0; + tp->rx_opt.accecn = 0; /* pred_flags is 0xS?10 << 16 + snd_wnd * if header_prediction is to be made diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 7c52645567eb..605a0e54a1ff 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3451,6 +3451,7 @@ static void __net_init tcp_set_hashinfo(struct net *net) static int __net_init tcp_sk_init(struct net *net) { net->ipv4.sysctl_tcp_ecn = 2; + net->ipv4.sysctl_tcp_ecn_option = 2; net->ipv4.sysctl_tcp_ecn_fallback = 1; net->ipv4.sysctl_tcp_base_mss = TCP_BASE_MSS; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 3638a865a430..7b102e7c76fd 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -491,6 +491,7 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) #define OPTION_SMC BIT(9) #define OPTION_MPTCP BIT(10) #define OPTION_AO BIT(11) +#define OPTION_ACCECN BIT(12) static void smc_options_write(__be32 *ptr, u16 *options) { @@ -512,12 +513,14 @@ struct tcp_out_options { u16 mss; /* 0 to disable */ u8 ws; /* window scale, 0 to disable */ u8 num_sack_blocks; /* number of SACK blocks to include */ + u8 num_accecn_fields; /* number of AccECN fields needed */ u8 hash_size; /* bytes in hash_location */ u8 bpf_opt_len; /* length of BPF hdr option */ __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ struct mptcp_out_options mptcp; + u32 *ecn_bytes; /* AccECN ECT/CE byte counters */ }; static void mptcp_options_write(struct tcphdr *th, __be32 *ptr, @@ -766,6 +769,47 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, *ptr++ = htonl(opts->tsecr); } + if (OPTION_ACCECN & options) { + const u8 ect0_idx = INET_ECN_ECT_0 - 1; + const u8 ect1_idx = INET_ECN_ECT_1 - 1; + const u8 ce_idx = INET_ECN_CE - 1; + u32 e0b; + u32 e1b; + u32 ceb; + u8 len; + + e0b = opts->ecn_bytes[ect0_idx] + TCP_ACCECN_E0B_INIT_OFFSET; + e1b = opts->ecn_bytes[ect1_idx] + TCP_ACCECN_E1B_INIT_OFFSET; + ceb = opts->ecn_bytes[ce_idx] + TCP_ACCECN_CEB_INIT_OFFSET; + len = TCPOLEN_ACCECN_BASE + + opts->num_accecn_fields * TCPOLEN_ACCECN_PERFIELD; + + if (opts->num_accecn_fields == 2) { + *ptr++ = htonl((TCPOPT_ACCECN1 << 24) | (len << 16) | + ((e1b >> 8) & 0xffff)); + *ptr++ = htonl(((e1b & 0xff) << 24) | + (ceb & 0xffffff)); + } else if (opts->num_accecn_fields == 1) { + *ptr++ = htonl((TCPOPT_ACCECN1 << 24) | (len << 16) | + ((e1b >> 8) & 0xffff)); + leftover_bytes = ((e1b & 0xff) << 8) | + TCPOPT_NOP; + leftover_size = 1; + } else if (opts->num_accecn_fields == 0) { + leftover_bytes = (TCPOPT_ACCECN1 << 8) | len; + leftover_size = 2; + } else if (opts->num_accecn_fields == 3) { + *ptr++ = htonl((TCPOPT_ACCECN1 << 24) | (len << 16) | + ((e1b >> 8) & 0xffff)); + *ptr++ = htonl(((e1b & 0xff) << 24) | + (ceb & 0xffffff)); + *ptr++ = htonl(((e0b & 0xffffff) << 8) | + TCPOPT_NOP); + } + if (tp) + tp->accecn_minlen = 0; + } + if (unlikely(OPTION_SACK_ADVERTISE & options)) { *ptr++ = htonl((leftover_bytes << 16) | (TCPOPT_SACK_PERM << 8) | @@ -886,6 +930,60 @@ static void mptcp_set_option_cond(const struct request_sock *req, } } +/* Initial values for AccECN option, ordered is based on ECN field bits + * similar to received_ecn_bytes. Used for SYN/ACK AccECN option. + */ +static u32 synack_ecn_bytes[3] = { 0, 0, 0 }; + +static u32 tcp_synack_options_combine_saving(struct tcp_out_options *opts) +{ + /* How much there's room for combining with the alignment padding? */ + if ((opts->options & (OPTION_SACK_ADVERTISE | OPTION_TS)) == + OPTION_SACK_ADVERTISE) + return 2; + else if (opts->options & OPTION_WSCALE) + return 1; + return 0; +} + +/* Calculates how long AccECN option will fit to @remaining option space. + * + * AccECN option can sometimes replace NOPs used for alignment of other + * TCP options (up to @max_combine_saving available). + * + * Only solutions with at least @required AccECN fields are accepted. + * + * Returns: The size of the AccECN option excluding space repurposed from + * the alignment of the other options. + */ +static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, + int remaining, int max_combine_saving) +{ + int size = TCP_ACCECN_MAXSIZE; + + opts->num_accecn_fields = TCP_ACCECN_NUMFIELDS; + + while (opts->num_accecn_fields >= required) { + int leftover_size = size & 0x3; + /* Pad to dword if cannot combine */ + if (leftover_size > max_combine_saving) + leftover_size = -((4 - leftover_size) & 0x3); + + if (remaining >= size - leftover_size) { + size -= leftover_size; + break; + } + + opts->num_accecn_fields--; + size -= TCPOLEN_ACCECN_PERFIELD; + } + if (opts->num_accecn_fields < required) + return 0; + + opts->options |= OPTION_ACCECN; + return size; +} + /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -968,6 +1066,17 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, } } + /* Simultaneous open SYN/ACK needs AccECN option but not SYN */ + if (unlikely((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK) && + tcp_ecn_mode_accecn(tp) && + sock_net(sk)->ipv4.sysctl_tcp_ecn_option && + remaining >= TCPOLEN_ACCECN_BASE)) { + u32 saving = tcp_synack_options_combine_saving(opts); + + opts->ecn_bytes = synack_ecn_bytes; + remaining -= tcp_options_fit_accecn(opts, 0, remaining, saving); + } + bpf_skops_hdr_opt_len(sk, skb, NULL, NULL, 0, opts, &remaining); return MAX_TCP_OPTION_SPACE - remaining; @@ -985,6 +1094,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, { struct inet_request_sock *ireq = inet_rsk(req); unsigned int remaining = MAX_TCP_OPTION_SPACE; + struct tcp_request_sock *treq = tcp_rsk(req); if (tcp_key_is_md5(key)) { opts->options |= OPTION_MD5; @@ -1047,6 +1157,14 @@ static unsigned int tcp_synack_options(const struct sock *sk, smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); + if (treq->accecn_ok && sock_net(sk)->ipv4.sysctl_tcp_ecn_option && + remaining >= TCPOLEN_ACCECN_BASE) { + u32 saving = tcp_synack_options_combine_saving(opts); + + opts->ecn_bytes = synack_ecn_bytes; + remaining -= tcp_options_fit_accecn(opts, 0, remaining, saving); + } + bpf_skops_hdr_opt_len((struct sock *)sk, skb, req, syn_skb, synack_type, opts, &remaining); @@ -1117,6 +1235,17 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb opts->num_sack_blocks = 0; } + if (tcp_ecn_mode_accecn(tp) && + sock_net(sk)->ipv4.sysctl_tcp_ecn_option) { + int saving = opts->num_sack_blocks > 0 ? 2 : 0; + int remaining = MAX_TCP_OPTION_SPACE - size; + + opts->ecn_bytes = tp->received_ecn_bytes; + size += tcp_options_fit_accecn(opts, tp->accecn_minlen, + remaining, + saving); + } + if (unlikely(BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG))) { unsigned int remaining = MAX_TCP_OPTION_SPACE - size; From patchwork Tue Mar 18 00:27:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020098 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2089.outbound.protection.outlook.com [40.107.241.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACA7C21364; Tue, 18 Mar 2025 00:28:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.241.89 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257684; cv=fail; b=VvVp5kwdB0jbuU6iJGPBOQUOJN6+O+DOjAVmoRVSSszex/hh38VAxnoPPTsgekKRjO664gK3JROJ6ZVLRa2ZvxUJ1hhI8GxpMEKBY22z1/37w2h4uSySi7sEGLqf/k8W+5jFNi3zLCTzoYX+S4VJCkGC6lfpdPDOlfPvf4FazDs= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257684; c=relaxed/simple; bh=WM3JGgZkKBWnizfsw/AJz2FKU3+wP7V8NYafTxs5mww=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=rFSZbusxCzmRZEUaFPnvER2DnITBHlg4bonhds8MOxWJ3bdCGl/HGS72cT+NUIirusDLmE+4K5bg6Elv+SjAbhAuVHY+CV1W2mARAIbMD9MIkf7IXtzOPuDtnV0J0v8zOZlpqcEhqv8d1TGJRKEQ5sCtlAmpsubYS/znT6fWlLI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=dZGsImFz; arc=fail smtp.client-ip=40.107.241.89 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="dZGsImFz" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YfYmpSphWpcxLSryHgt/+Juq4kEDJ4kdep2x1FdFSytty07gHW7rTdz5p4yZF0lpTPeqENz4pW+Ws0wjxQMTMKgYMcVKAk0I5LAmqnDJHiUq801MHRlmhLV06yOR3V0G5cqXkFtDTNRoLpZAPdHk8vhI9ebXS5eN79lgb2IYdiJ4UXWK/vDYekEYbe9Hj62SZQnf2odsM79URmD3PVRMWpBk41lTkkzGqpUIBqaRVDnbZqbuLLInJ5zOFd7AB74Syh8jm3aifNrfEaGOEcbTEnACC3004t+VVJocR54uVPMGpBuE1vadl4G11LvZoJ4BQJ+zj7BefQnCp6nyKT09bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4r7zoQge4NAdzXesTo6zfqb5PjGbZVAqT36vn6W3Ni4=; b=LRyHPG0z8ohNgM3M9DuZaWpYFLgv/aEnFWrGc0Z4V9BR2+8AeAT9JBfcFEANsTx9f2KtSeQo+19h6lbZbbgm4UGBD3GHFmvdUaPcyCtuBWLNCQPTDjWcrh+uArwuPLQOmhhmxbMZ/8iui/ggSExeyTsGPRz1fDlAHXpbZsHQr7+FAm7Mx21sdChEtX0dWcG6T0dzainkhadoXu5NNIe/SF5jmDYuFQFANB76dTZHulgFodOfZjM+rUZ6re4zLtY2tR2htyUDppM4Mkghaq6+Pki1qyF1Ks3V0MKRxZIEdKRfl3AN+WvgS7Zf6x6/sKghfM++MFHLxGovVpXoQtx7Vw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4r7zoQge4NAdzXesTo6zfqb5PjGbZVAqT36vn6W3Ni4=; b=dZGsImFzMxF0N7eoqRCcQc41CyQGO6zN19A67WL/bMgaW1GH31OkDnt2wKBlOk2pwtbxoWKZSiuFFsFLqvTRxkvD+rJ1RU+PB6dmSBnP+ZXKY6A1SLFtFCfnM6hbnFW1P08bG8QzNGq622jJY6cWGt2HrKT5I8WVmsCS1+dhNbHJ4ncRnHoQBcC7qhEgAWvt0Gxf+YWgrdvY4V//julltTodg2fpXBx0if+MVLncNKFlXU/lIOmuRK1OcM056kCJLkxQMOFCcbxdoBw3DGyfAuDnEmSChb+y9DhmK+AtKUGVdwMIKOV6zzThHhADufyOXRQw8v9oJaTbHR7pQVa38w== Received: from DB6PR0301CA0079.eurprd03.prod.outlook.com (2603:10a6:6:30::26) by GVXPR07MB10053.eurprd07.prod.outlook.com (2603:10a6:150:11f::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8511.31; Tue, 18 Mar 2025 00:27:57 +0000 Received: from DB5PEPF00014B93.eurprd02.prod.outlook.com (2603:10a6:6:30:cafe::44) by DB6PR0301CA0079.outlook.office365.com (2603:10a6:6:30::26) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB5PEPF00014B93.mail.protection.outlook.com (10.167.8.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:57 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBp024935; Tue, 18 Mar 2025 00:28:04 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 09/15] tcp: accecn: AccECN option send control Date: Tue, 18 Mar 2025 01:27:04 +0100 Message-Id: <20250318002710.29483-10-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB5PEPF00014B93:EE_|GVXPR07MB10053:EE_ X-MS-Office365-Filtering-Correlation-Id: a95dc151-16cf-47ed-03fa-08dd65b3bb42 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|7416014|36860700013|376014|1800799024|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?dHSFMT9jDKxSyr9zOEa2Q8bMe3sMDMb?= =?utf-8?q?Gr9gnjmA2wRs+xmV6UTsvrZx3r6ACTC7LFRCU+apfLpatJy8ke+Yz0Ghe8cyae6Hn?= =?utf-8?q?h4E/m5WJ44RayuM0bzgK2FKvMiDNwGwPwTONHbKCMKGNnIWnUwXE29akDykGTe3hc?= =?utf-8?q?WF5fKSfVzFEsTWrB7ddknbpuvv7fUVeZb1YuYu8e5UbA5iuWPcF5+uehfRY2LeZO0?= =?utf-8?q?5PKSrJfWpKKwnY78URHUhoTKGTsNEinb+Z0M4e5lCLA5jLFTGSLNDABJcYnEDJykc?= =?utf-8?q?csFKxtfxKFW3BzDHmCE53BMfzpTVsbOhzL7NgPVXhfjQTmnVT53plWv9WJxxGcBRt?= =?utf-8?q?BK8LKneuDV8sa21QCPhNNoVG0Ic5PlHPOS7azij7PY7JQ0/WNG6qwdMj0gEjWKmHk?= =?utf-8?q?onpq58tyGgLTVu5t2YBWIrQgEFISOSsQGOWQRx4iipSC9cYZarw+qDBeHTovQNGB4?= =?utf-8?q?1tIM0YrUQx7UK7BhHxlt7LV7U7UfhFYDExOSmuiEcva0zJWgZfuFrArts3HNLhO6d?= =?utf-8?q?IeNoExQD1ReKrRhti2LB3eJI8pLGwiEGt3Tug8RmJNYF1XuaA1Btkyb/5h0GsAte6?= =?utf-8?q?NE9lFt8/aX8uocmqhsKKHT5qqQ+2bXbjKmaw74jL32d6CF7rKdBeLn/1VDdDKu7JY?= =?utf-8?q?CG1nf/c4wavdUFKkyN5iltiBkEJ3SqcZtjtJq4awEfwtYw7i8bnlGq44fIeQi9Qq1?= =?utf-8?q?e/TYn3LjVxUqcWjGDBvGaPOa63VaTRZBy1ecNI245Dr3Et2h3fsDNTeadd4gHpda2?= =?utf-8?q?M34TlN3d5Qj9ZPC/qKy/7dgdio84ACX62TCVtbdDWZdHhq3nY/vAt4YLNta6wDo7o?= =?utf-8?q?6bZMyz78pMfXBjkG+ePjzZsDUzLVRRkTbWZKEB5DdzDJhYtreRzoyv6pN6L8FICbA?= =?utf-8?q?XRfwPnEcAbbmkr7SSi8qZZPIgf0xvJb452EZsBGPPe8eCHi/GmQ2n9VoJnCXUvLkZ?= =?utf-8?q?bTNqhS82nDaU9PLJG2rxroXAPJHJfTB4dGM9ivgt2Rq+vF8fSCmXeHb+/jzXC04+C?= =?utf-8?q?9g1EfxHVWvfhpo/jq5KHwd/CSarpozXfdLYWrv2aGqH5veRJ3jZsMNX20OJUqoa2M?= =?utf-8?q?5kfxvKMyBfIUG1qhgwz41Zfww46mUrpBjNE5TrGCqrs9DgptPcrnAkBWrH2eC0tl8?= =?utf-8?q?ezVpStHJ5lw3GaNYwTVzPJKyDLBJBzogWuBe/bAtw5D9fTpzheSwa5eoRiRE5Vco7?= =?utf-8?q?FhFm26VSnYiUYw2g3q7aNQo0onF5euKPeeLJc2CuZBuoUU08+Jske27lLxiBcxKRC?= =?utf-8?q?P+tTRfAa3VOzyMcFKoGkCaU7/xmm73TQ62o78frPeIL708MK1FZTLhhtvu4G5+Mbt?= =?utf-8?q?3DtkkqhhytF/+aSo/cfHotk35qyn5ovNXH5EARr/FRZKWqPtJkPX3mW/pacMyPhP1?= =?utf-8?q?Vcg9WNrIDC2DFtGw95j/LHEvbxS/+zsb9lA8cvQKZ/JuiMiMIsg+kA=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(7416014)(36860700013)(376014)(1800799024)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:57.0758 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a95dc151-16cf-47ed-03fa-08dd65b3bb42 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B93.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR07MB10053 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen Instead of sending the option in every ACK, limit sending to those ACKs where the option is necessary: - Handshake - "Change-triggered ACK" + the ACK following it. The 2nd ACK is necessary to unambiguously indicate which of the ECN byte counters in increasing. The first ACK has two counters increasing due to the ecnfield edge. - ACKs with CE to allow CEP delta validations to take advantage of the option. - Force option to be sent every at least once per 2^22 bytes. The check is done using the bit edges of the byte counters (avoids need for extra variables). - AccECN option beacon to send a few times per RTT even if nothing in the ECN state requires that. The default is 3 times per RTT, and its period can be set via sysctl_tcp_ecn_option_beacon. Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 3 +++ include/net/netns/ipv4.h | 1 + include/net/tcp.h | 1 + net/ipv4/sysctl_net_ipv4.c | 9 ++++++++ net/ipv4/tcp.c | 5 ++++- net/ipv4/tcp_input.c | 36 +++++++++++++++++++++++++++++++- net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_minisocks.c | 2 ++ net/ipv4/tcp_output.c | 42 ++++++++++++++++++++++++++++++-------- 9 files changed, 90 insertions(+), 10 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index b282c076b6b5..8d0f5a73b0a3 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -306,7 +306,10 @@ struct tcp_sock { u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ unused2:4; u8 accecn_minlen:2,/* Minimum length of AccECN option sent */ + prev_ecnfield:2,/* ECN bits from the previous segment */ + accecn_opt_demand:2,/* Demand AccECN option for n next ACKs */ est_ecnfield:2;/* ECN field for AccECN delivered estimates */ + u64 accecn_opt_tstamp; /* Last AccECN option sent timestamp */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ /* diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 8f9feebbf9e1..f0ef79fa6485 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -139,6 +139,7 @@ struct netns_ipv4 { u8 sysctl_tcp_ecn; u8 sysctl_tcp_ecn_option; + u8 sysctl_tcp_ecn_option_beacon; u8 sysctl_tcp_ecn_fallback; u8 sysctl_ip_default_ttl; diff --git a/include/net/tcp.h b/include/net/tcp.h index 1fdd127b18ab..48fb4e5579d1 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1067,6 +1067,7 @@ static inline void tcp_accecn_init_counters(struct tcp_sock *tp) __tcp_accecn_init_bytes_counters(tp->received_ecn_bytes); __tcp_accecn_init_bytes_counters(tp->delivered_ecn_bytes); tp->accecn_minlen = 0; + tp->accecn_opt_demand = 0; tp->est_ecnfield = 0; } diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 1d7fd86ca7b9..3ceefd2a77d7 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -740,6 +740,15 @@ static struct ctl_table ipv4_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, + { + .procname = "tcp_ecn_option_beacon", + .data = &init_net.ipv4.sysctl_tcp_ecn_option_beacon, + .maxlen = sizeof(u8), + .mode = 0644, + .proc_handler = proc_dou8vec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_FOUR, + }, { .procname = "tcp_ecn_fallback", .data = &init_net.ipv4.sysctl_tcp_ecn_fallback, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index d867957334e1..701013b0aa87 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3366,6 +3366,8 @@ int tcp_disconnect(struct sock *sk, int flags) tp->wait_third_ack = 0; tp->accecn_fail_mode = 0; tcp_accecn_init_counters(tp); + tp->prev_ecnfield = 0; + tp->accecn_opt_tstamp = 0; if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); @@ -5077,6 +5079,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, accecn_opt_tstamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); @@ -5084,7 +5087,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 122 + 6); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 130 + 6); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e4db5ccff75c..8cdeb7765d91 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -467,6 +467,7 @@ static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, default: tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; + tp->accecn_opt_demand = 2; if (INET_ECN_is_ce(ip_dsfield) && tcp_accecn_validate_syn_feedback(sk, ace, tp->syn_ect_snt)) { @@ -487,6 +488,7 @@ static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th, } else { tp->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; + tp->prev_ecnfield = tp->syn_ect_rcv; tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); } } @@ -6279,6 +6281,7 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); struct tcp_sock *tp = tcp_sk(sk); + bool ecn_edge; if (!INET_ECN_is_not_ect(ecnfield)) { u32 pcount = is_ce * max_t(u16, 1, skb_shinfo(skb)->gso_segs); @@ -6292,9 +6295,36 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, if (payload_len > 0) { u8 minlen = tcp_ecnfield_to_accecn_optfield(ecnfield); + u32 oldbytes = tp->received_ecn_bytes[ecnfield - 1]; + tp->received_ecn_bytes[ecnfield - 1] += payload_len; tp->accecn_minlen = max_t(u8, tp->accecn_minlen, minlen); + + /* Demand AccECN option at least every 2^22 bytes to + * avoid overflowing the ECN byte counters. + */ + if ((tp->received_ecn_bytes[ecnfield - 1] ^ oldbytes) & + ~((1 << 22) - 1)) { + u8 opt_demand = max_t(u8, 1, + tp->accecn_opt_demand); + + tp->accecn_opt_demand = opt_demand; + } + } + } + + ecn_edge = tp->prev_ecnfield != ecnfield; + if (ecn_edge || is_ce) { + tp->prev_ecnfield = ecnfield; + /* Demand Accurate ECN change-triggered ACKs. Two ACK are + * demanded to indicate unambiguously the ecnfield value + * in the latter ACK. + */ + if (tcp_ecn_mode_accecn(tp)) { + if (ecn_edge) + inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_NOW; + tp->accecn_opt_demand = 2; } } } @@ -6427,8 +6457,12 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { - if (tcp_ecn_mode_accecn(tp)) + if (tcp_ecn_mode_accecn(tp)) { + u8 opt_demand = max_t(u8, 1, tp->accecn_opt_demand); + send_accecn_reflector = true; + tp->accecn_opt_demand = opt_demand; + } if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 605a0e54a1ff..47f47ffa4f28 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3452,6 +3452,7 @@ static int __net_init tcp_sk_init(struct net *net) { net->ipv4.sysctl_tcp_ecn = 2; net->ipv4.sysctl_tcp_ecn_option = 2; + net->ipv4.sysctl_tcp_ecn_option_beacon = 3; net->ipv4.sysctl_tcp_ecn_fallback = 1; net->ipv4.sysctl_tcp_base_mss = TCP_BASE_MSS; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 550c2d9d08b7..82065b49e7dd 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -498,6 +498,8 @@ static void tcp_ecn_openreq_child(struct sock *sk, tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tp->prev_ecnfield = treq->syn_ect_rcv; + tp->accecn_opt_demand = 1; tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); } else { tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 7b102e7c76fd..61bb5f5ee357 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -806,8 +806,12 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, *ptr++ = htonl(((e0b & 0xffffff) << 8) | TCPOPT_NOP); } - if (tp) + if (tp) { tp->accecn_minlen = 0; + tp->accecn_opt_tstamp = tp->tcp_mstamp; + if (tp->accecn_opt_demand) + tp->accecn_opt_demand--; + } } if (unlikely(OPTION_SACK_ADVERTISE & options)) { @@ -984,6 +988,18 @@ static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, return size; } +static bool tcp_accecn_option_beacon_check(const struct sock *sk) +{ + const struct tcp_sock *tp = tcp_sk(sk); + + if (!sock_net(sk)->ipv4.sysctl_tcp_ecn_option_beacon) + return false; + + return tcp_stamp_us_delta(tp->tcp_mstamp, tp->accecn_opt_tstamp) * + sock_net(sk)->ipv4.sysctl_tcp_ecn_option_beacon >= + (tp->srtt_us >> 3); +} + /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -1237,13 +1253,18 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb if (tcp_ecn_mode_accecn(tp) && sock_net(sk)->ipv4.sysctl_tcp_ecn_option) { - int saving = opts->num_sack_blocks > 0 ? 2 : 0; - int remaining = MAX_TCP_OPTION_SPACE - size; - - opts->ecn_bytes = tp->received_ecn_bytes; - size += tcp_options_fit_accecn(opts, tp->accecn_minlen, - remaining, - saving); + if (sock_net(sk)->ipv4.sysctl_tcp_ecn_option >= 2 || + tp->accecn_opt_demand || + tcp_accecn_option_beacon_check(sk)) { + int saving = opts->num_sack_blocks > 0 ? 2 : 0; + int remaining = MAX_TCP_OPTION_SPACE - size; + + opts->ecn_bytes = tp->received_ecn_bytes; + size += tcp_options_fit_accecn(opts, + tp->accecn_minlen, + remaining, + saving); + } } if (unlikely(BPF_SOCK_OPS_TEST_FLAG(tp, @@ -2959,6 +2980,11 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, sent_pkts = 0; tcp_mstamp_refresh(tp); + + /* AccECN option beacon depends on mstamp, it may change mss */ + if (tcp_ecn_mode_accecn(tp) && tcp_accecn_option_beacon_check(sk)) + mss_now = tcp_current_mss(sk); + if (!push_one) { /* Do MTU probing. */ result = tcp_mtu_probe(sk); From patchwork Tue Mar 18 00:27:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020099 X-Patchwork-Delegate: kuba@kernel.org Received: from AS8PR03CU001.outbound.protection.outlook.com (mail-westeuropeazon11012012.outbound.protection.outlook.com [52.101.71.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA13C224F0; Tue, 18 Mar 2025 00:28:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.71.12 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257685; cv=fail; b=pAY8secKzcoVEY0JbBdRGXKLiR+BJSQ0usUdN2U2t6HK9GK50SDoeyCqmbtsnlNb9AiiD1Ll3jcKM84G+AB0357VHTK5jXj/2lLink4lytVP27xbm0D2NEsrmHjK4YpRQJ6ZriVGfrQvulQ4hJaI4OE7mTKYQQVdr3BhWdS1f8k= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257685; c=relaxed/simple; bh=Jfw8pCRZM/WvUnDQpH9IFbTRpO7rJFAbatV4ss6bVm4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=mqWi9lMCziif6CIRq5XCYn7ZCePt83UYFEWeau9Ahab4iuE6OGfhB34dJz9AwZXGhZwjQRPT4nFqYEpzOS3iFDeFs/E/rrYzAJuvJa+z75eOTt0wSCizI0EDDmuADTYaleN2I8hz+bXEOaSox1yeD6fM8DcPB9KQI1Qnxwe18e8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=Q5Q1IopP; arc=fail smtp.client-ip=52.101.71.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="Q5Q1IopP" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rfwwl+8GqWYadKozPkyZtUU9bFyKPTs8Tt88sOWOx8ho+9443T3zZFGReV532u42Tka8kgX5Ilu/0n/Ym19SIay4Jqhmp/jzVIk1E7Ex1XwqWT2vsCZ4GF5PnVQh5/oZKcHa1F9YJreCjG3/cILnCixkSy0p0FG4xLKN2Lny6W8sxofwQWQ8h2k8wEyW/MpzRJmh3nnggC2REMB9D3SlF3D3lV4izIbMgNKry9pDycFCcNgw7kTnIdMDuA8Rnguoee/bJVKALvBWrqOaaBu2FODt35mhHBWTRODgNxkUISRET7QSQR4HhrYAypcZ+XvREstXmSBSu7mo1gljzyil1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0iTVB9QAuoJKr4I1LIOJqJ3t3nWd6OvyM23++tXfgd0=; b=w4+MP1l2U1zXa6FypRY85mQEvwDnLToDWVwtPk5uWOP51q4h0005jjiPCNj1l+1ZX3EXPQkY0xpHWRTrjipncRXko141yTMQebBY1t9cqGJNPpL1fMF+ZkfZjsk95b8NKFduXpZRAhBxy1GfyHbGPWUW/MNRVDZiQyVlSNp83HYOUkGQMEhriykpN2ojw6lbu3nmQIICC671fIfOLIZ3/Vp4+dTT2ctUK3uVkgOQPK5Q39kW29aCgJ7qNLz34y53vAIwj/Wtx6nJd+tT7pyNZUHkaAPyPkomsd3Ef2tU2b6HgsMmheIB5vZXJCQLREz1qR1ZCt1DaoLyAsFK1xlRUg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0iTVB9QAuoJKr4I1LIOJqJ3t3nWd6OvyM23++tXfgd0=; b=Q5Q1IopPzvtNa07dZevDfLjoh7cugayTs9+Bpg4VAnlwxmrf2M8rWg45cqyNCHatuu48iRuHybjd1Nqg9QW5GzV+PlvXpcIp6jhms1KOcHKjbNgfVylh+qRhXkh8PVLgYKdj7/24ncKMW3/MZLvHYBlFa6qpKO+yj3FU8kWWwX5KpD79RKKmrfIeQA77dQNeUO1zJPMGNVQUDpmJfkHu1HNdCFtM0Y1H8uJh8iNGyCBnZio3mpAg8dsXScLvraPZtAhR5xMKAictDyPJl0j34uL8KpQiaSEhEw76UsTexcy05WWDgm5dhOMmNkNM6CfoBe74mZnQa4r7GPvF7rgBYA== Received: from DB9PR02CA0005.eurprd02.prod.outlook.com (2603:10a6:10:1d9::10) by PR3PR07MB6778.eurprd07.prod.outlook.com (2603:10a6:102:73::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:27:58 +0000 Received: from DB1PEPF00039234.eurprd03.prod.outlook.com (2603:10a6:10:1d9:cafe::a8) by DB9PR02CA0005.outlook.office365.com (2603:10a6:10:1d9::10) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:27:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB1PEPF00039234.mail.protection.outlook.com (10.167.8.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:58 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBq024935; Tue, 18 Mar 2025 00:28:06 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 10/15] tcp: accecn: AccECN option failure handling Date: Tue, 18 Mar 2025 01:27:05 +0100 Message-Id: <20250318002710.29483-11-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB1PEPF00039234:EE_|PR3PR07MB6778:EE_ X-MS-Office365-Filtering-Correlation-Id: 24b30673-13ec-463d-ef45-08dd65b3bc1f X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|82310400026|36860700013|1800799024|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?b5B7KIs/ilSdxhOCY+tM9RuBnpHAJKq?= =?utf-8?q?IdwVFy85uXUmlH06n5bS+xSv7GAXrDCswFpgrpaEodABEx4CJg0QmlCg0MMBiSVP3?= =?utf-8?q?/GNPRy9hseXOiKQoUS8AbwOrCJncBmHb1qLiC0kjtcGWfI63bwdtbYNi/5G9SrdaA?= =?utf-8?q?dOg2dRKt5X/d6d7pY19K7B2iZjZVcz9RuKr6topT4GMfto8Y2g4p+EwiBJSgFbf0m?= =?utf-8?q?uownXLNOgzQLGMWWnHinK7eWjPBVFjgxdutXy/FPVBly9VgufL1JyTSVa2jRMlUAq?= =?utf-8?q?wxRAAFvmGCaf65B1v5BGoW1JOPhbAsIOh6enLrRMMqTwfDyp42uRAoLQOoRFR6lez?= =?utf-8?q?Bi6zDq13497QjhP77g1LJqq7jLhHQoLeibOehtX6fQOEqwVdaSjCltzxJ5Lrwaq5X?= =?utf-8?q?GPlEJrqukr8smyQnb/JA8k0poG9M8ysPhLGOkGarWDMbabzg2VWv0IG+UF+nAP3Gl?= =?utf-8?q?8qKX9biVDsa2fD+5uFysF5KFZy1s9SohmfDlBdYGu5EXDUy6vw9uP+df2NwgVVGLB?= =?utf-8?q?nNg9Xzj1feeT+8Dwl2TXl5roix+QPEyf5Dw/IF9yD2DWgyhi/mwUddLbUCbqs8scC?= =?utf-8?q?QBPj4hec3QrTMwVswyC01j+t5oVHgwVqcdlzY/Wv1f4VrjnpbHGo6dWgrqX+IwN34?= =?utf-8?q?gQXJam96Bqft0pERPLJpWnC/iT9bHw4k02TAyHtVdTyxcMyHvQbadRNq37WzXNpcp?= =?utf-8?q?BEu/sXiymtSY7uM4dDtDOwndtbZnzC+B+wKtz+XZLk86p9KOl3MIn9jLmZE7fUHXc?= =?utf-8?q?JkX8C8t7dXLDFHRO1U1UyRrVjGvheumu5Zl6JJeKAr+arEbXjKIXbja4MrE4sUEot?= =?utf-8?q?QMNceJ6CEVUWf4Pu8CJmJVL4euPZ05SFeWRSLvqDcXNJKJnuWTyg0ZmNdoS6147w9?= =?utf-8?q?qk2aPIcI7dSLLk4NT9zcyDv17wUo/6TdvBmbcoPA+jKwY3EYUykHzb3Ly4ccPNKY5?= =?utf-8?q?7L6sDWV/0DAUrEqQnEisOpp4JSngKP4qg+0TN5TgLwDkoTx7fh43kHCu5zku7j1nl?= =?utf-8?q?z6EqXsrpPMqA127Hl1l4x9YQSJZQaOMz5FJ8SS8IhP5Y6jHMOO/N7wSqIS3L+ZBSk?= =?utf-8?q?FXG95I1TzKjdEDhT9tgTgyPXYcMVyRwMRddr4SSFsIjcJf8pauDIVsPXRwESnS5P/?= =?utf-8?q?3LvuhtQd08KV6BfvKzgW360FxqTJTn1ESZ4cTUuvAIFLv5ICbm/2cAIuAavFjf6LX?= =?utf-8?q?wfNKzaBVqWAH3E1JMdb6MmwBmYBFHQz61Rk8rilM4K985/ZMACZ2Ly8nYvKsV25AN?= =?utf-8?q?p4ulqSZtwqxFi5hRASZV4HyFRdOjlqMHVu6lZ0l9HuJt9B5i81OukBxWFje1jUCS4?= =?utf-8?q?NDikEL9craAQaINU5mPFDdv8NYHnan38ee0jzLLPISnZeSkDtjONlalgrz9kPotnp?= =?utf-8?q?YYeFUHLZs0YFYUvmty4LhhBKIbzJYy1xYRMLX/uwhhIGqPXS8C7/IcW7QBUTDwUG2?= =?utf-8?q?80LKOzlrzh?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(376014)(7416014)(82310400026)(36860700013)(1800799024)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:58.5145 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 24b30673-13ec-463d-ef45-08dd65b3bc1f X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF00039234.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR07MB6778 X-Patchwork-Delegate: kuba@kernel.org From: Chia-Yu Chang AccECN option may fail in various way, handle these: - Remove option from SYN/ACK rexmits to handle blackholes - If no option arrives in SYN/ACK, assume Option is not usable - If an option arrives later, re-enabled - If option is zeroed, disable AccECN option processing Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 6 ++-- include/net/tcp.h | 7 +++++ net/ipv4/tcp.c | 1 + net/ipv4/tcp_input.c | 67 +++++++++++++++++++++++++++++++++++----- net/ipv4/tcp_minisocks.c | 38 +++++++++++++++++++++++ net/ipv4/tcp_output.c | 7 +++-- 6 files changed, 115 insertions(+), 11 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 8d0f5a73b0a3..ccb5918c8b41 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -160,7 +160,8 @@ struct tcp_request_sock { u8 accecn_ok : 1, syn_ect_snt: 2, syn_ect_rcv: 2; - u8 accecn_fail_mode:4; + u8 accecn_fail_mode:4, + saw_accecn_opt :2; u32 txhash; u32 rcv_isn; u32 snt_isn; @@ -388,7 +389,8 @@ struct tcp_sock { syn_ect_snt:2, /* AccECN ECT memory, only */ syn_ect_rcv:2, /* ... needed durign 3WHS + first seqno */ wait_third_ack:1; /* Wait 3rd ACK in simultaneous open */ - u8 accecn_fail_mode:4; /* AccECN failure handling */ + u8 accecn_fail_mode:4, /* AccECN failure handling */ + saw_accecn_opt:2; /* An AccECN option was seen */ u8 thin_lto : 1,/* Use linear timeouts for thin streams */ fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */ diff --git a/include/net/tcp.h b/include/net/tcp.h index 48fb4e5579d1..d531da9f9af8 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -276,6 +276,12 @@ static inline void tcp_accecn_fail_mode_set(struct tcp_sock *tp, u8 mode) tp->accecn_fail_mode |= mode; } +/* tp->saw_accecn_opt states */ +#define TCP_ACCECN_OPT_NOT_SEEN 0x0 +#define TCP_ACCECN_OPT_EMPTY_SEEN 0x1 +#define TCP_ACCECN_OPT_COUNTER_SEEN 0x2 +#define TCP_ACCECN_OPT_FAIL_SEEN 0x3 + /* Flags in tp->nonagle */ #define TCP_NAGLE_OFF 1 /* Nagle's algo is disabled */ #define TCP_NAGLE_CORK 2 /* Socket is corked */ @@ -477,6 +483,7 @@ static inline int tcp_accecn_extract_syn_ect(u8 ace) bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, u8 syn_ect_snt); +u8 tcp_accecn_option_init(const struct sk_buff *skb, u8 opt_offset); void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, u32 payload_len); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 701013b0aa87..c4eadf6dd6fb 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3365,6 +3365,7 @@ int tcp_disconnect(struct sock *sk, int flags) tp->delivered_ce = 0; tp->wait_third_ack = 0; tp->accecn_fail_mode = 0; + tp->saw_accecn_opt = TCP_ACCECN_OPT_NOT_SEEN; tcp_accecn_init_counters(tp); tp->prev_ecnfield = 0; tp->accecn_opt_tstamp = 0; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8cdeb7765d91..d7498b1c9fb9 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -447,8 +447,8 @@ bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect) } /* See Table 2 of the AccECN draft */ -static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, - u8 ip_dsfield) +static void tcp_ecn_rcv_synack(struct sock *sk, const struct sk_buff *skb, + const struct tcphdr *th, u8 ip_dsfield) { struct tcp_sock *tp = tcp_sk(sk); u8 ace = tcp_accecn_ace(th); @@ -467,7 +467,19 @@ static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, default: tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; - tp->accecn_opt_demand = 2; + if (tp->rx_opt.accecn && + tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 saw_opt = tcp_accecn_option_init(skb, + tp->rx_opt.accecn); + + tp->saw_accecn_opt = saw_opt; + if (tp->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_accecn_fail_mode_set(tp, fail_mode); + } + tp->accecn_opt_demand = 2; + } if (INET_ECN_is_ce(ip_dsfield) && tcp_accecn_validate_syn_feedback(sk, ace, tp->syn_ect_snt)) { @@ -587,7 +599,23 @@ static bool tcp_accecn_process_option(struct tcp_sock *tp, bool order1, res; unsigned int i; + if (tcp_accecn_opt_fail_recv(tp)) + return false; + if (!(flag & FLAG_SLOWPATH) || !tp->rx_opt.accecn) { + if (!tp->saw_accecn_opt) { + /* Too late to enable after this point due to + * potential counter wraps + */ + if (tp->bytes_sent >= (1 << 23) - 1) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tp->saw_accecn_opt = TCP_ACCECN_OPT_FAIL_SEEN; + tcp_accecn_fail_mode_set(tp, fail_mode); + } + return false; + } + if (estimate_ecnfield) { u8 ecnfield = estimate_ecnfield - 1; @@ -603,6 +631,13 @@ static bool tcp_accecn_process_option(struct tcp_sock *tp, order1 = (ptr[0] == TCPOPT_ACCECN1); ptr += 2; + if (tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + tp->saw_accecn_opt = tcp_accecn_option_init(skb, + tp->rx_opt.accecn); + if (tp->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_OPT_FAIL_RECV); + } + res = !!estimate_ecnfield; for (i = 0; i < 3; i++) { if (optlen >= TCPOLEN_ACCECN_PERFIELD) { @@ -6458,10 +6493,25 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, */ if (th->syn) { if (tcp_ecn_mode_accecn(tp)) { - u8 opt_demand = max_t(u8, 1, tp->accecn_opt_demand); - send_accecn_reflector = true; - tp->accecn_opt_demand = opt_demand; + if (tp->rx_opt.accecn && + tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 offset = tp->rx_opt.accecn; + u8 opt_demand; + u8 saw_opt; + + saw_opt = tcp_accecn_option_init(skb, offset); + tp->saw_accecn_opt = saw_opt; + if (tp->saw_accecn_opt == + TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_accecn_fail_mode_set(tp, fail_mode); + } + opt_demand = max_t(u8, 1, + tp->accecn_opt_demand); + tp->accecn_opt_demand = opt_demand; + } } if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && @@ -6955,7 +7005,8 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, */ if (tcp_ecn_mode_any(tp)) - tcp_ecn_rcv_synack(sk, th, TCP_SKB_CB(skb)->ip_dsfield); + tcp_ecn_rcv_synack(sk, skb, th, + TCP_SKB_CB(skb)->ip_dsfield); tcp_init_wl(tp, TCP_SKB_CB(skb)->seq); tcp_try_undo_spurious_syn(sk); @@ -7532,6 +7583,8 @@ static void tcp_openreq_init(struct request_sock *req, tcp_rsk(req)->snt_tsval_first = 0; tcp_rsk(req)->last_oow_ack_time = 0; tcp_rsk(req)->accecn_ok = 0; + tcp_rsk(req)->saw_accecn_opt = TCP_ACCECN_OPT_NOT_SEEN; + tcp_rsk(req)->accecn_fail_mode = 0; tcp_rsk(req)->syn_ect_rcv = 0; tcp_rsk(req)->syn_ect_snt = 0; req->mss = rx_opt->mss_clamp; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 82065b49e7dd..07259c828594 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -498,6 +498,7 @@ static void tcp_ecn_openreq_child(struct sock *sk, tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tp->saw_accecn_opt = treq->saw_accecn_opt; tp->prev_ecnfield = treq->syn_ect_rcv; tp->accecn_opt_demand = 1; tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); @@ -552,6 +553,30 @@ static void smc_check_reset_syn_req(const struct tcp_sock *oldtp, #endif } +u8 tcp_accecn_option_init(const struct sk_buff *skb, u8 opt_offset) +{ + unsigned char *ptr = skb_transport_header(skb) + opt_offset; + unsigned int optlen = ptr[1] - 2; + + WARN_ON_ONCE(ptr[0] != TCPOPT_ACCECN0 && ptr[0] != TCPOPT_ACCECN1); + ptr += 2; + + /* Detect option zeroing: an AccECN connection "MAY check that the + * initial value of the EE0B field or the EE1B field is non-zero" + */ + if (optlen < TCPOLEN_ACCECN_PERFIELD) + return TCP_ACCECN_OPT_EMPTY_SEEN; + if (get_unaligned_be24(ptr) == 0) + return TCP_ACCECN_OPT_FAIL_SEEN; + if (optlen < TCPOLEN_ACCECN_PERFIELD * 3) + return TCP_ACCECN_OPT_COUNTER_SEEN; + ptr += TCPOLEN_ACCECN_PERFIELD * 2; + if (get_unaligned_be24(ptr) == 0) + return TCP_ACCECN_OPT_FAIL_SEEN; + + return TCP_ACCECN_OPT_COUNTER_SEEN; +} + /* This is not only more efficient than what we used to do, it eliminates * a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM * @@ -713,6 +738,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, bool own_req; tmp_opt.saw_tstamp = 0; + tmp_opt.accecn = 0; if (th->doff > (sizeof(struct tcphdr)>>2)) { tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL); @@ -890,6 +916,18 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, if (!(flg & TCP_FLAG_ACK)) return NULL; + if (tcp_rsk(req)->accecn_ok && tmp_opt.accecn && + tcp_rsk(req)->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 saw_opt = tcp_accecn_option_init(skb, tmp_opt.accecn); + + tcp_rsk(req)->saw_accecn_opt = saw_opt; + if (tcp_rsk(req)->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_rsk(req)->accecn_fail_mode |= fail_mode; + } + } + /* For Fast Open no more processing is needed (sk is the * child socket). */ diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 61bb5f5ee357..6bee68795b0e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1085,6 +1085,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, /* Simultaneous open SYN/ACK needs AccECN option but not SYN */ if (unlikely((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK) && tcp_ecn_mode_accecn(tp) && + inet_csk(sk)->icsk_retransmits < 2 && sock_net(sk)->ipv4.sysctl_tcp_ecn_option && remaining >= TCPOLEN_ACCECN_BASE)) { u32 saving = tcp_synack_options_combine_saving(opts); @@ -1174,7 +1175,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); if (treq->accecn_ok && sock_net(sk)->ipv4.sysctl_tcp_ecn_option && - remaining >= TCPOLEN_ACCECN_BASE) { + req->num_timeout < 1 && remaining >= TCPOLEN_ACCECN_BASE) { u32 saving = tcp_synack_options_combine_saving(opts); opts->ecn_bytes = synack_ecn_bytes; @@ -1252,7 +1253,9 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb } if (tcp_ecn_mode_accecn(tp) && - sock_net(sk)->ipv4.sysctl_tcp_ecn_option) { + sock_net(sk)->ipv4.sysctl_tcp_ecn_option && + tp->saw_accecn_opt && + !tcp_accecn_opt_fail_send(tp)) { if (sock_net(sk)->ipv4.sysctl_tcp_ecn_option >= 2 || tp->accecn_opt_demand || tcp_accecn_option_beacon_check(sk)) { From patchwork Tue Mar 18 00:27:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020100 X-Patchwork-Delegate: kuba@kernel.org Received: from DUZPR83CU001.outbound.protection.outlook.com (mail-northeuropeazon11013000.outbound.protection.outlook.com [52.101.67.0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF95E149C7B; Tue, 18 Mar 2025 00:28:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.67.0 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257686; cv=fail; b=MaWyq4sEPbW8c2LNqt087H0a5InamaRs+9VXAQ+r1KiHFFIKfaD3nEEMCSkdwjIF5HaAUwfu+FxvkqC3Kw5VsEiyNwVbtkPqZS7RwPRB4sBIZ2jMuGgfTQaMgzYellyZnE4GP3VIrbV70MnXrf8ef1qXpsvkc0JLCe+SZSQUKUY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257686; c=relaxed/simple; bh=mcQJpmjSzL62FdjH+VYiKQ6QmWX5Xcp1sbHA6UPYShM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=jh8ziVykuCHn8RfArswHyUJcR2xGOin1KZ/+u/MGJMDAe4Os/Vq7Op5b7nJLT8InX135aUt0aOF8Ess34vM4pC6nq/EbSRKRJygdutG2C4H7v+cXmwnZrenwMUQXnpEUQlzYWTj4CELUJDPTqE7tiLvuGrWSOXELrOflN/6lY+E= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=Uw5q3fGT; arc=fail smtp.client-ip=52.101.67.0 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="Uw5q3fGT" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=f2wjjZ6KBjJQf3T8jMJRg+0yKOa+S9L/SOhb7H7xLOtdn6UcJT3BkPMdckLyhOJyOP2ugFXSYR2/kTRdF3PYNFPtx6qWVIAAB4lsqDLsCiRUjhqRU3Kc6ave2SOb0yMKgugA8SaST3hIWYhcwlC3YPCpCj/Oo1YhxxWXzmnsBYxDNAm4MZJQGjFaLCfLrhhf0rcjodRVkaC6ULixA5BKaXVJ5uzGSKZunAsF+y4MCgFL05OtystpTkn4n+NyxqBh6H6v8hsmXHfeAfYfgzZNmfC5a6yNa6MsjqbQ2YAUNH+6o/LS47Juz94FQ6Q6/RWpu95t/T1ujfM3FUWPWdn2gQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Rs6bt+XzymTPEqXWxB38Qk5p3/58eWQSq/Au1qzRsdI=; b=lnEZyi13MjLCBNY4sGqhOFzp0+Jz1sC2l2i+qD5g4Y+4fPb3GyNS3/c5MoUVtZ8hpVoF6iv4Ls55EHfxftYVDmOdGdIei6t7hN6QTR23ug2gOoYR3qI0LIt+Nayj7LADip8w4HGixI7UP+4IY6zIe48cgpKzYjGQnp8SaFMyBmqWp+EHEN0P4vF4b3MQQTQmdDEx3XiiIdtljHzBBzu6vQm7FPjzhPf0TLfmksw1V6iIDyLeBwkjtrT3Y7e5On2/q7KY5nx3vNpmMWGWrAgRoj/GAI49FAMt4HsxsF4CPAhk9YATeC3i0LzPSuakpT+WPeUAz7bff7YXorqFhyvF3A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Rs6bt+XzymTPEqXWxB38Qk5p3/58eWQSq/Au1qzRsdI=; b=Uw5q3fGTJUxD3lEp/HDOmtbckOYv3rXiyFiXlMmBF7dzuDALvbSqDX0ykCKhiiDz8q3CLs7jgx+iTehSLrIk/rTKlmfTaGSHXSZw01yk/iC5h2o7RTd/JuJUUXb0FaYi3c50/TFC4fM4IBI0isClFM9CCEpSzBwZzsBaOSfunSpuOrEsJwdeVSa4jIju/UED5p3q6IjuW9FfTw0SrsBHHNo1FdJbdDxwbbeduh7WKD5kGuLlLSK4sErevVxRWoX31X4AOhzxT/wG4nqCW4lDHhaEY1C9kctdqHUCd+cs2NoAiAkjheNOgwbh7kLS/LaDxKd83r8mMS6C2afju+9+iA== Received: from DB9PR06CA0016.eurprd06.prod.outlook.com (2603:10a6:10:1db::21) by GV1PR07MB9023.eurprd07.prod.outlook.com (2603:10a6:150:86::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:28:00 +0000 Received: from DB1PEPF0003922E.eurprd03.prod.outlook.com (2603:10a6:10:1db:cafe::56) by DB9PR06CA0016.outlook.office365.com (2603:10a6:10:1db::21) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:28:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB1PEPF0003922E.mail.protection.outlook.com (10.167.8.101) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:27:59 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBr024935; Tue, 18 Mar 2025 00:28:07 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 11/15] tcp: accecn: AccECN option ceb/cep heuristic Date: Tue, 18 Mar 2025 01:27:06 +0100 Message-Id: <20250318002710.29483-12-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB1PEPF0003922E:EE_|GV1PR07MB9023:EE_ X-MS-Office365-Filtering-Correlation-Id: d3a9ecc1-a4af-43b0-8229-08dd65b3bce5 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?4nxYxlI93+L0GxBxZoZVLAQXjY17GtS?= =?utf-8?q?luzWDM1DxvXwttVWUjxQjxc9DeT6R9WlXPg+2AYQVU1ROyCHhxv94hNAZCIZEv8Hu?= =?utf-8?q?kGI9f0E0arDE/K668+ooB5TViKnodUuhVRRBvUva73SgT1AsF3ml+SAYqnlVjiZ00?= =?utf-8?q?ccJL6RZG0Ciz/1f9DabPRQR1LKwCAgNWn/4804mR0AIixTfK/EwT1Yzd61h/StpVR?= =?utf-8?q?UAEoGa+6RCCfC3NVDRoucNOcOn/ZjxQr+3fmnra8iTg4H5CKXFtgP7KYn7ZL3Kyr/?= =?utf-8?q?/NzGhR6UW5Cpuw5SS80exAFkDuLLwh3+Nn3ZkU7z8KQsD5axfSkvqv+cLL9VA0D3f?= =?utf-8?q?871/OB+ARzwW0Icj0FanEJTEvEnixzVHG+LM59olb78K/oiKWNa2sSG3achQy0tjk?= =?utf-8?q?EIJ2Q5Oiyhb0pyWebpS2qiAFswg1XR4HRkxMPh4ODn3oOnRzlDRrdcoHutJMhThSH?= =?utf-8?q?3bAGumtvoTebkSopHFuzwa06Krm4nZ84C2vWpq6U0d6QYymHHqktjOxR4erMAJXy2?= =?utf-8?q?YXAyYZAsglUt3dzmiwCge1bAPOYjJOcTeRqkVEAt5z8vwndT/5zKVtbHwRZ0rEIcq?= =?utf-8?q?E9ea+50C5pUabOv2D1CoMJuTrAWRm1N816nXi5dsT5SKplbwve7GZK0S75f/DvsMD?= =?utf-8?q?SzQ0/ODbWNq5btHfC0T7gYrii/Tpc+YLMORjCGIg9jGQs7w0Ts25HqXujKK/7iDDX?= =?utf-8?q?dS1yhDXsze0bR/tLfNEwbwFD2R0cSHYt9kozb1zPguP2UQK1GxwowY3+uuRgppbSD?= =?utf-8?q?JeZO3ND/jwFwe3EHBt7Krtds+7V5rPfE8LBKP7umZUhiiaJF4q9KB3roRl4k1Ewuh?= =?utf-8?q?Fm/DDe6tkwS8PtRo2v334PWJBMzGu/Ic5DK1Clvv2Xl+QRRJfptl1CbYsaynLXVNA?= =?utf-8?q?2LL8ddyxdA9S8PgqtK0Dav25dhHLaOLoFMtJbukZSQTq3WO2jI4JdOXa0A5OeoYay?= =?utf-8?q?lPjIL5Xl80GDmu33h/v9Eh1RMopabY+CLw7eSUmkLgHuTdXCRAh1gZZw/x0ip7SfV?= =?utf-8?q?rXBH95TACdy21mQ5Ckp91hmfww7/x9ik8VbLpuVgo3K04YFYdSScRiqk3BdgbgqnS?= =?utf-8?q?WoAWo4ZEAakTvw4h0/K/b0CJKE0IdbEGsf3tp4DariS5tfYdu+7/3EddIkzvokpkN?= =?utf-8?q?SkPvA2I/FTHXlO31KMkfRWxJSTa5zmgrLNuzOv1rbL+FJNUe6D23gRIS5alv72ntp?= =?utf-8?q?xdpKFMudvkm5sO+i1yO2p6XfMkqOHW46hCnU9ApSCCztEXp4ieFV4Bi8FX8SdyiaR?= =?utf-8?q?pekYoe6Vzm3aWx/CBxfoNBGxWWVanIuzSfZhToJD5STmVIa+tM52kRX66Li1tf893?= =?utf-8?q?aiMMQIWBM/Fvz6XHJ2iLrW2ZLk73b2wKyn0jU6Yx/uU4a2GPby5Tv6vDUxx1Bf1x6?= =?utf-8?q?/GZGr8x0AVxdtiK2rSDjkqqx/FIKdMTvxmswfhSYvaoZSpLdLyuoV/JCOLJA6d67m?= =?utf-8?q?1m6rFq9AV8?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:27:59.8249 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d3a9ecc1-a4af-43b0-8229-08dd65b3bce5 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF0003922E.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR07MB9023 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen The heuristic algorithm from draft-11 Appendix A.2.2 to mitigate against false ACE field overflows. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- include/net/tcp.h | 1 + net/ipv4/tcp_input.c | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index d531da9f9af8..09220668d2e3 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -244,6 +244,7 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCP_ACCECN_MAXSIZE (TCPOLEN_ACCECN_BASE + \ TCPOLEN_ACCECN_PERFIELD * \ TCP_ACCECN_NUMFIELDS) +#define TCP_ACCECN_SAFETY_SHIFT 1 /* SAFETY_FACTOR in accecn draft */ /* tp->accecn_fail_mode */ #define TCP_ACCECN_ACE_FAIL_SEND BIT(0) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index d7498b1c9fb9..f62bbf6f4eb3 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -695,16 +695,19 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, u32 delivered_pkts, u32 delivered_bytes, int flag) { + u32 old_ceb = tcp_sk(sk)->delivered_ecn_bytes[INET_ECN_CE - 1]; const struct tcphdr *th = tcp_hdr(skb); struct tcp_sock *tp = tcp_sk(sk); - u32 delta, safe_delta; + u32 delta, safe_delta, d_ceb; + bool opt_deltas_valid; u32 corrected_ace; /* Reordered ACK or uncertain due to lack of data to send and ts */ if (!(flag & (FLAG_FORWARD_PROGRESS | FLAG_TS_PROGRESS))) return 0; - tcp_accecn_process_option(tp, skb, delivered_bytes, flag); + opt_deltas_valid = tcp_accecn_process_option(tp, skb, + delivered_bytes, flag); if (!(flag & FLAG_SLOWPATH)) { /* AccECN counter might overflow on large ACKs */ @@ -727,6 +730,17 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, safe_delta = delivered_pkts - ((delivered_pkts - delta) & TCP_ACCECN_CEP_ACE_MASK); + if (opt_deltas_valid) { + d_ceb = tp->delivered_ecn_bytes[INET_ECN_CE - 1] - old_ceb; + if (!d_ceb) + return delta; + if (d_ceb > delta * tp->mss_cache) + return safe_delta; + if (d_ceb < + safe_delta * tp->mss_cache >> TCP_ACCECN_SAFETY_SHIFT) + return delta; + } + return safe_delta; } From patchwork Tue Mar 18 00:27:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020101 X-Patchwork-Delegate: kuba@kernel.org Received: from AS8PR04CU009.outbound.protection.outlook.com (mail-westeuropeazon11011012.outbound.protection.outlook.com [52.101.70.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C35BC1552FD; Tue, 18 Mar 2025 00:28:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.70.12 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257687; cv=fail; b=NR2fIwyRvR2wiOA/419hi6WK9YMLwSgIMBdSc8aZV2cWhArREibIHLudvAWAR1LhYLytSQ0jY7794m1iQ+0KGF9HCD1vCgiapweYrOQci2G66/Q3oRAIUr+RQR/Dstmw8Oay9vYZ0ojSdiI1ZozM53kRF9mMRKM76Y2jvnwaS9M= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257687; c=relaxed/simple; bh=P+y5pe+LjBEM3YJfO5X11HdJpVJ/6TKxbk0/u/q0G70=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=sC2nHFEeBYCCpOmkajAggM3Fi5VpNXw9/oWcnbhIh0fJJUIHPYAOs83LYHtAjE+jkmohkFO15bgraxW8uiW9sNDdplumVjQYTc8H5yb0jFumebrqgVs+MQbt9sYyc5tn0r/AtZeoUu3xcD5HY3I1BFsw8LfG6MkK0iBqwmKFYzk= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=ABQhpUeY; arc=fail smtp.client-ip=52.101.70.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="ABQhpUeY" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JUi5TAz7QFWwKV+r6nKFGV1ZSSqsz0raQ+8jzvihzfPsQ1bPin92SM+9ZENfX4Iixbpf5wLNy+BpjC7Usn+v59OTR1S7rr46Y7gQ1LTQ/h1hsYqZo8YOnjWqDnxTUR/ex2pND/Ibb9NIXLpw78E6YcwHtN8RtLF97NkCm5DJehfh9fyo7ts/DxWPuqKUBgd+MwZn2iEIziidc4qASDPN2Un8A6UgROP+B0ToJkupDk22HH77lIfLW7O+nm26RZeJM6xd427LdbDz0lnMJp5GfyQKXsT0z7P6OUonhthXgFfkp7t5ubJ3/tR40aEIKYAIGL9bzCIwiMtIqxNckfBG7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RJ5VMo6AccBQ2E1Aco5pSiTxmnzfuaVGs7CuYQiAUMI=; b=DJOCVhSd6VHzO1cyqq/XXOPTZOkXaPIMcjhxiufXPW8VfbekA7Y3j+OOZoMUMtXUCGCIb49h0bsJ0HdRIfSR6pOidgi9JE3hrY5lGXBiWpX6PSKuEfspKMJSz9ibG2IEv9AIPHIj1ATl868zKbURr44cC3WxYfRZDDBDH79svHwcGsQZ8Dqwy9Jlltgx8s31uaG1dDLqU3hVvFLO0aiqGEPJR664wyh1Uj1/6c65BuJriqWkQAbS146e8T6WctyugpmHGOg6JRx77rmXx5xb6Od3lJO+cCPQrz/Yu0ZPUwN7qy3bI3WQBFO9zzsDRvhnUKx18UbBZGSvIiEXY3pAmQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RJ5VMo6AccBQ2E1Aco5pSiTxmnzfuaVGs7CuYQiAUMI=; b=ABQhpUeYVODVo5dVRjGSyYQHXYq2PeoHh0ZKYTwTnAjrvasFg7ZJXNUBiHmU3KudX6b3t47ChOhv/ID88OTtMP9W7W8Yw89qSoQ1+pwgc8Nwi31V1a3uIDdxA1TXyCFjQ5fmBq4CV97LiAuokjo2VTuqTXg5GXx1Lb4KvYxtys9ohNJkIO4/LS3I7gMHbnyzz5j42C5IsPnICFtQSCr9lPwcZYRFhzbsSeCmmkXQ98VaF93kD4MjW7T+DdI57jtitlTRMKBylZ5EK7WN0Mc4217zC156Ow2Kg25frL9DaSWlqwqslWaKorS6ElqeVlAtHpZzf+xbMfFHRVBtx6vHYg== Received: from PAZP264CA0136.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1f8::6) by GVXPR07MB9793.eurprd07.prod.outlook.com (2603:10a6:150:113::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:28:01 +0000 Received: from AMS0EPF00000196.eurprd05.prod.outlook.com (2603:10a6:102:1f8:cafe::92) by PAZP264CA0136.outlook.office365.com (2603:10a6:102:1f8::6) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:28:01 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AMS0EPF00000196.mail.protection.outlook.com (10.167.16.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:28:01 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBs024935; Tue, 18 Mar 2025 00:28:08 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 12/15] tcp: accecn: AccECN ACE field multi-wrap heuristic Date: Tue, 18 Mar 2025 01:27:07 +0100 Message-Id: <20250318002710.29483-13-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AMS0EPF00000196:EE_|GVXPR07MB9793:EE_ X-MS-Office365-Filtering-Correlation-Id: 91e90bc2-eafd-414d-50bb-08dd65b3bd9b X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?K+c3M1/WX90tBtg1MdG+/jZ5aMQnxWV?= =?utf-8?q?2iecy53nC3KeUcNUAHeSHmoFxVpp/WVo5GZncJWP4l9nWK/RuIKzxU00saLMt4gnx?= =?utf-8?q?GinxuvpSTyZfcs8RSuOF1R90wyweU4V/VP1nx2NNzQUekUNpYJQqMwn8kiHJlpdza?= =?utf-8?q?0Cj1Eq9nHJUclxvE37lRriNIn3WT70K01kwfaFH+cm0VwZ6B4lqbpkPlbA0nJ6/hr?= =?utf-8?q?MhHps3ryeCEOd6TTpEnn9YMD/HWphW9GqJKRnHc32Ga34kOP7okrQOW8JAIWqdgdS?= =?utf-8?q?BZF1DcmupIHoFo+jhZNb5uUKMvjkalc0EeABMNPjmdmoMXawyR8cV7ZCnzhB4pR/m?= =?utf-8?q?sSO8hTh+CHZLyxIgZUUmN83YiyW6PtftnYloO8PSwIZJ0eBleIaVpNqlO/KvZJeRJ?= =?utf-8?q?/zy80eIfxu2aSAx2uJUoCrHh7BLftt0aCtF/v/KahS2kXBi4aMemPCOnF2hrZ+jg2?= =?utf-8?q?g5+uPxYbN6c61Aa/UhWqIeQnXJ81kHSarYLU/hLSuULxVCI4cy132d5bgsAo3brCM?= =?utf-8?q?C6E9gYpK1QWmqQghqxj0bHDzOHZ2mgZxVEcwsBAb7cCNqiE806KKV9bjhbZhMukA1?= =?utf-8?q?HjwboLMK6OkUkVEOozQQuuQ3dZ+c2u0eMfmdIOtXXLcUg1mmOGJVnAiTBD8i38aZJ?= =?utf-8?q?vEj3bYqD8d8BxcEXyoC0EkRrx7gYBxr/cqviXnnQOADDUGWabEtxzLIK2IrWdDU2U?= =?utf-8?q?1cFTy01OqjdgDj+FIMRPCBp1c5WxlEofEN6WBvGHBXFoiVED+kFIvq0xLpsNeM6LI?= =?utf-8?q?F6IgA+Xzimb4eMNceLEeE22WdR5dz24ymqTqV+KfA6cvlh/k177DAVUvQofu1BKdB?= =?utf-8?q?p1nyUODgAA0duIyX8WGe7CGj9+kktd6D1wktTNFq5dt9XDPLOK8Xj75EkNLYfaYCp?= =?utf-8?q?XMe2oOZwHODJVzXtVhcH/y7uwAM9DMFTf1+cAW8tsxptgYWW+t8ACfI69uRPMzsIE?= =?utf-8?q?DqQ/lSKJcjDWlg1ihttqKUUNPLdUav3tn7WtZ90iJl9cvlDWt6p9qypw4t2af7Xec?= =?utf-8?q?fK5MX3IrdgGeCzsTGhoGnv6wpY5zVdCtovLyyOqo9QyxlyB3iryxXhQULAhbwln86?= =?utf-8?q?h7RBfZtqu7WgGIMyEiLGUIq2Me5Kq/9z3g6m0xXEEcO28OUhH3qWfrxhZVxyJiqw+?= =?utf-8?q?2nXmiRoWDD56CZsJm9BN5JQcQGWEJKUQ3KNPU9+Xl4Lc7j2mq4+aLNIU/LOboP9lE?= =?utf-8?q?3+sVRwkysjiu/sjDUHIReRIZ3OhTU/KhLVZOiau/+kTc1FS/AI5WFqQRRIoUolBNa?= =?utf-8?q?1DAMt5IDOMX3yY8+wIlGSVC30Fq+VvVal75GPk+RFXF4CvzltQE4OrKKuIQMzq2HC?= =?utf-8?q?tz4aC+oidQ98+zBFyzWP9B3wbbpMa2zvmRN6YTCwVpAXg3FnskzFwJki2SSnvN6+z?= =?utf-8?q?YP17rVwoQ9qn0dWkaU+ut9JEFUXR+fnJXPWbpM4Q0OUVZhAnT/uwteFjtAqEelCgJ?= =?utf-8?q?GoCPuHztxI?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:28:01.0321 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 91e90bc2-eafd-414d-50bb-08dd65b3bd9b X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF00000196.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR07MB9793 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen Armed with ceb delta from option, delivered bytes, and delivered packets it is possible to estimate how many times ACE field wrapped. This calculation is necessary only if more than one wrap is possible. Without SACK, delivered bytes and packets are not always trustworthy in which case TCP falls back to the simpler no-or-all wraps ceb algorithm. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_input.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index f62bbf6f4eb3..5c71135b43f7 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -734,6 +734,24 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, d_ceb = tp->delivered_ecn_bytes[INET_ECN_CE - 1] - old_ceb; if (!d_ceb) return delta; + + if ((delivered_pkts >= (TCP_ACCECN_CEP_ACE_MASK + 1) * 2) && + (tcp_is_sack(tp) || + ((1 << inet_csk(sk)->icsk_ca_state) & + (TCPF_CA_Open | TCPF_CA_CWR)))) { + u32 est_d_cep; + + if (delivered_bytes <= d_ceb) + return safe_delta; + + est_d_cep = DIV_ROUND_UP_ULL((u64)d_ceb * + delivered_pkts, + delivered_bytes); + return min(safe_delta, + delta + + (est_d_cep & ~TCP_ACCECN_CEP_ACE_MASK)); + } + if (d_ceb > delta * tp->mss_cache) return safe_delta; if (d_ceb < From patchwork Tue Mar 18 00:27:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020102 X-Patchwork-Delegate: kuba@kernel.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2084.outbound.protection.outlook.com [40.107.21.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0966A81741; Tue, 18 Mar 2025 00:28:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.21.84 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257688; cv=fail; b=Ip2Y82tDaz1kiX51jdxhVWAMKQ+bQ2XRAIAINjlzkBmFM4C5I7SEHNdoCsgPFlin+IEs3uk4HFf/5EDaeQiJdJCGd9C5B7+f8V7dekeiBg/jlJvbFPIuBeK8kKLavo8HpsHEOb08YJ1xygpLevJYKP4EYRLBklAmk3NS9O9BJcA= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257688; c=relaxed/simple; bh=lbD1m6Bf5JVDs6uG5YSX8tUoJRk8XTkiOdk2SGkJnpI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=lnUImwNEZjvFEle7GlPzMc626TN6K4LvGbGSG6so6Il0eegk/IjN0eYJViiqh0enfpQIS7wvLH6u71fiDe1cozL/QRNJ0+bwmMhUac156HSVeClnCZJmlvW8So8uE/s6I9K5rrmrwMWSJgi8Wc7lQlEHUJJhsc1EO/QRWDqlZxU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=HDLc9GpH; arc=fail smtp.client-ip=40.107.21.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="HDLc9GpH" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JU6MXNQdc/lLh3AoaCC72u+eH4lgsqJ4l79nNCJTsznnX73+EPNPvYIl+5MKCrVBSjt9qOTuKA8KyAD4i4/ESco+GlHC4X1rlqXWrcghQ1tO46MiQLqfgmSn4BXs+3/DXV3ITR6rqfFyT38C8H4JXDs3ZXM0qMhup5aZZySTlPPmGnsMGzZI4zFyre3kkbNEMLle5PWnEDYmgav1xpjSsYibilnkDzAbeyaFuV2KTEz5z66mDGrHVgi90MNYJDX6SLZdz+yHwyyZ8z/xET0iYXNbNekRsaTqQ6fPFeCwNomilHMz9Vd0oND7K0ek9B9GS+ZVf7xH6l1yjXDpcaXGnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=H+c3S3ZKWX7fFFw+ZnK2eOgtBq08riLwt5aOlut6+Pw=; b=VSWwRTsfJe0nqMSpUhtP9QY55X/peK9QI1baL/RwtvJeoQaOK34J9ELAlWjCcfgygt1+kAXIiJ68oVOu/H/dFCyDqS9iGK2a2pAypU8Tw5K1BVrfMYaIzMu6UPYXmxQDol+1uJVkjMYOKGwilBiO7EB/ackDLB1JkkXvMHCk35RIWSAJMqABED+A2EXFj65DBBwItglXDnKINXWkLqRkeq68YoxgB/gFWRxaZDWcUQFPqEb+lDSD9iXpA5RD/lNluGtVLJv8Shx3yfNwQwF+OZQAZDQP8B3R50yaKBr6IZsl7vyFzJOrl9x3unWTnPAgYt0ASCwFLo3FrtYuBzn6KQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=H+c3S3ZKWX7fFFw+ZnK2eOgtBq08riLwt5aOlut6+Pw=; b=HDLc9GpHTp1vQRqy3LRpVWA3x3mZijpAWXeleFuaqJGcO7c/1AnHL/bh8NNyLkLGlEqADfHItlCwRzRVsWBRqgv6Y8R8XxlY/ywuqvwxcmvIKUZTYuLpPEBhVznl5bvEtdBYXEnZz5HuHsREFWAt9rcl1SYjz5rvsdYGrEG9IIkYRUSBAJsJDhxXgONuPnWqhs+BDOykX8xVFBRANGTAm+1VVHOy7xz0Pk3Dm81/WYVwiHjuDs2+J9y5kT/HcLoLwiYC7AfKJIjan2k3IJxwjog8y25rQtYp7e+unl850zY/LjA29iUzvtF7DqDHwSVIh/c1p4uPLK9LF6Y5TGcC9w== Received: from CWLP123CA0247.GBRP123.PROD.OUTLOOK.COM (2603:10a6:400:19e::9) by PA4PR07MB8390.eurprd07.prod.outlook.com (2603:10a6:102:2a1::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:28:02 +0000 Received: from AM4PEPF00027A5E.eurprd04.prod.outlook.com (2603:10a6:400:19e:cafe::32) by CWLP123CA0247.outlook.office365.com (2603:10a6:400:19e::9) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:28:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AM4PEPF00027A5E.mail.protection.outlook.com (10.167.16.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:28:02 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBt024935; Tue, 18 Mar 2025 00:28:10 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 13/15] tcp: accecn: try to fit AccECN option with SACK Date: Tue, 18 Mar 2025 01:27:08 +0100 Message-Id: <20250318002710.29483-14-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM4PEPF00027A5E:EE_|PA4PR07MB8390:EE_ X-MS-Office365-Filtering-Correlation-Id: 7fc59d8f-a9cc-4966-c753-08dd65b3be6d X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?vDD4sXD5Wu+pn1X2iYmbbDgpI1/rfgg?= =?utf-8?q?QUZz6UuVm5YQ/32P2arIqyoz/wIGemnrgPfag5NXO4Q0ag4gX1IuPH2jm/XuBD8Ii?= =?utf-8?q?FCBHRvKa79Xp6BppIkQHLyVaUiqDQvsoZC55NKyC0kbmSPHB8/t91zuL5noxSBD74?= =?utf-8?q?+/1YuE6I9E1Lpxbd62dKvb9xqr3PykWb195/RN4wCBu295Dn5H37lbb9nAdRv2URa?= =?utf-8?q?OWRMkE7c0hzYIWyeUnmHcCBe8BqhVIypuRFgh6a1/Yu+gw/qXaDBdYVoR6twj68sv?= =?utf-8?q?R1xy9MZWpWY1lyrmnOeOGAq51LXPQbjBU7cBjjhe/NX2Oc5O5JvWLrpW6dubhj+JW?= =?utf-8?q?lPywXIO4S8qShxbCmUBXSdl/YewwGmzUGEbgYS9QLDgEP3IVZJnAzhIz+dgGlINBD?= =?utf-8?q?8sqykgJ5bpBgjmqcLoHHvcGcwtgusg6AKj9a0G0CDsIT0zenB9/HKNrIxwmHwVrgs?= =?utf-8?q?1KmdbJybIWDaEmZFi1QANIHzJBncASZne6CNFkjQqNJywYj9R15If89zlMXERzNes?= =?utf-8?q?cfR9WlVc7vEvxbO3mX7BxfOP9OB0dxk3+rHwDH9nFrfszuJ12eD6dyaENRXdconjh?= =?utf-8?q?lWueKc7wnkQcD0e942x0mOsnt+rsQDlmvLK0C0E5MgaP6p7BBZsVF98E2Z0+cjYQP?= =?utf-8?q?om8BdVJNDx61VG/wxdZBah+gDoqDb6r5/r4IFJ7WdYrIDP39QDdDKTNf3KOZ5giSl?= =?utf-8?q?bb0SuZvk9ghwUUpvv7lz6IcxA0EiIRj9vUoRatoOlhedjo975nrpH/TifbSIY1Bts?= =?utf-8?q?O29ABMFzUj0LjGvuc+njqCba6ZT4x5pBYzR2MDPPggi87lT0rCAhY15f1qlt4iU/z?= =?utf-8?q?FoVY9HlxX09U36NoqKUbQ07pWEbmCSgDBa+RKfoOI1Hxt3D7gE5wGchA3SClJdqMM?= =?utf-8?q?CVRuCu9cN9rU6SBSQ7qGgahfZFWuE2SfXPvRB4eMDEVoE7ppK7+aGFLZy6JRpZVw6?= =?utf-8?q?5npUhcdpfoZzjChcfU6cMpd/s50fCHeWS1LNBkuFF2cu/sxx9lUpABD25Wpr8aJf2?= =?utf-8?q?Z/48/wlmY2yZgyGIeSZuuaEN8xpQBK1zBAFFeKBTkA65JDpxnikalDATBkEWuLmdV?= =?utf-8?q?gIaWGczmmN3LHg83GhMvWXJ4o6ufe9Kqbx7o9SsDANb1fEP84Mp83SIHaMLMi86QY?= =?utf-8?q?QgyXj5WdgT2mvAKkv1e+vAnKMEHmgjrtgP24ABwvuagFPbYjkujCWGi4u+CceGzYU?= =?utf-8?q?IuWPPriFAHnuwYTqqPvuSuvL0bpFMhYdbq8+4Nl6AjiGK+PVPhpbCnuk0rU3LqDzF?= =?utf-8?q?ThsYR6/HH/b3YKOpqdUlru8FUhnAsK1PYb1t0Wm5G/Z7O7aMVtu4W1FxFHtU9oNVq?= =?utf-8?q?6FNAjW1lmwpXLdULdthy/mXNxxr0tKU+0bzYWUTHBuN3cpXEm/t34WqfdmMgpmRIe?= =?utf-8?q?m3AZBm4/hQb0zkf78jbjzOyM++KAtUxhIoY5Z2GibhFfysStMMG4m4=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014)(7416014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:28:02.3957 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7fc59d8f-a9cc-4966-c753-08dd65b3be6d X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A5E.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR07MB8390 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen As SACK blocks tend to eat all option space when there are many holes, it is useful to compromise on sending many SACK blocks in every ACK and try to fit AccECN option there by reduction the number of SACK blocks. But never go below two SACK blocks because of AccECN option. As AccECN option is often not put to every ACK, the space hijack is usually only temporary. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_output.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6bee68795b0e..fafca7cd369b 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -981,8 +981,21 @@ static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, opts->num_accecn_fields--; size -= TCPOLEN_ACCECN_PERFIELD; } - if (opts->num_accecn_fields < required) + if (opts->num_accecn_fields < required) { + if (opts->num_sack_blocks > 2) { + /* Try to fit the option by removing one SACK block */ + opts->num_sack_blocks--; + size = tcp_options_fit_accecn(opts, required, + remaining + + TCPOLEN_SACK_PERBLOCK, + max_combine_saving); + if (opts->options & OPTION_ACCECN) + return size - TCPOLEN_SACK_PERBLOCK; + + opts->num_sack_blocks++; + } return 0; + } opts->options |= OPTION_ACCECN; return size; From patchwork Tue Mar 18 00:27:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020103 X-Patchwork-Delegate: kuba@kernel.org Received: from DB3PR0202CU003.outbound.protection.outlook.com (mail-northeuropeazon11011016.outbound.protection.outlook.com [52.101.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA82F155753; Tue, 18 Mar 2025 00:28:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.65.16 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257689; cv=fail; b=uMFPM7iehOESo6LHuVo221y7ipYBdi2jn4D8bz3QvX+bWYeZ+kDVItoQjYQHjO02npoPol75dGEDkPvJZwBhQgKAdX6AUlbtrFAjuShHx2AVsuQr1DvziPj4SetQ2NdhexDnI4ZlFSBVDmBohpeYwn0jpgOhEpTJBi2YivOmulA= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257689; c=relaxed/simple; bh=tKsZlajXLpAOIbyJ8qGKdzctObD1YPtdFcgDVM+hUHU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=gZ/hGrfulgVrWfiDfDkJ8OAEacVHlzzI8Mi1s8LZbQGtWUKbIyYsK67p6bRuvizx5jUzacW+UdVZqeiFFdUbhjHvxp30QKSmgUaUUb94lTEJ21gS4hPLe+mEHiQOr/OZRAyfMxc+iVdtAdK67J0BWcdGCzmnVsmgo+gysvv7yS8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=Jd2enOf2; arc=fail smtp.client-ip=52.101.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="Jd2enOf2" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ZNIJSeDGbCq8sVWz5b1aqht6jIpmQHo6I1DS8G2gxgITgk4moai6OCoteEicujF2x6a/LUs7sSnUctaiZhQVDU2DA2wh5txLHL/moRUDhlT5rbWfZ5ugmV2w4H91nN9PEoXrxAMs4UOho1B8MrOV3dr+ViaOiGwBb9lf8va5h2mCfi9cc9GkgKxK8oZ1aUV82PfU2DyC4AHc3YeABVmvcDO167hFfvienZgXt/1eiuRdBdg1dkdDpNnKJAQ62pj5h+V5yRu1zHCi93SV96kcTo6rtHPBiRwHG6RUM99BUsI3UVSxuNNAe+eh883HRdtD4ofKcAWF08viTkaR8EtN/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LphzWS9GjpX6oHrIgVfxf1jJOgRAdXuN7feHQH5h6s4=; b=GTk5YGTM9uzBCAgrDVygV9qxnAD80UPqemVCsehFs9lBQ0WoukExW7GiYkmQtelInbPmW31H2+xWuQ+EivuE0KjysScKDn85NW1TzcZ76ru0r207F3qM/9KzNxom6b8biqB2ifT+D2MeVfv8Cb/sJH8nMTo9AiVAaciw3yenfaZv17oU8GLywY1SzmRHJuhFRUQWlem8ACMHd/EvzyErDnbvvIks782H5ibYddCWTJ7U8GAJ/Q1wJQBASrDS6GfavT6rM7z5HTVsaXr5hjXeJteXaEvxzeFUaC5nUL0Dze8gcZC1UPfBDc3qX0kW4rZ/so9MlgmXm7F8A2ldK34D2Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LphzWS9GjpX6oHrIgVfxf1jJOgRAdXuN7feHQH5h6s4=; b=Jd2enOf2Ryf4CA8Nh+KSQ8MTy8b1I9oz1dF36Ug0CKEItIzf6JwTsGTcfNfsIwi+OG9UXzmOHEKerbtejWMDAbR9kHJ10ao8SY/YjwaKw+A9/Yj2jPYr3YVElPTHmXThQD0rm1WJP9FwnuDQ3c4TdJnFbNgbvX6I+uzPerDCIb+6yeG6WrIYlJsTA98mq4Lz50tofJJplFHOk3Whmk8krBRQ9LiueaMNCcAxOr+UfcMTgR/FJ4/mkJBq9aBjVdo55gLwoMig4rhBXsbIsCjAA9byoof9hFvIjhJDUXVbswT/JkAhUnsk0siBmn8guf4I08UW8MBV1u25ZIdaX5rctw== Received: from DB7PR02CA0011.eurprd02.prod.outlook.com (2603:10a6:10:52::24) by AM0PR07MB6305.eurprd07.prod.outlook.com (2603:10a6:20b:15f::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:28:04 +0000 Received: from DB1PEPF00039230.eurprd03.prod.outlook.com (2603:10a6:10:52:cafe::89) by DB7PR02CA0011.outlook.office365.com (2603:10a6:10:52::24) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:28:04 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by DB1PEPF00039230.mail.protection.outlook.com (10.167.8.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:28:04 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBu024935; Tue, 18 Mar 2025 00:28:11 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 14/15] tcp: try to avoid safer when ACKs are thinned Date: Tue, 18 Mar 2025 01:27:09 +0100 Message-Id: <20250318002710.29483-15-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB1PEPF00039230:EE_|AM0PR07MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 428cefdd-286a-4600-1806-08dd65b3bf64 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?OXPsr8ZxPeVCRALcb4Yx7V1GeDMmQnG?= =?utf-8?q?Z0oZM+BMBNXnrImj6yOgn1w2hohGLZ3rLVu07E3oAhxSvysWgqZ/f0VHp4eLuRI16?= =?utf-8?q?qNIJYRcURmSyTKVfBa3c/22LeRJFKVgG+hF94VleKVd9eKyINRQwwE20cDPUJKt0Y?= =?utf-8?q?+lRrjZEZa4BLlhZUM8HpkPgjLXR+rJCotAYzZoy3ZW5EllUwymoVuivKKfQoedQdZ?= =?utf-8?q?TBskDfU+2dF5nRTI4NkMT/wFKc5tfWtS/9wGA1xf1v9Np28QpWiV73lZudfqCUKSC?= =?utf-8?q?KSFzRQToNN+PDNQgsorws03J/DqlG0o4A5TpnCxIx/oqlEnU7+kqj3X3QYe2hjbyt?= =?utf-8?q?Ua8hWbOsbzTrHckuG2CHt/K8I5pFQBu/HRoGDPeDegsydOu91ndpkF1d3PN9c/OOK?= =?utf-8?q?XjxvYxUxQlhpqxQ5aHngl2OJsB0ldj7ZhmkCYD76A35mjpSovDFyA1scchoV2AHnc?= =?utf-8?q?8hqkZ8LZxi/lnn+qrWCnrLfpF06pxwLo+mMYB+FLifbRoBqInIkptxYFVcqIcfsji?= =?utf-8?q?Y7z8lWeTfJTKNN5mEX55A5NLYmy5MLkxNGi+10tTk70Z5fAFpkJ/Vy5io7eK2pkK4?= =?utf-8?q?FNViWQ1wivWRryTpEt9eO+SyEM6SxyGMkzq7emJMsPmzsoMBkkk48AMYIyqSYDfpe?= =?utf-8?q?Fiww286MD8Ic4YXxHa0ahEkWlyKBGu+bnz+Egs/fDueWbrw9WYY2157WQWcK8onCF?= =?utf-8?q?HRoMZzkdUaWJvQMbvc3ezmM3L2Hv8VkofEfG1tzicSIofkJjyL7gngeGkkdhcMfcn?= =?utf-8?q?B1+5zDHaN03Z0j+iUj8y0s8tEGB7JCsYXQJmRWFd9ylWjnhkZO4nP2Pkr+98sED+L?= =?utf-8?q?uTK75XtwE9z62aDuzj04y5TfQIILiUvln9cfqZNDoUvVBAmDcF9iatZQSCn+ek+YE?= =?utf-8?q?MxfYJLrAe5ffNf0l1AsvS/RL+qymn09Es0G/6Xy8aT1RiUNj5cS8LDZcuqCi8cTjZ?= =?utf-8?q?ZyG8AG8NgyWxpY2/hW1FDnwfewEpSjOpgej+zC34vcIADSpGnrHNEBoTBc9b7Nze3?= =?utf-8?q?2gfzcpqA+h3EF6QkSFt4m2OwEqVx69sSKDmvmEsOH5dz6E4DuwGbsCFoFnXUq7mDZ?= =?utf-8?q?AD2urYTERMNgsNdzfAoxYPgelX5dDAIP/+hSF9Ej/xEco95FDgY/LPjlq8O5mdvXi?= =?utf-8?q?WfPJWbgQ4zaT04L3EIf6R369cUTsJulMhAFo5sZrENSlm9qiL9Vc6gVXCkpwzBoF3?= =?utf-8?q?CoYq/B7k9eEN8b3LiuadCWu3t53QMaiZpqIN7uibyFRu5Fqfqu54wkaC4V08dfdnP?= =?utf-8?q?jJW5wsVWjgrH9FS7Myrkj9v80+fNyIf8zZIl+2P2XUTHPfBysv7tGij1eMXdZFfHr?= =?utf-8?q?rbvXXptKyPCDvHefTA2J7ec4dBFUb0C0PFztf0yLMlq4ZDhEIdspB2cEA5sj6sPWv?= =?utf-8?q?2EiPZ1t+pSIrbJMxsqqah2tLCJ6mmVQoCPJm9oloZCnG0rPNkzW+oQRa2JWSAXKl7?= =?utf-8?q?AtvU3Hqqix?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:28:04.0101 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 428cefdd-286a-4600-1806-08dd65b3bf64 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF00039230.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB6305 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen Add newly acked pkts EWMA. When ACK thinning occurs, select between safer and unsafe cep delta in AccECN processing based on it. If the packets ACKed per ACK tends to be large, don't conservatively assume ACE field overflow. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 1 + net/ipv4/tcp.c | 4 +++- net/ipv4/tcp_input.c | 20 +++++++++++++++++++- 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index ccb5918c8b41..06c0ff87ad4a 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -310,6 +310,7 @@ struct tcp_sock { prev_ecnfield:2,/* ECN bits from the previous segment */ accecn_opt_demand:2,/* Demand AccECN option for n next ACKs */ est_ecnfield:2;/* ECN field for AccECN delivered estimates */ + u16 pkts_acked_ewma;/* Pkts acked EWMA for AccECN cep heuristic */ u64 accecn_opt_tstamp; /* Last AccECN option sent timestamp */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c4eadf6dd6fb..01532c2b6acd 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3369,6 +3369,7 @@ int tcp_disconnect(struct sock *sk, int flags) tcp_accecn_init_counters(tp); tp->prev_ecnfield = 0; tp->accecn_opt_tstamp = 0; + tp->pkts_acked_ewma = 0; if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); @@ -5080,6 +5081,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, pkts_acked_ewma); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, accecn_opt_tstamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); @@ -5088,7 +5090,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 130 + 6); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 132 + 4); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 5c71135b43f7..cbcb3a2d4786 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -690,6 +690,10 @@ static void tcp_count_delivered(struct tcp_sock *tp, u32 delivered, tcp_count_delivered_ce(tp, delivered); } +#define PKTS_ACKED_WEIGHT 6 +#define PKTS_ACKED_PREC 6 +#define ACK_COMP_THRESH 4 + /* Returns the ECN CE delta */ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, u32 delivered_pkts, u32 delivered_bytes, @@ -709,6 +713,19 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, opt_deltas_valid = tcp_accecn_process_option(tp, skb, delivered_bytes, flag); + if (delivered_pkts) { + if (!tp->pkts_acked_ewma) { + tp->pkts_acked_ewma = delivered_pkts << PKTS_ACKED_PREC; + } else { + u32 ewma = tp->pkts_acked_ewma; + + ewma = (((ewma << PKTS_ACKED_WEIGHT) - ewma) + + (delivered_pkts << PKTS_ACKED_PREC)) >> + PKTS_ACKED_WEIGHT; + tp->pkts_acked_ewma = min_t(u32, ewma, 0xFFFFU); + } + } + if (!(flag & FLAG_SLOWPATH)) { /* AccECN counter might overflow on large ACKs */ if (delivered_pkts <= TCP_ACCECN_CEP_ACE_MASK) @@ -757,7 +774,8 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, if (d_ceb < safe_delta * tp->mss_cache >> TCP_ACCECN_SAFETY_SHIFT) return delta; - } + } else if (tp->pkts_acked_ewma > (ACK_COMP_THRESH << PKTS_ACKED_PREC)) + return delta; return safe_delta; } From patchwork Tue Mar 18 00:27:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang (Nokia)" X-Patchwork-Id: 14020104 X-Patchwork-Delegate: kuba@kernel.org Received: from AS8PR03CU001.outbound.protection.outlook.com (mail-westeuropeazon11012009.outbound.protection.outlook.com [52.101.71.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2FDC17A319; Tue, 18 Mar 2025 00:28:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.71.9 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257690; cv=fail; b=G43yijSi/pWTe2e8nUZ6TWEv/Nz67uFU0q7yoCigsX3dNY+60JGQxpLW/SW5Ls+LnBl/kRYlGceaodbm3a3JPXz8GC/Q7vuHl6iIz4Kg6H7w9rn8TzSqinRyLsYxm+gVd0SRYcqRjfDXNEb5yVsI+y9rqeaDHOkMh/hx7WSsWaE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742257690; c=relaxed/simple; bh=TmPK1Vyp2bFVJubcSXhUlXk5Wuf8M2fpIB8ZL5UOptg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=iFS87xzoc67tpWJwXvX34NhHjsd4zF1DFZQMfkxsKOa//xCctkK00gPSLbnJyKgX5+AlumzYG74rg8zMPnBUpJArroIBFKtFvJvBiAQKAe8RK70MVF511+KErBPDZSdKfbdZir+0Nant7n6A//f21Lulyr9awgrWGH0BA3CdiVM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=V63bKlJr; arc=fail smtp.client-ip=52.101.71.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="V63bKlJr" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FPYA8vBLc4B1P1BzwycawK848R01ZQ7pn5k3cIxTuB1m9fQVTL16ndcJ3lvDUZAWqaDX1supoCzqkpriojWLOHQRTIz1z0QX3MeuqOCbEO3pV0g4DToM1O9v8ri4FZov3xFLcT0Iglk9ISnjedhOy/Wwu0xCxvFWQ30hNoU1zdu28bwtOESvgUvzJasSg5YAk2Nvz44570NEl1biMAQV+PYSr+Dw19/gqmy7mebMYMg5zpEvSmJLdwVsBPYiZ5rD68PdK+HIi5j1SAyTukMc8WjItYZJWNPG2QLCZDqrnf2PD0++BavwzRG8ZmSLS0ez24aBkRD218fXFujXLBNdOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=y/4w8JPc5QkyDdKZ1GwKpR0KH4xz9bMjj6Dj9IGETB4=; b=sBmufZU4eM6stMrO/pHiuJiWzuAfk1FqPht6ryV6HqTQrFEWYF8S+AB4k7mS8OpouIRtUTVt8/PhSkn0LCCPty26GChOFfB2nWohzS64MZ19VcKqz2VH7SrhnOCfDvxIWU80ZIBFDJXjbwAYKmVO2MP3jmNOC+XjzMCgwyaEw4TiTvVAzbsDC5BoXrrlrYwqUOmHwL5ZOUFjaUkHs6p2daqNRIo494fRmVw8aCEp6iMVE69B5gim5IAjnjio2cfr3U5msDtrdTU7Lz+aN3DJT+AKw5e1lJ1TlIjh7nZHIWO/kwvwM4F4ot1aZjZXcZvyHyZ99IsE19SWFJFb/xAEHg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.100) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=y/4w8JPc5QkyDdKZ1GwKpR0KH4xz9bMjj6Dj9IGETB4=; b=V63bKlJr6mE1P8z2DP7F4qgW0Bqd2eWEC+b1i8qZ6FOJWD8U96DFP6Mfoo/o4s7pnYj5mN4pTE+Lc4hb2PtCvemRwPFynrzHl4yc71YqdrqivOZSHxkfBmAlaPluE/Dwm1tEPe3vM2V/g+xQa1GhO0f7GRVK2RcqFDDqbJH+r8OR9ofzi2HquC2YBwTAnD0wl1U391umOhPO5zQQVd/Akp5GPDKw1KkptC6JtSTKwrHppX9U65hwyn9PP1yBEb2l8sZ6jI0n5wpSrhZktUgtsUGGW2CbsPI4F43Z/ziHWM2PeBiLbpx8OZ1RF/qyX3AXVmaX6PXS0+IqJcRpue6tLA== Received: from AM0PR07CA0025.eurprd07.prod.outlook.com (2603:10a6:208:ac::38) by DU2PR07MB8109.eurprd07.prod.outlook.com (2603:10a6:10:238::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.33; Tue, 18 Mar 2025 00:28:05 +0000 Received: from AMS0EPF00000192.eurprd05.prod.outlook.com (2603:10a6:208:ac:cafe::9f) by AM0PR07CA0025.outlook.office365.com (2603:10a6:208:ac::38) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8534.33 via Frontend Transport; Tue, 18 Mar 2025 00:28:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.100) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.100 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.100; helo=fr711usmtp2.zeu.alcatel-lucent.com; pr=C Received: from fr711usmtp2.zeu.alcatel-lucent.com (131.228.6.100) by AMS0EPF00000192.mail.protection.outlook.com (10.167.16.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8534.20 via Frontend Transport; Tue, 18 Mar 2025 00:28:05 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr711usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id 52I0RNBv024935; Tue, 18 Mar 2025 00:28:12 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v2 net-next 15/15] gro: flushing when CWR is set negatively affects AccECN Date: Tue, 18 Mar 2025 01:27:10 +0100 Message-Id: <20250318002710.29483-16-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> References: <20250318002710.29483-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AMS0EPF00000192:EE_|DU2PR07MB8109:EE_ X-MS-Office365-Filtering-Correlation-Id: d8c37557-8b60-4df5-8890-08dd65b3c018 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?ZUSEgto126YzxiWlBgDrod+N5wS0oZ3?= =?utf-8?q?gjq8dJZrrp5upRnl7HNcGzcU+kPBlkb97ZaUlmb/+jPXzzmgBtDVFnoDVhd5dArVA?= =?utf-8?q?YPxivRtg0sV6ZweevyhHBwmO8HQW+Cy5NFcO6vz5/stib/ZG55+uhMsnnRd+qc3qb?= =?utf-8?q?c2qPR3duZRIf/bhLNdll0XxIzz+8xDB3UUP2Ud+BK0ibahWWYmMCFpaKypzJUeu2H?= =?utf-8?q?9WiVQUphWqoEhaPjNblJSpNzGu9b/Og8TgFY/QLBWshFYBWOGBUCALLYaBx1slw2N?= =?utf-8?q?zbV71x11Jon9pfA5Sm/r8KlRWQAPPzI7yME+HOmptDPdljC5p8R78PrkhULYk+wMG?= =?utf-8?q?Glgr3ykx1L6gX60jnyikjzEVP1QZZZR5aCVt/cN+4kpb2z/oGJkz9jRHN5CoUPCSB?= =?utf-8?q?Ezz2JnwB8lGdR64Ho+MrEDM4Fycs4jWv62ta1Iz5igyW6DT5UFru58D4MpeIqxfs5?= =?utf-8?q?imEfn8RKsMZm41sqJ+BALpT1dZZlRYk6adtRQyx9zmw3mR+sQubJIXAngUmyBbg1I?= =?utf-8?q?S3ivMlzARTgcAREaBD+KlTAy75GoAFIV92G/g21kf7jLM8zWCMTclFC/YJzE9IhPN?= =?utf-8?q?64/h0xmAYQGRb8CnQTd6pUXNpt2FRKqVfWm23+Wt5jZQFAUTR91f/acU6QQzVa5ck?= =?utf-8?q?p2qeWIVXhrDA6deuSLwxdSKtOYKgWu3kPOGs3gdilpsdfZTadEAntB/8TS5zSGWxs?= =?utf-8?q?UvGQJrIIhpggSfx/XQaeQgp7rtUSb+M3BYh3Z7ZGgUTA4ML3Sj5tbHYijyJqVtTk7?= =?utf-8?q?jeu4Ifrx8aVQJsbzrHF1AByk9bH35pIKmnB7iU4ehjIUoCUB4GSBL/TUCOlwYdqu8?= =?utf-8?q?TnoF2c7B/who1zahhtFmQcjs9vTQr16n8ZXx2Y/b0qNnis/kgFX5d2LhUMX9GnlMy?= =?utf-8?q?GZj3MzoY+WWcMBROGk3ikHFt5c0ROOwebYFy23WH0nQo4vJcPM/+9raH260TM91UF?= =?utf-8?q?hIIpZQj2RXm1DAr4AYycCqhenTRehOC9bQ1CjVPOfh4Y4k/GoFD+hmhc9b8+r1WFK?= =?utf-8?q?/k+qG93hWEozsYwmuTWXuRT6I+wJO2zdnzpwodCNX3kx9yG/VJMbjwV3LAIyfTlGL?= =?utf-8?q?w9V8lTR10Y+A1hmLPS8AdNhnuMToV2GSQMk8FOAL5bJWhzTC41BEB5iNdka35Dn5v?= =?utf-8?q?9baAKXqMDHbgc+3nQwyzKEnqrvTDEXCeup4aBodKdmV3mvr1I34po8lP/lyNefkm5?= =?utf-8?q?hZcoUKpXaPbnWUPU2RedZl3mUZLXZOQObgL0pBmLJ+2M5Tm1913OcknOlOS7IMRIA?= =?utf-8?q?yfPuBF2+d7yj0knu6O0QXAMHdDJjs/ZfDQiRBb4MvIPZ8QXlufocu41BfQN83aOpt?= =?utf-8?q?LpmOSceWFIMqoOu7aPzQpniubou/FLo2ltTD6c+cJob00vVfGyIsRe4V5VGciR6f7?= =?utf-8?q?e8h0dw+P4QBYVVu2OC+CIqQfOmh8tZlIVk3V4+AF8J61Qt7kMYD36U+0r5cew/AHd?= =?utf-8?q?wmYL7Z4/Rn?= X-Forefront-Antispam-Report: CIP:131.228.6.100;CTRY:FI;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:fr711usmtp2.zeu.alcatel-lucent.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(7416014)(376014)(921020);DIR:OUT;SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Mar 2025 00:28:05.2238 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d8c37557-8b60-4df5-8890-08dd65b3c018 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0;Ip=[131.228.6.100];Helo=[fr711usmtp2.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF00000192.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2PR07MB8109 X-Patchwork-Delegate: kuba@kernel.org From: Ilpo Järvinen As AccECN may keep CWR bit asserted due to different interpretation of the bit, flushing with GRO because of CWR may effectively disable GRO until AccECN counter field changes such that CWR-bit becomes 0. There is no harm done from not immediately forwarding the CWR'ed segment with RFC3168 ECN. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_offload.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 934f777f29d3..fd2fd70f650c 100644 --- a/net/ipv4/tcp_offload.c +++ b/net/ipv4/tcp_offload.c @@ -330,8 +330,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb, goto out_check_final; th2 = tcp_hdr(p); - flush = (__force int)(flags & TCP_FLAG_CWR); - flush |= (__force int)((flags ^ tcp_flag_word(th2)) & + flush = (__force int)((flags ^ tcp_flag_word(th2)) & ~(TCP_FLAG_FIN | TCP_FLAG_PSH)); flush |= (__force int)(th->ack_seq ^ th2->ack_seq); for (i = sizeof(*th); i < thlen; i += 4)