From patchwork Tue Aug 15 19:14:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 13354198 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59B751804B for ; Tue, 15 Aug 2023 19:15:32 +0000 (UTC) Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4EF0B8 for ; Tue, 15 Aug 2023 12:15:29 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-3fe4cdb724cso53987145e9.1 for ; Tue, 15 Aug 2023 12:15:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1692126928; x=1692731728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Gl4v14fnWKN5hHK7D99HSHeZgwX9WSWLARpW2GsofK8=; b=d1f3WeRc1i6/ZtkQLtIdfP4NkCbOZEhS3U1Qv02DBKgWhpJIgJ7rpSTb7TehiqPbU0 0YoHYCrnNL0bkaWZBR56bqM6JBJLPfEG87a+Pwhkm0DiKRwSy95jF5REzVtAUwOBAi+Q WQfBPgmrn5cfMwClJRzA0gEvajFzFCoAbDX2u3jvoH6KJzVJq1zzaxrIghywadgVb5J1 /heNze9cvG1tq8+mPALJstbAH8bu/uPERKXwhbvok1HyoIkn9+1XN/7O02k5zgS8tvkb DliP+au4YbL3dhf4QLn5zrCEHs8xAnV6/5Figq4ZFX5siZjh6poXA15pMc3E/YTILABl bBVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692126928; x=1692731728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gl4v14fnWKN5hHK7D99HSHeZgwX9WSWLARpW2GsofK8=; b=UHPA8j8dNcuLe2iLNHYocIPp5QFu8BIIql4KYyZvxuoW3nXDbt+AOF+A/xnDmtyrQo xXBq0IPIN+kutP1YDqn8/TNDtFSpyg//LEZpOLohGSd52HJBXRJ9z4nk6cKui7Bx76ff W8ssKhQx/dVcQgZdCqdAyZEui7x1Os8lbc3IFnwZjba0klfU5Xlwir7rEjZ9ZEvLdSkC ka3XogzK8yo/FWtuufxCt8q8WvakV2NRMHKLCLIaq4LnKjJEro+C98R/3outjicHtcRB BQeXR49t11HkoJS7WIcoPVxYH0ZGrMT3KJvm+TbaZ9rGNuHBEcXM0esuBcLOo+8eKOro Mj4Q== X-Gm-Message-State: AOJu0YxjTzIkq4BDRHojqJSWBE0WvrEc0Cf1wcUhRS8DTvUZFgt50rAY cYW+zQlI+WIPeb+LIgaqqWaYoA== X-Google-Smtp-Source: AGHT+IH1Q9qzIEmPG8O7sbrVFVMriyt8EUHDQK20ytqIMh0PwILB9A5D4oAF/6ugmXzYZvMRpUbsaA== X-Received: by 2002:a7b:c7c9:0:b0:3fb:bc6d:41f1 with SMTP id z9-20020a7bc7c9000000b003fbbc6d41f1mr10838583wmk.17.1692126928225; Tue, 15 Aug 2023 12:15:28 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id q9-20020a1ce909000000b003fbbe41fd78sm18779737wmc.10.2023.08.15.12.15.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 12:15:27 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v10 net-next 14/23] net/tcp: Add TCP-AO SNE support Date: Tue, 15 Aug 2023 20:14:43 +0100 Message-ID: <20230815191455.1872316-15-dima@arista.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230815191455.1872316-1-dima@arista.com> References: <20230815191455.1872316-1-dima@arista.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 20 ++++++++++++++++++++ net/ipv4/tcp_ao.c | 40 +++++++++++++++++++++++++++++++++------- net/ipv4/tcp_input.c | 28 ++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_minisocks.c | 15 ++++++++++++++- net/ipv4/tcp_output.c | 6 +++++- net/ipv6/tcp_ipv6.c | 1 + 7 files changed, 102 insertions(+), 9 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index 9fa8ff89f5e4..d295ea1b6cb7 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -95,6 +95,25 @@ struct tcp_ao_info { __unused :31; __be32 lisn; __be32 risn; + /* Sequence Number Extension (SNE) are upper 4 bytes for SEQ, + * that protect TCP-AO connection from replayed old TCP segments. + * See RFC5925 (6.2). + * In order to get correct SNE, there's a helper tcp_ao_compute_sne(). + * It needs SEQ basis to understand whereabouts are lower SEQ numbers. + * According to that basis vector, it can provide incremented SNE + * when SEQ rolls over or provide decremented SNE when there's + * a retransmitted segment from before-rolling over. + * - for request sockets such basis is rcv_isn/snt_isn, which seems + * good enough as it's unexpected to receive 4 Gbytes on reqsk. + * - for full sockets the basis is rcv_nxt/snd_una. snd_una is + * taken instead of snd_nxt as currently it's easier to track + * in tcp_snd_una_update(), rather than updating SNE in all + * WRITE_ONCE(tp->snd_nxt, ...) + * - for time-wait sockets the basis is tw_rcv_nxt/tw_snd_nxt. + * tw_snd_nxt is not expected to change, while tw_rcv_nxt may. + */ + u32 snd_sne; + u32 rcv_sne; atomic_t refcnt; /* Protects twsk destruction */ struct rcu_head rcu; }; @@ -145,6 +164,7 @@ enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, const struct tcp_ao_hdr *aoh); +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 88f32b163e0e..21a711bf6921 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -401,6 +401,21 @@ static int tcp_ao_hash_pseudoheader(unsigned short int family, return -EAFNOSUPPORT; } +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq) +{ + u32 sne = next_sne; + + if (before(seq, next_seq)) { + if (seq > next_seq) + sne--; + } else { + if (seq < next_seq) + sne++; + } + + return sne; +} + /* tcp_ao_hash_sne(struct tcp_sigpool *hp) * @hp - used for hashing * @sne - sne value @@ -639,7 +654,8 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); - *sne = 0; + *sne = tcp_ao_compute_sne(0, tcp_rsk(req)->snt_isn, + ntohl(seq)); } else { sisn = th->seq; disn = 0; @@ -671,11 +687,15 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *keyid = (*key)->rcvid; } else { struct tcp_ao_key *rnext_key; + u32 snd_basis; - if (sk->sk_state == TCP_TIME_WAIT) + if (sk->sk_state == TCP_TIME_WAIT) { ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); - else + snd_basis = tcp_twsk(sk)->tw_snd_nxt; + } else { ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + snd_basis = tcp_sk(sk)->snd_una; + } if (!ao_info) return -ENOENT; @@ -685,7 +705,8 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *traffic_key = snd_other_key(*key); rnext_key = READ_ONCE(ao_info->rnext_key); *keyid = rnext_key->rcvid; - *sne = 0; + *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne), + snd_basis, ntohl(seq)); } return 0; } @@ -814,7 +835,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if (unlikely(th->syn && !th->ack)) goto verify_hash; - sne = 0; + sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt, + ntohl(th->seq)); /* Established socket, traffic key are cached */ traffic_key = rcv_other_key(key); err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, @@ -849,14 +871,16 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { /* Make the initial syn the likely case here */ if (unlikely(req)) { - sne = 0; + sne = tcp_ao_compute_sne(0, tcp_rsk(req)->rcv_isn, + ntohl(th->seq)); sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); } else if (unlikely(th->ack && !th->syn)) { /* Possible syncookie packet */ sisn = htonl(ntohl(th->seq) - 1); disn = htonl(ntohl(th->ack_seq) - 1); - sne = 0; + sne = tcp_ao_compute_sne(0, ntohl(sisn), + ntohl(th->seq)); } else if (unlikely(!th->syn)) { /* no way to figure out initial sisn/disn - drop */ return SKB_DROP_REASON_TCP_FLAGS; @@ -955,6 +979,7 @@ void tcp_ao_connect_init(struct sock *sk) tp->tcp_header_len += tcp_ao_len(key); ao_info->lisn = htonl(tp->write_seq); + ao_info->snd_sne = 0; } else { /* TODO: probably, it should fail to connect() here */ rcu_assign_pointer(tp->ao_info, NULL); @@ -973,6 +998,7 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) return; ao->risn = tcp_hdr(skb)->seq; + ao->rcv_sne = 0; hlist_for_each_entry_rcu(key, &ao->head, node) tcp_ao_cache_traffic_keys(sk, ao, key); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index acf9aa9ce0a0..78e7b04fac4f 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3532,9 +3532,18 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) { u32 delta = ack - tp->snd_una; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_acked += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && ack < tp->snd_una) + ao->snd_sne++; +#endif tp->snd_una = ack; } @@ -3542,9 +3551,18 @@ static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq) { u32 delta = seq - tp->rcv_nxt; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_received += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && seq < tp->rcv_nxt) + ao->rcv_sne++; +#endif WRITE_ONCE(tp->rcv_nxt, seq); } @@ -6392,6 +6410,16 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * simultaneous connect with crossed SYNs. * Particularly, it can be connect to self. */ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)); + if (ao) { + ao->risn = th->seq; + ao->rcv_sne = 0; + } +#endif tcp_set_state(sk, TCP_SYN_RECV); if (tp->rx_opt.saw_tstamp) { diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 883fa4403de7..5d25c974ff61 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1039,6 +1039,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) struct tcp_ao_key *rnext_key; traffic_key = snd_other_key(ao_key); + ao_sne = READ_ONCE(ao_info->snd_sne); rnext_key = READ_ONCE(ao_info->rnext_key); rcv_next = rnext_key->rcvid; } diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index c9360fbcd0b7..7f69bbf8a804 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -51,6 +51,18 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw, return TCP_TW_SUCCESS; } +static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq) +{ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference(tcptw->ao_info); + if (unlikely(ao && seq < tcptw->tw_rcv_nxt)) + WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1); +#endif + tcptw->tw_rcv_nxt = seq; +} + /* * * Main purpose of TIME-WAIT state is to close connection gracefully, * when one of ends sits in LAST-ACK or CLOSING retransmitting FIN @@ -136,7 +148,8 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, /* FIN arrived, enter true time-wait state. */ tw->tw_substate = TCP_TIME_WAIT; - tcptw->tw_rcv_nxt = TCP_SKB_CB(skb)->end_seq; + twsk_rcv_nxt_update(tcptw, TCP_SKB_CB(skb)->end_seq); + if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent_stamp = ktime_get_seconds(); tcptw->tw_ts_recent = tmp_opt.rcv_tsval; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 72c0c0746ce4..675582d08f01 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1450,6 +1450,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, u8 *traffic_key; void *tkey_buf = NULL; __be32 disn; + u32 sne; sk_gso_disable(sk); if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) { @@ -1467,9 +1468,12 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, } else { traffic_key = snd_other_key(ao_key); } + sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), + READ_ONCE(tp->snd_una), + ntohl(th->seq)); tp->af_specific->calc_ao_hash(opts.hash_location, ao_key, sk, skb, traffic_key, - opts.hash_location - (u8 *)th, 0); + opts.hash_location - (u8 *)th, sne); kfree(tkey_buf); } #endif diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index fd3402fe77af..1506de3ce5e0 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1180,6 +1180,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) /* rcv_next switches to our rcv_next */ rnext_key = READ_ONCE(ao_info->rnext_key); rcv_next = rnext_key->rcvid; + ao_sne = READ_ONCE(ao_info->snd_sne); } #endif