From patchwork Mon Oct 23 19:22:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov X-Patchwork-Id: 13433450 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D49BD1F933 for ; Mon, 23 Oct 2023 19:23:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=arista.com header.i=@arista.com header.b="SOyrRGoq" Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E0F6170E for ; Mon, 23 Oct 2023 12:22:57 -0700 (PDT) Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2c504a5e1deso57964521fa.2 for ; Mon, 23 Oct 2023 12:22:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=google; t=1698088973; x=1698693773; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OQK8HpMXfmvV96FH1MbnWz7hCIl1lQYmXecUy/I3/xU=; b=SOyrRGoqi+MZF5++3oIEIhEZpz+9rJuZ/2CpIV14ssYjGndWxDXsLTbpRFTYAp0SUg LgN8GxbAi+gCN6yiUUNDshTSp142pNAdqfHWdsL/iJLzk97UgluLJcto5TbwFTICp5Kv Basg1IYD7QK+Eqx8BNbqO98mtcgkc6mFrhJZmnhhCtr5AEGR7g9uGyD90ffERzI4S/9G 1wCoGSyfMmKfrkNWfolf9RjT2YeyeuOVVeHDB8EExdOEQZ4myb8lkCU0lHZwJp2wZdSV j0mjfbCHpeo6SCIDwSC+8Fb4BRNhmIOGduxWDMQrugB3N3hvuC9QWeI+dHhqGrF9luFA L7Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698088973; x=1698693773; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OQK8HpMXfmvV96FH1MbnWz7hCIl1lQYmXecUy/I3/xU=; b=pAKq4xiMB2XvWPRADxAHQgodp1rWOS9BXIT4fm7jMBPRYox38XDVaIC8wXkjkFS6tK FiSpuIgLNpwF5T1tdItBedpSOqrgVEBfGv6uael174BWy4JqS2UcmYaPlMkUuhNgzNU4 0cs4hWelQnZX1IWyZl/qhHFdnbgDMpZptvK2+oZLhMHFlUKdz+bNyFOQL7zKoMUnof68 mfmlrN6b9FDBKQH0wAWWTw7oh400YlQ7GEZjykYXaMj28Czxm3o/j6c+u91M0rj8Eqra DjrETTpLy9nPQJvEe1uv1n/T5X1wCpCSFlR/7nBsFSwDq5GCkUWt9mLJZOE2kLZJzit8 AFAw== X-Gm-Message-State: AOJu0YwgeqDOeM/W6qTDB2hUjlOQdpVWUBPIg1R+l1uXds0DNaq5soPj 6/do+iEwkoSRWrdW8sJNsaPGlQ== X-Google-Smtp-Source: AGHT+IFV7oUosKi/nGzt21yDp9SK2fEKpeqljqdMnZbsd0lHtsIebSTamUCOM36v7nH1K8kCtiSsdA== X-Received: by 2002:a05:651c:1a28:b0:2c5:1bd9:f946 with SMTP id by40-20020a05651c1a2800b002c51bd9f946mr8706892ljb.9.1698088973323; Mon, 23 Oct 2023 12:22:53 -0700 (PDT) Received: from Mindolluin.ire.aristanetworks.com ([217.173.96.166]) by smtp.gmail.com with ESMTPSA id ay20-20020a05600c1e1400b00407460234f9sm10142088wmb.21.2023.10.23.12.22.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 12:22:52 -0700 (PDT) From: Dmitry Safonov To: David Ahern , Eric Dumazet , Paolo Abeni , Jakub Kicinski , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Dmitry Safonov , Andy Lutomirski , Ard Biesheuvel , Bob Gilligan , Dan Carpenter , David Laight , Dmitry Safonov <0x7f454c46@gmail.com>, Donald Cassidy , Eric Biggers , "Eric W. Biederman" , Francesco Ruggeri , "Gaillardetz, Dominik" , Herbert Xu , Hideaki YOSHIFUJI , Ivan Delalande , Leonard Crestez , "Nassiri, Mohammad" , Salam Noureddine , Simon Horman , "Tetreault, Francois" , netdev@vger.kernel.org Subject: [PATCH v16 net-next 14/23] net/tcp: Add TCP-AO SNE support Date: Mon, 23 Oct 2023 20:22:06 +0100 Message-ID: <20231023192217.426455-15-dima@arista.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023192217.426455-1-dima@arista.com> References: <20231023192217.426455-1-dima@arista.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri Signed-off-by: Francesco Ruggeri Co-developed-by: Salam Noureddine Signed-off-by: Salam Noureddine Signed-off-by: Dmitry Safonov Acked-by: David Ahern --- include/net/tcp_ao.h | 22 ++++++++++++++++++- net/ipv4/tcp_ao.c | 46 ++++++++++++++++++++++++++++++++-------- net/ipv4/tcp_input.c | 28 ++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 3 ++- net/ipv4/tcp_minisocks.c | 15 ++++++++++++- net/ipv6/tcp_ipv6.c | 3 ++- 6 files changed, 104 insertions(+), 13 deletions(-) diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h index cfb55bd9411b..0c3516d1b968 100644 --- a/include/net/tcp_ao.h +++ b/include/net/tcp_ao.h @@ -95,6 +95,25 @@ struct tcp_ao_info { __unused :31; __be32 lisn; __be32 risn; + /* Sequence Number Extension (SNE) are upper 4 bytes for SEQ, + * that protect TCP-AO connection from replayed old TCP segments. + * See RFC5925 (6.2). + * In order to get correct SNE, there's a helper tcp_ao_compute_sne(). + * It needs SEQ basis to understand whereabouts are lower SEQ numbers. + * According to that basis vector, it can provide incremented SNE + * when SEQ rolls over or provide decremented SNE when there's + * a retransmitted segment from before-rolling over. + * - for request sockets such basis is rcv_isn/snt_isn, which seems + * good enough as it's unexpected to receive 4 Gbytes on reqsk. + * - for full sockets the basis is rcv_nxt/snd_una. snd_una is + * taken instead of snd_nxt as currently it's easier to track + * in tcp_snd_una_update(), rather than updating SNE in all + * WRITE_ONCE(tp->snd_nxt, ...) + * - for time-wait sockets the basis is tw_rcv_nxt/tw_snd_nxt. + * tw_snd_nxt is not expected to change, while tw_rcv_nxt may. + */ + u32 snd_sne; + u32 rcv_sne; refcount_t refcnt; /* Protects twsk destruction */ struct rcu_head rcu; }; @@ -147,6 +166,7 @@ enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, unsigned short int family, const struct request_sock *req, const struct tcp_ao_hdr *aoh); +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq); struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, const union tcp_ao_addr *addr, int family, int sndid, int rcvid); @@ -156,7 +176,7 @@ int tcp_ao_hash_hdr(unsigned short family, char *ao_hash, const union tcp_ao_addr *saddr, const struct tcphdr *th, u32 sne); int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, - const struct tcp_ao_hdr *aoh, int l3index, + const struct tcp_ao_hdr *aoh, int l3index, u32 seq, struct tcp_ao_key **key, char **traffic_key, bool *allocated_traffic_key, u8 *keyid, u32 *sne); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index 1097e99a9ad6..7e14bcd4dfd4 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -401,6 +401,21 @@ static int tcp_ao_hash_pseudoheader(unsigned short int family, return -EAFNOSUPPORT; } +u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq) +{ + u32 sne = next_sne; + + if (before(seq, next_seq)) { + if (seq > next_seq) + sne--; + } else { + if (seq < next_seq) + sne++; + } + + return sne; +} + /* tcp_ao_hash_sne(struct tcp_sigpool *hp) * @hp - used for hashing * @sne - sne value @@ -611,7 +626,7 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk, } int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, - const struct tcp_ao_hdr *aoh, int l3index, + const struct tcp_ao_hdr *aoh, int l3index, u32 seq, struct tcp_ao_key **key, char **traffic_key, bool *allocated_traffic_key, u8 *keyid, u32 *sne) { @@ -639,7 +654,7 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); - *sne = 0; + *sne = tcp_ao_compute_sne(0, tcp_rsk(req)->snt_isn, seq); } else { sisn = th->seq; disn = 0; @@ -670,11 +685,15 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *keyid = (*key)->rcvid; } else { struct tcp_ao_key *rnext_key; + u32 snd_basis; - if (sk->sk_state == TCP_TIME_WAIT) + if (sk->sk_state == TCP_TIME_WAIT) { ao_info = rcu_dereference(tcp_twsk(sk)->ao_info); - else + snd_basis = tcp_twsk(sk)->tw_snd_nxt; + } else { ao_info = rcu_dereference(tcp_sk(sk)->ao_info); + snd_basis = tcp_sk(sk)->snd_una; + } if (!ao_info) return -ENOENT; @@ -684,7 +703,8 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb, *traffic_key = snd_other_key(*key); rnext_key = READ_ONCE(ao_info->rnext_key); *keyid = rnext_key->rcvid; - *sne = 0; + *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne), + snd_basis, seq); } return 0; } @@ -698,6 +718,7 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, struct tcp_ao_info *ao; void *tkey_buf = NULL; u8 *traffic_key; + u32 sne; ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, lockdep_sock_is_held(sk)); @@ -717,8 +738,10 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb, tp->af_specific->ao_calc_key_sk(key, traffic_key, sk, ao->lisn, disn, true); } + sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), READ_ONCE(tp->snd_una), + ntohl(th->seq)); tp->af_specific->calc_ao_hash(hash_location, key, sk, skb, traffic_key, - hash_location - (u8 *)th, 0); + hash_location - (u8 *)th, sne); kfree(tkey_buf); return 0; } @@ -846,7 +869,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if (unlikely(th->syn && !th->ack)) goto verify_hash; - sne = 0; + sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt, + ntohl(th->seq)); /* Established socket, traffic key are cached */ traffic_key = rcv_other_key(key); err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key, @@ -881,14 +905,16 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb, if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) { /* Make the initial syn the likely case here */ if (unlikely(req)) { - sne = 0; + sne = tcp_ao_compute_sne(0, tcp_rsk(req)->rcv_isn, + ntohl(th->seq)); sisn = htonl(tcp_rsk(req)->rcv_isn); disn = htonl(tcp_rsk(req)->snt_isn); } else if (unlikely(th->ack && !th->syn)) { /* Possible syncookie packet */ sisn = htonl(ntohl(th->seq) - 1); disn = htonl(ntohl(th->ack_seq) - 1); - sne = 0; + sne = tcp_ao_compute_sne(0, ntohl(sisn), + ntohl(th->seq)); } else if (unlikely(!th->syn)) { /* no way to figure out initial sisn/disn - drop */ return SKB_DROP_REASON_TCP_FLAGS; @@ -986,6 +1012,7 @@ void tcp_ao_connect_init(struct sock *sk) tp->tcp_header_len += tcp_ao_len(key); ao_info->lisn = htonl(tp->write_seq); + ao_info->snd_sne = 0; } else { /* Can't happen: tcp_connect() verifies that there's * at least one tcp-ao key that matches the remote peer. @@ -1021,6 +1048,7 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb) return; WRITE_ONCE(ao->risn, tcp_hdr(skb)->seq); + ao->rcv_sne = 0; hlist_for_each_entry_rcu(key, &ao->head, node) tcp_ao_cache_traffic_keys(sk, ao, key); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 475af9767615..8132577bdfba 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3575,9 +3575,18 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) { u32 delta = ack - tp->snd_una; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_acked += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && ack < tp->snd_una) + ao->snd_sne++; +#endif tp->snd_una = ack; } @@ -3585,9 +3594,18 @@ static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack) static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq) { u32 delta = seq - tp->rcv_nxt; +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; +#endif sock_owned_by_me((struct sock *)tp); tp->bytes_received += delta; +#ifdef CONFIG_TCP_AO + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held((struct sock *)tp)); + if (ao && seq < tp->rcv_nxt) + ao->rcv_sne++; +#endif WRITE_ONCE(tp->rcv_nxt, seq); } @@ -6455,6 +6473,16 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * simultaneous connect with crossed SYNs. * Particularly, it can be connect to self. */ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference_protected(tp->ao_info, + lockdep_sock_is_held(sk)); + if (ao) { + WRITE_ONCE(ao->risn, th->seq); + ao->rcv_sne = 0; + } +#endif tcp_set_state(sk, TCP_SYN_RECV); if (tp->rx_opt.saw_tstamp) { diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ece95d5138e1..bdec99707028 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -676,7 +676,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb, u8 keyid; rcu_read_lock(); - if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, ntohl(reply->seq), &key, &traffic_key, &allocated_traffic_key, &keyid, &ao_sne)) goto out; @@ -1034,6 +1034,7 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) struct tcp_ao_key *rnext_key; key.traffic_key = snd_other_key(key.ao_key); + key.sne = READ_ONCE(ao_info->snd_sne); rnext_key = READ_ONCE(ao_info->rnext_key); key.rcv_next = rnext_key->rcvid; key.type = TCP_KEY_AO; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 8d941a6e066b..a9807eeb311c 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -51,6 +51,18 @@ tcp_timewait_check_oow_rate_limit(struct inet_timewait_sock *tw, return TCP_TW_SUCCESS; } +static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq) +{ +#ifdef CONFIG_TCP_AO + struct tcp_ao_info *ao; + + ao = rcu_dereference(tcptw->ao_info); + if (unlikely(ao && seq < tcptw->tw_rcv_nxt)) + WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1); +#endif + tcptw->tw_rcv_nxt = seq; +} + /* * * Main purpose of TIME-WAIT state is to close connection gracefully, * when one of ends sits in LAST-ACK or CLOSING retransmitting FIN @@ -136,7 +148,8 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, /* FIN arrived, enter true time-wait state. */ tw->tw_substate = TCP_TIME_WAIT; - tcptw->tw_rcv_nxt = TCP_SKB_CB(skb)->end_seq; + twsk_rcv_nxt_update(tcptw, TCP_SKB_CB(skb)->end_seq); + if (tmp_opt.saw_tstamp) { tcptw->tw_ts_recent_stamp = ktime_get_seconds(); tcptw->tw_ts_recent = tmp_opt.rcv_tsval; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 9c668bbb4853..97397f57dec1 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1090,7 +1090,7 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) int l3index; l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0; - if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, + if (tcp_ao_prepare_reset(sk, skb, aoh, l3index, seq, &key.ao_key, &key.traffic_key, &allocated_traffic_key, &key.rcv_next, &key.sne)) @@ -1167,6 +1167,7 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb) /* rcv_next switches to our rcv_next */ rnext_key = READ_ONCE(ao_info->rnext_key); key.rcv_next = rnext_key->rcvid; + key.sne = READ_ONCE(ao_info->snd_sne); key.type = TCP_KEY_AO; #else if (0) {