From patchwork Mon Apr 3 12:12:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Ehrig X-Patchwork-Id: 13198053 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 081B3C77B62 for ; Mon, 3 Apr 2023 11:13:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232331AbjDCLNW (ORCPT ); Mon, 3 Apr 2023 07:13:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232261AbjDCLNK (ORCPT ); Mon, 3 Apr 2023 07:13:10 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FC5A1A45A for ; Mon, 3 Apr 2023 04:12:39 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id r29so28887925wra.13 for ; Mon, 03 Apr 2023 04:12:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1680520354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iUBwfYQK6nNehUVYqNgpHoXy2st+RjBCDaSOraeR1MA=; b=tG9yTZ9q1pBQrfTomOZ97lwSs8wXAOQo9nVQPQUN7de3ask4yNTYIaCDO8aVPxXvf5 06CgPiqLxCes9CxQoWPCBfIKe+YV0/hQU+XfABOspRFFPbN3h7w+S1kLTnHGAhindVOt fKTvFptopgwxRb0FWYK8mMl3s4rEVsEGBgxp0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680520354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iUBwfYQK6nNehUVYqNgpHoXy2st+RjBCDaSOraeR1MA=; b=s010Fpi/uy006Jbjb2nAyn+bJB/jHJzbcT1ChQagALdVuzEHNROwQ3AzwWy8PqvToG ex5nn4MdPdQ3ikiZgdyWmZ8Jn7iTI22uMC/eZgxABjdcWAkZf/1RhWwvDisMVptXaL/F U/S6WYkSwmOToO9xTbfqEoTKroTVxuwFh4JWglyo33OKJc7hu4bew+cuD90N4OjftGLt RCIp4jNKPnbmEK6eDvMhXcZxXZ5A1r3vbCEaXBiAWvbBXgsqRqsb+mDEE8PJEQxLhHko WDNdKSlswchp4Dfd8Z3uKfBRLbnuHiQzHfOnzvumCvh8MWPK1du0DdYqjHeSDXJ/BL9I aVpw== X-Gm-Message-State: AAQBX9f6poiQ9IN0HU+OnWJlSJRtJHyqMqu49uWsx5WhlyC9Jo3pp5+U LjvcQoP3/Rt+nTqzWcDF6lSWZZU2t+pLWlucoV6lDw== X-Google-Smtp-Source: AKy350Zyt0jClJcG8KcSL5ZEKon7D9w7+V9dhjpJWj1kjUcjiCcmhmHEI18QhyZtBSgaxekxQwIjAg== X-Received: by 2002:a5d:5445:0:b0:2c9:23c4:8f93 with SMTP id w5-20020a5d5445000000b002c923c48f93mr25203010wrv.57.1680520353618; Mon, 03 Apr 2023 04:12:33 -0700 (PDT) Received: from workstation.ehrig.io (tmo-066-125.customers.d1-online.com. [80.187.66.125]) by smtp.gmail.com with ESMTPSA id y11-20020adffa4b000000b002c7066a6f77sm9505517wrr.31.2023.04.03.04.12.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Apr 2023 04:12:33 -0700 (PDT) From: Christian Ehrig To: bpf@vger.kernel.org Cc: cehrig@cloudflare.com, kernel-team@cloudflare.com, "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 1/3] ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices Date: Mon, 3 Apr 2023 14:12:07 +0200 Message-Id: X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Today ipip devices in collect-metadata mode don't allow for sending FOU or GUE encapsulated packets. This patch lifts the restriction by adding a struct ip_tunnel_encap to the tunnel metadata. On the egress path, the members of this struct can be set by the bpf_skb_set_fou_encap kfunc via a BPF tc-hook. Instead of dropping packets wishing to use additional UDP encapsulation, ip_md_tunnel_xmit now evaluates the contents of this struct and adds the corresponding FOU or GUE header. Furthermore, it is making sure that additional header bytes are taken into account for PMTU discovery. On the ingress path, an ipip device in collect-metadata mode will fill this struct and a BPF tc-hook can obtain the information via a call to the bpf_skb_get_fou_encap kfunc. The minor change to ip_tunnel_encap, which now takes a pointer to struct ip_tunnel_encap instead of struct ip_tunnel, allows us to control FOU encap type and parameters on a per packet-level. Signed-off-by: Christian Ehrig --- include/net/ip_tunnels.h | 28 +++++++++++++++------------- net/ipv4/ip_tunnel.c | 22 ++++++++++++++++++++-- net/ipv4/ipip.c | 1 + net/ipv6/sit.c | 2 +- 4 files changed, 37 insertions(+), 16 deletions(-) diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index fca357679816..7912f53caae0 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -57,6 +57,13 @@ struct ip_tunnel_key { __u8 flow_flags; }; +struct ip_tunnel_encap { + u16 type; + u16 flags; + __be16 sport; + __be16 dport; +}; + /* Flags for ip_tunnel_info mode. */ #define IP_TUNNEL_INFO_TX 0x01 /* represents tx tunnel parameters */ #define IP_TUNNEL_INFO_IPV6 0x02 /* key contains IPv6 addresses */ @@ -66,9 +73,9 @@ struct ip_tunnel_key { #define IP_TUNNEL_OPTS_MAX \ GENMASK((sizeof_field(struct ip_tunnel_info, \ options_len) * BITS_PER_BYTE) - 1, 0) - struct ip_tunnel_info { struct ip_tunnel_key key; + struct ip_tunnel_encap encap; #ifdef CONFIG_DST_CACHE struct dst_cache dst_cache; #endif @@ -86,13 +93,6 @@ struct ip_tunnel_6rd_parm { }; #endif -struct ip_tunnel_encap { - u16 type; - u16 flags; - __be16 sport; - __be16 dport; -}; - struct ip_tunnel_prl_entry { struct ip_tunnel_prl_entry __rcu *next; __be32 addr; @@ -293,6 +293,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn, __be32 remote, __be32 local, __be32 key); +void ip_tunnel_md_udp_encap(struct sk_buff *skb, struct ip_tunnel_info *info); int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, const struct tnl_ptk_info *tpi, struct metadata_dst *tun_dst, bool log_ecn_error); @@ -371,22 +372,23 @@ static inline int ip_encap_hlen(struct ip_tunnel_encap *e) return hlen; } -static inline int ip_tunnel_encap(struct sk_buff *skb, struct ip_tunnel *t, +static inline int ip_tunnel_encap(struct sk_buff *skb, + struct ip_tunnel_encap *e, u8 *protocol, struct flowi4 *fl4) { const struct ip_tunnel_encap_ops *ops; int ret = -EINVAL; - if (t->encap.type == TUNNEL_ENCAP_NONE) + if (e->type == TUNNEL_ENCAP_NONE) return 0; - if (t->encap.type >= MAX_IPTUN_ENCAP_OPS) + if (e->type >= MAX_IPTUN_ENCAP_OPS) return -EINVAL; rcu_read_lock(); - ops = rcu_dereference(iptun_encaps[t->encap.type]); + ops = rcu_dereference(iptun_encaps[e->type]); if (likely(ops && ops->build_header)) - ret = ops->build_header(skb, &t->encap, protocol, fl4); + ret = ops->build_header(skb, e, protocol, fl4); rcu_read_unlock(); return ret; diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index de90b09dfe78..add437f710fc 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -359,6 +359,20 @@ static struct ip_tunnel *ip_tunnel_create(struct net *net, return ERR_PTR(err); } +void ip_tunnel_md_udp_encap(struct sk_buff *skb, struct ip_tunnel_info *info) +{ + const struct iphdr *iph = ip_hdr(skb); + const struct udphdr *udph; + + if (iph->protocol != IPPROTO_UDP) + return; + + udph = (struct udphdr *)((__u8 *)iph + (iph->ihl << 2)); + info->encap.sport = udph->source; + info->encap.dport = udph->dest; +} +EXPORT_SYMBOL(ip_tunnel_md_udp_encap); + int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, const struct tnl_ptk_info *tpi, struct metadata_dst *tun_dst, bool log_ecn_error) @@ -572,7 +586,11 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, tunnel_id_to_key32(key->tun_id), RT_TOS(tos), dev_net(dev), 0, skb->mark, skb_get_hash(skb), key->flow_flags); - if (tunnel->encap.type != TUNNEL_ENCAP_NONE) + + if (!tunnel_hlen) + tunnel_hlen = ip_encap_hlen(&tun_info->encap); + + if (ip_tunnel_encap(skb, &tun_info->encap, &proto, &fl4) < 0) goto tx_error; use_cache = ip_tunnel_dst_cache_usable(skb, tun_info); @@ -732,7 +750,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, dev_net(dev), tunnel->parms.link, tunnel->fwmark, skb_get_hash(skb), 0); - if (ip_tunnel_encap(skb, tunnel, &protocol, &fl4) < 0) + if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0) goto tx_error; if (connected && md) { diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index abea77759b7e..27b8f83c6ea2 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -241,6 +241,7 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto) tun_dst = ip_tun_rx_dst(skb, 0, 0, 0); if (!tun_dst) return 0; + ip_tunnel_md_udp_encap(skb, &tun_dst->u.tun_info); } skb_reset_mac_header(skb); diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c index 70d81bba5093..063560e2cb1a 100644 --- a/net/ipv6/sit.c +++ b/net/ipv6/sit.c @@ -1024,7 +1024,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb, ttl = iph6->hop_limit; tos = INET_ECN_encapsulate(tos, ipv6_get_dsfield(iph6)); - if (ip_tunnel_encap(skb, tunnel, &protocol, &fl4) < 0) { + if (ip_tunnel_encap(skb, &tunnel->encap, &protocol, &fl4) < 0) { ip_rt_put(rt); goto tx_error; } From patchwork Mon Apr 3 12:12:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Ehrig X-Patchwork-Id: 13198054 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AC9BC76196 for ; Mon, 3 Apr 2023 11:13:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232257AbjDCLNX (ORCPT ); Mon, 3 Apr 2023 07:13:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232279AbjDCLNM (ORCPT ); Mon, 3 Apr 2023 07:13:12 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3CB51A948 for ; Mon, 3 Apr 2023 04:12:42 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id d17so28904514wrb.11 for ; Mon, 03 Apr 2023 04:12:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1680520358; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PZLNuzKtLWlgs6Y6X1IWZZsYIByhDy9cqzVY3QztvD8=; b=jy4FFCj4W1GPd4u8mxyCmg4yJi77U4RIvAPJDqPUF6xM2iipbQVCdxInK0RyZQL1jC YNCS9mjeGD+40sd3I2njZqs0B0FrJL+6HH4/5l5Fhu7r3F+yESzWaovZ5i35jvNq/BcZ RMqREAWTYHZX6CDc5Gq7Wv9BzcwemCDUcYkc0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680520358; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PZLNuzKtLWlgs6Y6X1IWZZsYIByhDy9cqzVY3QztvD8=; b=8F/1mnpnSEYR9Esul2UQERCsJ2xyBqReKkhm3uPRV78NgKkvAZFLqlAORETVcatS47 8YThVebzwrMkpCYC/7cR0DxBu3h7Xnq9GsdE9uRVLs9DVLaE68NQpVzGnlrfkUL5KLBD O1ziTRIqLF1vE6/fzsByYTBOrKTi1Q8PWZ1nGm3BGYXDswHK2hafWXYUeIOu+/BcK/BI SGrsFvnFLkLDD6d3Er1ZIqNKkECYY/VwVxI6AZKRcYGOA3px+EOj/Dm1+zKligMqc5GL qD4qFuXbaqP43f1N3l+WG4mKp0ywwiGowDzfPQJDSfsIm7Lxof4T8sbYmA70lVtqFXA8 Ox7w== X-Gm-Message-State: AAQBX9cAmGug07K1m1Skvs7apIZPfnveZgoV5ENxjo5+zFW3zBZhTTIt /HiSy1PvLalZYcDVhUEC5pC9QdtxpSbk55hV0/UALQ== X-Google-Smtp-Source: AKy350b6PtyTH2NhbmRpAUnOLx+YvBEU55u36HMrmiJvHzsHE0fiO2hXXXIdpHRUgB8aNeHN85j9kg== X-Received: by 2002:a5d:4ecc:0:b0:2d1:9ce9:2b99 with SMTP id s12-20020a5d4ecc000000b002d19ce92b99mr25187273wrv.18.1680520358383; Mon, 03 Apr 2023 04:12:38 -0700 (PDT) Received: from workstation.ehrig.io (tmo-066-125.customers.d1-online.com. [80.187.66.125]) by smtp.gmail.com with ESMTPSA id y11-20020adffa4b000000b002c7066a6f77sm9505517wrr.31.2023.04.03.04.12.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Apr 2023 04:12:37 -0700 (PDT) From: Christian Ehrig To: bpf@vger.kernel.org Cc: cehrig@cloudflare.com, kernel-team@cloudflare.com, kernel test robot , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , David Ahern , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH bpf-next v2 2/3] bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs Date: Mon, 3 Apr 2023 14:12:08 +0200 Message-Id: X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add two new kfuncs that allow a BPF tc-hook, installed on an ipip device in collect-metadata mode, to control FOU encap parameters on a per-packet level. The set of kfuncs is registered with the fou module. The bpf_skb_set_fou_encap kfunc is supposed to be used in tandem and after a successful call to the bpf_skb_set_tunnel_key bpf-helper. UDP source and destination ports can be controlled by passing a struct bpf_fou_encap. A source port of zero will auto-assign a source port. enum bpf_fou_encap_type is used to specify if the egress path should FOU or GUE encap the packet. On the ingress path bpf_skb_get_fou_encap can be used to read UDP source and destination ports from the receiver's point of view and allows for packet multiplexing across different destination ports within a single BPF program and ipip device. Reported-by: kernel test robot Link: https://lore.kernel.org/oe-kbuild-all/202304020425.L8MwfV5h-lkp@intel.com/ Signed-off-by: Christian Ehrig --- include/net/fou.h | 2 + net/ipv4/Makefile | 2 +- net/ipv4/fou_bpf.c | 119 ++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/fou_core.c | 5 ++ 4 files changed, 127 insertions(+), 1 deletion(-) create mode 100644 net/ipv4/fou_bpf.c diff --git a/include/net/fou.h b/include/net/fou.h index 80f56e275b08..824eb4b231fd 100644 --- a/include/net/fou.h +++ b/include/net/fou.h @@ -17,4 +17,6 @@ int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e, int __gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e, u8 *protocol, __be16 *sport, int type); +int register_fou_bpf(void); + #endif diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 880277c9fd07..b18ba8ef93ad 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -26,7 +26,7 @@ obj-$(CONFIG_IP_MROUTE) += ipmr.o obj-$(CONFIG_IP_MROUTE_COMMON) += ipmr_base.o obj-$(CONFIG_NET_IPIP) += ipip.o gre-y := gre_demux.o -fou-y := fou_core.o fou_nl.o +fou-y := fou_core.o fou_nl.o fou_bpf.o obj-$(CONFIG_NET_FOU) += fou.o obj-$(CONFIG_NET_IPGRE_DEMUX) += gre.o obj-$(CONFIG_NET_IPGRE) += ip_gre.o diff --git a/net/ipv4/fou_bpf.c b/net/ipv4/fou_bpf.c new file mode 100644 index 000000000000..3760a14b6b57 --- /dev/null +++ b/net/ipv4/fou_bpf.c @@ -0,0 +1,119 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Unstable Fou Helpers for TC-BPF hook + * + * These are called from SCHED_CLS BPF programs. Note that it is + * allowed to break compatibility for these functions since the interface they + * are exposed through to BPF programs is explicitly unstable. + */ + +#include +#include + +#include +#include + +struct bpf_fou_encap { + __be16 sport; + __be16 dport; +}; + +enum bpf_fou_encap_type { + FOU_BPF_ENCAP_FOU, + FOU_BPF_ENCAP_GUE, +}; + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in BTF"); + +/* bpf_skb_set_fou_encap - Set FOU encap parameters + * + * This function allows for using GUE or FOU encapsulation together with an + * ipip device in collect-metadata mode. + * + * It is meant to be used in BPF tc-hooks and after a call to the + * bpf_skb_set_tunnel_key helper, responsible for setting IP addresses. + * + * Parameters: + * @skb_ctx Pointer to ctx (__sk_buff) in TC program. Cannot be NULL + * @encap Pointer to a `struct bpf_fou_encap` storing UDP src and + * dst ports. If sport is set to 0 the kernel will auto-assign a + * port. This is similar to using `encap-sport auto`. + * Cannot be NULL + * @type Encapsulation type for the packet. Their definitions are + * specified in `enum bpf_fou_encap_type` + */ +__bpf_kfunc int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, + struct bpf_fou_encap *encap, int type) +{ + struct sk_buff *skb = (struct sk_buff *)skb_ctx; + struct ip_tunnel_info *info = skb_tunnel_info(skb); + + if (unlikely(!encap)) + return -EINVAL; + + if (unlikely(!info || !(info->mode & IP_TUNNEL_INFO_TX))) + return -EINVAL; + + switch (type) { + case FOU_BPF_ENCAP_FOU: + info->encap.type = TUNNEL_ENCAP_FOU; + break; + case FOU_BPF_ENCAP_GUE: + info->encap.type = TUNNEL_ENCAP_GUE; + break; + default: + info->encap.type = TUNNEL_ENCAP_NONE; + } + + if (info->key.tun_flags & TUNNEL_CSUM) + info->encap.flags |= TUNNEL_ENCAP_FLAG_CSUM; + + info->encap.sport = encap->sport; + info->encap.dport = encap->dport; + + return 0; +} + +/* bpf_skb_get_fou_encap - Get FOU encap parameters + * + * This function allows for reading encap metadata from a packet received + * on an ipip device in collect-metadata mode. + * + * Parameters: + * @skb_ctx Pointer to ctx (__sk_buff) in TC program. Cannot be NULL + * @encap Pointer to a struct bpf_fou_encap storing UDP source and + * destination port. Cannot be NULL + */ +__bpf_kfunc int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, + struct bpf_fou_encap *encap) +{ + struct sk_buff *skb = (struct sk_buff *)skb_ctx; + struct ip_tunnel_info *info = skb_tunnel_info(skb); + + if (unlikely(!info)) + return -EINVAL; + + encap->sport = info->encap.sport; + encap->dport = info->encap.dport; + + return 0; +} + +__diag_pop() + +BTF_SET8_START(fou_kfunc_set) +BTF_ID_FLAGS(func, bpf_skb_set_fou_encap) +BTF_ID_FLAGS(func, bpf_skb_get_fou_encap) +BTF_SET8_END(fou_kfunc_set) + +static const struct btf_kfunc_id_set fou_bpf_kfunc_set = { + .owner = THIS_MODULE, + .set = &fou_kfunc_set, +}; + +int register_fou_bpf(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &fou_bpf_kfunc_set); +} diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c index cafec9b4eee0..0c41076e31ed 100644 --- a/net/ipv4/fou_core.c +++ b/net/ipv4/fou_core.c @@ -1236,10 +1236,15 @@ static int __init fou_init(void) if (ret < 0) goto unregister; + ret = register_fou_bpf(); + if (ret < 0) + goto kfunc_failed; + ret = ip_tunnel_encap_add_fou_ops(); if (ret == 0) return 0; +kfunc_failed: genl_unregister_family(&fou_nl_family); unregister: unregister_pernet_device(&fou_net_ops); From patchwork Mon Apr 3 12:12:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Ehrig X-Patchwork-Id: 13198055 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29293C761A6 for ; Mon, 3 Apr 2023 11:13:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232224AbjDCLN1 (ORCPT ); Mon, 3 Apr 2023 07:13:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232117AbjDCLNN (ORCPT ); Mon, 3 Apr 2023 07:13:13 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BE17FF09 for ; Mon, 3 Apr 2023 04:12:45 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id y14so28948334wrq.4 for ; Mon, 03 Apr 2023 04:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1680520363; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JJGK6+z2A0j3BXRQbw4triQYURy3/oqOBEOPJiYTwI4=; b=tpFFS4UnM3prfzvnWok4lKRG4k267cw8oCzicAYmRl1H6CB6VSOmte/empqxGvcWy7 Moqy9a1QyOZJ+XDWTAYRrDqM8ZZJZSxJG2WHAKEPIcRN2W2/t0mIggTKqDNFn6iAhKUd khD3w7P4NcOnjfp3u7S5/YNSB+jEavbx+NV+Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680520363; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JJGK6+z2A0j3BXRQbw4triQYURy3/oqOBEOPJiYTwI4=; b=HtjL1g2YeNhho6ZTUrdK0NOssm2xGhW9M0/Mj40gjXziOxTiv9XpX5Yeg98RLAIJxA RODOrFlbWHrhPnJY8RvgROhqoM9ARoVhC3ousQISdAAnfG1TnqcbsfnBBMqYNT+/zqfC AXatJk1KQ9hyt2FGB1dFe9v0albc/P5cd08R4VQokHC5ONGsUjPEK0UoOEkedidZyt1u LQhYGEano6qxNhqioOVYYWv1DDON6HhrZB4c6kWTHJHbX7fBZ8BRlb3VfFaZI8Hmr4Ct mk6WC9SkjpfhZCeZ9g8pBGKKIY/dSxJ5FeezmF0gxrhryqvOlHc3N7AHRcr2eF2aevmm b2RA== X-Gm-Message-State: AAQBX9f48HCGDUzMLPco0737MffflRZHFzEEZemzxTN91AGzYS1xtOwD zrQknKd0JKh9Ipo+3Zyryxbr6pGWzpuOYt4Rp3Q1avdT X-Google-Smtp-Source: AKy350bcITxGCnkiMeQi5Vz8Ia9/Zlb9CV3ZkbXUJUt3e/Xz42s2EhmH92FXuuBHO4CBhxxI/xNO2g== X-Received: by 2002:adf:fccc:0:b0:2cf:e868:f789 with SMTP id f12-20020adffccc000000b002cfe868f789mr25909194wrs.48.1680520362592; Mon, 03 Apr 2023 04:12:42 -0700 (PDT) Received: from workstation.ehrig.io (tmo-066-125.customers.d1-online.com. [80.187.66.125]) by smtp.gmail.com with ESMTPSA id y11-20020adffa4b000000b002c7066a6f77sm9505517wrr.31.2023.04.03.04.12.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Apr 2023 04:12:42 -0700 (PDT) From: Christian Ehrig To: bpf@vger.kernel.org Cc: cehrig@cloudflare.com, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Kaixi Fan , Paul Chaignon , Dave Marchevsky , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 3/3] selftests/bpf: Test FOU kfuncs for externally controlled ipip devices Date: Mon, 3 Apr 2023 14:12:09 +0200 Message-Id: <528a824713c1545839d870eaad84d87749a23371.1680520500.git.cehrig@cloudflare.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add tests for FOU and GUE encapsulation via the bpf_skb_{set,get}_fou_encap kfuncs, using ipip devices in collect-metadata mode. These tests make sure that we can successfully set and obtain FOU and GUE encap parameters using ingress / egress BPF tc-hooks. Signed-off-by: Christian Ehrig --- .../selftests/bpf/progs/test_tunnel_kern.c | 117 ++++++++++++++++++ tools/testing/selftests/bpf/test_tunnel.sh | 81 ++++++++++++ 2 files changed, 198 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index 9ab2d55ab7c0..f66af753bbbb 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -52,6 +52,21 @@ struct vxlan_metadata { __u32 gbp; }; +struct bpf_fou_encap { + __be16 sport; + __be16 dport; +}; + +enum bpf_fou_encap_type { + FOU_BPF_ENCAP_FOU, + FOU_BPF_ENCAP_GUE, +}; + +int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, + struct bpf_fou_encap *encap, int type) __ksym; +int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, + struct bpf_fou_encap *encap) __ksym; + struct { __uint(type, BPF_MAP_TYPE_ARRAY); __uint(max_entries, 1); @@ -749,6 +764,108 @@ int ipip_get_tunnel(struct __sk_buff *skb) return TC_ACT_OK; } +SEC("tc") +int ipip_gue_set_tunnel(struct __sk_buff *skb) +{ + struct bpf_tunnel_key key = {}; + struct bpf_fou_encap encap = {}; + void *data = (void *)(long)skb->data; + struct iphdr *iph = data; + void *data_end = (void *)(long)skb->data_end; + int ret; + + if (data + sizeof(*iph) > data_end) { + log_err(1); + return TC_ACT_SHOT; + } + + key.tunnel_ttl = 64; + if (iph->protocol == IPPROTO_ICMP) + key.remote_ipv4 = 0xac100164; /* 172.16.1.100 */ + + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + encap.sport = 0; + encap.dport = bpf_htons(5555); + + ret = bpf_skb_set_fou_encap(skb, &encap, FOU_BPF_ENCAP_GUE); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + +SEC("tc") +int ipip_fou_set_tunnel(struct __sk_buff *skb) +{ + struct bpf_tunnel_key key = {}; + struct bpf_fou_encap encap = {}; + void *data = (void *)(long)skb->data; + struct iphdr *iph = data; + void *data_end = (void *)(long)skb->data_end; + int ret; + + if (data + sizeof(*iph) > data_end) { + log_err(1); + return TC_ACT_SHOT; + } + + key.tunnel_ttl = 64; + if (iph->protocol == IPPROTO_ICMP) + key.remote_ipv4 = 0xac100164; /* 172.16.1.100 */ + + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + encap.sport = 0; + encap.dport = bpf_htons(5555); + + ret = bpf_skb_set_fou_encap(skb, &encap, FOU_BPF_ENCAP_FOU); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + +SEC("tc") +int ipip_encap_get_tunnel(struct __sk_buff *skb) +{ + int ret; + struct bpf_tunnel_key key = {}; + struct bpf_fou_encap encap = {}; + + ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + ret = bpf_skb_get_fou_encap(skb, &encap); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + if (bpf_ntohs(encap.dport) != 5555) + return TC_ACT_SHOT; + + bpf_printk("%d remote ip 0x%x, sport %d, dport %d\n", ret, + key.remote_ipv4, bpf_ntohs(encap.sport), + bpf_ntohs(encap.dport)); + return TC_ACT_OK; +} + SEC("tc") int ipip6_set_tunnel(struct __sk_buff *skb) { diff --git a/tools/testing/selftests/bpf/test_tunnel.sh b/tools/testing/selftests/bpf/test_tunnel.sh index 2dec7dbf29a2..f2379414a887 100755 --- a/tools/testing/selftests/bpf/test_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tunnel.sh @@ -212,6 +212,24 @@ add_ipip_tunnel() ip addr add dev $DEV 10.1.1.200/24 } +add_ipip_encap_tunnel() +{ + # at_ns0 namespace + ip netns exec at_ns0 ip fou add port 5555 $IPPROTO + ip netns exec at_ns0 \ + ip link add dev $DEV_NS type $TYPE \ + local 172.16.1.100 remote 172.16.1.200 \ + encap $ENCAP encap-sport auto encap-dport 5555 noencap-csum + ip netns exec at_ns0 ip link set dev $DEV_NS up + ip netns exec at_ns0 ip addr add dev $DEV_NS 10.1.1.100/24 + + # root namespace + ip fou add port 5555 $IPPROTO + ip link add dev $DEV type $TYPE external + ip link set dev $DEV up + ip addr add dev $DEV 10.1.1.200/24 +} + add_ip6tnl_tunnel() { ip netns exec at_ns0 ip addr add ::11/96 dev veth0 @@ -461,6 +479,60 @@ test_ipip() echo -e ${GREEN}"PASS: $TYPE"${NC} } +test_ipip_gue() +{ + TYPE=ipip + DEV_NS=ipip00 + DEV=ipip11 + ret=0 + ENCAP=gue + IPPROTO=$ENCAP + + check $TYPE + config_device + add_ipip_encap_tunnel + ip link set dev veth1 mtu 1500 + attach_bpf $DEV ipip_gue_set_tunnel ipip_encap_get_tunnel + ping $PING_ARG 10.1.1.100 + check_err $? + ip netns exec at_ns0 ping $PING_ARG 10.1.1.200 + check_err $? + cleanup + + if [ $ret -ne 0 ]; then + echo -e ${RED}"FAIL: $TYPE (GUE)"${NC} + return 1 + fi + echo -e ${GREEN}"PASS: $TYPE (GUE)"${NC} +} + +test_ipip_fou() +{ + TYPE=ipip + DEV_NS=ipip00 + DEV=ipip11 + ret=0 + ENCAP=fou + IPPROTO="ipproto 4" + + check $TYPE + config_device + add_ipip_encap_tunnel + ip link set dev veth1 mtu 1500 + attach_bpf $DEV ipip_fou_set_tunnel ipip_encap_get_tunnel + ping $PING_ARG 10.1.1.100 + check_err $? + ip netns exec at_ns0 ping $PING_ARG 10.1.1.200 + check_err $? + cleanup + + if [ $ret -ne 0 ]; then + echo -e ${RED}"FAIL: $TYPE (FOU)"${NC} + return 1 + fi + echo -e ${GREEN}"PASS: $TYPE (FOU)"${NC} +} + test_ipip6() { TYPE=ip6tnl @@ -634,6 +706,7 @@ cleanup() ip xfrm policy delete dir in src 10.1.1.100/32 dst 10.1.1.200/32 2> /dev/null ip xfrm state delete src 172.16.1.100 dst 172.16.1.200 proto esp spi 0x1 2> /dev/null ip xfrm state delete src 172.16.1.200 dst 172.16.1.100 proto esp spi 0x2 2> /dev/null + ip fou del port 5555 gue 2> /dev/null } cleanup_exit() @@ -708,6 +781,14 @@ bpf_tunnel_test() test_ipip errors=$(( $errors + $? )) + echo "Testing IPIP (GUE) tunnel..." + test_ipip_gue + errors=$(( $errors + $? )) + + echo "Testing IPIP (FOU) tunnel..." + test_ipip_fou + errors=$(( $errors + $? )) + echo "Testing IPIP6 tunnel..." test_ipip6 errors=$(( $errors + $? ))