Message ID | 20220911122328.306188-5-shmulik.ladkani@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf: Support setting variable-length tunnel options | expand |
On 9/11/22 5:23 AM, Shmulik Ladkani wrote: > Add geneve test to test_tunnel. The test setup and scheme resembles the > existing vxlan test. > > The test also exercises tunnel option assignment using > bpf_skb_set_tunnel_opt_dynptr. > > Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> > > --- > v6: > - Fix missing retcodes in progs/test_tunnel_kern.c > spotted by John Fastabend <john.fastabend@gmail.com> > - Simplify bpf_skb_set_tunnel_opt_dynptr's interface, removing the > superfluous 'len' parameter > suggested by Andrii Nakryiko <andrii.nakryiko@gmail.com> > --- > .../selftests/bpf/prog_tests/test_tunnel.c | 108 ++++++++++++++ > .../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++++++++++++ > 2 files changed, 246 insertions(+) > [...] > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > index b11f6952b0c8..cb901b76a547 100644 > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > @@ -24,6 +24,20 @@ > > #define log_err(__ret) bpf_printk("ERROR line:%d ret:%d\n", __LINE__, __ret) > > +#define GENEVE_OPTS_LEN0 12 > +#define GENEVE_OPTS_LEN1 20 > + > +struct tun_opts_raw { > + __u8 data[64]; > +}; > + > +struct { > + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); > + __uint(max_entries, 1); > + __type(key, __u32); > + __type(value, struct tun_opts_raw); > +} geneve_opts SEC(".maps"); > + > struct geneve_opt { > __be16 opt_class; > __u8 type; > @@ -286,6 +300,130 @@ int ip4ip6erspan_get_tunnel(struct __sk_buff *skb) > return TC_ACT_OK; > } > > +SEC("tc") > +int geneve_set_tunnel_dst(struct __sk_buff *skb) > +{ > + int ret; > + struct bpf_tunnel_key key; > + struct tun_opts_raw *opts; > + struct bpf_dynptr dptr; > + __u32 index = 0; > + __u32 *local_ip = NULL; > + > + local_ip = bpf_map_lookup_elem(&local_ip_map, &index); > + if (!local_ip) { > + log_err(-1); > + return TC_ACT_SHOT; > + } > + > + index = 0; > + opts = bpf_map_lookup_elem(&geneve_opts, &index); > + if (!opts) { > + log_err(-1); > + return TC_ACT_SHOT; > + } > + > + __builtin_memset(&key, 0x0, sizeof(key)); > + key.local_ipv4 = 0xac100164; /* 172.16.1.100 */ > + key.remote_ipv4 = *local_ip; > + key.tunnel_id = 2; > + key.tunnel_tos = 0; > + key.tunnel_ttl = 64; > + > + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), > + BPF_F_ZERO_CSUM_TX); > + if (ret < 0) { > + log_err(ret); > + return TC_ACT_SHOT; > + } > + > + /* set empty geneve options (of runtime length) using a dynptr */ > + __builtin_memset(opts, 0x0, sizeof(*opts)); > + if (*local_ip % 2) > + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN1, 0, &dptr); > + else > + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN0, 0, &dptr); > + ret = bpf_skb_set_tunnel_opt_dynptr(skb, &dptr); I think the above example is not good. since it can write as if (*local_ip % 2) ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN1); else ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN0); In the commit message of Patch 2, we have === For example, we have an ebpf program that gets geneve options on incoming packets, stores them into a map (using a key representing the incoming flow), and later needs to assign *same* options to reply packets (belonging to same flow). === It would be great if you can create a test case for the above use case. > + if (ret < 0) { > + log_err(ret); > + return TC_ACT_SHOT; > + } > + > + return TC_ACT_OK; > +} > + [...]
On Mon, 19 Sep 2022 19:58:20 -0700 Yonghong Song <yhs@fb.com> wrote: > > + /* set empty geneve options (of runtime length) using a dynptr */ > > + __builtin_memset(opts, 0x0, sizeof(*opts)); > > + if (*local_ip % 2) > > + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN1, 0, &dptr); > > + else > > + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN0, 0, &dptr); > > + ret = bpf_skb_set_tunnel_opt_dynptr(skb, &dptr); > > I think the above example is not good. since it can write as > if (*local_ip % 2) > ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN1); > else > ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN0); > > In the commit message of Patch 2, we have > > === > For example, we have an ebpf program that gets geneve options on > incoming packets, stores them into a map (using a key representing > the incoming flow), and later needs to assign *same* options to > reply packets (belonging to same flow). > === > > It would be great if you can create a test case for the above > use case. Yes, but please note dynptr trim/advance API is still WIP: https://lore.kernel.org/bpf/CAJnrk1a53F=LLaU+gdmXGcZBBeUR-anALT3iO6pyHKiZpD0cNw@mail.gmail.com/ However, once we settled on the API for setting variable length tunnel options from a *dynptr* (and not from raw buffer+len), we can just exercise 'bpf_skb_set_tunnel_opt_dynptr' regardless the original usecase (i.e. we can assume dynptrs can be properly mangled). In any case, I can later amend the test once all dynptr convenience helpers are accepted. Best, Shmulik
On 9/19/22 10:22 PM, Shmulik Ladkani wrote: > On Mon, 19 Sep 2022 19:58:20 -0700 Yonghong Song <yhs@fb.com> wrote: > >>> + /* set empty geneve options (of runtime length) using a dynptr */ >>> + __builtin_memset(opts, 0x0, sizeof(*opts)); >>> + if (*local_ip % 2) >>> + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN1, 0, &dptr); >>> + else >>> + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN0, 0, &dptr); >>> + ret = bpf_skb_set_tunnel_opt_dynptr(skb, &dptr); >> >> I think the above example is not good. since it can write as >> if (*local_ip % 2) >> ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN1); >> else >> ret = bpf_skb_set_tunnel_opt(skb, opts, GENEVE_OPTS_LEN0); >> >> In the commit message of Patch 2, we have >> >> === >> For example, we have an ebpf program that gets geneve options on >> incoming packets, stores them into a map (using a key representing >> the incoming flow), and later needs to assign *same* options to >> reply packets (belonging to same flow). >> === >> >> It would be great if you can create a test case for the above >> use case. > > Yes, but please note dynptr trim/advance API is still WIP: > > https://lore.kernel.org/bpf/CAJnrk1a53F=LLaU+gdmXGcZBBeUR-anALT3iO6pyHKiZpD0cNw@mail.gmail.com/ > > However, once we settled on the API for setting variable length tunnel > options from a *dynptr* (and not from raw buffer+len), we can just > exercise 'bpf_skb_set_tunnel_opt_dynptr' regardless the original > usecase (i.e. we can assume dynptrs can be properly mangled). > > In any case, I can later amend the test once all dynptr convenience > helpers are accepted. Could you give more details how you could use these additional dynptr trim/advance APIs in your use case? It would give an overall picture whether bpf_skb_set_tunnel_opt_dynptr() is useful or not. W.r.t. your map use case, you could create a map and populate needed info (geneve options, lens) in user space, and then the bpf program tries to get such information from the map and then call bpf_skb_set_tunnel_opt_dynptr(). Maybe this could mimic your use case? > > Best, > Shmulik
diff --git a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c index 852da04ff281..9aae03c720e9 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_tunnel.c +++ b/tools/testing/selftests/bpf/prog_tests/test_tunnel.c @@ -87,6 +87,8 @@ #define VXLAN_TUNL_DEV1 "vxlan11" #define IP6VXLAN_TUNL_DEV0 "ip6vxlan00" #define IP6VXLAN_TUNL_DEV1 "ip6vxlan11" +#define GENEVE_TUNL_DEV0 "geneve00" +#define GENEVE_TUNL_DEV1 "geneve11" #define PING_ARGS "-i 0.01 -c 3 -w 10 -q" @@ -133,6 +135,38 @@ static void cleanup(void) SYS_NOFAIL("ip rule del to %s table 20 2> /dev/null", IP4_ADDR2_VETH1); SYS_NOFAIL("ip link del %s 2> /dev/null", VXLAN_TUNL_DEV1); SYS_NOFAIL("ip link del %s 2> /dev/null", IP6VXLAN_TUNL_DEV1); + SYS_NOFAIL("ip link del %s 2> /dev/null", GENEVE_TUNL_DEV1); +} + +static int add_geneve_tunnel(void) +{ + /* at_ns0 namespace */ + SYS("ip netns exec at_ns0 ip link add dev %s type geneve external", + GENEVE_TUNL_DEV0); + SYS("ip netns exec at_ns0 ip link set dev %s address %s up", + GENEVE_TUNL_DEV0, MAC_TUNL_DEV0); + SYS("ip netns exec at_ns0 ip addr add dev %s %s/24", + GENEVE_TUNL_DEV0, IP4_ADDR_TUNL_DEV0); + SYS("ip netns exec at_ns0 ip neigh add %s lladdr %s dev %s", + IP4_ADDR_TUNL_DEV1, MAC_TUNL_DEV1, GENEVE_TUNL_DEV0); + + /* root namespace */ + SYS("ip link add dev %s type geneve external", GENEVE_TUNL_DEV1); + SYS("ip link set dev %s address %s up", GENEVE_TUNL_DEV1, MAC_TUNL_DEV1); + SYS("ip addr add dev %s %s/24", GENEVE_TUNL_DEV1, IP4_ADDR_TUNL_DEV1); + SYS("ip neigh add %s lladdr %s dev %s", + IP4_ADDR_TUNL_DEV0, MAC_TUNL_DEV0, GENEVE_TUNL_DEV1); + + return 0; +fail: + return -1; +} + +static void delete_geneve_tunnel(void) +{ + SYS_NOFAIL("ip netns exec at_ns0 ip link delete dev %s", + GENEVE_TUNL_DEV0); + SYS_NOFAIL("ip link delete dev %s", GENEVE_TUNL_DEV1); } static int add_vxlan_tunnel(void) @@ -248,6 +282,79 @@ static int attach_tc_prog(struct bpf_tc_hook *hook, int igr_fd, int egr_fd) return 0; } +static void test_geneve_tunnel(void) +{ + struct test_tunnel_kern *skel = NULL; + struct nstoken *nstoken; + int local_ip_map_fd = -1; + int set_src_prog_fd, get_src_prog_fd; + int set_dst_prog_fd; + int key = 0, ifindex = -1; + uint local_ip; + int err; + DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, + .attach_point = BPF_TC_INGRESS); + + /* add genve tunnel */ + err = add_geneve_tunnel(); + if (!ASSERT_OK(err, "add geneve tunnel")) + goto done; + + /* load and attach bpf prog to tunnel dev tc hook point */ + skel = test_tunnel_kern__open_and_load(); + if (!ASSERT_OK_PTR(skel, "test_tunnel_kern__open_and_load")) + goto done; + ifindex = if_nametoindex(GENEVE_TUNL_DEV1); + if (!ASSERT_NEQ(ifindex, 0, "geneve11 ifindex")) + goto done; + tc_hook.ifindex = ifindex; + get_src_prog_fd = bpf_program__fd(skel->progs.geneve_get_tunnel_src); + set_src_prog_fd = bpf_program__fd(skel->progs.geneve_set_tunnel_src); + if (!ASSERT_GE(get_src_prog_fd, 0, "bpf_program__fd")) + goto done; + if (!ASSERT_GE(set_src_prog_fd, 0, "bpf_program__fd")) + goto done; + if (attach_tc_prog(&tc_hook, get_src_prog_fd, set_src_prog_fd)) + goto done; + + /* load and attach prog set_md to tunnel dev tc hook point at_ns0 */ + nstoken = open_netns("at_ns0"); + if (!ASSERT_OK_PTR(nstoken, "setns src")) + goto done; + ifindex = if_nametoindex(GENEVE_TUNL_DEV0); + if (!ASSERT_NEQ(ifindex, 0, "geneve00 ifindex")) + goto done; + tc_hook.ifindex = ifindex; + set_dst_prog_fd = bpf_program__fd(skel->progs.geneve_set_tunnel_dst); + if (!ASSERT_GE(set_dst_prog_fd, 0, "bpf_program__fd")) + goto done; + if (attach_tc_prog(&tc_hook, -1, set_dst_prog_fd)) + goto done; + close_netns(nstoken); + + /* use veth1 ip 1 as tunnel source ip */ + local_ip_map_fd = bpf_map__fd(skel->maps.local_ip_map); + if (!ASSERT_GE(local_ip_map_fd, 0, "bpf_map__fd")) + goto done; + local_ip = IP4_ADDR1_HEX_VETH1; + err = bpf_map_update_elem(local_ip_map_fd, &key, &local_ip, BPF_ANY); + if (!ASSERT_OK(err, "update bpf local_ip_map")) + goto done; + + /* ping test */ + err = test_ping(AF_INET, IP4_ADDR_TUNL_DEV0); + if (!ASSERT_OK(err, "test_ping")) + goto done; + +done: + /* delete geneve tunnel */ + delete_geneve_tunnel(); + if (local_ip_map_fd >= 0) + close(local_ip_map_fd); + if (skel) + test_tunnel_kern__destroy(skel); +} + static void test_vxlan_tunnel(void) { struct test_tunnel_kern *skel = NULL; @@ -408,6 +515,7 @@ static void *test_tunnel_run_tests(void *arg) RUN_TEST(vxlan_tunnel); RUN_TEST(ip6vxlan_tunnel); + RUN_TEST(geneve_tunnel); cleanup(); diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c index b11f6952b0c8..cb901b76a547 100644 --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c @@ -24,6 +24,20 @@ #define log_err(__ret) bpf_printk("ERROR line:%d ret:%d\n", __LINE__, __ret) +#define GENEVE_OPTS_LEN0 12 +#define GENEVE_OPTS_LEN1 20 + +struct tun_opts_raw { + __u8 data[64]; +}; + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, struct tun_opts_raw); +} geneve_opts SEC(".maps"); + struct geneve_opt { __be16 opt_class; __u8 type; @@ -286,6 +300,130 @@ int ip4ip6erspan_get_tunnel(struct __sk_buff *skb) return TC_ACT_OK; } +SEC("tc") +int geneve_set_tunnel_dst(struct __sk_buff *skb) +{ + int ret; + struct bpf_tunnel_key key; + struct tun_opts_raw *opts; + struct bpf_dynptr dptr; + __u32 index = 0; + __u32 *local_ip = NULL; + + local_ip = bpf_map_lookup_elem(&local_ip_map, &index); + if (!local_ip) { + log_err(-1); + return TC_ACT_SHOT; + } + + index = 0; + opts = bpf_map_lookup_elem(&geneve_opts, &index); + if (!opts) { + log_err(-1); + return TC_ACT_SHOT; + } + + __builtin_memset(&key, 0x0, sizeof(key)); + key.local_ipv4 = 0xac100164; /* 172.16.1.100 */ + key.remote_ipv4 = *local_ip; + key.tunnel_id = 2; + key.tunnel_tos = 0; + key.tunnel_ttl = 64; + + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), + BPF_F_ZERO_CSUM_TX); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + /* set empty geneve options (of runtime length) using a dynptr */ + __builtin_memset(opts, 0x0, sizeof(*opts)); + if (*local_ip % 2) + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN1, 0, &dptr); + else + bpf_dynptr_from_mem(opts, GENEVE_OPTS_LEN0, 0, &dptr); + ret = bpf_skb_set_tunnel_opt_dynptr(skb, &dptr); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + +SEC("tc") +int geneve_set_tunnel_src(struct __sk_buff *skb) +{ + int ret; + struct bpf_tunnel_key key; + __u32 index = 0; + __u32 *local_ip = NULL; + + local_ip = bpf_map_lookup_elem(&local_ip_map, &index); + if (!local_ip) { + log_err(-1); + return TC_ACT_SHOT; + } + + __builtin_memset(&key, 0x0, sizeof(key)); + key.local_ipv4 = *local_ip; + key.remote_ipv4 = 0xac100164; /* 172.16.1.100 */ + key.tunnel_id = 2; + key.tunnel_tos = 0; + key.tunnel_ttl = 64; + + ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), + BPF_F_ZERO_CSUM_TX); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + +SEC("tc") +int geneve_get_tunnel_src(struct __sk_buff *skb) +{ + int ret; + struct bpf_tunnel_key key; + struct tun_opts_raw opts; + int expected_opts_len; + __u32 index = 0; + __u32 *local_ip = NULL; + + local_ip = bpf_map_lookup_elem(&local_ip_map, &index); + if (!local_ip) { + log_err(-1); + return TC_ACT_SHOT; + } + + ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + ret = bpf_skb_get_tunnel_opt(skb, &opts, sizeof(opts)); + if (ret < 0) { + log_err(ret); + return TC_ACT_SHOT; + } + + expected_opts_len = *local_ip % 2 ? GENEVE_OPTS_LEN1 : GENEVE_OPTS_LEN0; + if (key.local_ipv4 != *local_ip || ret != expected_opts_len) { + bpf_printk("geneve key %d local ip 0x%x remote ip 0x%x opts_len %d\n", + key.tunnel_id, key.local_ipv4, + key.remote_ipv4, ret); + bpf_printk("local_ip 0x%x\n", *local_ip); + log_err(ret); + return TC_ACT_SHOT; + } + + return TC_ACT_OK; +} + SEC("tc") int vxlan_set_tunnel_dst(struct __sk_buff *skb) {
Add geneve test to test_tunnel. The test setup and scheme resembles the existing vxlan test. The test also exercises tunnel option assignment using bpf_skb_set_tunnel_opt_dynptr. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> --- v6: - Fix missing retcodes in progs/test_tunnel_kern.c spotted by John Fastabend <john.fastabend@gmail.com> - Simplify bpf_skb_set_tunnel_opt_dynptr's interface, removing the superfluous 'len' parameter suggested by Andrii Nakryiko <andrii.nakryiko@gmail.com> --- .../selftests/bpf/prog_tests/test_tunnel.c | 108 ++++++++++++++ .../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++++++++++++ 2 files changed, 246 insertions(+)