From patchwork Thu Oct 15 15:46:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 11839549 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 458D4C433DF for ; Thu, 15 Oct 2020 15:47:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DB92A22255 for ; Thu, 15 Oct 2020 15:47:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eKNfNi80" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389746AbgJOPrC (ORCPT ); Thu, 15 Oct 2020 11:47:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:27349 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389730AbgJOPq6 (ORCPT ); Thu, 15 Oct 2020 11:46:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602776815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vx9aQYubZ4LFkrPuA7E8/fqIUZ9jky3v5t7ct4oHSyQ=; b=eKNfNi80tJfNOxdZcoGgUqp0BtH2nxz2IrwDi8R4ZX48TOe6cUzOc5Brf6ihw+wc/IU+az c6/TF2Fbm7UIqY37ABZvQ1Qg6UBchBwHbkWr9bs5W5FYqVsFJ7cKMzlFo0wyN3N1N/Q0LL Os0qpki6S5nVvB/z3iTY6dWuqLiZAeA= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-578-sVN8yiidPDqC9f4yt5OJmA-1; Thu, 15 Oct 2020 11:46:52 -0400 X-MC-Unique: sVN8yiidPDqC9f4yt5OJmA-1 Received: by mail-qv1-f70.google.com with SMTP id k6so1915985qvg.9 for ; Thu, 15 Oct 2020 08:46:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=vx9aQYubZ4LFkrPuA7E8/fqIUZ9jky3v5t7ct4oHSyQ=; b=bPw3F1Qyn0TshOu03rh53uiYARMT8Iet0j/cv3gBPoudThpBNqIHHtJ93Z8Zl4xbUE buzON6QoRJTBDs4S1yGj5L8ODPAK6UUuzIkHTgc8GCQuX9IcqiyHIwcPl9UH/NICUSEN hoi0UR68lacQwaHiJv2XqHUiDWb9GubrTbLy/k72At116J1x5emKkZUQpLCS/QKCF95o vcGyjpHk1k5g4VowCag/Ky/bblIANZot7GVpKeuJGWjA4Ydf0+NvmadW03eoBlzdWPVA FxQf4JxPRiyCq+VLrVg22xO39EnJeahF1m46Uv9REkKm2sTDKpZNga+mQztm6uorDY29 sBVA== X-Gm-Message-State: AOAM532LNxH02DWKOROjFNag/pvdnJqlp/wS5t5uXiAYrkojMNmUNLBB JO+hkkfodW9KcaBZANrsxaRn4Bv4vUosubV+bHl5SzJxDfh1YIirhjN8fKOaBdVrblAmTb88swv 81ob96/KV5mcx X-Received: by 2002:ac8:1c1b:: with SMTP id a27mr4830358qtk.157.1602776811871; Thu, 15 Oct 2020 08:46:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxl/FYp601RI0rkdV0j5+GeEXBgs24O/7EBXFa0TxNjpHey7YTPK7am8x8r7fyC8h+UgVDcWg== X-Received: by 2002:ac8:1c1b:: with SMTP id a27mr4830305qtk.157.1602776811256; Thu, 15 Oct 2020 08:46:51 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id x43sm1347334qtx.40.2020.10.15.08.46.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Oct 2020 08:46:50 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id B17F31838E4; Thu, 15 Oct 2020 17:46:48 +0200 (CEST) Subject: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= To: Daniel Borkmann Cc: David Ahern , netdev@vger.kernel.org, bpf@vger.kernel.org Date: Thu, 15 Oct 2020 17:46:48 +0200 Message-ID: <160277680864.157904.8719768977907736015.stgit@toke.dk> In-Reply-To: <160277680746.157904.8726318184090980429.stgit@toke.dk> References: <160277680746.157904.8726318184090980429.stgit@toke.dk> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Toke Høiland-Jørgensen Based on the discussion in [0], update the bpf_redirect_neigh() helper to accept an optional parameter specifying the nexthop information. This makes it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without incurring a duplicate FIB lookup - since the FIB lookup helper will return the nexthop information even if no neighbour is present, this can simply be passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns BPF_FIB_LKUP_RET_NO_NEIGH. [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/ Signed-off-by: Toke Høiland-Jørgensen --- include/linux/filter.h | 9 ++ include/uapi/linux/bpf.h | 23 +++++- net/core/filter.c | 152 +++++++++++++++++++++++++--------------- scripts/bpf_helpers_doc.py | 1 tools/include/uapi/linux/bpf.h | 23 +++++- 5 files changed, 143 insertions(+), 65 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 20fc24c9779a..ba9de7188cd0 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -607,12 +607,21 @@ struct bpf_skb_data_end { void *data_end; }; +struct bpf_nh_params { + u8 nh_family; + union { + __u32 ipv4_nh; + struct in6_addr ipv6_nh; + }; +}; + struct bpf_redirect_info { u32 flags; u32 tgt_index; void *tgt_value; struct bpf_map *map; u32 kern_flags; + struct bpf_nh_params nh; }; DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index bf5a99d803e4..980cc1363be8 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3677,15 +3677,19 @@ union bpf_attr { * Return * The id is returned or 0 in case the id could not be retrieved. * - * long bpf_redirect_neigh(u32 ifindex, u64 flags) + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags) * Description * Redirect the packet to another net device of index *ifindex* * and fill in L2 addresses from neighboring subsystem. This helper * is somewhat similar to **bpf_redirect**\ (), except that it * populates L2 addresses as well, meaning, internally, the helper - * performs a FIB lookup based on the skb's networking header to - * get the address of the next hop and then relies on the neighbor - * lookup for the L2 address of the nexthop. + * relies on the neighbor lookup for the L2 address of the nexthop. + * + * The helper will perform a FIB lookup based on the skb's + * networking header to get the address of the next hop, unless + * this is supplied by the caller in the *params* argument. The + * *plen* argument indicates the len of *params* and should be set + * to 0 if *params* is NULL. * * The *flags* argument is reserved and must be 0. The helper is * currently only supported for tc BPF program types, and enabled @@ -4906,6 +4910,17 @@ struct bpf_fib_lookup { __u8 dmac[6]; /* ETH_ALEN */ }; +struct bpf_redir_neigh { + /* network family for lookup (AF_INET, AF_INET6) + */ + __u8 nh_family; + /* network address of nexthop; skips fib lookup to find gateway */ + union { + __be32 ipv4_nh; + __u32 ipv6_nh[4]; /* in6_addr; network order */ + }; +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ diff --git a/net/core/filter.c b/net/core/filter.c index c5e2a1c5fd8d..d073031a3a61 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2165,12 +2165,11 @@ static int __bpf_redirect(struct sk_buff *skb, struct net_device *dev, } #if IS_ENABLED(CONFIG_IPV6) -static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb) +static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb, + struct net_device *dev, const struct in6_addr *nexthop) { - struct dst_entry *dst = skb_dst(skb); - struct net_device *dev = dst->dev; u32 hh_len = LL_RESERVED_SPACE(dev); - const struct in6_addr *nexthop; + struct dst_entry *dst = NULL; struct neighbour *neigh; if (dev_xmit_recursion()) { @@ -2196,8 +2195,11 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb) } rcu_read_lock_bh(); - nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst), - &ipv6_hdr(skb)->daddr); + if (!nexthop) { + dst = skb_dst(skb); + nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst), + &ipv6_hdr(skb)->daddr); + } neigh = ip_neigh_gw6(dev, nexthop); if (likely(!IS_ERR(neigh))) { int ret; @@ -2210,36 +2212,46 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb) return ret; } rcu_read_unlock_bh(); - IP6_INC_STATS(dev_net(dst->dev), - ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES); + if (dst) + IP6_INC_STATS(dev_net(dst->dev), + ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES); out_drop: kfree_skb(skb); return -ENETDOWN; } -static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev) +static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev, + struct bpf_nh_params *nh) { const struct ipv6hdr *ip6h = ipv6_hdr(skb); + struct in6_addr *nexthop = NULL; struct net *net = dev_net(dev); int err, ret = NET_XMIT_DROP; - struct dst_entry *dst; - struct flowi6 fl6 = { - .flowi6_flags = FLOWI_FLAG_ANYSRC, - .flowi6_mark = skb->mark, - .flowlabel = ip6_flowinfo(ip6h), - .flowi6_oif = dev->ifindex, - .flowi6_proto = ip6h->nexthdr, - .daddr = ip6h->daddr, - .saddr = ip6h->saddr, - }; - dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL); - if (IS_ERR(dst)) - goto out_drop; + if (!nh->nh_family) { + struct dst_entry *dst; + struct flowi6 fl6 = { + .flowi6_flags = FLOWI_FLAG_ANYSRC, + .flowi6_mark = skb->mark, + .flowlabel = ip6_flowinfo(ip6h), + .flowi6_oif = dev->ifindex, + .flowi6_proto = ip6h->nexthdr, + .daddr = ip6h->daddr, + .saddr = ip6h->saddr, + }; + + dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + goto out_drop; - skb_dst_set(skb, dst); + skb_dst_set(skb, dst); + } else if (nh->nh_family == AF_INET6) { + nexthop = &nh->ipv6_nh; + } else { + goto out_drop; + } - err = bpf_out_neigh_v6(net, skb); + err = bpf_out_neigh_v6(net, skb, dev, nexthop); if (unlikely(net_xmit_eval(err))) dev->stats.tx_errors++; else @@ -2260,11 +2272,9 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev) #endif /* CONFIG_IPV6 */ #if IS_ENABLED(CONFIG_INET) -static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb) +static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb, + struct net_device *dev, struct bpf_nh_params *nh) { - struct dst_entry *dst = skb_dst(skb); - struct rtable *rt = container_of(dst, struct rtable, dst); - struct net_device *dev = dst->dev; u32 hh_len = LL_RESERVED_SPACE(dev); struct neighbour *neigh; bool is_v6gw = false; @@ -2292,7 +2302,20 @@ static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb) } rcu_read_lock_bh(); - neigh = ip_neigh_for_gw(rt, skb, &is_v6gw); + if (!nh) { + struct dst_entry *dst = skb_dst(skb); + struct rtable *rt = container_of(dst, struct rtable, dst); + + neigh = ip_neigh_for_gw(rt, skb, &is_v6gw); + } else if (nh->nh_family == AF_INET6) { + neigh = ip_neigh_gw6(dev, &nh->ipv6_nh); + is_v6gw = true; + } else if (nh->nh_family == AF_INET) { + neigh = ip_neigh_gw4(dev, nh->ipv4_nh); + } else { + goto out_drop; + } + if (likely(!IS_ERR(neigh))) { int ret; @@ -2309,33 +2332,38 @@ static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb) return -ENETDOWN; } -static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev) +static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev, + struct bpf_nh_params *nh) { const struct iphdr *ip4h = ip_hdr(skb); struct net *net = dev_net(dev); int err, ret = NET_XMIT_DROP; - struct rtable *rt; - struct flowi4 fl4 = { - .flowi4_flags = FLOWI_FLAG_ANYSRC, - .flowi4_mark = skb->mark, - .flowi4_tos = RT_TOS(ip4h->tos), - .flowi4_oif = dev->ifindex, - .flowi4_proto = ip4h->protocol, - .daddr = ip4h->daddr, - .saddr = ip4h->saddr, - }; - rt = ip_route_output_flow(net, &fl4, NULL); - if (IS_ERR(rt)) - goto out_drop; - if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) { - ip_rt_put(rt); - goto out_drop; - } + if (!nh->nh_family) { + struct rtable *rt; + struct flowi4 fl4 = { + .flowi4_flags = FLOWI_FLAG_ANYSRC, + .flowi4_mark = skb->mark, + .flowi4_tos = RT_TOS(ip4h->tos), + .flowi4_oif = dev->ifindex, + .flowi4_proto = ip4h->protocol, + .daddr = ip4h->daddr, + .saddr = ip4h->saddr, + }; + + rt = ip_route_output_flow(net, &fl4, NULL); + if (IS_ERR(rt)) + goto out_drop; + if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) { + ip_rt_put(rt); + goto out_drop; + } - skb_dst_set(skb, &rt->dst); + skb_dst_set(skb, &rt->dst); + nh = NULL; + } - err = bpf_out_neigh_v4(net, skb); + err = bpf_out_neigh_v4(net, skb, dev, nh); if (unlikely(net_xmit_eval(err))) dev->stats.tx_errors++; else @@ -2355,7 +2383,8 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev) } #endif /* CONFIG_INET */ -static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev) +static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev, + struct bpf_nh_params *nh) { struct ethhdr *ethh = eth_hdr(skb); @@ -2370,9 +2399,9 @@ static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev) skb_reset_network_header(skb); if (skb->protocol == htons(ETH_P_IP)) - return __bpf_redirect_neigh_v4(skb, dev); + return __bpf_redirect_neigh_v4(skb, dev, nh); else if (skb->protocol == htons(ETH_P_IPV6)) - return __bpf_redirect_neigh_v6(skb, dev); + return __bpf_redirect_neigh_v6(skb, dev, nh); out: kfree_skb(skb); return -ENOTSUPP; @@ -2455,8 +2484,8 @@ int skb_do_redirect(struct sk_buff *skb) return -EAGAIN; } return flags & BPF_F_NEIGH ? - __bpf_redirect_neigh(skb, dev) : - __bpf_redirect(skb, dev, flags); + __bpf_redirect_neigh(skb, dev, &ri->nh) : + __bpf_redirect(skb, dev, flags); out_drop: kfree_skb(skb); return -EINVAL; @@ -2504,16 +2533,23 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = { .arg2_type = ARG_ANYTHING, }; -BPF_CALL_2(bpf_redirect_neigh, u32, ifindex, u64, flags) +BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params, + int, plen, u64, flags) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - if (unlikely(flags)) + if (unlikely((plen && plen < sizeof(*params)) || flags)) return TC_ACT_SHOT; ri->flags = BPF_F_NEIGH; ri->tgt_index = ifindex; + BUILD_BUG_ON(sizeof(struct bpf_redir_neigh) != sizeof(struct bpf_nh_params)); + if (plen) + memcpy(&ri->nh, params, sizeof(ri->nh)); + else + ri->nh.nh_family = 0; /* clear previous value */ + return TC_ACT_REDIRECT; } @@ -2522,7 +2558,9 @@ static const struct bpf_func_proto bpf_redirect_neigh_proto = { .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_ANYTHING, - .arg2_type = ARG_ANYTHING, + .arg2_type = ARG_PTR_TO_MEM_OR_NULL, + .arg3_type = ARG_CONST_SIZE_OR_ZERO, + .arg4_type = ARG_ANYTHING, }; BPF_CALL_2(bpf_msg_apply_bytes, struct sk_msg *, msg, u32, bytes) diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index 7d86fdd190be..6769caae142f 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -453,6 +453,7 @@ class PrinterHelpers(Printer): 'struct bpf_perf_event_data', 'struct bpf_perf_event_value', 'struct bpf_pidns_info', + 'struct bpf_redir_neigh', 'struct bpf_sk_lookup', 'struct bpf_sock', 'struct bpf_sock_addr', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index bf5a99d803e4..980cc1363be8 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3677,15 +3677,19 @@ union bpf_attr { * Return * The id is returned or 0 in case the id could not be retrieved. * - * long bpf_redirect_neigh(u32 ifindex, u64 flags) + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags) * Description * Redirect the packet to another net device of index *ifindex* * and fill in L2 addresses from neighboring subsystem. This helper * is somewhat similar to **bpf_redirect**\ (), except that it * populates L2 addresses as well, meaning, internally, the helper - * performs a FIB lookup based on the skb's networking header to - * get the address of the next hop and then relies on the neighbor - * lookup for the L2 address of the nexthop. + * relies on the neighbor lookup for the L2 address of the nexthop. + * + * The helper will perform a FIB lookup based on the skb's + * networking header to get the address of the next hop, unless + * this is supplied by the caller in the *params* argument. The + * *plen* argument indicates the len of *params* and should be set + * to 0 if *params* is NULL. * * The *flags* argument is reserved and must be 0. The helper is * currently only supported for tc BPF program types, and enabled @@ -4906,6 +4910,17 @@ struct bpf_fib_lookup { __u8 dmac[6]; /* ETH_ALEN */ }; +struct bpf_redir_neigh { + /* network family for lookup (AF_INET, AF_INET6) + */ + __u8 nh_family; + /* network address of nexthop; skips fib lookup to find gateway */ + union { + __be32 ipv4_nh; + __u32 ipv6_nh[4]; /* in6_addr; network order */ + }; +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ From patchwork Thu Oct 15 15:46:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 11839547 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06379C43457 for ; Thu, 15 Oct 2020 15:47:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E02422248 for ; Thu, 15 Oct 2020 15:47:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VteV/eIl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389775AbgJOPrC (ORCPT ); Thu, 15 Oct 2020 11:47:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59904 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389749AbgJOPq7 (ORCPT ); Thu, 15 Oct 2020 11:46:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602776816; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WPpn/eXZenODwZxYvP9a20o768zvVVpLw71z6dbl6Wg=; b=VteV/eIlvjFpW8s1OysELXBTMJPoteI9ArEtWlsRbWOkpeu7SNIOnLEunoKFbLg9fHwPEh eJQGPVyGzFxn5l4VrObg48aLb4Xuq9EaBuZIvckNuEyOUOiqR89W3nVqEMEWFZ0F0dvyro k7jw/T+E7ihFOVisHAbdw/AT67d3yBg= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-319-Cu4EaDlWOXKLcYI1HtqPzw-1; Thu, 15 Oct 2020 11:46:53 -0400 X-MC-Unique: Cu4EaDlWOXKLcYI1HtqPzw-1 Received: by mail-qv1-f69.google.com with SMTP id s8so1888453qvv.18 for ; Thu, 15 Oct 2020 08:46:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=WPpn/eXZenODwZxYvP9a20o768zvVVpLw71z6dbl6Wg=; b=N6UikIoLXXVdbQvlay2s8FUAesQzbkICUvLG8Qb6fV1n0GQ2NCKPBes4JygPGhq5Z8 fNeqoZNPQWDVT2AKJ/28JnmJPcQDOIJhzXlX8l8ytcNXAMwUKHGts8+QnSJ6bKpp+Cx4 glC9bRhODLBjeyc/oo/AKgkwaHCGXdPNLdYauibgaYPGO84W/dIjZf+I4sIu+Ax9YU8n 4LN6+QIYT361hCX52+Jr7Mox/8tXzJSGsiG1pR8Hs6Z2jQpVhsUGGOmIDMxVELtTDXSi qcyZtgZJIPOXNR3NccxZt9z27jse0md9L/ZstdIIAFSR0OwMOtTeDjE4V6sDqnJMdeHa yquA== X-Gm-Message-State: AOAM530rT8r9Wlbsp3b+bqjoJA42CyWPddfrX6GkQLq4XiJ8KIzEKpw9 45FPuVIfwhhg6f6KFk/EmDzGTyseluIXUD0ftSsAjr8UGP40m3T9iSkqTUFaw0AUC3maziwceHm tMkbylV72ZSgt X-Received: by 2002:ac8:e8c:: with SMTP id v12mr4536432qti.329.1602776812717; Thu, 15 Oct 2020 08:46:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUJAHA6XHi+2SXzqBZgUbAqvfAwXlbFLyuoUR7m8qOm0fbL1jRNa+ufuN994/9vgukSB1aHg== X-Received: by 2002:ac8:e8c:: with SMTP id v12mr4536411qti.329.1602776812451; Thu, 15 Oct 2020 08:46:52 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id z37sm1337626qtz.67.2020.10.15.08.46.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Oct 2020 08:46:51 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id C40DC1838E5; Thu, 15 Oct 2020 17:46:49 +0200 (CEST) Subject: [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh() From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= To: Daniel Borkmann Cc: David Ahern , netdev@vger.kernel.org, bpf@vger.kernel.org Date: Thu, 15 Oct 2020 17:46:49 +0200 Message-ID: <160277680973.157904.15451524562795164056.stgit@toke.dk> In-Reply-To: <160277680746.157904.8726318184090980429.stgit@toke.dk> References: <160277680746.157904.8726318184090980429.stgit@toke.dk> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC From: Toke Høiland-Jørgensen This updates the test_tc_neigh selftest to use the new syntax of bpf_redirect_neigh(). To exercise the helper both with and without the optional parameter, one forwarding direction is changed to do a bpf_fib_lookup() followed by a call to bpf_redirect_neigh(), while the other direction is using the map-based ifindex lookup letting the redirect helper resolve the nexthop from the FIB. This also fixes the test_tc_redirect.sh script to work on systems that have a consolidated dual-stack 'ping' binary instead of separate ping/ping6 versions. Signed-off-by: Toke Høiland-Jørgensen --- tools/testing/selftests/bpf/progs/test_tc_neigh.c | 83 ++++++++++++++++++--- tools/testing/selftests/bpf/test_tc_redirect.sh | 8 +- 2 files changed, 78 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/test_tc_neigh.c b/tools/testing/selftests/bpf/progs/test_tc_neigh.c index fe182616b112..ba03e603ba9b 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_neigh.c +++ b/tools/testing/selftests/bpf/progs/test_tc_neigh.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include @@ -32,6 +33,9 @@ a.s6_addr32[3] == b.s6_addr32[3]) #endif +#define AF_INET 2 +#define AF_INET6 10 + enum { dev_src, dev_dst, @@ -45,7 +49,8 @@ struct bpf_map_def SEC("maps") ifindex_map = { }; static __always_inline bool is_remote_ep_v4(struct __sk_buff *skb, - __be32 addr) + __be32 addr, + struct bpf_fib_lookup *fib_params) { void *data_end = ctx_ptr(skb->data_end); void *data = ctx_ptr(skb->data); @@ -58,11 +63,26 @@ static __always_inline bool is_remote_ep_v4(struct __sk_buff *skb, if ((void *)(ip4h + 1) > data_end) return false; - return ip4h->daddr == addr; + if (ip4h->daddr != addr) + return false; + + if (fib_params) { + fib_params->family = AF_INET; + fib_params->tos = ip4h->tos; + fib_params->l4_protocol = ip4h->protocol; + fib_params->sport = 0; + fib_params->dport = 0; + fib_params->tot_len = bpf_ntohs(ip4h->tot_len); + fib_params->ipv4_src = ip4h->saddr; + fib_params->ipv4_dst = ip4h->daddr; + } + + return true; } static __always_inline bool is_remote_ep_v6(struct __sk_buff *skb, - struct in6_addr addr) + struct in6_addr addr, + struct bpf_fib_lookup *fib_params) { void *data_end = ctx_ptr(skb->data_end); void *data = ctx_ptr(skb->data); @@ -75,7 +95,24 @@ static __always_inline bool is_remote_ep_v6(struct __sk_buff *skb, if ((void *)(ip6h + 1) > data_end) return false; - return v6_equal(ip6h->daddr, addr); + if (!v6_equal(ip6h->daddr, addr)) + return false; + + if (fib_params) { + struct in6_addr *src = (struct in6_addr *)fib_params->ipv6_src; + struct in6_addr *dst = (struct in6_addr *)fib_params->ipv6_dst; + + fib_params->family = AF_INET6; + fib_params->flowinfo = 0; + fib_params->l4_protocol = ip6h->nexthdr; + fib_params->sport = 0; + fib_params->dport = 0; + fib_params->tot_len = bpf_ntohs(ip6h->payload_len); + *src = ip6h->saddr; + *dst = ip6h->daddr; + } + + return true; } static __always_inline int get_dev_ifindex(int which) @@ -99,15 +136,17 @@ SEC("chk_egress") int tc_chk(struct __sk_buff *skb) SEC("dst_ingress") int tc_dst(struct __sk_buff *skb) { + struct bpf_fib_lookup fib_params = { .ifindex = skb->ingress_ifindex }; __u8 zero[ETH_ALEN * 2]; bool redirect = false; + int ret; switch (skb->protocol) { case __bpf_constant_htons(ETH_P_IP): - redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_src)); + redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_src), &fib_params); break; case __bpf_constant_htons(ETH_P_IPV6): - redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_src); + redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_src, &fib_params); break; } @@ -118,7 +157,31 @@ SEC("dst_ingress") int tc_dst(struct __sk_buff *skb) if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0) return TC_ACT_SHOT; - return bpf_redirect_neigh(get_dev_ifindex(dev_src), 0); + ret = bpf_fib_lookup(skb, &fib_params, sizeof(fib_params), 0); + bpf_printk("bpf_fib_lookup() ret: %d\n", ret); + if (ret == BPF_FIB_LKUP_RET_SUCCESS) { + void *data_end = ctx_ptr(skb->data_end); + struct ethhdr *eth = ctx_ptr(skb->data); + + if (eth + 1 > data_end) + return TC_ACT_SHOT; + + __builtin_memcpy(eth->h_dest, fib_params.dmac, ETH_ALEN); + __builtin_memcpy(eth->h_source, fib_params.smac, ETH_ALEN); + + return bpf_redirect(fib_params.ifindex, 0); + + } else if (ret == BPF_FIB_LKUP_RET_NO_NEIGH) { + struct bpf_redir_neigh nh_params = {}; + + nh_params.nh_family = fib_params.family; + __builtin_memcpy(&nh_params.ipv6_nh, &fib_params.ipv6_dst, + sizeof(nh_params.ipv6_nh)); + + return bpf_redirect_neigh(fib_params.ifindex, &nh_params, + sizeof(nh_params), 0); + } + return TC_ACT_SHOT; } SEC("src_ingress") int tc_src(struct __sk_buff *skb) @@ -128,10 +191,10 @@ SEC("src_ingress") int tc_src(struct __sk_buff *skb) switch (skb->protocol) { case __bpf_constant_htons(ETH_P_IP): - redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_dst)); + redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_dst), NULL); break; case __bpf_constant_htons(ETH_P_IPV6): - redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_dst); + redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_dst, NULL); break; } @@ -142,7 +205,7 @@ SEC("src_ingress") int tc_src(struct __sk_buff *skb) if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0) return TC_ACT_SHOT; - return bpf_redirect_neigh(get_dev_ifindex(dev_dst), 0); + return bpf_redirect_neigh(get_dev_ifindex(dev_dst), NULL, 0, 0); } char __license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_tc_redirect.sh b/tools/testing/selftests/bpf/test_tc_redirect.sh index 6d7482562140..09b20f24d018 100755 --- a/tools/testing/selftests/bpf/test_tc_redirect.sh +++ b/tools/testing/selftests/bpf/test_tc_redirect.sh @@ -24,8 +24,7 @@ command -v timeout >/dev/null 2>&1 || \ { echo >&2 "timeout is not available"; exit 1; } command -v ping >/dev/null 2>&1 || \ { echo >&2 "ping is not available"; exit 1; } -command -v ping6 >/dev/null 2>&1 || \ - { echo >&2 "ping6 is not available"; exit 1; } +if command -v ping6 >/dev/null 2>&1; then PING6=ping6; else PING6=ping; fi command -v perl >/dev/null 2>&1 || \ { echo >&2 "perl is not available"; exit 1; } command -v jq >/dev/null 2>&1 || \ @@ -152,7 +151,7 @@ netns_test_connectivity() echo -e "${TEST}: ${GREEN}PASS${NC}" TEST="ICMPv6 connectivity test" - ip netns exec ${NS_SRC} ping6 $PING_ARG ${IP6_DST} + ip netns exec ${NS_SRC} $PING6 $PING_ARG ${IP6_DST} if [ $? -ne 0 ]; then echo -e "${TEST}: ${RED}FAIL${NC}" exit 1 @@ -179,6 +178,9 @@ netns_setup_bpf() ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd ingress bpf da obj $obj sec dst_ingress ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd egress bpf da obj $obj sec chk_egress + # bpf_fib_lookup() checks if forwarding is enabled + ip netns exec ${NS_FWD} sysctl -w net.ipv4.ip_forward=1 net.ipv6.conf.veth_dst_fwd.forwarding=1 + veth_src=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_src_fwd/ifindex) veth_dst=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_dst_fwd/ifindex)