From patchwork Wed Jun 7 19:26:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271179 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B00A6118; Wed, 7 Jun 2023 19:26:41 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1E8F1FDC; Wed, 7 Jun 2023 12:26:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=Bf6O+kSV0jfWqWpshgMNDyEp5rZhdeH2I3m46DcC2F4=; b=Z53+nfN6Qx0h4dGezI7PHbaiDZ PqsObpFDwV1QE/LbUKWXO9jyeKgZHRo4wfEDzdrbquzdsnQ67QLYULTrE0UppGteKsThyFsClpCZ+ jtTWJNSpCOh1mqmoDPrSA8ukrwulq+l0BoKneLPa+2Z8X4XYf/LH/mNeWEO8l2P3VoFjOFxhW2Fcy YKOPsVn01GZSnZ7p/mnMbAl45riLerHZq69Ry9u2KF/0YYy2gCVhYeN1oY9t/u4P/lBvyPwuEfsrf LINf5dR6yuwA7KfM+I+83T41SCgdI7HICkG3XkcDWqQxDcIPjX1w5Rb+6bLrNAgHC65fI3HOZ63eO j+RvmcFA==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6yni-000CXU-B2; Wed, 07 Jun 2023 21:26:34 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 1/7] bpf: Add generic attach/detach/query API for multi-progs Date: Wed, 7 Jun 2023 21:26:19 +0200 Message-Id: <20230607192625.22641-2-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This adds a generic layer called bpf_mprog which can be reused by different attachment layers to enable multi-program attachment and dependency resolution. In-kernel users of the bpf_mprog don't need to care about the dependency resolution internals, they can just consume it with few API calls. The initial idea of having a generic API sparked out of discussion [0] from an earlier revision of this work where tc's priority was reused and exposed via BPF uapi as a way to coordinate dependencies among tc BPF programs, similar as-is for classic tc BPF. The feedback was that priority provides a bad user experience and is hard to use [1], e.g.: I cannot help but feel that priority logic copy-paste from old tc, netfilter and friends is done because "that's how things were done in the past". [...] Priority gets exposed everywhere in uapi all the way to bpftool when it's right there for users to understand. And that's the main problem with it. The user don't want to and don't need to be aware of it, but uapi forces them to pick the priority. [...] Your cover letter [0] example proves that in real life different service pick the same priority. They simply don't know any better. Priority is an unnecessary magic that apps _have_ to pick, so they just copy-paste and everyone ends up using the same. The course of the discussion showed more and more the need for a generic, reusable API where the "same look and feel" can be applied for various other program types beyond just tc BPF, for example XDP today does not have multi- program support in kernel, but also there was interest around this API for improving management of cgroup program types. Such common multi-program management concept is useful for BPF management daemons or user space BPF applications coordinating about their attachments. Both from Cilium and Meta side [2], we've collected the following requirements for a generic attach/detach/query API for multi-progs which has been implemented as part of this work: - Support prog-based attach/detach and link API - Dependency directives (can also be combined): - BPF_F_{BEFORE,AFTER} with relative_{fd,id} which can be {prog,link,none} - BPF_F_ID flag as {fd,id} toggle - BPF_F_LINK flag as {prog,link} toggle - If relative_{fd,id} is none, then BPF_F_BEFORE will just prepend, and BPF_F_AFTER will just append for the case of attaching - Enforced only at attach time - BPF_F_{FIRST,LAST} - Enforced throughout the bpf_mprog state's lifetime - Admin override possible (e.g. link detach, prog-based BPF_F_REPLACE) - Internal revision counter and optionally being able to pass expected_revision - User space daemon can query current state with revision, and pass it along for attachment to assert current state before doing updates - Query also gets extension for link_ids array and link_attach_flags: - prog_ids are always filled with program IDs - link_ids are filled with link IDs when link was used, otherwise 0 - {prog,link}_attach_flags for holding {prog,link}-specific flags - Must be easy to integrate/reuse for in-kernel users The uapi-side changes needed for supporting bpf_mprog are rather minimal, consisting of the additions of the attachment flags, revision counter, and expanding existing union with relative_{fd,id} member. The bpf_mprog framework consists of an bpf_mprog_entry object which holds an array of bpf_mprog_fp (fast-path structure) and bpf_mprog_cp (control-path structure). Both have been separated, so that fast-path gets efficient packing of bpf_prog pointers for maximum cache efficieny. Also, array has been chosen instead of linked list or other structures to remove unnecessary indirections for a fast point-to-entry in tc for BPF. The bpf_mprog_entry comes as a pair via bpf_mprog_bundle so that in case of updates the peer bpf_mprog_entry is populated and then just swapped which avoids additional allocations that could otherwise fail, for example, in detach case. bpf_mprog_{fp,cp} arrays are currently static, but they could be converted to dynamic allocation if necessary at a point in future. Locking is deferred to the in-kernel user of bpf_mprog, for example, in case of tcx which uses this API in the next patch, it piggy- backs on rtnl. The nitty-gritty details are in the bpf_mprog_{replace,head_tail, add,del} implementation and an extensive test suite for checking all aspects of this API for prog-based attach/detach and link API as BPF selftests in this series. Kudos also to Andrii Nakryiko for API discussions wrt Meta's BPF management daemon. [0] https://lore.kernel.org/bpf/20221004231143.19190-1-daniel@iogearbox.net/ [1] https://lore.kernel.org/bpf/CAADnVQ+gEY3FjCR=+DmjDR4gp5bOYZUFJQXj4agKFHT9CQPZBw@mail.gmail.com [2] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf Signed-off-by: Daniel Borkmann --- MAINTAINERS | 1 + include/linux/bpf_mprog.h | 245 +++++++++++++++++ include/uapi/linux/bpf.h | 37 ++- kernel/bpf/Makefile | 2 +- kernel/bpf/mprog.c | 476 +++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 37 ++- 6 files changed, 781 insertions(+), 17 deletions(-) create mode 100644 include/linux/bpf_mprog.h create mode 100644 kernel/bpf/mprog.c diff --git a/MAINTAINERS b/MAINTAINERS index c904dba1733b..754a9eeca0a1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3733,6 +3733,7 @@ F: include/linux/filter.h F: include/linux/tnum.h F: kernel/bpf/core.c F: kernel/bpf/dispatcher.c +F: kernel/bpf/mprog.c F: kernel/bpf/syscall.c F: kernel/bpf/tnum.c F: kernel/bpf/trampoline.c diff --git a/include/linux/bpf_mprog.h b/include/linux/bpf_mprog.h new file mode 100644 index 000000000000..7399181d8e6c --- /dev/null +++ b/include/linux/bpf_mprog.h @@ -0,0 +1,245 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2023 Isovalent */ +#ifndef __BPF_MPROG_H +#define __BPF_MPROG_H + +#include + +#define BPF_MPROG_MAX 64 +#define BPF_MPROG_SWAP 1 +#define BPF_MPROG_FREE 2 + +struct bpf_mprog_fp { + struct bpf_prog *prog; +}; + +struct bpf_mprog_cp { + struct bpf_link *link; + u32 flags; +}; + +struct bpf_mprog_entry { + struct bpf_mprog_fp fp_items[BPF_MPROG_MAX] ____cacheline_aligned; + struct bpf_mprog_cp cp_items[BPF_MPROG_MAX] ____cacheline_aligned; + struct bpf_mprog_bundle *parent; +}; + +struct bpf_mprog_bundle { + struct bpf_mprog_entry a; + struct bpf_mprog_entry b; + struct rcu_head rcu; + struct bpf_prog *ref; + atomic_t revision; +}; + +struct bpf_tuple { + struct bpf_prog *prog; + struct bpf_link *link; +}; + +static inline struct bpf_mprog_entry * +bpf_mprog_peer(const struct bpf_mprog_entry *entry) +{ + if (entry == &entry->parent->a) + return &entry->parent->b; + else + return &entry->parent->a; +} + +#define bpf_mprog_foreach_tuple(entry, fp, cp, t) \ + for (fp = &entry->fp_items[0], cp = &entry->cp_items[0]; \ + ({ \ + t.prog = READ_ONCE(fp->prog); \ + t.link = cp->link; \ + t.prog; \ + }); \ + fp++, cp++) + +#define bpf_mprog_foreach_prog(entry, fp, p) \ + for (fp = &entry->fp_items[0]; \ + (p = READ_ONCE(fp->prog)); \ + fp++) + +static inline struct bpf_mprog_entry *bpf_mprog_create(size_t extra_size) +{ + struct bpf_mprog_bundle *bundle; + + /* Fast-path items are not extensible, must only contain prog pointer! */ + BUILD_BUG_ON(sizeof(bundle->a.fp_items[0]) > sizeof(u64)); + /* Control-path items can be extended w/o affecting fast-path. */ + BUILD_BUG_ON(ARRAY_SIZE(bundle->a.fp_items) != ARRAY_SIZE(bundle->a.cp_items)); + + bundle = kzalloc(sizeof(*bundle) + extra_size, GFP_KERNEL); + if (bundle) { + atomic_set(&bundle->revision, 1); + bundle->a.parent = bundle; + bundle->b.parent = bundle; + return &bundle->a; + } + return NULL; +} + +static inline void bpf_mprog_free(struct bpf_mprog_entry *entry) +{ + kfree_rcu(entry->parent, rcu); +} + +static inline void bpf_mprog_mark_ref(struct bpf_mprog_entry *entry, + struct bpf_prog *prog) +{ + WARN_ON_ONCE(entry->parent->ref); + entry->parent->ref = prog; +} + +static inline u32 bpf_mprog_flags(u32 cur_flags, u32 req_flags, u32 flag) +{ + if (req_flags & flag) + cur_flags |= flag; + else + cur_flags &= ~flag; + return cur_flags; +} + +static inline u32 bpf_mprog_max(void) +{ + return ARRAY_SIZE(((struct bpf_mprog_entry *)NULL)->fp_items) - 1; +} + +static inline struct bpf_prog *bpf_mprog_first(struct bpf_mprog_entry *entry) +{ + return READ_ONCE(entry->fp_items[0].prog); +} + +static inline struct bpf_prog *bpf_mprog_last(struct bpf_mprog_entry *entry) +{ + struct bpf_prog *tmp, *prog = NULL; + struct bpf_mprog_fp *fp; + + bpf_mprog_foreach_prog(entry, fp, tmp) + prog = tmp; + return prog; +} + +static inline bool bpf_mprog_exists(struct bpf_mprog_entry *entry, + struct bpf_prog *prog) +{ + const struct bpf_mprog_fp *fp; + const struct bpf_prog *tmp; + + bpf_mprog_foreach_prog(entry, fp, tmp) { + if (tmp == prog) + return true; + } + return false; +} + +static inline struct bpf_prog *bpf_mprog_first_reg(struct bpf_mprog_entry *entry) +{ + struct bpf_tuple tuple = {}; + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + + bpf_mprog_foreach_tuple(entry, fp, cp, tuple) { + if (cp->flags & BPF_F_FIRST) + continue; + return tuple.prog; + } + return NULL; +} + +static inline struct bpf_prog *bpf_mprog_last_reg(struct bpf_mprog_entry *entry) +{ + struct bpf_tuple tuple = {}; + struct bpf_prog *prog = NULL; + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + + bpf_mprog_foreach_tuple(entry, fp, cp, tuple) { + if (cp->flags & BPF_F_LAST) + break; + prog = tuple.prog; + } + return prog; +} + +static inline void bpf_mprog_commit(struct bpf_mprog_entry *entry) +{ + do { + atomic_inc(&entry->parent->revision); + } while (atomic_read(&entry->parent->revision) == 0); + synchronize_rcu(); + if (entry->parent->ref) { + bpf_prog_put(entry->parent->ref); + entry->parent->ref = NULL; + } +} + +static inline void bpf_mprog_entry_clear(struct bpf_mprog_entry *entry) +{ + memset(entry->fp_items, 0, sizeof(entry->fp_items)); + memset(entry->cp_items, 0, sizeof(entry->cp_items)); +} + +static inline u64 bpf_mprog_revision(struct bpf_mprog_entry *entry) +{ + return atomic_read(&entry->parent->revision); +} + +static inline void bpf_mprog_read(struct bpf_mprog_entry *entry, u32 which, + struct bpf_mprog_fp **fp_dst, + struct bpf_mprog_cp **cp_dst) +{ + *fp_dst = &entry->fp_items[which]; + *cp_dst = &entry->cp_items[which]; +} + +static inline void bpf_mprog_write(struct bpf_mprog_fp *fp_dst, + struct bpf_mprog_cp *cp_dst, + struct bpf_tuple *tuple, u32 flags) +{ + WRITE_ONCE(fp_dst->prog, tuple->prog); + cp_dst->link = tuple->link; + cp_dst->flags = flags; +} + +static inline void bpf_mprog_copy(struct bpf_mprog_fp *fp_dst, + struct bpf_mprog_cp *cp_dst, + struct bpf_mprog_fp *fp_src, + struct bpf_mprog_cp *cp_src) +{ + WRITE_ONCE(fp_dst->prog, READ_ONCE(fp_src->prog)); + memcpy(cp_dst, cp_src, sizeof(*cp_src)); +} + +static inline void bpf_mprog_copy_range(struct bpf_mprog_entry *peer, + struct bpf_mprog_entry *entry, + u32 idx_peer, u32 idx_entry, u32 num) +{ + memcpy(&peer->fp_items[idx_peer], &entry->fp_items[idx_entry], + num * sizeof(peer->fp_items[0])); + memcpy(&peer->cp_items[idx_peer], &entry->cp_items[idx_entry], + num * sizeof(peer->cp_items[0])); +} + +static inline u32 bpf_mprog_total(struct bpf_mprog_entry *entry) +{ + const struct bpf_mprog_fp *fp; + const struct bpf_prog *tmp; + u32 num = 0; + + bpf_mprog_foreach_prog(entry, fp, tmp) + num++; + return num; +} + +int bpf_mprog_attach(struct bpf_mprog_entry *entry, struct bpf_prog *prog, + struct bpf_link *link, u32 flags, u32 object, + u32 expected_revision); +int bpf_mprog_detach(struct bpf_mprog_entry *entry, struct bpf_prog *prog, + struct bpf_link *link, u32 flags, u32 object, + u32 expected_revision); + +int bpf_mprog_query(const union bpf_attr *attr, union bpf_attr __user *uattr, + struct bpf_mprog_entry *entry); + +#endif /* __BPF_MPROG_H */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a7b5e91dd768..207f8a37b327 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1102,7 +1102,14 @@ enum bpf_link_type { */ #define BPF_F_ALLOW_OVERRIDE (1U << 0) #define BPF_F_ALLOW_MULTI (1U << 1) +/* Generic attachment flags. */ #define BPF_F_REPLACE (1U << 2) +#define BPF_F_BEFORE (1U << 3) +#define BPF_F_AFTER (1U << 4) +#define BPF_F_FIRST (1U << 5) +#define BPF_F_LAST (1U << 6) +#define BPF_F_ID (1U << 7) +#define BPF_F_LINK BPF_F_LINK /* 1 << 13 */ /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the * verifier will perform strict alignment checking as if the kernel @@ -1433,14 +1440,19 @@ union bpf_attr { }; struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */ - __u32 target_fd; /* container object to attach to */ - __u32 attach_bpf_fd; /* eBPF program to attach */ + union { + __u32 target_fd; /* target object to attach to or ... */ + __u32 target_ifindex; /* target ifindex */ + }; + __u32 attach_bpf_fd; __u32 attach_type; __u32 attach_flags; - __u32 replace_bpf_fd; /* previously attached eBPF - * program to replace if - * BPF_F_REPLACE is used - */ + union { + __u32 relative_fd; + __u32 relative_id; + __u32 replace_bpf_fd; + }; + __u32 expected_revision; }; struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */ @@ -1486,16 +1498,25 @@ union bpf_attr { } info; struct { /* anonymous struct used by BPF_PROG_QUERY command */ - __u32 target_fd; /* container object to query */ + union { + __u32 target_fd; /* target object to query or ... */ + __u32 target_ifindex; /* target ifindex */ + }; __u32 attach_type; __u32 query_flags; __u32 attach_flags; __aligned_u64 prog_ids; - __u32 prog_cnt; + union { + __u32 prog_cnt; + __u32 count; + }; + __u32 revision; /* output: per-program attach_flags. * not allowed to be set during effective query. */ __aligned_u64 prog_attach_flags; + __aligned_u64 link_ids; + __aligned_u64 link_attach_flags; } query; struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 1d3892168d32..1bea2eb912cd 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -12,7 +12,7 @@ obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o -obj-$(CONFIG_BPF_SYSCALL) += disasm.o +obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o obj-$(CONFIG_BPF_JIT) += dispatcher.o diff --git a/kernel/bpf/mprog.c b/kernel/bpf/mprog.c new file mode 100644 index 000000000000..efc3b73f8bf5 --- /dev/null +++ b/kernel/bpf/mprog.c @@ -0,0 +1,476 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ + +#include +#include +#include + +static int bpf_mprog_tuple_relative(struct bpf_tuple *tuple, + u32 object, u32 flags, + enum bpf_prog_type type) +{ + struct bpf_prog *prog; + struct bpf_link *link; + + memset(tuple, 0, sizeof(*tuple)); + if (!(flags & (BPF_F_REPLACE | BPF_F_BEFORE | BPF_F_AFTER))) + return object || (flags & (BPF_F_ID | BPF_F_LINK)) ? + -EINVAL : 0; + if (flags & BPF_F_LINK) { + if (flags & BPF_F_ID) + link = bpf_link_by_id(object); + else + link = bpf_link_get_from_fd(object); + if (IS_ERR(link)) + return PTR_ERR(link); + if (type && link->prog->type != type) { + bpf_link_put(link); + return -EINVAL; + } + tuple->link = link; + tuple->prog = link->prog; + } else { + if (flags & BPF_F_ID) + prog = bpf_prog_by_id(object); + else + prog = bpf_prog_get(object); + if (IS_ERR(prog)) { + if (!object && + !(flags & BPF_F_ID)) + return 0; + return PTR_ERR(prog); + } + if (type && prog->type != type) { + bpf_prog_put(prog); + return -EINVAL; + } + tuple->link = NULL; + tuple->prog = prog; + } + return 0; +} + +static void bpf_mprog_tuple_put(struct bpf_tuple *tuple) +{ + if (tuple->link) + bpf_link_put(tuple->link); + else if (tuple->prog) + bpf_prog_put(tuple->prog); +} + +static int bpf_mprog_replace(struct bpf_mprog_entry *entry, + struct bpf_tuple *ntuple, + struct bpf_tuple *rtuple, u32 rflags) +{ + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + struct bpf_prog *oprog; + u32 iflags; + int i; + + if (rflags & (BPF_F_BEFORE | BPF_F_AFTER | BPF_F_LINK)) + return -EINVAL; + if (rtuple->prog != ntuple->prog && + bpf_mprog_exists(entry, ntuple->prog)) + return -EEXIST; + for (i = 0; i < bpf_mprog_max(); i++) { + bpf_mprog_read(entry, i, &fp, &cp); + oprog = READ_ONCE(fp->prog); + if (!oprog) + break; + if (oprog != rtuple->prog) + continue; + if (cp->link != ntuple->link) + return -EBUSY; + iflags = cp->flags; + if ((iflags & BPF_F_FIRST) != + (rflags & BPF_F_FIRST)) { + iflags = bpf_mprog_flags(iflags, rflags, + BPF_F_FIRST); + if ((iflags & BPF_F_FIRST) && + rtuple->prog != bpf_mprog_first(entry)) + return -EACCES; + } + if ((iflags & BPF_F_LAST) != + (rflags & BPF_F_LAST)) { + iflags = bpf_mprog_flags(iflags, rflags, + BPF_F_LAST); + if ((iflags & BPF_F_LAST) && + rtuple->prog != bpf_mprog_last(entry)) + return -EACCES; + } + bpf_mprog_write(fp, cp, ntuple, iflags); + if (!ntuple->link) + bpf_prog_put(oprog); + return 0; + } + return -ENOENT; +} + +static int bpf_mprog_head_tail(struct bpf_mprog_entry *entry, + struct bpf_tuple *ntuple, + struct bpf_tuple *rtuple, u32 aflags) +{ + struct bpf_mprog_entry *peer; + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + struct bpf_prog *oprog; + u32 iflags, items; + + if (bpf_mprog_exists(entry, ntuple->prog)) + return -EEXIST; + items = bpf_mprog_total(entry); + peer = bpf_mprog_peer(entry); + bpf_mprog_entry_clear(peer); + if (aflags & BPF_F_FIRST) { + if (aflags & BPF_F_AFTER) + return -EINVAL; + bpf_mprog_read(entry, 0, &fp, &cp); + iflags = cp->flags; + if (iflags & BPF_F_FIRST) + return -EBUSY; + if (aflags & BPF_F_LAST) { + if (aflags & BPF_F_BEFORE) + return -EINVAL; + if (items) + return -EBUSY; + bpf_mprog_read(peer, 0, &fp, &cp); + bpf_mprog_write(fp, cp, ntuple, + BPF_F_FIRST | BPF_F_LAST); + return BPF_MPROG_SWAP; + } + if (aflags & BPF_F_BEFORE) { + oprog = READ_ONCE(fp->prog); + if (oprog != rtuple->prog || + (rtuple->link && + rtuple->link != cp->link)) + return -EBUSY; + } + if (items >= bpf_mprog_max()) + return -ENOSPC; + bpf_mprog_read(peer, 0, &fp, &cp); + bpf_mprog_write(fp, cp, ntuple, BPF_F_FIRST); + bpf_mprog_copy_range(peer, entry, 1, 0, items); + return BPF_MPROG_SWAP; + } + if (aflags & BPF_F_LAST) { + if (aflags & BPF_F_BEFORE) + return -EINVAL; + if (items) { + bpf_mprog_read(entry, items - 1, &fp, &cp); + iflags = cp->flags; + if (iflags & BPF_F_LAST) + return -EBUSY; + if (aflags & BPF_F_AFTER) { + oprog = READ_ONCE(fp->prog); + if (oprog != rtuple->prog || + (rtuple->link && + rtuple->link != cp->link)) + return -EBUSY; + } + if (items >= bpf_mprog_max()) + return -ENOSPC; + } else { + if (aflags & BPF_F_AFTER) + return -EBUSY; + } + bpf_mprog_read(peer, items, &fp, &cp); + bpf_mprog_write(fp, cp, ntuple, BPF_F_LAST); + bpf_mprog_copy_range(peer, entry, 0, 0, items); + return BPF_MPROG_SWAP; + } + return -ENOENT; +} + +static int bpf_mprog_add(struct bpf_mprog_entry *entry, + struct bpf_tuple *ntuple, + struct bpf_tuple *rtuple, u32 aflags) +{ + struct bpf_mprog_fp *fp_dst, *fp_src; + struct bpf_mprog_cp *cp_dst, *cp_src; + struct bpf_mprog_entry *peer; + struct bpf_prog *oprog; + bool found = false; + u32 items; + int i, j; + + items = bpf_mprog_total(entry); + if (items >= bpf_mprog_max()) + return -ENOSPC; + if ((aflags & (BPF_F_BEFORE | BPF_F_AFTER)) == + (BPF_F_BEFORE | BPF_F_AFTER)) + return -EINVAL; + if (bpf_mprog_exists(entry, ntuple->prog)) + return -EEXIST; + if (!rtuple->prog && (aflags & (BPF_F_BEFORE | BPF_F_AFTER))) { + if (!items) + aflags &= ~(BPF_F_AFTER | BPF_F_BEFORE); + if (aflags & BPF_F_BEFORE) + rtuple->prog = bpf_mprog_first_reg(entry); + if (aflags & BPF_F_AFTER) + rtuple->prog = bpf_mprog_last_reg(entry); + if (!rtuple->prog) + aflags &= ~(BPF_F_AFTER | BPF_F_BEFORE); + else + bpf_prog_inc(rtuple->prog); + } + peer = bpf_mprog_peer(entry); + bpf_mprog_entry_clear(peer); + for (i = 0, j = 0; i < bpf_mprog_max(); i++, j++) { + bpf_mprog_read(entry, i, &fp_src, &cp_src); + bpf_mprog_read(peer, j, &fp_dst, &cp_dst); + oprog = READ_ONCE(fp_src->prog); + if (!oprog) { + if (i != j) + break; + if (i > 0) { + bpf_mprog_read(entry, i - 1, + &fp_src, &cp_src); + if (cp_src->flags & BPF_F_LAST) { + if (cp_src->flags & BPF_F_FIRST) + return -EBUSY; + bpf_mprog_copy(fp_dst, cp_dst, + fp_src, cp_src); + bpf_mprog_read(peer, --j, + &fp_dst, &cp_dst); + } + } + bpf_mprog_write(fp_dst, cp_dst, ntuple, 0); + break; + } + if (aflags & (BPF_F_BEFORE | BPF_F_AFTER)) { + if (rtuple->prog != oprog || + (rtuple->link && + rtuple->link != cp_src->link)) + goto next; + found = true; + if (aflags & BPF_F_BEFORE) { + if (cp_src->flags & BPF_F_FIRST) + return -EBUSY; + bpf_mprog_write(fp_dst, cp_dst, ntuple, 0); + bpf_mprog_read(peer, ++j, &fp_dst, &cp_dst); + goto next; + } + if (aflags & BPF_F_AFTER) { + if (cp_src->flags & BPF_F_LAST) + return -EBUSY; + bpf_mprog_copy(fp_dst, cp_dst, + fp_src, cp_src); + bpf_mprog_read(peer, ++j, &fp_dst, &cp_dst); + bpf_mprog_write(fp_dst, cp_dst, ntuple, 0); + continue; + } + } +next: + bpf_mprog_copy(fp_dst, cp_dst, + fp_src, cp_src); + } + if (rtuple->prog && !found) + return -ENOENT; + return BPF_MPROG_SWAP; +} + +static int bpf_mprog_del(struct bpf_mprog_entry *entry, + struct bpf_tuple *dtuple, + struct bpf_tuple *rtuple, u32 dflags) +{ + struct bpf_mprog_fp *fp_dst, *fp_src; + struct bpf_mprog_cp *cp_dst, *cp_src; + struct bpf_mprog_entry *peer; + struct bpf_prog *oprog; + bool found = false; + int i, j, ret; + + if (dflags & BPF_F_REPLACE) + return -EINVAL; + if (dflags & BPF_F_FIRST) { + oprog = bpf_mprog_first(entry); + if (dtuple->prog && + dtuple->prog != oprog) + return -ENOENT; + dtuple->prog = oprog; + } + if (dflags & BPF_F_LAST) { + oprog = bpf_mprog_last(entry); + if (dtuple->prog && + dtuple->prog != oprog) + return -ENOENT; + dtuple->prog = oprog; + } + if (!rtuple->prog && (dflags & (BPF_F_BEFORE | BPF_F_AFTER))) { + if (dtuple->prog) + return -EINVAL; + if (dflags & BPF_F_BEFORE) + dtuple->prog = bpf_mprog_first_reg(entry); + if (dflags & BPF_F_AFTER) + dtuple->prog = bpf_mprog_last_reg(entry); + if (dtuple->prog) + dflags &= ~(BPF_F_AFTER | BPF_F_BEFORE); + } + for (i = 0; i < bpf_mprog_max(); i++) { + bpf_mprog_read(entry, i, &fp_src, &cp_src); + oprog = READ_ONCE(fp_src->prog); + if (!oprog) + break; + if (dflags & (BPF_F_BEFORE | BPF_F_AFTER)) { + if (rtuple->prog != oprog || + (rtuple->link && + rtuple->link != cp_src->link)) + continue; + found = true; + if (dflags & BPF_F_BEFORE) { + if (!i) + return -ENOENT; + bpf_mprog_read(entry, i - 1, + &fp_src, &cp_src); + oprog = READ_ONCE(fp_src->prog); + if (dtuple->prog && + dtuple->prog != oprog) + return -ENOENT; + dtuple->prog = oprog; + break; + } + if (dflags & BPF_F_AFTER) { + bpf_mprog_read(entry, i + 1, + &fp_src, &cp_src); + oprog = READ_ONCE(fp_src->prog); + if (dtuple->prog && + dtuple->prog != oprog) + return -ENOENT; + dtuple->prog = oprog; + break; + } + } + } + if (!dtuple->prog || (rtuple->prog && !found)) + return -ENOENT; + peer = bpf_mprog_peer(entry); + bpf_mprog_entry_clear(peer); + ret = -ENOENT; + for (i = 0, j = 0; i < bpf_mprog_max(); i++) { + bpf_mprog_read(entry, i, &fp_src, &cp_src); + bpf_mprog_read(peer, j, &fp_dst, &cp_dst); + oprog = READ_ONCE(fp_src->prog); + if (!oprog) + break; + if (oprog != dtuple->prog) { + bpf_mprog_copy(fp_dst, cp_dst, + fp_src, cp_src); + j++; + } else { + if (cp_src->link != dtuple->link) + return -EBUSY; + if (!cp_src->link) + bpf_mprog_mark_ref(entry, dtuple->prog); + ret = BPF_MPROG_SWAP; + } + } + if (!bpf_mprog_total(peer)) + ret = BPF_MPROG_FREE; + return ret; +} + +int bpf_mprog_attach(struct bpf_mprog_entry *entry, struct bpf_prog *prog, + struct bpf_link *link, u32 flags, u32 object, + u32 expected_revision) +{ + struct bpf_tuple rtuple, ntuple = { + .prog = prog, + .link = link, + }; + int ret; + + if (expected_revision && + expected_revision != bpf_mprog_revision(entry)) + return -ESTALE; + ret = bpf_mprog_tuple_relative(&rtuple, object, flags, prog->type); + if (ret) + return ret; + if (flags & BPF_F_REPLACE) + ret = bpf_mprog_replace(entry, &ntuple, &rtuple, flags); + else if (flags & (BPF_F_FIRST | BPF_F_LAST)) + ret = bpf_mprog_head_tail(entry, &ntuple, &rtuple, flags); + else + ret = bpf_mprog_add(entry, &ntuple, &rtuple, flags); + bpf_mprog_tuple_put(&rtuple); + return ret; +} + +int bpf_mprog_detach(struct bpf_mprog_entry *entry, struct bpf_prog *prog, + struct bpf_link *link, u32 flags, u32 object, + u32 expected_revision) +{ + struct bpf_tuple rtuple, dtuple = { + .prog = prog, + .link = link, + }; + int ret; + + if (expected_revision && + expected_revision != bpf_mprog_revision(entry)) + return -ESTALE; + ret = bpf_mprog_tuple_relative(&rtuple, object, flags, + prog ? prog->type : + BPF_PROG_TYPE_UNSPEC); + if (ret) + return ret; + ret = bpf_mprog_del(entry, &dtuple, &rtuple, flags); + bpf_mprog_tuple_put(&rtuple); + return ret; +} + +int bpf_mprog_query(const union bpf_attr *attr, union bpf_attr __user *uattr, + struct bpf_mprog_entry *entry) +{ + u32 i, id, flags = 0, count, revision; + u32 __user *uprog_id, *uprog_af; + u32 __user *ulink_id, *ulink_af; + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + struct bpf_prog *prog; + int ret = 0; + + if (attr->query.query_flags || attr->query.attach_flags) + return -EINVAL; + revision = bpf_mprog_revision(entry); + count = bpf_mprog_total(entry); + if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) + return -EFAULT; + if (copy_to_user(&uattr->query.revision, &revision, sizeof(revision))) + return -EFAULT; + if (copy_to_user(&uattr->query.count, &count, sizeof(count))) + return -EFAULT; + uprog_id = u64_to_user_ptr(attr->query.prog_ids); + if (attr->query.count == 0 || !uprog_id || !count) + return 0; + if (attr->query.count < count) { + count = attr->query.count; + ret = -ENOSPC; + } + uprog_af = u64_to_user_ptr(attr->query.prog_attach_flags); + ulink_id = u64_to_user_ptr(attr->query.link_ids); + ulink_af = u64_to_user_ptr(attr->query.link_attach_flags); + for (i = 0; i < ARRAY_SIZE(entry->fp_items); i++) { + bpf_mprog_read(entry, i, &fp, &cp); + prog = READ_ONCE(fp->prog); + if (!prog) + break; + id = prog->aux->id; + if (copy_to_user(uprog_id + i, &id, sizeof(id))) + return -EFAULT; + id = cp->link ? cp->link->id : 0; + if (ulink_id && + copy_to_user(ulink_id + i, &id, sizeof(id))) + return -EFAULT; + flags = cp->flags; + if (uprog_af && !id && + copy_to_user(uprog_af + i, &flags, sizeof(flags))) + return -EFAULT; + if (ulink_af && id && + copy_to_user(ulink_af + i, &flags, sizeof(flags))) + return -EFAULT; + if (i + 1 == count) + break; + } + return ret; +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index a7b5e91dd768..207f8a37b327 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1102,7 +1102,14 @@ enum bpf_link_type { */ #define BPF_F_ALLOW_OVERRIDE (1U << 0) #define BPF_F_ALLOW_MULTI (1U << 1) +/* Generic attachment flags. */ #define BPF_F_REPLACE (1U << 2) +#define BPF_F_BEFORE (1U << 3) +#define BPF_F_AFTER (1U << 4) +#define BPF_F_FIRST (1U << 5) +#define BPF_F_LAST (1U << 6) +#define BPF_F_ID (1U << 7) +#define BPF_F_LINK BPF_F_LINK /* 1 << 13 */ /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the * verifier will perform strict alignment checking as if the kernel @@ -1433,14 +1440,19 @@ union bpf_attr { }; struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */ - __u32 target_fd; /* container object to attach to */ - __u32 attach_bpf_fd; /* eBPF program to attach */ + union { + __u32 target_fd; /* target object to attach to or ... */ + __u32 target_ifindex; /* target ifindex */ + }; + __u32 attach_bpf_fd; __u32 attach_type; __u32 attach_flags; - __u32 replace_bpf_fd; /* previously attached eBPF - * program to replace if - * BPF_F_REPLACE is used - */ + union { + __u32 relative_fd; + __u32 relative_id; + __u32 replace_bpf_fd; + }; + __u32 expected_revision; }; struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */ @@ -1486,16 +1498,25 @@ union bpf_attr { } info; struct { /* anonymous struct used by BPF_PROG_QUERY command */ - __u32 target_fd; /* container object to query */ + union { + __u32 target_fd; /* target object to query or ... */ + __u32 target_ifindex; /* target ifindex */ + }; __u32 attach_type; __u32 query_flags; __u32 attach_flags; __aligned_u64 prog_ids; - __u32 prog_cnt; + union { + __u32 prog_cnt; + __u32 count; + }; + __u32 revision; /* output: per-program attach_flags. * not allowed to be set during effective query. */ __aligned_u64 prog_attach_flags; + __aligned_u64 link_ids; + __aligned_u64 link_attach_flags; } query; struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ From patchwork Wed Jun 7 19:26:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271181 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 550473AE57; Wed, 7 Jun 2023 19:26:43 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDA571FE0; Wed, 7 Jun 2023 12:26:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=oL4royuqeCY3p2YQmB9C5nGwCQb4/He0yVmj+WXzAaU=; b=jB9+MHlAbH0fPALM/5uNKNACGI WWzzsSduALXWJFWdyZCClm7JH8qpZyttuS8+Yaa06TzvNQf+ac1SMWpZvzbGxgzLvzg1PuPrAtXhf qwmtens/iWtj6chLLOILEzsY1X3F3PyHbTP7E5Y4kW8KylRenewD7XqfGyDQo/6FJr0jmQranBoBw 5eMnQLf0Rhoqbq/TQ2LQxIynvQT1Y8kYl/FmbpahTPY5R8eOwznf2YlYTuSsbdHkscFiOmA2LCm3H yHV15W9jvKAn0hiFThzR65mab6s7IfamkW9MJoe2heBfhXtGihI5+9dUpSE8fHJ5SBVbLqz6zO2bT 8/9NrpQg==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynj-000CXe-3r; Wed, 07 Jun 2023 21:26:35 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 2/7] bpf: Add fd-based tcx multi-prog infra with link support Date: Wed, 7 Jun 2023 21:26:20 +0200 Message-Id: <20230607192625.22641-3-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This work refactors and adds a lightweight extension ("tcx") to the tc BPF ingress and egress data path side for allowing BPF program management based on fds via bpf() syscall through the newly added generic multi-prog API. The main goal behind this work which we also presented at LPC [0] last year and a recent update at LSF/MM/BPF this year [3] is to support long-awaited BPF link functionality for tc BPF programs, which allows for a model of safe ownership and program detachment. Given the rise in tc BPF users in cloud native environments, this becomes necessary to avoid hard to debug incidents either through stale leftover programs or 3rd party applications accidentally stepping on each others toes. As a recap, a BPF link represents the attachment of a BPF program to a BPF hook point. The BPF link holds a single reference to keep BPF program alive. Moreover, hook points do not reference a BPF link, only the application's fd or pinning does. A BPF link holds meta-data specific to attachment and implements operations for link creation, (atomic) BPF program update, detachment and introspection. The motivation for BPF links for tc BPF programs is multi-fold, for example: - From Meta: "It's especially important for applications that are deployed fleet-wide and that don't "control" hosts they are deployed to. If such application crashes and no one notices and does anything about that, BPF program will keep running draining resources or even just, say, dropping packets. We at FB had outages due to such permanent BPF attachment semantics. With fd-based BPF link we are getting a framework, which allows safe, auto-detachable behavior by default, unless application explicitly opts in by pinning the BPF link." [1] - From Cilium-side the tc BPF programs we attach to host-facing veth devices and phys devices build the core datapath for Kubernetes Pods, and they implement forwarding, load-balancing, policy, EDT-management, etc, within BPF. Currently there is no concept of 'safe' ownership, e.g. we've recently experienced hard-to-debug issues in a user's staging environment where another Kubernetes application using tc BPF attached to the same prio/handle of cls_bpf, accidentally wiping all Cilium-based BPF programs from underneath it. The goal is to establish a clear/safe ownership model via links which cannot accidentally be overridden. [0,2] BPF links for tc can co-exist with non-link attachments, and the semantics are in line also with XDP links: BPF links cannot replace other BPF links, BPF links cannot replace non-BPF links, non-BPF links cannot replace BPF links and lastly only non-BPF links can replace non-BPF links. In case of Cilium, this would solve mentioned issue of safe ownership model as 3rd party applications would not be able to accidentally wipe Cilium programs, even if they are not BPF link aware. Earlier attempts [4] have tried to integrate BPF links into core tc machinery to solve cls_bpf, which has been intrusive to the generic tc kernel API with extensions only specific to cls_bpf and suboptimal/complex since cls_bpf could be wiped from the qdisc also. Locking a tc BPF program in place this way, is getting into layering hacks given the two object models are vastly different. We instead implemented the tcx (tc 'express') layer which is an fd-based tc BPF attach API, so that the BPF link implementation blends in naturally similar to other link types which are fd-based and without the need for changing core tc internal APIs. BPF programs for tc can then be successively migrated from classic cls_bpf to the new tc BPF link without needing to change the program's source code, just the BPF loader mechanics for attaching is sufficient. For the current tc framework, there is no change in behavior with this change and neither does this change touch on tc core kernel APIs. The gist of this patch is that the ingress and egress hook have a lightweight, qdisc-less extension for BPF to attach its tc BPF programs, in other words, a minimal entry point for tc BPF. The name tcx has been suggested from discussion of earlier revisions of this work as a good fit, and to more easily differ between the classic cls_bpf attachment and the fd-based one. For the ingress and egress tcx points, the device holds a cache-friendly array with program pointers which is separated from control plane (slow-path) data. Earlier versions of this work used priority to determine ordering and expression of dependencies similar as with classic tc, but it was challenged that for something more future-proof a better user experience is required. Hence this resulted in the design and development of the generic attach/detach/query API for multi-progs. See prior patch with its discussion on the API design. tcx is the first user and later we plan to integrate also others, for example, one candidate is multi-prog support for XDP which would benefit and have the same 'look and feel' from API perspective. The goal with tcx is to have maximum compatibility to existing tc BPF programs, so they don't need to be rewritten specifically. Compatibility to call into classic tcf_classify() is also provided in order to allow successive migration or both to cleanly co-exist where needed given its all one logical tc layer. tcx supports the simplified return codes TCX_NEXT which is non-terminating (go to next program) and terminating ones with TCX_PASS, TCX_DROP, TCX_REDIRECT. The fd-based API is behind a static key, so that when unused the code is also not entered. The struct tcx_entry's program array is currently static, but could be made dynamic if necessary at a point in future. The a/b pair swap design has been chosen so that for detachment there are no allocations which otherwise could fail. The work has been tested with tc-testing selftest suite which all passes, as well as the tc BPF tests from the BPF CI, and also with Cilium's L4LB. Kudos also to Nikolay Aleksandrov and Martin Lau for in-depth early reviews of this work. [0] https://lpc.events/event/16/contributions/1353/ [1] https://lore.kernel.org/bpf/CAEf4BzbokCJN33Nw_kg82sO=xppXnKWEncGTWCTB9vGCmLB6pw@mail.gmail.com/ [2] https://colocatedeventseu2023.sched.com/event/1Jo6O/tales-from-an-ebpf-programs-murder-mystery-hemanth-malla-guillaume-fournier-datadog [3] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf [4] https://lore.kernel.org/bpf/20210604063116.234316-1-memxor@gmail.com/ Signed-off-by: Daniel Borkmann --- MAINTAINERS | 4 +- include/linux/netdevice.h | 15 +- include/linux/skbuff.h | 4 +- include/net/sch_generic.h | 2 +- include/net/tcx.h | 157 +++++++++++++++ include/uapi/linux/bpf.h | 35 +++- kernel/bpf/Kconfig | 1 + kernel/bpf/Makefile | 1 + kernel/bpf/syscall.c | 95 +++++++-- kernel/bpf/tcx.c | 347 +++++++++++++++++++++++++++++++++ net/Kconfig | 5 + net/core/dev.c | 267 +++++++++++++++---------- net/core/filter.c | 4 +- net/sched/Kconfig | 4 +- net/sched/sch_ingress.c | 45 ++++- tools/include/uapi/linux/bpf.h | 35 +++- 16 files changed, 877 insertions(+), 144 deletions(-) create mode 100644 include/net/tcx.h create mode 100644 kernel/bpf/tcx.c diff --git a/MAINTAINERS b/MAINTAINERS index 754a9eeca0a1..7a0d0b0c5a5e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3827,13 +3827,15 @@ L: netdev@vger.kernel.org S: Maintained F: kernel/bpf/bpf_struct* -BPF [NETWORKING] (tc BPF, sock_addr) +BPF [NETWORKING] (tcx & tc BPF, sock_addr) M: Martin KaFai Lau M: Daniel Borkmann R: John Fastabend L: bpf@vger.kernel.org L: netdev@vger.kernel.org S: Maintained +F: include/net/tcx.h +F: kernel/bpf/tcx.c F: net/core/filter.c F: net/sched/act_bpf.c F: net/sched/cls_bpf.c diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 08fbd4622ccf..fd4281d1cdbb 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1927,8 +1927,7 @@ enum netdev_ml_priv_type { * * @rx_handler: handler for received packets * @rx_handler_data: XXX: need comments on this one - * @miniq_ingress: ingress/clsact qdisc specific data for - * ingress processing + * @tcx_ingress: BPF & clsact qdisc specific data for ingress processing * @ingress_queue: XXX: need comments on this one * @nf_hooks_ingress: netfilter hooks executed for ingress packets * @broadcast: hw bcast address @@ -1949,8 +1948,7 @@ enum netdev_ml_priv_type { * @xps_maps: all CPUs/RXQs maps for XPS device * * @xps_maps: XXX: need comments on this one - * @miniq_egress: clsact qdisc specific data for - * egress processing + * @tcx_egress: BPF & clsact qdisc specific data for egress processing * @nf_hooks_egress: netfilter hooks executed for egress packets * @qdisc_hash: qdisc hash table * @watchdog_timeo: Represents the timeout that is used by @@ -2249,9 +2247,8 @@ struct net_device { unsigned int gro_ipv4_max_size; rx_handler_func_t __rcu *rx_handler; void __rcu *rx_handler_data; - -#ifdef CONFIG_NET_CLS_ACT - struct mini_Qdisc __rcu *miniq_ingress; +#ifdef CONFIG_NET_XGRESS + struct bpf_mprog_entry __rcu *tcx_ingress; #endif struct netdev_queue __rcu *ingress_queue; #ifdef CONFIG_NETFILTER_INGRESS @@ -2279,8 +2276,8 @@ struct net_device { #ifdef CONFIG_XPS struct xps_dev_maps __rcu *xps_maps[XPS_MAPS_MAX]; #endif -#ifdef CONFIG_NET_CLS_ACT - struct mini_Qdisc __rcu *miniq_egress; +#ifdef CONFIG_NET_XGRESS + struct bpf_mprog_entry __rcu *tcx_egress; #endif #ifdef CONFIG_NETFILTER_EGRESS struct nf_hook_entries __rcu *nf_hooks_egress; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 5951904413ab..48c3e307f057 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -943,7 +943,7 @@ struct sk_buff { __u8 __mono_tc_offset[0]; /* public: */ __u8 mono_delivery_time:1; /* See SKB_MONO_DELIVERY_TIME_MASK */ -#ifdef CONFIG_NET_CLS_ACT +#ifdef CONFIG_NET_XGRESS __u8 tc_at_ingress:1; /* See TC_AT_INGRESS_MASK */ __u8 tc_skip_classify:1; #endif @@ -992,7 +992,7 @@ struct sk_buff { __u8 csum_not_inet:1; #endif -#ifdef CONFIG_NET_SCHED +#if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS) __u16 tc_index; /* traffic control index */ #endif diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index fab5ba3e61b7..0ade5d1a72b2 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -695,7 +695,7 @@ int skb_do_redirect(struct sk_buff *); static inline bool skb_at_tc_ingress(const struct sk_buff *skb) { -#ifdef CONFIG_NET_CLS_ACT +#ifdef CONFIG_NET_XGRESS return skb->tc_at_ingress; #else return false; diff --git a/include/net/tcx.h b/include/net/tcx.h new file mode 100644 index 000000000000..27885ecedff9 --- /dev/null +++ b/include/net/tcx.h @@ -0,0 +1,157 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2023 Isovalent */ +#ifndef __NET_TCX_H +#define __NET_TCX_H + +#include +#include + +#include + +struct mini_Qdisc; + +struct tcx_entry { + struct bpf_mprog_bundle bundle; + struct mini_Qdisc __rcu *miniq; +}; + +struct tcx_link { + struct bpf_link link; + struct net_device *dev; + u32 location; + u32 flags; +}; + +static inline struct tcx_link *tcx_link(struct bpf_link *link) +{ + return container_of(link, struct tcx_link, link); +} + +static inline const struct tcx_link *tcx_link_const(const struct bpf_link *link) +{ + return tcx_link((struct bpf_link *)link); +} + +static inline void tcx_set_ingress(struct sk_buff *skb, bool ingress) +{ +#ifdef CONFIG_NET_XGRESS + skb->tc_at_ingress = ingress; +#endif +} + +#ifdef CONFIG_NET_XGRESS +void tcx_inc(void); +void tcx_dec(void); + +static inline struct tcx_entry *tcx_entry(struct bpf_mprog_entry *entry) +{ + return container_of(entry->parent, struct tcx_entry, bundle); +} + +static inline void +tcx_entry_update(struct net_device *dev, struct bpf_mprog_entry *entry, bool ingress) +{ + ASSERT_RTNL(); + if (ingress) + rcu_assign_pointer(dev->tcx_ingress, entry); + else + rcu_assign_pointer(dev->tcx_egress, entry); +} + +static inline struct bpf_mprog_entry * +dev_tcx_entry_fetch(struct net_device *dev, bool ingress) +{ + ASSERT_RTNL(); + if (ingress) + return rcu_dereference_rtnl(dev->tcx_ingress); + else + return rcu_dereference_rtnl(dev->tcx_egress); +} + +static inline struct bpf_mprog_entry * +dev_tcx_entry_fetch_or_create(struct net_device *dev, bool ingress, bool *created) +{ + struct bpf_mprog_entry *entry = dev_tcx_entry_fetch(dev, ingress); + + *created = false; + if (!entry) { + entry = bpf_mprog_create(sizeof_field(struct tcx_entry, + miniq)); + if (!entry) + return NULL; + *created = true; + } + return entry; +} + +static inline void tcx_skeys_inc(bool ingress) +{ + tcx_inc(); + if (ingress) + net_inc_ingress_queue(); + else + net_inc_egress_queue(); +} + +static inline void tcx_skeys_dec(bool ingress) +{ + if (ingress) + net_dec_ingress_queue(); + else + net_dec_egress_queue(); + tcx_dec(); +} + +static inline enum tcx_action_base tcx_action_code(struct sk_buff *skb, int code) +{ + switch (code) { + case TCX_PASS: + skb->tc_index = qdisc_skb_cb(skb)->tc_classid; + fallthrough; + case TCX_DROP: + case TCX_REDIRECT: + return code; + case TCX_NEXT: + default: + return TCX_NEXT; + } +} +#endif /* CONFIG_NET_XGRESS */ + +#if defined(CONFIG_NET_XGRESS) && defined(CONFIG_BPF_SYSCALL) +int tcx_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog); +int tcx_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); +int tcx_prog_detach(const union bpf_attr *attr, struct bpf_prog *prog); +int tcx_prog_query(const union bpf_attr *attr, + union bpf_attr __user *uattr); +void dev_tcx_uninstall(struct net_device *dev); +#else +static inline int tcx_prog_attach(const union bpf_attr *attr, + struct bpf_prog *prog) +{ + return -EINVAL; +} + +static inline int tcx_link_attach(const union bpf_attr *attr, + struct bpf_prog *prog) +{ + return -EINVAL; +} + +static inline int tcx_prog_detach(const union bpf_attr *attr, + struct bpf_prog *prog) +{ + return -EINVAL; +} + +static inline int tcx_prog_query(const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return -EINVAL; +} + +static inline void dev_tcx_uninstall(struct net_device *dev) +{ +} +#endif /* CONFIG_NET_XGRESS && CONFIG_BPF_SYSCALL */ +#endif /* __NET_TCX_H */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 207f8a37b327..e7584e24bc83 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1035,6 +1035,8 @@ enum bpf_attach_type { BPF_TRACE_KPROBE_MULTI, BPF_LSM_CGROUP, BPF_STRUCT_OPS, + BPF_TCX_INGRESS, + BPF_TCX_EGRESS, __MAX_BPF_ATTACH_TYPE }; @@ -1052,7 +1054,7 @@ enum bpf_link_type { BPF_LINK_TYPE_KPROBE_MULTI = 8, BPF_LINK_TYPE_STRUCT_OPS = 9, BPF_LINK_TYPE_NETFILTER = 10, - + BPF_LINK_TYPE_TCX = 11, MAX_BPF_LINK_TYPE, }; @@ -1559,13 +1561,13 @@ union bpf_attr { __u32 map_fd; /* struct_ops to attach */ }; union { - __u32 target_fd; /* object to attach to */ - __u32 target_ifindex; /* target ifindex */ + __u32 target_fd; /* target object to attach to or ... */ + __u32 target_ifindex; /* target ifindex */ }; __u32 attach_type; /* attach type */ __u32 flags; /* extra flags */ union { - __u32 target_btf_id; /* btf_id of target to attach to */ + __u32 target_btf_id; /* btf_id of target to attach to */ struct { __aligned_u64 iter_info; /* extra bpf_iter_link_info */ __u32 iter_info_len; /* iter_info length */ @@ -1599,6 +1601,13 @@ union bpf_attr { __s32 priority; __u32 flags; } netfilter; + struct { + union { + __u32 relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; + } tcx; }; } link_create; @@ -6207,6 +6216,19 @@ struct bpf_sock_tuple { }; }; +/* (Simplified) user return codes for tcx prog type. + * A valid tcx program must return one of these defined values. All other + * return codes are reserved for future use. Must remain compatible with + * their TC_ACT_* counter-parts. For compatibility in behavior, unknown + * return codes are mapped to TCX_NEXT. + */ +enum tcx_action_base { + TCX_NEXT = -1, + TCX_PASS = 0, + TCX_DROP = 2, + TCX_REDIRECT = 7, +}; + struct bpf_xdp_sock { __u32 queue_id; }; @@ -6459,6 +6481,11 @@ struct bpf_link_info { __s32 priority; __u32 flags; } netfilter; + struct { + __u32 ifindex; + __u32 attach_type; + __u32 flags; + } tcx; }; } __attribute__((aligned(8))); diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig index 2dfe1079f772..6a906ff93006 100644 --- a/kernel/bpf/Kconfig +++ b/kernel/bpf/Kconfig @@ -31,6 +31,7 @@ config BPF_SYSCALL select TASKS_TRACE_RCU select BINARY_PRINTF select NET_SOCK_MSG if NET + select NET_XGRESS if NET select PAGE_POOL if NET default n help diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 1bea2eb912cd..f526b7573e97 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -21,6 +21,7 @@ obj-$(CONFIG_BPF_SYSCALL) += devmap.o obj-$(CONFIG_BPF_SYSCALL) += cpumap.o obj-$(CONFIG_BPF_SYSCALL) += offload.o obj-$(CONFIG_BPF_SYSCALL) += net_namespace.o +obj-$(CONFIG_BPF_SYSCALL) += tcx.o endif ifeq ($(CONFIG_PERF_EVENTS),y) obj-$(CONFIG_BPF_SYSCALL) += stackmap.o diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 92a57efc77de..e2c219d053f4 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -37,6 +37,8 @@ #include #include +#include + #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \ (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \ (map)->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) @@ -3522,31 +3524,57 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_XDP; case BPF_LSM_CGROUP: return BPF_PROG_TYPE_LSM; + case BPF_TCX_INGRESS: + case BPF_TCX_EGRESS: + return BPF_PROG_TYPE_SCHED_CLS; default: return BPF_PROG_TYPE_UNSPEC; } } -#define BPF_PROG_ATTACH_LAST_FIELD replace_bpf_fd +#define BPF_PROG_ATTACH_LAST_FIELD expected_revision + +#define BPF_F_ATTACH_MASK_BASE \ + (BPF_F_ALLOW_OVERRIDE | \ + BPF_F_ALLOW_MULTI | \ + BPF_F_REPLACE) + +#define BPF_F_ATTACH_MASK_MPROG \ + (BPF_F_REPLACE | \ + BPF_F_BEFORE | \ + BPF_F_AFTER | \ + BPF_F_FIRST | \ + BPF_F_LAST | \ + BPF_F_ID | \ + BPF_F_LINK) -#define BPF_F_ATTACH_MASK \ - (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI | BPF_F_REPLACE) +static bool bpf_supports_mprog(enum bpf_prog_type ptype) +{ + switch (ptype) { + case BPF_PROG_TYPE_SCHED_CLS: + return true; + default: + return false; + } +} static int bpf_prog_attach(const union bpf_attr *attr) { enum bpf_prog_type ptype; struct bpf_prog *prog; + u32 mask; int ret; if (CHECK_ATTR(BPF_PROG_ATTACH)) return -EINVAL; - if (attr->attach_flags & ~BPF_F_ATTACH_MASK) - return -EINVAL; - ptype = attach_type_to_prog_type(attr->attach_type); if (ptype == BPF_PROG_TYPE_UNSPEC) return -EINVAL; + mask = bpf_supports_mprog(ptype) ? + BPF_F_ATTACH_MASK_MPROG : BPF_F_ATTACH_MASK_BASE; + if (attr->attach_flags & ~mask) + return -EINVAL; prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype); if (IS_ERR(prog)) @@ -3582,6 +3610,9 @@ static int bpf_prog_attach(const union bpf_attr *attr) else ret = cgroup_bpf_prog_attach(attr, ptype, prog); break; + case BPF_PROG_TYPE_SCHED_CLS: + ret = tcx_prog_attach(attr, prog); + break; default: ret = -EINVAL; } @@ -3591,25 +3622,42 @@ static int bpf_prog_attach(const union bpf_attr *attr) return ret; } -#define BPF_PROG_DETACH_LAST_FIELD attach_type +#define BPF_PROG_DETACH_LAST_FIELD expected_revision static int bpf_prog_detach(const union bpf_attr *attr) { + struct bpf_prog *prog = NULL; enum bpf_prog_type ptype; + int ret; if (CHECK_ATTR(BPF_PROG_DETACH)) return -EINVAL; ptype = attach_type_to_prog_type(attr->attach_type); + if (bpf_supports_mprog(ptype)) { + if (ptype == BPF_PROG_TYPE_UNSPEC) + return -EINVAL; + if (attr->attach_flags & ~BPF_F_ATTACH_MASK_MPROG) + return -EINVAL; + prog = bpf_prog_get_type(attr->attach_bpf_fd, ptype); + if (IS_ERR(prog)) { + if ((int)attr->attach_bpf_fd > 0) + return PTR_ERR(prog); + prog = NULL; + } + } switch (ptype) { case BPF_PROG_TYPE_SK_MSG: case BPF_PROG_TYPE_SK_SKB: - return sock_map_prog_detach(attr, ptype); + ret = sock_map_prog_detach(attr, ptype); + break; case BPF_PROG_TYPE_LIRC_MODE2: - return lirc_prog_detach(attr); + ret = lirc_prog_detach(attr); + break; case BPF_PROG_TYPE_FLOW_DISSECTOR: - return netns_bpf_prog_detach(attr, ptype); + ret = netns_bpf_prog_detach(attr, ptype); + break; case BPF_PROG_TYPE_CGROUP_DEVICE: case BPF_PROG_TYPE_CGROUP_SKB: case BPF_PROG_TYPE_CGROUP_SOCK: @@ -3618,13 +3666,21 @@ static int bpf_prog_detach(const union bpf_attr *attr) case BPF_PROG_TYPE_CGROUP_SYSCTL: case BPF_PROG_TYPE_SOCK_OPS: case BPF_PROG_TYPE_LSM: - return cgroup_bpf_prog_detach(attr, ptype); + ret = cgroup_bpf_prog_detach(attr, ptype); + break; + case BPF_PROG_TYPE_SCHED_CLS: + ret = tcx_prog_detach(attr, prog); + break; default: - return -EINVAL; + ret = -EINVAL; } + + if (prog) + bpf_prog_put(prog); + return ret; } -#define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags +#define BPF_PROG_QUERY_LAST_FIELD query.link_attach_flags static int bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr) @@ -3672,6 +3728,9 @@ static int bpf_prog_query(const union bpf_attr *attr, case BPF_SK_MSG_VERDICT: case BPF_SK_SKB_VERDICT: return sock_map_bpf_prog_query(attr, uattr); + case BPF_TCX_INGRESS: + case BPF_TCX_EGRESS: + return tcx_prog_query(attr, uattr); default: return -EINVAL; } @@ -4629,6 +4688,13 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) goto out; } break; + case BPF_PROG_TYPE_SCHED_CLS: + if (attr->link_create.attach_type != BPF_TCX_INGRESS && + attr->link_create.attach_type != BPF_TCX_EGRESS) { + ret = -EINVAL; + goto out; + } + break; default: ptype = attach_type_to_prog_type(attr->link_create.attach_type); if (ptype == BPF_PROG_TYPE_UNSPEC || ptype != prog->type) { @@ -4680,6 +4746,9 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) case BPF_PROG_TYPE_XDP: ret = bpf_xdp_link_attach(attr, prog); break; + case BPF_PROG_TYPE_SCHED_CLS: + ret = tcx_link_attach(attr, prog); + break; case BPF_PROG_TYPE_NETFILTER: ret = bpf_nf_link_attach(attr, prog); break; diff --git a/kernel/bpf/tcx.c b/kernel/bpf/tcx.c new file mode 100644 index 000000000000..d3d23b4ed4f0 --- /dev/null +++ b/kernel/bpf/tcx.c @@ -0,0 +1,347 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ + +#include +#include +#include + +#include + +int tcx_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + bool created, ingress = attr->attach_type == BPF_TCX_INGRESS; + struct net *net = current->nsproxy->net_ns; + struct bpf_mprog_entry *entry; + struct net_device *dev; + int ret; + + rtnl_lock(); + dev = __dev_get_by_index(net, attr->target_ifindex); + if (!dev) { + ret = -ENODEV; + goto out; + } + entry = dev_tcx_entry_fetch_or_create(dev, ingress, &created); + if (!entry) { + ret = -ENOMEM; + goto out; + } + ret = bpf_mprog_attach(entry, prog, NULL, attr->attach_flags, + attr->relative_fd, attr->expected_revision); + if (ret >= 0) { + if (ret == BPF_MPROG_SWAP) + tcx_entry_update(dev, bpf_mprog_peer(entry), ingress); + bpf_mprog_commit(entry); + tcx_skeys_inc(ingress); + ret = 0; + } else if (created) { + bpf_mprog_free(entry); + } +out: + rtnl_unlock(); + return ret; +} + +static bool tcx_release_entry(struct bpf_mprog_entry *entry, int code) +{ + return code == BPF_MPROG_FREE && !tcx_entry(entry)->miniq; +} + +int tcx_prog_detach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + bool tcx_release, ingress = attr->attach_type == BPF_TCX_INGRESS; + struct net *net = current->nsproxy->net_ns; + struct bpf_mprog_entry *entry, *peer; + struct net_device *dev; + int ret; + + rtnl_lock(); + dev = __dev_get_by_index(net, attr->target_ifindex); + if (!dev) { + ret = -ENODEV; + goto out; + } + entry = dev_tcx_entry_fetch(dev, ingress); + if (!entry) { + ret = -ENOENT; + goto out; + } + ret = bpf_mprog_detach(entry, prog, NULL, attr->attach_flags, + attr->relative_fd, attr->expected_revision); + if (ret >= 0) { + tcx_release = tcx_release_entry(entry, ret); + peer = tcx_release ? NULL : bpf_mprog_peer(entry); + if (ret == BPF_MPROG_SWAP || ret == BPF_MPROG_FREE) + tcx_entry_update(dev, peer, ingress); + bpf_mprog_commit(entry); + tcx_skeys_dec(ingress); + if (tcx_release) + bpf_mprog_free(entry); + ret = 0; + } +out: + rtnl_unlock(); + return ret; +} + +static void tcx_uninstall(struct net_device *dev, bool ingress) +{ + struct bpf_tuple tuple = {}; + struct bpf_mprog_entry *entry; + struct bpf_mprog_fp *fp; + struct bpf_mprog_cp *cp; + + entry = dev_tcx_entry_fetch(dev, ingress); + if (!entry) + return; + tcx_entry_update(dev, NULL, ingress); + bpf_mprog_commit(entry); + bpf_mprog_foreach_tuple(entry, fp, cp, tuple) { + if (tuple.link) + tcx_link(tuple.link)->dev = NULL; + else + bpf_prog_put(tuple.prog); + tcx_skeys_dec(ingress); + } + WARN_ON_ONCE(tcx_entry(entry)->miniq); + bpf_mprog_free(entry); +} + +void dev_tcx_uninstall(struct net_device *dev) +{ + ASSERT_RTNL(); + tcx_uninstall(dev, true); + tcx_uninstall(dev, false); +} + +int tcx_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr) +{ + bool ingress = attr->query.attach_type == BPF_TCX_INGRESS; + struct net *net = current->nsproxy->net_ns; + struct bpf_mprog_entry *entry; + struct net_device *dev; + int ret; + + rtnl_lock(); + dev = __dev_get_by_index(net, attr->query.target_ifindex); + if (!dev) { + ret = -ENODEV; + goto out; + } + entry = dev_tcx_entry_fetch(dev, ingress); + if (!entry) { + ret = -ENOENT; + goto out; + } + ret = bpf_mprog_query(attr, uattr, entry); +out: + rtnl_unlock(); + return ret; +} + +static int tcx_link_prog_attach(struct bpf_link *l, u32 flags, u32 object, + u32 expected_revision) +{ + struct tcx_link *link = tcx_link(l); + bool created, ingress = link->location == BPF_TCX_INGRESS; + struct net_device *dev = link->dev; + struct bpf_mprog_entry *entry; + int ret; + + ASSERT_RTNL(); + entry = dev_tcx_entry_fetch_or_create(dev, ingress, &created); + if (!entry) + return -ENOMEM; + ret = bpf_mprog_attach(entry, l->prog, l, flags, object, + expected_revision); + if (ret >= 0) { + if (ret == BPF_MPROG_SWAP) + tcx_entry_update(dev, bpf_mprog_peer(entry), ingress); + bpf_mprog_commit(entry); + tcx_skeys_inc(ingress); + ret = 0; + } else if (created) { + bpf_mprog_free(entry); + } + return ret; +} + +static void tcx_link_release(struct bpf_link *l) +{ + struct tcx_link *link = tcx_link(l); + bool tcx_release, ingress = link->location == BPF_TCX_INGRESS; + struct bpf_mprog_entry *entry, *peer; + struct net_device *dev; + int ret = 0; + + rtnl_lock(); + dev = link->dev; + if (!dev) + goto out; + entry = dev_tcx_entry_fetch(dev, ingress); + if (!entry) { + ret = -ENOENT; + goto out; + } + ret = bpf_mprog_detach(entry, l->prog, l, link->flags, 0, 0); + if (ret >= 0) { + tcx_release = tcx_release_entry(entry, ret); + peer = tcx_release ? NULL : bpf_mprog_peer(entry); + if (ret == BPF_MPROG_SWAP || ret == BPF_MPROG_FREE) + tcx_entry_update(dev, peer, ingress); + bpf_mprog_commit(entry); + tcx_skeys_dec(ingress); + if (tcx_release) + bpf_mprog_free(entry); + link->dev = NULL; + ret = 0; + } +out: + WARN_ON_ONCE(ret); + rtnl_unlock(); +} + +static int tcx_link_update(struct bpf_link *l, struct bpf_prog *nprog, + struct bpf_prog *oprog) +{ + struct tcx_link *link = tcx_link(l); + bool ingress = link->location == BPF_TCX_INGRESS; + struct net_device *dev = link->dev; + struct bpf_mprog_entry *entry; + int ret = 0; + + rtnl_lock(); + if (!link->dev) { + ret = -ENOLINK; + goto out; + } + if (oprog && l->prog != oprog) { + ret = -EPERM; + goto out; + } + oprog = l->prog; + if (oprog == nprog) { + bpf_prog_put(nprog); + goto out; + } + entry = dev_tcx_entry_fetch(dev, ingress); + if (!entry) { + ret = -ENOENT; + goto out; + } + ret = bpf_mprog_attach(entry, nprog, l, + BPF_F_REPLACE | BPF_F_ID | link->flags, + l->prog->aux->id, 0); + if (ret >= 0) { + if (ret == BPF_MPROG_SWAP) + tcx_entry_update(dev, bpf_mprog_peer(entry), ingress); + bpf_mprog_commit(entry); + tcx_skeys_inc(ingress); + oprog = xchg(&l->prog, nprog); + bpf_prog_put(oprog); + ret = 0; + } +out: + rtnl_unlock(); + return ret; +} + +static void tcx_link_dealloc(struct bpf_link *l) +{ + kfree(tcx_link(l)); +} + +static void tcx_link_fdinfo(const struct bpf_link *l, struct seq_file *seq) +{ + const struct tcx_link *link = tcx_link_const(l); + u32 ifindex = 0; + + rtnl_lock(); + if (link->dev) + ifindex = link->dev->ifindex; + rtnl_unlock(); + + seq_printf(seq, "ifindex:\t%u\n", ifindex); + seq_printf(seq, "attach_type:\t%u (%s)\n", + link->location, + link->location == BPF_TCX_INGRESS ? "ingress" : "egress"); + seq_printf(seq, "flags:\t%u\n", link->flags); +} + +static int tcx_link_fill_info(const struct bpf_link *l, + struct bpf_link_info *info) +{ + const struct tcx_link *link = tcx_link_const(l); + u32 ifindex = 0; + + rtnl_lock(); + if (link->dev) + ifindex = link->dev->ifindex; + rtnl_unlock(); + + info->tcx.ifindex = ifindex; + info->tcx.attach_type = link->location; + info->tcx.flags = link->flags; + return 0; +} + +static int tcx_link_detach(struct bpf_link *l) +{ + tcx_link_release(l); + return 0; +} + +static const struct bpf_link_ops tcx_link_lops = { + .release = tcx_link_release, + .detach = tcx_link_detach, + .dealloc = tcx_link_dealloc, + .update_prog = tcx_link_update, + .show_fdinfo = tcx_link_fdinfo, + .fill_link_info = tcx_link_fill_info, +}; + +int tcx_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + struct net *net = current->nsproxy->net_ns; + struct bpf_link_primer link_primer; + struct net_device *dev; + struct tcx_link *link; + int fd, err; + + dev = dev_get_by_index(net, attr->link_create.target_ifindex); + if (!dev) + return -EINVAL; + link = kzalloc(sizeof(*link), GFP_USER); + if (!link) { + err = -ENOMEM; + goto out_put; + } + + bpf_link_init(&link->link, BPF_LINK_TYPE_TCX, &tcx_link_lops, prog); + link->location = attr->link_create.attach_type; + link->flags = attr->link_create.flags & (BPF_F_FIRST | BPF_F_LAST); + link->dev = dev; + + err = bpf_link_prime(&link->link, &link_primer); + if (err) { + kfree(link); + goto out_put; + } + rtnl_lock(); + err = tcx_link_prog_attach(&link->link, attr->link_create.flags, + attr->link_create.tcx.relative_fd, + attr->link_create.tcx.expected_revision); + if (!err) + fd = bpf_link_settle(&link_primer); + rtnl_unlock(); + if (err) { + link->dev = NULL; + bpf_link_cleanup(&link_primer); + goto out_put; + } + dev_put(dev); + return fd; +out_put: + dev_put(dev); + return err; +} diff --git a/net/Kconfig b/net/Kconfig index 2fb25b534df5..d532ec33f1fe 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -52,6 +52,11 @@ config NET_INGRESS config NET_EGRESS bool +config NET_XGRESS + select NET_INGRESS + select NET_EGRESS + bool + config NET_REDIRECT bool diff --git a/net/core/dev.c b/net/core/dev.c index 3393c2f3dbe8..95c7e3189884 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -107,6 +107,7 @@ #include #include #include +#include #include #include #include @@ -154,7 +155,6 @@ #include "dev.h" #include "net-sysfs.h" - static DEFINE_SPINLOCK(ptype_lock); struct list_head ptype_base[PTYPE_HASH_SIZE] __read_mostly; struct list_head ptype_all __read_mostly; /* Taps */ @@ -3923,69 +3923,200 @@ int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb) EXPORT_SYMBOL(dev_loopback_xmit); #ifdef CONFIG_NET_EGRESS -static struct sk_buff * -sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) +static struct netdev_queue * +netdev_tx_queue_mapping(struct net_device *dev, struct sk_buff *skb) +{ + int qm = skb_get_queue_mapping(skb); + + return netdev_get_tx_queue(dev, netdev_cap_txqueue(dev, qm)); +} + +static bool netdev_xmit_txqueue_skipped(void) { + return __this_cpu_read(softnet_data.xmit.skip_txqueue); +} + +void netdev_xmit_skip_txqueue(bool skip) +{ + __this_cpu_write(softnet_data.xmit.skip_txqueue, skip); +} +EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue); +#endif /* CONFIG_NET_EGRESS */ + +#ifdef CONFIG_NET_XGRESS +static int tc_run(struct tcx_entry *entry, struct sk_buff *skb) +{ + int ret = TC_ACT_UNSPEC; #ifdef CONFIG_NET_CLS_ACT - struct mini_Qdisc *miniq = rcu_dereference_bh(dev->miniq_egress); - struct tcf_result cl_res; + struct mini_Qdisc *miniq = rcu_dereference_bh(entry->miniq); + struct tcf_result res; if (!miniq) - return skb; + return ret; - /* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */ tc_skb_cb(skb)->mru = 0; tc_skb_cb(skb)->post_ct = false; - mini_qdisc_bstats_cpu_update(miniq, skb); - switch (tcf_classify(skb, miniq->block, miniq->filter_list, &cl_res, false)) { + mini_qdisc_bstats_cpu_update(miniq, skb); + ret = tcf_classify(skb, miniq->block, miniq->filter_list, &res, false); + /* Only tcf related quirks below. */ + switch (ret) { + case TC_ACT_SHOT: + mini_qdisc_qstats_cpu_drop(miniq); + break; case TC_ACT_OK: case TC_ACT_RECLASSIFY: - skb->tc_index = TC_H_MIN(cl_res.classid); + skb->tc_index = TC_H_MIN(res.classid); break; + } +#endif /* CONFIG_NET_CLS_ACT */ + return ret; +} + +static DEFINE_STATIC_KEY_FALSE(tcx_needed_key); + +void tcx_inc(void) +{ + static_branch_inc(&tcx_needed_key); +} +EXPORT_SYMBOL_GPL(tcx_inc); + +void tcx_dec(void) +{ + static_branch_dec(&tcx_needed_key); +} +EXPORT_SYMBOL_GPL(tcx_dec); + +static __always_inline enum tcx_action_base +tcx_run(const struct bpf_mprog_entry *entry, struct sk_buff *skb, + const bool needs_mac) +{ + const struct bpf_mprog_fp *fp; + const struct bpf_prog *prog; + int ret = TCX_NEXT; + + if (needs_mac) + __skb_push(skb, skb->mac_len); + bpf_mprog_foreach_prog(entry, fp, prog) { + bpf_compute_data_pointers(skb); + ret = bpf_prog_run(prog, skb); + if (ret != TCX_NEXT) + break; + } + if (needs_mac) + __skb_pull(skb, skb->mac_len); + return tcx_action_code(skb, ret); +} + +static __always_inline struct sk_buff * +sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, + struct net_device *orig_dev, bool *another) +{ + struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress); + int sch_ret; + + if (!entry) + return skb; + if (*pt_prev) { + *ret = deliver_skb(skb, *pt_prev, orig_dev); + *pt_prev = NULL; + } + + qdisc_skb_cb(skb)->pkt_len = skb->len; + tcx_set_ingress(skb, true); + + if (static_branch_unlikely(&tcx_needed_key)) { + sch_ret = tcx_run(entry, skb, true); + if (sch_ret != TC_ACT_UNSPEC) + goto ingress_verdict; + } + sch_ret = tc_run(container_of(entry->parent, struct tcx_entry, bundle), skb); +ingress_verdict: + switch (sch_ret) { + case TC_ACT_REDIRECT: + /* skb_mac_header check was done by BPF, so we can safely + * push the L2 header back before redirecting to another + * netdev. + */ + __skb_push(skb, skb->mac_len); + if (skb_do_redirect(skb) == -EAGAIN) { + __skb_pull(skb, skb->mac_len); + *another = true; + break; + } + *ret = NET_RX_SUCCESS; + return NULL; case TC_ACT_SHOT: - mini_qdisc_qstats_cpu_drop(miniq); - *ret = NET_XMIT_DROP; - kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS); + kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS); + *ret = NET_RX_DROP; return NULL; + /* used by tc_run */ case TC_ACT_STOLEN: case TC_ACT_QUEUED: case TC_ACT_TRAP: - *ret = NET_XMIT_SUCCESS; consume_skb(skb); + fallthrough; + case TC_ACT_CONSUMED: + *ret = NET_RX_SUCCESS; return NULL; + } + + return skb; +} + +static __always_inline struct sk_buff * +sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) +{ + struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress); + int sch_ret; + + if (!entry) + return skb; + + /* qdisc_skb_cb(skb)->pkt_len & tcx_set_ingress() was + * already set by the caller. + */ + if (static_branch_unlikely(&tcx_needed_key)) { + sch_ret = tcx_run(entry, skb, false); + if (sch_ret != TC_ACT_UNSPEC) + goto egress_verdict; + } + sch_ret = tc_run(container_of(entry->parent, struct tcx_entry, bundle), skb); +egress_verdict: + switch (sch_ret) { case TC_ACT_REDIRECT: /* No need to push/pop skb's mac_header here on egress! */ skb_do_redirect(skb); *ret = NET_XMIT_SUCCESS; return NULL; - default: - break; + case TC_ACT_SHOT: + kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS); + *ret = NET_XMIT_DROP; + return NULL; + /* used by tc_run */ + case TC_ACT_STOLEN: + case TC_ACT_QUEUED: + case TC_ACT_TRAP: + *ret = NET_XMIT_SUCCESS; + return NULL; } -#endif /* CONFIG_NET_CLS_ACT */ return skb; } - -static struct netdev_queue * -netdev_tx_queue_mapping(struct net_device *dev, struct sk_buff *skb) -{ - int qm = skb_get_queue_mapping(skb); - - return netdev_get_tx_queue(dev, netdev_cap_txqueue(dev, qm)); -} - -static bool netdev_xmit_txqueue_skipped(void) +#else +static __always_inline struct sk_buff * +sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, + struct net_device *orig_dev, bool *another) { - return __this_cpu_read(softnet_data.xmit.skip_txqueue); + return skb; } -void netdev_xmit_skip_txqueue(bool skip) +static __always_inline struct sk_buff * +sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) { - __this_cpu_write(softnet_data.xmit.skip_txqueue, skip); + return skb; } -EXPORT_SYMBOL_GPL(netdev_xmit_skip_txqueue); -#endif /* CONFIG_NET_EGRESS */ +#endif /* CONFIG_NET_XGRESS */ #ifdef CONFIG_XPS static int __get_xps_queue_idx(struct net_device *dev, struct sk_buff *skb, @@ -4169,9 +4300,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev) skb_update_prio(skb); qdisc_pkt_len_init(skb); -#ifdef CONFIG_NET_CLS_ACT - skb->tc_at_ingress = 0; -#endif + tcx_set_ingress(skb, false); #ifdef CONFIG_NET_EGRESS if (static_branch_unlikely(&egress_needed_key)) { if (nf_hook_egress_active()) { @@ -5103,72 +5232,6 @@ int (*br_fdb_test_addr_hook)(struct net_device *dev, EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook); #endif -static inline struct sk_buff * -sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, - struct net_device *orig_dev, bool *another) -{ -#ifdef CONFIG_NET_CLS_ACT - struct mini_Qdisc *miniq = rcu_dereference_bh(skb->dev->miniq_ingress); - struct tcf_result cl_res; - - /* If there's at least one ingress present somewhere (so - * we get here via enabled static key), remaining devices - * that are not configured with an ingress qdisc will bail - * out here. - */ - if (!miniq) - return skb; - - if (*pt_prev) { - *ret = deliver_skb(skb, *pt_prev, orig_dev); - *pt_prev = NULL; - } - - qdisc_skb_cb(skb)->pkt_len = skb->len; - tc_skb_cb(skb)->mru = 0; - tc_skb_cb(skb)->post_ct = false; - skb->tc_at_ingress = 1; - mini_qdisc_bstats_cpu_update(miniq, skb); - - switch (tcf_classify(skb, miniq->block, miniq->filter_list, &cl_res, false)) { - case TC_ACT_OK: - case TC_ACT_RECLASSIFY: - skb->tc_index = TC_H_MIN(cl_res.classid); - break; - case TC_ACT_SHOT: - mini_qdisc_qstats_cpu_drop(miniq); - kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS); - *ret = NET_RX_DROP; - return NULL; - case TC_ACT_STOLEN: - case TC_ACT_QUEUED: - case TC_ACT_TRAP: - consume_skb(skb); - *ret = NET_RX_SUCCESS; - return NULL; - case TC_ACT_REDIRECT: - /* skb_mac_header check was done by cls/act_bpf, so - * we can safely push the L2 header back before - * redirecting to another netdev - */ - __skb_push(skb, skb->mac_len); - if (skb_do_redirect(skb) == -EAGAIN) { - __skb_pull(skb, skb->mac_len); - *another = true; - break; - } - *ret = NET_RX_SUCCESS; - return NULL; - case TC_ACT_CONSUMED: - *ret = NET_RX_SUCCESS; - return NULL; - default: - break; - } -#endif /* CONFIG_NET_CLS_ACT */ - return skb; -} - /** * netdev_is_rx_handler_busy - check if receive handler is registered * @dev: device to check @@ -10873,7 +10936,7 @@ void unregister_netdevice_many_notify(struct list_head *head, /* Shutdown queueing discipline. */ dev_shutdown(dev); - + dev_tcx_uninstall(dev); dev_xdp_uninstall(dev); bpf_dev_bound_netdev_unregister(dev); diff --git a/net/core/filter.c b/net/core/filter.c index d25d52854c21..1ff9a0988ea6 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -9233,7 +9233,7 @@ static struct bpf_insn *bpf_convert_tstamp_read(const struct bpf_prog *prog, __u8 value_reg = si->dst_reg; __u8 skb_reg = si->src_reg; -#ifdef CONFIG_NET_CLS_ACT +#ifdef CONFIG_NET_XGRESS /* If the tstamp_type is read, * the bpf prog is aware the tstamp could have delivery time. * Thus, read skb->tstamp as is if tstamp_type_access is true. @@ -9267,7 +9267,7 @@ static struct bpf_insn *bpf_convert_tstamp_write(const struct bpf_prog *prog, __u8 value_reg = si->src_reg; __u8 skb_reg = si->dst_reg; -#ifdef CONFIG_NET_CLS_ACT +#ifdef CONFIG_NET_XGRESS /* If the tstamp_type is read, * the bpf prog is aware the tstamp could have delivery time. * Thus, write skb->tstamp as is if tstamp_type_access is true. diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 4b95cb1ac435..470c70deffe2 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -347,8 +347,7 @@ config NET_SCH_FQ_PIE config NET_SCH_INGRESS tristate "Ingress/classifier-action Qdisc" depends on NET_CLS_ACT - select NET_INGRESS - select NET_EGRESS + select NET_XGRESS help Say Y here if you want to use classifiers for incoming and/or outgoing packets. This qdisc doesn't do anything else besides running classifiers, @@ -679,6 +678,7 @@ config NET_EMATCH_IPT config NET_CLS_ACT bool "Actions" select NET_CLS + select NET_XGRESS help Say Y here if you want to use traffic control actions. Actions get attached to classifiers and are invoked after a successful diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c index 84838128b9c5..4af1360f537e 100644 --- a/net/sched/sch_ingress.c +++ b/net/sched/sch_ingress.c @@ -13,6 +13,7 @@ #include #include #include +#include struct ingress_sched_data { struct tcf_block *block; @@ -78,11 +79,18 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt, { struct ingress_sched_data *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); + struct bpf_mprog_entry *entry; + bool created; int err; net_inc_ingress_queue(); - mini_qdisc_pair_init(&q->miniqp, sch, &dev->miniq_ingress); + entry = dev_tcx_entry_fetch_or_create(dev, true, &created); + if (!entry) + return -ENOMEM; + mini_qdisc_pair_init(&q->miniqp, sch, &tcx_entry(entry)->miniq); + if (created) + tcx_entry_update(dev, entry, true); q->block_info.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS; q->block_info.chain_head_change = clsact_chain_head_change; @@ -93,15 +101,20 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt, return err; mini_qdisc_pair_block_init(&q->miniqp, q->block); - return 0; } static void ingress_destroy(struct Qdisc *sch) { struct ingress_sched_data *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + struct bpf_mprog_entry *entry = rtnl_dereference(dev->tcx_ingress); tcf_block_put_ext(q->block, sch, &q->block_info); + if (entry && !bpf_mprog_total(entry)) { + tcx_entry_update(dev, NULL, true); + bpf_mprog_free(entry); + } net_dec_ingress_queue(); } @@ -217,12 +230,19 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt, { struct clsact_sched_data *q = qdisc_priv(sch); struct net_device *dev = qdisc_dev(sch); + struct bpf_mprog_entry *entry; + bool created; int err; net_inc_ingress_queue(); net_inc_egress_queue(); - mini_qdisc_pair_init(&q->miniqp_ingress, sch, &dev->miniq_ingress); + entry = dev_tcx_entry_fetch_or_create(dev, true, &created); + if (!entry) + return -ENOMEM; + mini_qdisc_pair_init(&q->miniqp_ingress, sch, &tcx_entry(entry)->miniq); + if (created) + tcx_entry_update(dev, entry, true); q->ingress_block_info.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS; q->ingress_block_info.chain_head_change = clsact_chain_head_change; @@ -235,7 +255,12 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt, mini_qdisc_pair_block_init(&q->miniqp_ingress, q->ingress_block); - mini_qdisc_pair_init(&q->miniqp_egress, sch, &dev->miniq_egress); + entry = dev_tcx_entry_fetch_or_create(dev, false, &created); + if (!entry) + return -ENOMEM; + mini_qdisc_pair_init(&q->miniqp_egress, sch, &tcx_entry(entry)->miniq); + if (created) + tcx_entry_update(dev, entry, false); q->egress_block_info.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS; q->egress_block_info.chain_head_change = clsact_chain_head_change; @@ -247,9 +272,21 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt, static void clsact_destroy(struct Qdisc *sch) { struct clsact_sched_data *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + struct bpf_mprog_entry *ingress_entry = rtnl_dereference(dev->tcx_ingress); + struct bpf_mprog_entry *egress_entry = rtnl_dereference(dev->tcx_egress); tcf_block_put_ext(q->egress_block, sch, &q->egress_block_info); + if (egress_entry && !bpf_mprog_total(egress_entry)) { + tcx_entry_update(dev, NULL, false); + bpf_mprog_free(egress_entry); + } + tcf_block_put_ext(q->ingress_block, sch, &q->ingress_block_info); + if (ingress_entry && !bpf_mprog_total(ingress_entry)) { + tcx_entry_update(dev, NULL, true); + bpf_mprog_free(ingress_entry); + } net_dec_ingress_queue(); net_dec_egress_queue(); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 207f8a37b327..e7584e24bc83 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1035,6 +1035,8 @@ enum bpf_attach_type { BPF_TRACE_KPROBE_MULTI, BPF_LSM_CGROUP, BPF_STRUCT_OPS, + BPF_TCX_INGRESS, + BPF_TCX_EGRESS, __MAX_BPF_ATTACH_TYPE }; @@ -1052,7 +1054,7 @@ enum bpf_link_type { BPF_LINK_TYPE_KPROBE_MULTI = 8, BPF_LINK_TYPE_STRUCT_OPS = 9, BPF_LINK_TYPE_NETFILTER = 10, - + BPF_LINK_TYPE_TCX = 11, MAX_BPF_LINK_TYPE, }; @@ -1559,13 +1561,13 @@ union bpf_attr { __u32 map_fd; /* struct_ops to attach */ }; union { - __u32 target_fd; /* object to attach to */ - __u32 target_ifindex; /* target ifindex */ + __u32 target_fd; /* target object to attach to or ... */ + __u32 target_ifindex; /* target ifindex */ }; __u32 attach_type; /* attach type */ __u32 flags; /* extra flags */ union { - __u32 target_btf_id; /* btf_id of target to attach to */ + __u32 target_btf_id; /* btf_id of target to attach to */ struct { __aligned_u64 iter_info; /* extra bpf_iter_link_info */ __u32 iter_info_len; /* iter_info length */ @@ -1599,6 +1601,13 @@ union bpf_attr { __s32 priority; __u32 flags; } netfilter; + struct { + union { + __u32 relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; + } tcx; }; } link_create; @@ -6207,6 +6216,19 @@ struct bpf_sock_tuple { }; }; +/* (Simplified) user return codes for tcx prog type. + * A valid tcx program must return one of these defined values. All other + * return codes are reserved for future use. Must remain compatible with + * their TC_ACT_* counter-parts. For compatibility in behavior, unknown + * return codes are mapped to TCX_NEXT. + */ +enum tcx_action_base { + TCX_NEXT = -1, + TCX_PASS = 0, + TCX_DROP = 2, + TCX_REDIRECT = 7, +}; + struct bpf_xdp_sock { __u32 queue_id; }; @@ -6459,6 +6481,11 @@ struct bpf_link_info { __s32 priority; __u32 flags; } netfilter; + struct { + __u32 ifindex; + __u32 attach_type; + __u32 flags; + } tcx; }; } __attribute__((aligned(8))); From patchwork Wed Jun 7 19:26:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271177 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C055639234; Wed, 7 Jun 2023 19:26:41 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E8E61FD5; Wed, 7 Jun 2023 12:26:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=9bLt1bZoja3A6VdG8QxGXZX21kSsesIkv3TXCJcc660=; b=CnUb9WZpmBb2Z2v9m9FT2TFQZC A/ho/MPvlPVwupOw/zj1mkpWcQ1s1ngjAyS0biPCjWw1FQwFxJ329TQ/EGyC+Qq1NWeQwmYBeXvOg 6IpRmoE3eQeKCRlh7l7UtqmdnM2hwTpDiry+GZHkw4gheBT0crshzzeWqHhDyz+a1MkdNis5nnk4X wEYNxm+wtoZnH2O4btl1gwkpBGgN51FbxaNhBWIN/eZokFhBjfO+F5tSIlM8ctzCLSbDn35ZYig/y zhGD+1V0o8E3KqncS8sJRa8uXTJUsq7+tLX35Q+IciSyIGRQsd3J5H0K78ytPCAGqmW8PRE7Ck4o3 Ph+M8iIg==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynk-000CYB-8D; Wed, 07 Jun 2023 21:26:36 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 3/7] libbpf: Add opts-based attach/detach/query API for tcx Date: Wed, 7 Jun 2023 21:26:21 +0200 Message-Id: <20230607192625.22641-4-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Extend libbpf attach opts and add a new detach opts API so this can be used to add/remove fd-based tcx BPF programs. The old-style bpf_prog_detach and bpf_prog_detach2 APIs are refactored to reuse the detach opts internally. The bpf_prog_query_opts API got extended to be able to handle the new link_ids, link_attach_flags and revision fields. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann --- tools/lib/bpf/bpf.c | 78 ++++++++++++++++++++++------------------ tools/lib/bpf/bpf.h | 54 +++++++++++++++++++++------- tools/lib/bpf/libbpf.c | 6 ++++ tools/lib/bpf/libbpf.map | 1 + 4 files changed, 91 insertions(+), 48 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index ed86b37d8024..a3d1b7ebe224 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -629,11 +629,21 @@ int bpf_prog_attach(int prog_fd, int target_fd, enum bpf_attach_type type, return bpf_prog_attach_opts(prog_fd, target_fd, type, &opts); } -int bpf_prog_attach_opts(int prog_fd, int target_fd, - enum bpf_attach_type type, - const struct bpf_prog_attach_opts *opts) +int bpf_prog_detach(int target_fd, enum bpf_attach_type type) +{ + return bpf_prog_detach_opts(0, target_fd, type, NULL); +} + +int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type) { - const size_t attr_sz = offsetofend(union bpf_attr, replace_bpf_fd); + return bpf_prog_detach_opts(prog_fd, target_fd, type, NULL); +} + +int bpf_prog_attach_opts(int prog_fd, int target, + enum bpf_attach_type type, + const struct bpf_prog_attach_opts *opts) +{ + const size_t attr_sz = offsetofend(union bpf_attr, expected_revision); union bpf_attr attr; int ret; @@ -641,40 +651,35 @@ int bpf_prog_attach_opts(int prog_fd, int target_fd, return libbpf_err(-EINVAL); memset(&attr, 0, attr_sz); - attr.target_fd = target_fd; - attr.attach_bpf_fd = prog_fd; - attr.attach_type = type; - attr.attach_flags = OPTS_GET(opts, flags, 0); - attr.replace_bpf_fd = OPTS_GET(opts, replace_prog_fd, 0); + attr.target_fd = target; + attr.attach_bpf_fd = prog_fd; + attr.attach_type = type; + attr.attach_flags = OPTS_GET(opts, flags, 0); + attr.replace_bpf_fd = OPTS_GET(opts, relative_fd, 0); + attr.expected_revision = OPTS_GET(opts, expected_revision, 0); ret = sys_bpf(BPF_PROG_ATTACH, &attr, attr_sz); return libbpf_err_errno(ret); } -int bpf_prog_detach(int target_fd, enum bpf_attach_type type) +int bpf_prog_detach_opts(int prog_fd, int target, + enum bpf_attach_type type, + const struct bpf_prog_detach_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, replace_bpf_fd); + const size_t attr_sz = offsetofend(union bpf_attr, expected_revision); union bpf_attr attr; int ret; - memset(&attr, 0, attr_sz); - attr.target_fd = target_fd; - attr.attach_type = type; - - ret = sys_bpf(BPF_PROG_DETACH, &attr, attr_sz); - return libbpf_err_errno(ret); -} - -int bpf_prog_detach2(int prog_fd, int target_fd, enum bpf_attach_type type) -{ - const size_t attr_sz = offsetofend(union bpf_attr, replace_bpf_fd); - union bpf_attr attr; - int ret; + if (!OPTS_VALID(opts, bpf_prog_detach_opts)) + return libbpf_err(-EINVAL); memset(&attr, 0, attr_sz); - attr.target_fd = target_fd; - attr.attach_bpf_fd = prog_fd; - attr.attach_type = type; + attr.target_fd = target; + attr.attach_bpf_fd = prog_fd; + attr.attach_type = type; + attr.attach_flags = OPTS_GET(opts, flags, 0); + attr.replace_bpf_fd = OPTS_GET(opts, relative_fd, 0); + attr.expected_revision = OPTS_GET(opts, expected_revision, 0); ret = sys_bpf(BPF_PROG_DETACH, &attr, attr_sz); return libbpf_err_errno(ret); @@ -833,7 +838,7 @@ int bpf_iter_create(int link_fd) return libbpf_err_errno(fd); } -int bpf_prog_query_opts(int target_fd, +int bpf_prog_query_opts(int target, enum bpf_attach_type type, struct bpf_prog_query_opts *opts) { @@ -846,17 +851,20 @@ int bpf_prog_query_opts(int target_fd, memset(&attr, 0, attr_sz); - attr.query.target_fd = target_fd; - attr.query.attach_type = type; - attr.query.query_flags = OPTS_GET(opts, query_flags, 0); - attr.query.prog_cnt = OPTS_GET(opts, prog_cnt, 0); - attr.query.prog_ids = ptr_to_u64(OPTS_GET(opts, prog_ids, NULL)); - attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL)); + attr.query.target_fd = target; + attr.query.attach_type = type; + attr.query.query_flags = OPTS_GET(opts, query_flags, 0); + attr.query.count = OPTS_GET(opts, count, 0); + attr.query.prog_ids = ptr_to_u64(OPTS_GET(opts, prog_ids, NULL)); + attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL)); + attr.query.link_ids = ptr_to_u64(OPTS_GET(opts, link_ids, NULL)); + attr.query.link_attach_flags = ptr_to_u64(OPTS_GET(opts, link_attach_flags, NULL)); ret = sys_bpf(BPF_PROG_QUERY, &attr, attr_sz); OPTS_SET(opts, attach_flags, attr.query.attach_flags); - OPTS_SET(opts, prog_cnt, attr.query.prog_cnt); + OPTS_SET(opts, revision, attr.query.revision); + OPTS_SET(opts, count, attr.query.count); return libbpf_err_errno(ret); } diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 9aa0ee473754..480c584a6f7f 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -312,22 +312,43 @@ LIBBPF_API int bpf_obj_get(const char *pathname); LIBBPF_API int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts); -struct bpf_prog_attach_opts { - size_t sz; /* size of this struct for forward/backward compatibility */ - unsigned int flags; - int replace_prog_fd; -}; -#define bpf_prog_attach_opts__last_field replace_prog_fd - LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd, enum bpf_attach_type type, unsigned int flags); -LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int attachable_fd, - enum bpf_attach_type type, - const struct bpf_prog_attach_opts *opts); LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type); LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd, enum bpf_attach_type type); +struct bpf_prog_attach_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + __u32 flags; + union { + int replace_prog_fd; + int replace_fd; + int relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; +}; +#define bpf_prog_attach_opts__last_field expected_revision + +struct bpf_prog_detach_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + __u32 flags; + union { + int relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; +}; +#define bpf_prog_detach_opts__last_field expected_revision + +LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int target, + enum bpf_attach_type type, + const struct bpf_prog_attach_opts *opts); +LIBBPF_API int bpf_prog_detach_opts(int prog_fd, int target, + enum bpf_attach_type type, + const struct bpf_prog_detach_opts *opts); + union bpf_iter_link_info; /* defined in up-to-date linux/bpf.h */ struct bpf_link_create_opts { size_t sz; /* size of this struct for forward/backward compatibility */ @@ -489,14 +510,21 @@ struct bpf_prog_query_opts { __u32 query_flags; __u32 attach_flags; /* output argument */ __u32 *prog_ids; - __u32 prog_cnt; /* input+output argument */ + union { + __u32 prog_cnt; /* input+output argument */ + __u32 count; + }; __u32 *prog_attach_flags; + __u32 *link_ids; + __u32 *link_attach_flags; + __u32 revision; }; -#define bpf_prog_query_opts__last_field prog_attach_flags +#define bpf_prog_query_opts__last_field revision -LIBBPF_API int bpf_prog_query_opts(int target_fd, +LIBBPF_API int bpf_prog_query_opts(int target, enum bpf_attach_type type, struct bpf_prog_query_opts *opts); + LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 47632606b06d..b89127471c6a 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -117,6 +117,8 @@ static const char * const attach_type_name[] = { [BPF_PERF_EVENT] = "perf_event", [BPF_TRACE_KPROBE_MULTI] = "trace_kprobe_multi", [BPF_STRUCT_OPS] = "struct_ops", + [BPF_TCX_INGRESS] = "tcx_ingress", + [BPF_TCX_EGRESS] = "tcx_egress", }; static const char * const link_type_name[] = { @@ -8669,6 +8671,10 @@ static const struct bpf_sec_def section_defs[] = { SEC_DEF("kretsyscall+", KPROBE, 0, SEC_NONE, attach_ksyscall), SEC_DEF("usdt+", KPROBE, 0, SEC_NONE, attach_usdt), SEC_DEF("tc", SCHED_CLS, 0, SEC_NONE), + SEC_DEF("tc/ingress", SCHED_CLS, BPF_TCX_INGRESS, SEC_ATTACHABLE_OPT), + SEC_DEF("tc/egress", SCHED_CLS, BPF_TCX_EGRESS, SEC_ATTACHABLE_OPT), + SEC_DEF("tcx/ingress", SCHED_CLS, BPF_TCX_INGRESS, SEC_ATTACHABLE_OPT), + SEC_DEF("tcx/egress", SCHED_CLS, BPF_TCX_EGRESS, SEC_ATTACHABLE_OPT), SEC_DEF("classifier", SCHED_CLS, 0, SEC_NONE), SEC_DEF("action", SCHED_ACT, 0, SEC_NONE), SEC_DEF("tracepoint+", TRACEPOINT, 0, SEC_NONE, attach_tp), diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 7521a2fb7626..a29b90e9713c 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -395,4 +395,5 @@ LIBBPF_1.2.0 { LIBBPF_1.3.0 { global: bpf_obj_pin_opts; + bpf_prog_detach_opts; } LIBBPF_1.2.0; From patchwork Wed Jun 7 19:26:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271178 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D433C39240; Wed, 7 Jun 2023 19:26:41 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBACF1FED; Wed, 7 Jun 2023 12:26:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=Zv6v0Ud4Iqtw/hZL8+oK2LwXWr90kM8vVAzAu9mdXKA=; b=UaR5c526P8qi/w++8oNHUn4HQ+ aNmep+Unw0dvccW5ivpaPvh+U1S/KMBIvACCAORFy+htJaKZQsiBj/atcZHY9YO9L37vkClDVPyVJ INY6R9dCghVgCBKDxxsIsTPImx5sJlZlo83ccQje9ne4nuxwhDY9CSbiC8hphc+AULl6nVaLIYr3h 9EZk9+mzGOTPfO7oLSJ9OWUJybNnXQ+er7mmsz+f2Up9aon5AnZ05GcZoxRXTlZMN72od1NMHYJXc y4caG7MitKMNKA33lDFL5QLDaDITLzt7V3ZtfPrIbQCyuh+dktoLsJrabLwsGoqtpTbpNAWAOePnW iIQWmxGQ==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynl-000CYN-0R; Wed, 07 Jun 2023 21:26:37 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 4/7] libbpf: Add link-based API for tcx Date: Wed, 7 Jun 2023 21:26:22 +0200 Message-Id: <20230607192625.22641-5-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Implement tcx BPF link support for libbpf. The bpf_program__attach_fd_opts() API has been refactored slightly in order to pass bpf_link_create_opts pointer as input. A new bpf_program__attach_tcx_opts() has been added on top of this which allows for passing all relevant data via extensible struct bpf_tcx_opts. The program sections tcx/ingress and tcx/egress correspond to the hook locations for tc ingress and egress, respectively. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann --- tools/lib/bpf/bpf.c | 5 +++++ tools/lib/bpf/bpf.h | 7 +++++++ tools/lib/bpf/libbpf.c | 44 +++++++++++++++++++++++++++++++++++----- tools/lib/bpf/libbpf.h | 17 ++++++++++++++++ tools/lib/bpf/libbpf.map | 1 + 5 files changed, 69 insertions(+), 5 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index a3d1b7ebe224..c340d3cbc6bd 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -746,6 +746,11 @@ int bpf_link_create(int prog_fd, int target_fd, if (!OPTS_ZEROED(opts, tracing)) return libbpf_err(-EINVAL); break; + case BPF_TCX_INGRESS: + case BPF_TCX_EGRESS: + attr.link_create.tcx.relative_fd = OPTS_GET(opts, tcx.relative_fd, 0); + attr.link_create.tcx.expected_revision = OPTS_GET(opts, tcx.expected_revision, 0); + break; default: if (!OPTS_ZEROED(opts, flags)) return libbpf_err(-EINVAL); diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 480c584a6f7f..12591516dca0 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -370,6 +370,13 @@ struct bpf_link_create_opts { struct { __u64 cookie; } tracing; + struct { + union { + __u32 relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; + } tcx; }; size_t :0; }; diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index b89127471c6a..d7b6ff49f02e 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -133,6 +133,7 @@ static const char * const link_type_name[] = { [BPF_LINK_TYPE_KPROBE_MULTI] = "kprobe_multi", [BPF_LINK_TYPE_STRUCT_OPS] = "struct_ops", [BPF_LINK_TYPE_NETFILTER] = "netfilter", + [BPF_LINK_TYPE_TCX] = "tcx", }; static const char * const map_type_name[] = { @@ -11685,11 +11686,10 @@ static int attach_lsm(const struct bpf_program *prog, long cookie, struct bpf_li } static struct bpf_link * -bpf_program__attach_fd(const struct bpf_program *prog, int target_fd, int btf_id, - const char *target_name) +bpf_program__attach_fd_opts(const struct bpf_program *prog, + const struct bpf_link_create_opts *opts, + int target_fd, const char *target_name) { - DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts, - .target_btf_id = btf_id); enum bpf_attach_type attach_type; char errmsg[STRERR_BUFSIZE]; struct bpf_link *link; @@ -11707,7 +11707,7 @@ bpf_program__attach_fd(const struct bpf_program *prog, int target_fd, int btf_id link->detach = &bpf_link__detach_fd; attach_type = bpf_program__expected_attach_type(prog); - link_fd = bpf_link_create(prog_fd, target_fd, attach_type, &opts); + link_fd = bpf_link_create(prog_fd, target_fd, attach_type, opts); if (link_fd < 0) { link_fd = -errno; free(link); @@ -11720,6 +11720,17 @@ bpf_program__attach_fd(const struct bpf_program *prog, int target_fd, int btf_id return link; } +static struct bpf_link * +bpf_program__attach_fd(const struct bpf_program *prog, int target_fd, int btf_id, + const char *target_name) +{ + LIBBPF_OPTS(bpf_link_create_opts, opts, + .target_btf_id = btf_id, + ); + + return bpf_program__attach_fd_opts(prog, &opts, target_fd, target_name); +} + struct bpf_link * bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd) { @@ -11738,6 +11749,29 @@ struct bpf_link *bpf_program__attach_xdp(const struct bpf_program *prog, int ifi return bpf_program__attach_fd(prog, ifindex, 0, "xdp"); } +struct bpf_link * +bpf_program__attach_tcx_opts(const struct bpf_program *prog, + const struct bpf_tcx_opts *opts) +{ + LIBBPF_OPTS(bpf_link_create_opts, link_create_opts); + int ifindex = OPTS_GET(opts, ifindex, 0); + + if (!OPTS_VALID(opts, bpf_tcx_opts)) + return libbpf_err_ptr(-EINVAL); + if (!ifindex) { + pr_warn("prog '%s': target netdevice ifindex cannot be zero\n", + prog->name); + return libbpf_err_ptr(-EINVAL); + } + + link_create_opts.tcx.expected_revision = OPTS_GET(opts, expected_revision, 0); + link_create_opts.tcx.relative_fd = OPTS_GET(opts, relative_fd, 0); + link_create_opts.flags = OPTS_GET(opts, flags, 0); + + /* target_fd/target_ifindex use the same field in LINK_CREATE */ + return bpf_program__attach_fd_opts(prog, &link_create_opts, ifindex, "tc"); +} + struct bpf_link *bpf_program__attach_freplace(const struct bpf_program *prog, int target_fd, const char *attach_func_name) diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 754da73c643b..8ffba0f67c60 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -718,6 +718,23 @@ LIBBPF_API struct bpf_link * bpf_program__attach_freplace(const struct bpf_program *prog, int target_fd, const char *attach_func_name); +struct bpf_tcx_opts { + /* size of this struct, for forward/backward compatibility */ + size_t sz; + int ifindex; + __u32 flags; + union { + __u32 relative_fd; + __u32 relative_id; + }; + __u32 expected_revision; +}; +#define bpf_tcx_opts__last_field expected_revision + +LIBBPF_API struct bpf_link * +bpf_program__attach_tcx_opts(const struct bpf_program *prog, + const struct bpf_tcx_opts *opts); + struct bpf_map; LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index a29b90e9713c..f66b714512c2 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -396,4 +396,5 @@ LIBBPF_1.3.0 { global: bpf_obj_pin_opts; bpf_prog_detach_opts; + bpf_program__attach_tcx_opts; } LIBBPF_1.2.0; From patchwork Wed Jun 7 19:26:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271180 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B32DC3AE57; Wed, 7 Jun 2023 19:26:42 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A090E1FE5; Wed, 7 Jun 2023 12:26:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=wImN2gNVIaAxBr996GcZPrEDvIGGjR73g61EtbQB69U=; b=LCNOuM4DxTMnT32nkJASTTlz+P 9YxDZpTbYNdT3lGocJol9k+EG8p5fxcTcjWqNl14fAUZO/13xUtv9Py4r09TQdKr5q+vbNJIhrjcD vG+uCHb8QUs7dphlqojp50MbQmFhdChk0Exfnb2hQdnn2awWltfji42bJOwg4nJO2HS15mZfS8kaU Tqh57W/Jr3M7UGuv9ZwAnJcoIH+XMkKleHow/9ffinLa23RAG233cZMrDaz4L1lyL2TZGLQTDVI65 VYfut8Sk5pwmAR3rlH5OdYbcWBiLQmpwm8nljbvgdB3RqkRiII60/9IGyEMUaoWwCxsx0s9oTSCC4 NuLtavRA==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynl-000CYc-N1; Wed, 07 Jun 2023 21:26:37 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 5/7] bpftool: Extend net dump with tcx progs Date: Wed, 7 Jun 2023 21:26:23 +0200 Message-Id: <20230607192625.22641-6-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Add support to dump fd-based attach types via bpftool. This includes both the tc BPF link and attach ops programs. Dumped information contain the attach location, function entry name, program ID and link ID when applicable. Example with tc BPF link: # ./bpftool net xdp: tc: bond0(4) bpf/ingress cil_from_netdev prog id 784 link id 10 bond0(4) bpf/egress cil_to_netdev prog id 804 link id 11 flow_dissector: netfilter: Example with tc BPF attach ops: # ./bpftool net xdp: tc: bond0(4) bpf/ingress cil_from_netdev prog id 654 bond0(4) bpf/egress cil_to_netdev prog id 672 flow_dissector: netfilter: Signed-off-by: Daniel Borkmann --- tools/bpf/bpftool/net.c | 92 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 4 deletions(-) diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c index 26a49965bf71..b23b346b3ae9 100644 --- a/tools/bpf/bpftool/net.c +++ b/tools/bpf/bpftool/net.c @@ -76,6 +76,11 @@ static const char * const attach_type_strings[] = { [NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload", }; +static const char * const attach_loc_strings[] = { + [BPF_TCX_INGRESS] = "bpf/ingress", + [BPF_TCX_EGRESS] = "bpf/egress", +}; + const size_t net_attach_type_size = ARRAY_SIZE(attach_type_strings); static enum net_attach_type parse_attach_type(const char *str) @@ -422,8 +427,86 @@ static int dump_filter_nlmsg(void *cookie, void *msg, struct nlattr **tb) filter_info->devname, filter_info->ifindex); } -static int show_dev_tc_bpf(int sock, unsigned int nl_pid, - struct ip_devname_ifindex *dev) +static const char *flags_strings(__u32 flags) +{ + if (flags == (BPF_F_FIRST | BPF_F_LAST)) + return json_output ? "first,last" : " first last"; + if (flags & BPF_F_FIRST) + return json_output ? "first" : " first"; + if (flags & BPF_F_LAST) + return json_output ? "last" : " last"; + return json_output ? "none" : ""; +} + +static int __show_dev_tc_bpf_name(__u32 id, char *name, size_t len) +{ + struct bpf_prog_info info = {}; + __u32 ilen = sizeof(info); + int fd, ret; + + fd = bpf_prog_get_fd_by_id(id); + if (fd < 0) + return fd; + ret = bpf_obj_get_info_by_fd(fd, &info, &ilen); + if (ret < 0) + goto out; + ret = -ENOENT; + if (info.name[0]) { + get_prog_full_name(&info, fd, name, len); + ret = 0; + } +out: + close(fd); + return ret; +} + +static void __show_dev_tc_bpf(const struct ip_devname_ifindex *dev, + const enum bpf_attach_type loc) +{ + __u32 prog_flags[64] = {}, link_flags[64] = {}, i; + __u32 prog_ids[64] = {}, link_ids[64] = {}; + LIBBPF_OPTS(bpf_prog_query_opts, optq); + char prog_name[MAX_PROG_FULL_NAME]; + int ret; + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + optq.count = ARRAY_SIZE(prog_ids); + + ret = bpf_prog_query_opts(dev->ifindex, loc, &optq); + if (ret) + return; + for (i = 0; i < optq.count; i++) { + NET_START_OBJECT; + NET_DUMP_STR("devname", "%s", dev->devname); + NET_DUMP_UINT("ifindex", "(%u)", dev->ifindex); + NET_DUMP_STR("kind", " %s", attach_loc_strings[loc]); + ret = __show_dev_tc_bpf_name(prog_ids[i], prog_name, + sizeof(prog_name)); + if (!ret) + NET_DUMP_STR("name", " %s", prog_name); + NET_DUMP_UINT("prog_id", " prog id %u", prog_ids[i]); + if (prog_flags[i]) + NET_DUMP_STR("prog_flags", "%s", flags_strings(prog_flags[i])); + if (link_ids[i]) + NET_DUMP_UINT("link_id", " link id %u", + link_ids[i]); + if (link_flags[i]) + NET_DUMP_STR("link_flags", "%s", flags_strings(link_flags[i])); + NET_END_OBJECT_FINAL; + } +} + +static void show_dev_tc_bpf(struct ip_devname_ifindex *dev) +{ + __show_dev_tc_bpf(dev, BPF_TCX_INGRESS); + __show_dev_tc_bpf(dev, BPF_TCX_EGRESS); +} + +static int show_dev_tc_bpf_classic(int sock, unsigned int nl_pid, + struct ip_devname_ifindex *dev) { struct bpf_filter_t filter_info; struct bpf_tcinfo_t tcinfo; @@ -790,8 +873,9 @@ static int do_show(int argc, char **argv) if (!ret) { NET_START_ARRAY("tc", "%s:\n"); for (i = 0; i < dev_array.used_len; i++) { - ret = show_dev_tc_bpf(sock, nl_pid, - &dev_array.devices[i]); + show_dev_tc_bpf(&dev_array.devices[i]); + ret = show_dev_tc_bpf_classic(sock, nl_pid, + &dev_array.devices[i]); if (ret) break; } From patchwork Wed Jun 7 19:26:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271182 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BBC83AE57; Wed, 7 Jun 2023 19:26:46 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41A5C1FDE; Wed, 7 Jun 2023 12:26:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=y6I+EKAod5YlEaCvByoHgFnYaL9T050pxKm0cfFgGKA=; b=BQxQPeXZDjJEWuKr7+4Hi6KdG8 Iq6Wot+6Ds3B8jVul2ed5bK1qlkZgOu8qzSiG/SFwQD5qT68AxVwSMsNNlNsNnBaG5UJxhGLr6bKV UbSPXV1KeI1lSIS6uoIFLdli9HYjZ7a2wWJxC3JUgpyHvc70aqU1YO8/HMsPBv+LPm3tQff3y+Gxt LRysXnDsbCWi9DBlFCm2DbiL+3qzygP13Qkht2yNd0wYyJo5CA/k9ngdp6mj+mov7YLCG+zjG2gxx aCU+VL8upiRGezY7DZ7IyIUn8lQ04UNmdFNTgr7AgCBmPyzsFDpwplQAu48QEWU7Lov4aZvlv06S3 PIYbXITA==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynm-000CYk-Dv; Wed, 07 Jun 2023 21:26:38 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 6/7] selftests/bpf: Add mprog API tests for BPF tcx opts Date: Wed, 7 Jun 2023 21:26:24 +0200 Message-Id: <20230607192625.22641-7-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Add a big batch of test coverage to assert all aspects of the tcx opts attach, detach and query API: # ./vmtest.sh -- ./test_progs -t tc_opts [...] #237 tc_opts_after:OK #238 tc_opts_append:OK #239 tc_opts_basic:OK #240 tc_opts_before:OK #241 tc_opts_both:OK #242 tc_opts_chain_classic:OK #243 tc_opts_demixed:OK #244 tc_opts_detach:OK #245 tc_opts_detach_after:OK #246 tc_opts_detach_before:OK #247 tc_opts_dev_cleanup:OK #248 tc_opts_first:OK #249 tc_opts_invalid:OK #250 tc_opts_last:OK #251 tc_opts_mixed:OK #252 tc_opts_prepend:OK #253 tc_opts_replace:OK #254 tc_opts_revision:OK Summary: 18/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann --- .../selftests/bpf/prog_tests/tc_helpers.h | 72 + .../selftests/bpf/prog_tests/tc_opts.c | 2698 +++++++++++++++++ .../selftests/bpf/progs/test_tc_link.c | 40 + 3 files changed, 2810 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_helpers.h create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_opts.c create mode 100644 tools/testing/selftests/bpf/progs/test_tc_link.c diff --git a/tools/testing/selftests/bpf/prog_tests/tc_helpers.h b/tools/testing/selftests/bpf/prog_tests/tc_helpers.h new file mode 100644 index 000000000000..6c93215be8a3 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/tc_helpers.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2023 Isovalent */ +#ifndef TC_HELPERS +#define TC_HELPERS +#include + +static inline __u32 id_from_prog_fd(int fd) +{ + struct bpf_prog_info prog_info = {}; + __u32 prog_info_len = sizeof(prog_info); + int err; + + err = bpf_obj_get_info_by_fd(fd, &prog_info, &prog_info_len); + if (!ASSERT_OK(err, "id_from_prog_fd")) + return 0; + + ASSERT_NEQ(prog_info.id, 0, "prog_info.id"); + return prog_info.id; +} + +static inline __u32 id_from_link_fd(int fd) +{ + struct bpf_link_info link_info = {}; + __u32 link_info_len = sizeof(link_info); + int err; + + err = bpf_link_get_info_by_fd(fd, &link_info, &link_info_len); + if (!ASSERT_OK(err, "id_from_link_fd")) + return 0; + + ASSERT_NEQ(link_info.id, 0, "link_info.id"); + return link_info.id; +} + +static inline __u32 ifindex_from_link_fd(int fd) +{ + struct bpf_link_info link_info = {}; + __u32 link_info_len = sizeof(link_info); + int err; + + err = bpf_link_get_info_by_fd(fd, &link_info, &link_info_len); + if (!ASSERT_OK(err, "id_from_link_fd")) + return 0; + + return link_info.tcx.ifindex; +} + +static inline void __assert_mprog_count(int target, int expected, bool miniq, int ifindex) +{ + __u32 count = 0, attach_flags = 0; + int err; + + err = bpf_prog_query(ifindex, target, 0, &attach_flags, + NULL, &count); + ASSERT_EQ(count, expected, "count"); + if (!expected && !miniq) + ASSERT_EQ(err, -ENOENT, "prog_query"); + else + ASSERT_EQ(err, 0, "prog_query"); +} + +static inline void assert_mprog_count(int target, int expected) +{ + __assert_mprog_count(target, expected, false, loopback); +} + +static inline void assert_mprog_count_ifindex(int ifindex, int target, int expected) +{ + __assert_mprog_count(target, expected, false, ifindex); +} + +#endif /* TC_HELPERS */ diff --git a/tools/testing/selftests/bpf/prog_tests/tc_opts.c b/tools/testing/selftests/bpf/prog_tests/tc_opts.c new file mode 100644 index 000000000000..273521ca364a --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/tc_opts.c @@ -0,0 +1,2698 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ +#include +#include +#include + +#define loopback 1 +#define ping_cmd "ping -q -c1 -w1 127.0.0.1 > /dev/null" + +#include "test_tc_link.skel.h" +#include "tc_helpers.h" + +/* Test: + * + * Basic test which attaches a prog to ingress/egress, validates + * that the prog got attached, runs traffic through the programs, + * validates that traffic has been seen, and detaches everything + * again. Programs are attached without special flags. + */ +void serial_test_tc_opts_basic(void) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, id1, id2; + struct test_tc_link *skel; + __u32 prog_ids[2]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + + assert_mprog_count(BPF_TCX_INGRESS, 0); + assert_mprog_count(BPF_TCX_EGRESS, 0); + + ASSERT_EQ(skel->bss->seen_tc1, false, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + + err = bpf_prog_attach_opts(fd1, loopback, BPF_TCX_INGRESS, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(BPF_TCX_INGRESS, 1); + assert_mprog_count(BPF_TCX_EGRESS, 0); + + optq.prog_ids = prog_ids; + + memset(prog_ids, 0, sizeof(prog_ids)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, BPF_TCX_INGRESS, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_in; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + + err = bpf_prog_attach_opts(fd2, loopback, BPF_TCX_EGRESS, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_in; + + assert_mprog_count(BPF_TCX_INGRESS, 1); + assert_mprog_count(BPF_TCX_EGRESS, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, BPF_TCX_EGRESS, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_eg; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + +cleanup_eg: + err = bpf_prog_detach_opts(fd2, loopback, BPF_TCX_EGRESS, &optd); + ASSERT_OK(err, "prog_detach_eg"); + + assert_mprog_count(BPF_TCX_INGRESS, 1); + assert_mprog_count(BPF_TCX_EGRESS, 0); + +cleanup_in: + err = bpf_prog_detach_opts(fd1, loopback, BPF_TCX_INGRESS, &optd); + ASSERT_OK(err, "prog_detach_in"); + + assert_mprog_count(BPF_TCX_INGRESS, 0); + assert_mprog_count(BPF_TCX_EGRESS, 0); + +cleanup: + test_tc_link__destroy(skel); +} + +static void test_tc_opts_first_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, id1, id2; + struct test_tc_link *skel; + __u32 prog_ids[3]; + __u32 prog_flags[3]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + opta.flags = BPF_F_FIRST; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + opta.flags = BPF_F_FIRST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE | BPF_F_ID; + opta.relative_id = id1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + optd.flags = 0; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + if (!ASSERT_OK(err, "prog_detach")) + goto cleanup_target; + + assert_mprog_count(target, 1); + + opta.flags = BPF_F_LAST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], BPF_F_LAST, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup_target2: + optd.flags = 0; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 1); +cleanup_target: + optd.flags = BPF_F_FIRST; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with first flag set, + * validates that the prog got attached, other attach attempts for + * this position should fail. Regular attach attempts or with last + * flag set should succeed. Detach everything again. + */ +void serial_test_tc_opts_first(void) +{ + test_tc_opts_first_target(BPF_TCX_INGRESS); + test_tc_opts_first_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_last_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, id1, id2; + struct test_tc_link *skel; + __u32 prog_ids[3]; + __u32 prog_flags[3]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + opta.flags = BPF_F_LAST; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_LAST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + opta.flags = BPF_F_LAST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_AFTER; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_AFTER | BPF_F_ID; + opta.relative_id = id1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], BPF_F_LAST, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + optd.flags = 0; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + if (!ASSERT_OK(err, "prog_detach")) + goto cleanup_target; + + assert_mprog_count(target, 1); + + opta.flags = BPF_F_FIRST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], BPF_F_LAST, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup_target2: + optd.flags = 0; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 1); + +cleanup_target: + optd.flags = BPF_F_LAST; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with last flag set, + * validates that the prog got attached, other attach attempts for + * this position should fail. Regular attach attempts or with first + * flag set should succeed. Detach everything again. + */ +void serial_test_tc_opts_last(void) +{ + test_tc_opts_last_target(BPF_TCX_INGRESS); + test_tc_opts_last_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_both_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, id1, id2, detach_fd; + struct test_tc_link *skel; + __u32 prog_ids[3]; + __u32 prog_flags[3]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + detach_fd = fd1; + + opta.flags = BPF_F_FIRST | BPF_F_LAST; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST | BPF_F_LAST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + opta.flags = BPF_F_LAST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_FIRST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_AFTER; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_AFTER | BPF_F_ID; + opta.relative_id = id1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE | BPF_F_ID; + opta.relative_id = id1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_FIRST | BPF_F_LAST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_FIRST | BPF_F_LAST | BPF_F_REPLACE; + opta.replace_fd = fd1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 1); + + detach_fd = fd2; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST | BPF_F_LAST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + +cleanup_target: + optd.flags = BPF_F_FIRST | BPF_F_LAST; + err = bpf_prog_detach_opts(detach_fd, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with first and last + * flag set, validates that the prog got attached, other attach + * attempts should fail. Replace should work. Detach everything + * again. + */ +void serial_test_tc_opts_both(void) +{ + test_tc_opts_both_target(BPF_TCX_INGRESS); + test_tc_opts_both_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_before_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + opta.flags = BPF_F_BEFORE; + opta.relative_fd = fd2; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + opta.flags = BPF_F_BEFORE | BPF_F_ID; + opta.relative_id = id1; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target3; + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id4, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id3, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id2, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); + +cleanup_target4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup_target3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup_target2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup_target: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with before flag + * set, validates that the prog got attached in the right location. + * The first test inserts in the middle, then we insert to the front. + * Detach everything again. + */ +void serial_test_tc_opts_before(void) +{ + test_tc_opts_before_target(BPF_TCX_INGRESS); + test_tc_opts_before_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_after_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + opta.flags = BPF_F_AFTER; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + opta.flags = BPF_F_AFTER | BPF_F_ID; + opta.relative_id = id2; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target3; + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id2, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id4, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); + +cleanup_target4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target3; + + ASSERT_EQ(optq.count, 3, "count"); + ASSERT_EQ(optq.revision, 6, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id2, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], 0, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + +cleanup_target3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 7, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + +cleanup_target2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 8, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + +cleanup_target: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with after flag + * set, validates that the prog got attached in the right location. + * The first test inserts in the middle, then we insert to the end. + * Detach everything again. + */ +void serial_test_tc_opts_after(void) +{ + test_tc_opts_after_target(BPF_TCX_INGRESS); + test_tc_opts_after_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_revision_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, id1, id2; + struct test_tc_link *skel; + __u32 prog_ids[3]; + __u32 prog_flags[3]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + opta.expected_revision = 1; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.expected_revision = 1; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, -ESTALE, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.expected_revision = 2; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + + optd.expected_revision = 2; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_EQ(err, -ESTALE, "prog_detach"); + assert_mprog_count(target, 2); + +cleanup_target2: + optd.expected_revision = 3; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); + +cleanup_target: + optd.expected_revision = 0; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with revision count + * set, validates that the prog got attached and validate that + * when the count mismatches that the operation bails out. Detach + * everything again. + */ +void serial_test_tc_opts_revision(void) +{ + test_tc_opts_revision_target(BPF_TCX_INGRESS); + test_tc_opts_revision_target(BPF_TCX_EGRESS); +} + +static void test_tc_chain_classic(int target, bool chain_tc_old) +{ + LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1); + LIBBPF_OPTS(bpf_tc_hook, tc_hook, .ifindex = loopback); + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + bool hook_created = false, tc_attached = false; + __u32 fd1, fd2, fd3, id1, id2, id3; + struct test_tc_link *skel; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + if (chain_tc_old) { + tc_hook.attach_point = target == BPF_TCX_INGRESS ? + BPF_TC_INGRESS : BPF_TC_EGRESS; + err = bpf_tc_hook_create(&tc_hook); + if (err == 0) + hook_created = true; + err = err == -EEXIST ? 0 : err; + if (!ASSERT_OK(err, "bpf_tc_hook_create")) + goto cleanup; + + tc_opts.prog_fd = fd3; + err = bpf_tc_attach(&tc_hook, &tc_opts); + if (!ASSERT_OK(err, "bpf_tc_attach")) + goto cleanup; + tc_attached = true; + } + + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_detach; + + assert_mprog_count(target, 2); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, chain_tc_old, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + if (!ASSERT_OK(err, "prog_detach")) + goto cleanup_detach; + + assert_mprog_count(target, 1); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, chain_tc_old, "seen_tc3"); + +cleanup_detach: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + if (!ASSERT_OK(err, "prog_detach")) + goto cleanup; + + __assert_mprog_count(target, 0, chain_tc_old, loopback); +cleanup: + if (tc_attached) { + tc_opts.flags = tc_opts.prog_fd = tc_opts.prog_id = 0; + err = bpf_tc_detach(&tc_hook, &tc_opts); + ASSERT_OK(err, "bpf_tc_detach"); + } + if (hook_created) { + tc_hook.attach_point = BPF_TC_INGRESS | BPF_TC_EGRESS; + bpf_tc_hook_destroy(&tc_hook); + } + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches two progs to ingress/egress through the new + * API and one prog via classic API. Traffic runs through and it + * validates that the program has been executed. One of the two + * progs gets removed and test is rerun again. Detach everything + * at the end. + */ +void serial_test_tc_opts_chain_classic(void) +{ + test_tc_chain_classic(BPF_TCX_INGRESS, false); + test_tc_chain_classic(BPF_TCX_EGRESS, false); + test_tc_chain_classic(BPF_TCX_INGRESS, true); + test_tc_chain_classic(BPF_TCX_EGRESS, true); +} + +static void test_tc_opts_replace_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, id1, id2, id3, detach_fd; + struct test_tc_link *skel; + __u32 prog_ids[4]; + __u32 prog_flags[4]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + opta.expected_revision = 1; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE | BPF_F_FIRST | BPF_F_ID; + opta.relative_id = id1; + opta.expected_revision = 2; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + detach_fd = fd2; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = fd2; + opta.expected_revision = 3; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + detach_fd = fd3; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 4, "revision"); + ASSERT_EQ(optq.prog_ids[0], id3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + opta.flags = BPF_F_FIRST | BPF_F_REPLACE; + opta.replace_fd = fd3; + opta.expected_revision = 4; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + + opta.flags = BPF_F_LAST | BPF_F_REPLACE; + opta.replace_fd = fd3; + opta.expected_revision = 5; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + ASSERT_EQ(err, -EACCES, "prog_attach"); + assert_mprog_count(target, 2); + + opta.flags = BPF_F_FIRST | BPF_F_LAST | BPF_F_REPLACE; + opta.replace_fd = fd3; + opta.expected_revision = 5; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + ASSERT_EQ(err, -EACCES, "prog_attach"); + assert_mprog_count(target, 2); + + optd.flags = BPF_F_FIRST | BPF_F_BEFORE | BPF_F_ID; + optd.relative_id = id1; + optd.expected_revision = 5; +cleanup_target2: + err = bpf_prog_detach_opts(detach_fd, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); + +cleanup_target: + optd.flags = 0; + optd.relative_id = 0; + optd.expected_revision = 0; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress and validates + * replacement in combination with various flags. Similar for + * later detachment. + */ +void serial_test_tc_opts_replace(void) +{ + test_tc_opts_replace_target(BPF_TCX_INGRESS); + test_tc_opts_replace_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_invalid_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + __u32 fd1, fd2, id1, id2; + struct test_tc_link *skel; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_BEFORE | BPF_F_AFTER; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_BEFORE | BPF_F_ID; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -ENOENT, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_AFTER | BPF_F_ID; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -ENOENT, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_LAST | BPF_F_BEFORE; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_FIRST | BPF_F_AFTER; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_FIRST | BPF_F_LAST; + opta.relative_fd = fd2; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = 0; + opta.relative_fd = fd2; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_BEFORE | BPF_F_AFTER; + opta.relative_fd = fd2; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_ID; + opta.relative_id = id2; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EINVAL, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_BEFORE; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -ENOENT, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = BPF_F_AFTER; + opta.relative_fd = fd1; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -ENOENT, "prog_attach"); + assert_mprog_count(target, 0); + + opta.flags = 0; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_LAST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + assert_mprog_count(target, 1); + + opta.flags = BPF_F_FIRST; + opta.relative_fd = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + assert_mprog_count(target, 1); + + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test invalid flag combinations when attaching/detaching a + * program. + */ +void serial_test_tc_opts_invalid(void) +{ + test_tc_opts_invalid_target(BPF_TCX_INGRESS); + test_tc_opts_invalid_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_prepend_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = BPF_F_BEFORE; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + opta.flags = BPF_F_FIRST; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + opta.flags = BPF_F_BEFORE; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target3; + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], BPF_F_FIRST, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id4, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id2, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id1, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); + +cleanup_target4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup_target3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup_target2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup_target: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with before flag + * set and no fd/id, validates prepend behavior that the prog got + * attached in the right location. Detaches everything. + */ +void serial_test_tc_opts_prepend(void) +{ + test_tc_opts_prepend_target(BPF_TCX_INGRESS); + test_tc_opts_prepend_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_append_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = BPF_F_AFTER; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target; + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target2; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + opta.flags = BPF_F_LAST; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target2; + + opta.flags = BPF_F_AFTER; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup_target3; + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup_target4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id3, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], BPF_F_LAST, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); + +cleanup_target4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup_target3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup_target2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup_target: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with after flag + * set and no fd/id, validates append behavior that the prog got + * attached in the right location. Detaches everything. + */ +void serial_test_tc_opts_append(void) +{ + test_tc_opts_append_target(BPF_TCX_INGRESS); + test_tc_opts_append_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_dev_cleanup_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + int err, ifindex; + + ASSERT_OK(system("ip link add dev tcx_opts1 type veth peer name tcx_opts2"), "add veth"); + ifindex = if_nametoindex("tcx_opts1"); + ASSERT_NEQ(ifindex, 0, "non_zero_ifindex"); + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count_ifindex(ifindex, target, 0); + + err = bpf_prog_attach_opts(fd1, ifindex, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + assert_mprog_count_ifindex(ifindex, target, 1); + + err = bpf_prog_attach_opts(fd2, ifindex, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup1; + assert_mprog_count_ifindex(ifindex, target, 2); + + err = bpf_prog_attach_opts(fd3, ifindex, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup2; + assert_mprog_count_ifindex(ifindex, target, 3); + + err = bpf_prog_attach_opts(fd4, ifindex, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup3; + assert_mprog_count_ifindex(ifindex, target, 4); + + ASSERT_OK(system("ip link del dev tcx_opts1"), "del veth"); + ASSERT_EQ(if_nametoindex("tcx_opts1"), 0, "dev1_removed"); + ASSERT_EQ(if_nametoindex("tcx_opts2"), 0, "dev2_removed"); + return; +cleanup3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count_ifindex(ifindex, target, 2); +cleanup2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count_ifindex(ifindex, target, 1); +cleanup1: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count_ifindex(ifindex, target, 0); +cleanup: + test_tc_link__destroy(skel); + + ASSERT_OK(system("ip link del dev tcx_opts1"), "del veth"); + ASSERT_EQ(if_nametoindex("tcx_opts1"), 0, "dev1_removed"); + ASSERT_EQ(if_nametoindex("tcx_opts2"), 0, "dev2_removed"); +} + +/* Test: + * + * Test which attaches progs to ingress/egress on a newly created + * device. Removes the device with attached programs. + */ +void serial_test_tc_opts_dev_cleanup(void) +{ + test_tc_opts_dev_cleanup_target(BPF_TCX_INGRESS); + test_tc_opts_dev_cleanup_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_mixed_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 pid1, pid2, pid3, pid4, lid2, lid4; + __u32 prog_flags[4], link_flags[4]; + __u32 prog_ids[4], link_ids[4]; + struct test_tc_link *skel; + struct bpf_link *link; + int err, detach_fd; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc1), + loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + detach_fd = bpf_program__fd(skel->progs.tc1); + + assert_mprog_count(target, 1); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup1; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = bpf_program__fd(skel->progs.tc1); + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc2), + loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + + assert_mprog_count(target, 2); + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = bpf_program__fd(skel->progs.tc2); + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc1), + loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + + assert_mprog_count(target, 2); + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = bpf_program__fd(skel->progs.tc2); + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc3), + loopback, target, &opta); + ASSERT_EQ(err, -EBUSY, "prog_attach"); + + assert_mprog_count(target, 2); + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = bpf_program__fd(skel->progs.tc1); + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc3), + loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup1; + detach_fd = bpf_program__fd(skel->progs.tc3); + + assert_mprog_count(target, 2); + + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup1; + skel->links.tc4 = link; + + lid4 = id_from_link_fd(bpf_link__fd(skel->links.tc4)); + + assert_mprog_count(target, 3); + + opta.flags = BPF_F_REPLACE; + opta.replace_fd = bpf_program__fd(skel->progs.tc4); + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc2), + loopback, target, &opta); + ASSERT_EQ(err, -EEXIST, "prog_attach"); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup1; + + ASSERT_EQ(optq.count, 3, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], 0, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], pid4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], lid4, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], 0, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.link_ids[3], 0, "link_ids[3]"); + ASSERT_EQ(optq.link_attach_flags[3], 0, "link_flags[3]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup1: + err = bpf_prog_detach_opts(detach_fd, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attache a link and attempts to replace/delete via opts + * for ingress/egress. Ensures that the link is unaffected. + */ +void serial_test_tc_opts_mixed(void) +{ + test_tc_opts_mixed_target(BPF_TCX_INGRESS); + test_tc_opts_mixed_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_demixed_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + struct test_tc_link *skel; + struct bpf_link *link; + __u32 pid1, pid2; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + err = bpf_prog_attach_opts(bpf_program__fd(skel->progs.tc1), + loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup1; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + optd.flags = BPF_F_AFTER; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_EQ(err, -EBUSY, "prog_detach"); + + assert_mprog_count(target, 2); + + optd.flags = BPF_F_BEFORE; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 1); + goto cleanup; +cleanup1: + err = bpf_prog_detach_opts(bpf_program__fd(skel->progs.tc1), + loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches progs to ingress/egress, validates that the progs + * got attached in the right location, and removes them with before/after + * detach flag and empty detach prog. Validates that link cannot be removed + * this way. + */ +void serial_test_tc_opts_demixed(void) +{ + test_tc_opts_demixed_target(BPF_TCX_INGRESS); + test_tc_opts_demixed_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_detach_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup1; + + assert_mprog_count(target, 2); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup2; + + assert_mprog_count(target, 3); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup3; + + assert_mprog_count(target, 4); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id3, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id4, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + optd.flags = BPF_F_BEFORE; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 3); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 3, "count"); + ASSERT_EQ(optq.revision, 6, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], 0, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + + optd.flags = BPF_F_AFTER; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 7, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + optd.flags = 0; + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); + + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); + + optd.flags = BPF_F_BEFORE; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + + optd.flags = BPF_F_AFTER; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + goto cleanup; +cleanup4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup1: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches progs to ingress/egress, validates that the progs + * got attached in the right location, and removes them with before/after + * detach flag and empty detach prog. Valides that head/tail gets removed. + */ +void serial_test_tc_opts_detach(void) +{ + test_tc_opts_detach_target(BPF_TCX_INGRESS); + test_tc_opts_detach_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_detach_before_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup1; + + assert_mprog_count(target, 2); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup2; + + assert_mprog_count(target, 3); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup3; + + assert_mprog_count(target, 4); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id3, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id4, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd2; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 3); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 3, "count"); + ASSERT_EQ(optq.revision, 6, "revision"); + ASSERT_EQ(optq.prog_ids[0], id2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], 0, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd2; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd4; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd3; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 7, "revision"); + ASSERT_EQ(optq.prog_ids[0], id3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id4, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + optd.flags = BPF_F_BEFORE; + optd.relative_fd = fd4; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 8, "revision"); + ASSERT_EQ(optq.prog_ids[0], id4, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + optd.flags = 0; + optd.relative_fd = 0; + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 0); + goto cleanup; +cleanup4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup1: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches progs to ingress/egress, validates that the progs + * got attached in the right location, and removes them with before + * detach flag and non-empty detach prog. Validates that the right ones + * got removed. + */ +void serial_test_tc_opts_detach_before(void) +{ + test_tc_opts_detach_before_target(BPF_TCX_INGRESS); + test_tc_opts_detach_before_target(BPF_TCX_EGRESS); +} + +static void test_tc_opts_detach_after_target(int target) +{ + LIBBPF_OPTS(bpf_prog_attach_opts, opta); + LIBBPF_OPTS(bpf_prog_detach_opts, optd); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 fd1, fd2, fd3, fd4, id1, id2, id3, id4; + struct test_tc_link *skel; + __u32 prog_ids[5]; + __u32 prog_flags[5]; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + fd1 = bpf_program__fd(skel->progs.tc1); + fd2 = bpf_program__fd(skel->progs.tc2); + fd3 = bpf_program__fd(skel->progs.tc3); + fd4 = bpf_program__fd(skel->progs.tc4); + + id1 = id_from_prog_fd(fd1); + id2 = id_from_prog_fd(fd2); + id3 = id_from_prog_fd(fd3); + id4 = id_from_prog_fd(fd4); + + ASSERT_NEQ(id1, id2, "prog_ids_1_2"); + ASSERT_NEQ(id3, id4, "prog_ids_3_4"); + ASSERT_NEQ(id2, id3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd1, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup; + + assert_mprog_count(target, 1); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd2, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup1; + + assert_mprog_count(target, 2); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd3, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup2; + + assert_mprog_count(target, 3); + + opta.flags = 0; + err = bpf_prog_attach_opts(fd4, loopback, target, &opta); + if (!ASSERT_EQ(err, 0, "prog_attach")) + goto cleanup3; + + assert_mprog_count(target, 4); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id3, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], id4, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 3); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 3, "count"); + ASSERT_EQ(optq.revision, 6, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], id4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], 0, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd4; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd3; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_EQ(err, -ENOENT, "prog_detach"); + assert_mprog_count(target, 3); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 7, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], id4, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + + optd.flags = BPF_F_AFTER; + optd.relative_fd = fd1; + err = bpf_prog_detach_opts(0, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup4; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 8, "revision"); + ASSERT_EQ(optq.prog_ids[0], id1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + + optd.flags = 0; + optd.relative_fd = 0; + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + + assert_mprog_count(target, 0); + goto cleanup; +cleanup4: + err = bpf_prog_detach_opts(fd4, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 3); +cleanup3: + err = bpf_prog_detach_opts(fd3, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 2); +cleanup2: + err = bpf_prog_detach_opts(fd2, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 1); +cleanup1: + err = bpf_prog_detach_opts(fd1, loopback, target, &optd); + ASSERT_OK(err, "prog_detach"); + assert_mprog_count(target, 0); +cleanup: + test_tc_link__destroy(skel); +} + +/* Test: + * + * Test which attaches progs to ingress/egress, validates that the progs + * got attached in the right location, and removes them with after + * detach flag and non-empty detach prog. Validates that the right ones + * got removed. + */ +void serial_test_tc_opts_detach_after(void) +{ + test_tc_opts_detach_after_target(BPF_TCX_INGRESS); + test_tc_opts_detach_after_target(BPF_TCX_EGRESS); +} diff --git a/tools/testing/selftests/bpf/progs/test_tc_link.c b/tools/testing/selftests/bpf/progs/test_tc_link.c new file mode 100644 index 000000000000..ed1fd0e9cee9 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_tc_link.c @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ +#include +#include +#include + +char LICENSE[] SEC("license") = "GPL"; + +bool seen_tc1; +bool seen_tc2; +bool seen_tc3; +bool seen_tc4; + +SEC("tc/ingress") +int tc1(struct __sk_buff *skb) +{ + seen_tc1 = true; + return TCX_NEXT; +} + +SEC("tc/egress") +int tc2(struct __sk_buff *skb) +{ + seen_tc2 = true; + return TCX_NEXT; +} + +SEC("tc/egress") +int tc3(struct __sk_buff *skb) +{ + seen_tc3 = true; + return TCX_NEXT; +} + +SEC("tc/egress") +int tc4(struct __sk_buff *skb) +{ + seen_tc4 = true; + return TCX_NEXT; +} From patchwork Wed Jun 7 19:26:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Borkmann X-Patchwork-Id: 13271183 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BEB63B8A7; Wed, 7 Jun 2023 19:26:47 +0000 (UTC) Received: from www62.your-server.de (www62.your-server.de [213.133.104.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 09EB61FEB; Wed, 7 Jun 2023 12:26:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=iogearbox.net; s=default2302; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=PdOvVtzQeAQdvHeU/so2C0SnBqFVApzPW5TfKcf0nsE=; b=ijcp8UXXQnQb9vsCzbt8hmgeYe b9T72ecbnPbVPgbidbMBsaDuZ4xGsQCkDJYrTy6z2rui7lUx8Rjp8L5dEFUHYt5MB6M/cVaOgrQA1 nMD3pRC0ErOWkw8XOCCOYBw2tkT5WyWjo/Vtz5z9oPUp5jcudXFUkOrcPKZGM8KYl48m8Ct7qUJEH 5ONB2ykl59GvY79ywlXtmEzccbE77TkpxNRps8WxbS8iy0sGRdWXuDgdhor29BBJcFTRS/RvLV8zh xsVE8/u3H6vaKT99FKuECfWOhLYSvsUgpWpG/MO8+xrbwQSENpfpoXguGLKHTPsAYDGGY/RbN6RfA rRJAtcsA==; Received: from 49.248.197.178.dynamic.dsl-lte-bonding.zhbmb00p-msn.res.cust.swisscom.ch ([178.197.248.49] helo=localhost) by www62.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1q6ynn-000CYz-6E; Wed, 07 Jun 2023 21:26:39 +0200 From: Daniel Borkmann To: ast@kernel.org Cc: andrii@kernel.org, martin.lau@linux.dev, razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com, kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org, davem@davemloft.net, bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann Subject: [PATCH bpf-next v2 7/7] selftests/bpf: Add mprog API tests for BPF tcx links Date: Wed, 7 Jun 2023 21:26:25 +0200 Message-Id: <20230607192625.22641-8-daniel@iogearbox.net> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20230607192625.22641-1-daniel@iogearbox.net> References: <20230607192625.22641-1-daniel@iogearbox.net> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.8/26931/Wed Jun 7 09:23:57 2023) X-Spam-Status: No, score=1.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] #224 tc_links_after:OK #225 tc_links_append:OK #226 tc_links_basic:OK #227 tc_links_before:OK #228 tc_links_both:OK #229 tc_links_chain_classic:OK #230 tc_links_dev_cleanup:OK #231 tc_links_first:OK #232 tc_links_invalid:OK #233 tc_links_last:OK #234 tc_links_prepend:OK #235 tc_links_replace:OK #236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann --- .../selftests/bpf/prog_tests/tc_links.c | 2279 +++++++++++++++++ 1 file changed, 2279 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_links.c diff --git a/tools/testing/selftests/bpf/prog_tests/tc_links.c b/tools/testing/selftests/bpf/prog_tests/tc_links.c new file mode 100644 index 000000000000..98039db6ccaf --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/tc_links.c @@ -0,0 +1,2279 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Isovalent */ +#include +#include +#include + +#define loopback 1 +#define ping_cmd "ping -q -c1 -w1 127.0.0.1 > /dev/null" + +#include "test_tc_link.skel.h" +#include "tc_helpers.h" + +/* Test: + * + * Basic test which attaches a link to ingress/egress, validates + * that the link got attached, runs traffic through the programs, + * validates that traffic has been seen, and detaches everything + * again. Programs are attached without special flags. + */ +void serial_test_tc_links_basic(void) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_ids[2], link_ids[2]; + __u32 pid1, pid2, lid1, lid2; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(BPF_TCX_INGRESS, 0); + assert_mprog_count(BPF_TCX_EGRESS, 0); + + ASSERT_EQ(skel->bss->seen_tc1, false, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(BPF_TCX_INGRESS, 1); + assert_mprog_count(BPF_TCX_EGRESS, 0); + + optq.prog_ids = prog_ids; + optq.link_ids = link_ids; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(link_ids, 0, sizeof(link_ids)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, BPF_TCX_INGRESS, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + ASSERT_NEQ(lid1, lid2, "link_ids_1_2"); + + assert_mprog_count(BPF_TCX_INGRESS, 1); + assert_mprog_count(BPF_TCX_EGRESS, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(link_ids, 0, sizeof(link_ids)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, BPF_TCX_EGRESS, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); +cleanup: + test_tc_link__destroy(skel); + + assert_mprog_count(BPF_TCX_INGRESS, 0); + assert_mprog_count(BPF_TCX_EGRESS, 0); +} + +static void test_tc_links_first_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[3], link_flags[3]; + __u32 prog_ids[3], link_ids[3]; + __u32 pid1, pid2, lid1, lid2; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + assert_mprog_count(target, 1); + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + optl.flags = BPF_F_FIRST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_LINK; + optl.relative_fd = bpf_link__fd(skel->links.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_ID | BPF_F_LINK; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = 0; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + bpf_link__destroy(skel->links.tc2); + skel->links.tc2 = NULL; + assert_mprog_count(target, 1); + + optl.flags = BPF_F_LAST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], BPF_F_LAST, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a prog to ingress/egress with first flag set, + * validates that the prog got attached, other attach attempts for + * this position should fail. Regular attach attempts or with last + * flag set should succeed. Detach everything again. + */ +void serial_test_tc_links_first(void) +{ + test_tc_links_first_target(BPF_TCX_INGRESS); + test_tc_links_first_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_last_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[3], link_flags[3]; + __u32 prog_ids[3], link_ids[3]; + __u32 pid1, pid2, lid1, lid2; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + optl.flags = BPF_F_LAST; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + assert_mprog_count(target, 1); + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_LAST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + optl.flags = BPF_F_LAST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_LINK; + optl.relative_fd = bpf_link__fd(skel->links.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_ID | BPF_F_LINK; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = 0; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], BPF_F_LAST, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + bpf_link__destroy(skel->links.tc2); + skel->links.tc2 = NULL; + assert_mprog_count(target, 1); + + optl.flags = BPF_F_FIRST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], BPF_F_LAST, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with last flag set, + * validates that the prog got attached, other attach attempts for + * this position should fail. Regular attach attempts or with first + * flag set should succeed. Detach everything again. + */ +void serial_test_tc_links_last(void) +{ + test_tc_links_last_target(BPF_TCX_INGRESS); + test_tc_links_last_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_both_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[3], link_flags[3]; + __u32 prog_ids[3], link_ids[3]; + __u32 pid1, pid2, lid1; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST | BPF_F_LAST; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + assert_mprog_count(target, 1); + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST | BPF_F_LAST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + optl.flags = BPF_F_LAST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_FIRST; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_LINK; + optl.relative_fd = bpf_link__fd(skel->links.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_ID; + optl.relative_id = pid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_ID | BPF_F_LINK; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_ID; + optl.relative_id = pid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_ID | BPF_F_LINK; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = 0; + optl.relative_id = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_FIRST | BPF_F_LAST; + optl.relative_id = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_FIRST | BPF_F_LAST | BPF_F_REPLACE; + optl.relative_id = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 2, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST | BPF_F_LAST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + err = bpf_link__update_program(skel->links.tc1, skel->progs.tc2); + if (!ASSERT_OK(err, "link_update")) + goto cleanup; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST | BPF_F_LAST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with first and last + * flag set, validates that the link got attached, other attach + * attempts should fail. Link update should work. Detach everything + * again. + */ +void serial_test_tc_links_both(void) +{ + test_tc_links_both_target(BPF_TCX_INGRESS); + test_tc_links_both_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_before_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[5], link_flags[5]; + __u32 prog_ids[5], link_ids[5]; + __u32 pid1, pid2, pid3, pid4; + __u32 lid1, lid2, lid3, lid4; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + + optl.flags = BPF_F_BEFORE; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc3 = link; + + lid3 = id_from_link_fd(bpf_link__fd(skel->links.tc3)); + + optl.flags = BPF_F_BEFORE | BPF_F_ID | BPF_F_LINK; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc4 = link; + + lid4 = id_from_link_fd(bpf_link__fd(skel->links.tc4)); + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid4, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid4, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], pid3, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], lid3, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], pid2, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.link_ids[3], lid2, "link_ids[3]"); + ASSERT_EQ(optq.link_attach_flags[3], 0, "link_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + ASSERT_EQ(optq.link_ids[4], 0, "link_ids[4]"); + ASSERT_EQ(optq.link_attach_flags[4], 0, "link_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with before flag + * set, validates that the link got attached in the right location. + * The first test inserts in the middle, then we insert to the front. + * Detach everything again. + */ +void serial_test_tc_links_before(void) +{ + test_tc_links_before_target(BPF_TCX_INGRESS); + test_tc_links_before_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_after_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[5], link_flags[5]; + __u32 prog_ids[5], link_ids[5]; + __u32 pid1, pid2, pid3, pid4; + __u32 lid1, lid2, lid3, lid4; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + + optl.flags = BPF_F_AFTER; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc3 = link; + + lid3 = id_from_link_fd(bpf_link__fd(skel->links.tc3)); + + optl.flags = BPF_F_AFTER | BPF_F_LINK; + optl.relative_fd = bpf_link__fd(skel->links.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc4 = link; + + lid4 = id_from_link_fd(bpf_link__fd(skel->links.tc4)); + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid3, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid3, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], pid2, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], lid2, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], pid4, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.link_ids[3], lid4, "link_ids[3]"); + ASSERT_EQ(optq.link_attach_flags[3], 0, "link_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + ASSERT_EQ(optq.link_ids[4], 0, "link_ids[4]"); + ASSERT_EQ(optq.link_attach_flags[4], 0, "link_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with after flag + * set, validates that the link got attached in the right location. + * The first test inserts in the middle, then we insert to the end. + * Detach everything again. + */ +void serial_test_tc_links_after(void) +{ + test_tc_links_after_target(BPF_TCX_INGRESS); + test_tc_links_after_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_revision_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[3], link_flags[3]; + __u32 prog_ids[3], link_ids[3]; + __u32 pid1, pid2, lid1, lid2; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + optl.flags = 0; + optl.expected_revision = 1; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + optl.flags = 0; + optl.expected_revision = 1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = 0; + optl.expected_revision = 2; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "prog_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with revision count + * set, validates that the link got attached and validates that + * when the count mismatches that the operation bails out. Detach + * everything again. + */ +void serial_test_tc_links_revision(void) +{ + test_tc_links_revision_target(BPF_TCX_INGRESS); + test_tc_links_revision_target(BPF_TCX_EGRESS); +} + +static void test_tc_chain_classic(int target, bool chain_tc_old) +{ + LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1); + LIBBPF_OPTS(bpf_tc_hook, tc_hook, .ifindex = loopback); + bool hook_created = false, tc_attached = false; + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + __u32 pid1, pid2, pid3; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + if (chain_tc_old) { + tc_hook.attach_point = target == BPF_TCX_INGRESS ? + BPF_TC_INGRESS : BPF_TC_EGRESS; + err = bpf_tc_hook_create(&tc_hook); + if (err == 0) + hook_created = true; + err = err == -EEXIST ? 0 : err; + if (!ASSERT_OK(err, "bpf_tc_hook_create")) + goto cleanup; + + tc_opts.prog_fd = bpf_program__fd(skel->progs.tc3); + err = bpf_tc_attach(&tc_hook, &tc_opts); + if (!ASSERT_OK(err, "bpf_tc_attach")) + goto cleanup; + tc_attached = true; + } + + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, chain_tc_old, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + err = bpf_link__detach(skel->links.tc2); + if (!ASSERT_OK(err, "prog_detach")) + goto cleanup; + + assert_mprog_count(target, 1); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, chain_tc_old, "seen_tc3"); +cleanup: + if (tc_attached) { + tc_opts.flags = tc_opts.prog_fd = tc_opts.prog_id = 0; + err = bpf_tc_detach(&tc_hook, &tc_opts); + ASSERT_OK(err, "bpf_tc_detach"); + } + if (hook_created) { + tc_hook.attach_point = BPF_TC_INGRESS | BPF_TC_EGRESS; + bpf_tc_hook_destroy(&tc_hook); + } + assert_mprog_count(target, 1); + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches two links to ingress/egress through the new + * API and one prog via classic API. Traffic runs through and it + * validates that the program has been executed. One of the two + * links gets removed and test is rerun again. Detach everything + * at the end. + */ +void serial_test_tc_links_chain_classic(void) +{ + test_tc_chain_classic(BPF_TCX_INGRESS, false); + test_tc_chain_classic(BPF_TCX_EGRESS, false); + test_tc_chain_classic(BPF_TCX_INGRESS, true); + test_tc_chain_classic(BPF_TCX_EGRESS, true); +} + +static void test_tc_links_replace_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 pid1, pid2, pid3, lid1, lid2; + __u32 prog_flags[4], link_flags[4]; + __u32 prog_ids[4], link_ids[4]; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + optl.flags = 0; + optl.expected_revision = 1; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_FIRST | BPF_F_ID; + optl.relative_id = pid1; + optl.expected_revision = 2; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + optl.flags = BPF_F_REPLACE; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + optl.expected_revision = 3; + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 2); + + optl.flags = BPF_F_REPLACE | BPF_F_LINK; + optl.relative_fd = bpf_link__fd(skel->links.tc2); + optl.expected_revision = 3; + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 2); + + optl.flags = BPF_F_REPLACE | BPF_F_LINK | BPF_F_FIRST | BPF_F_ID; + optl.relative_id = lid2; + optl.expected_revision = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 2); + + err = bpf_link__update_program(skel->links.tc2, skel->progs.tc3); + if (!ASSERT_OK(err, "link_update")) + goto cleanup; + + assert_mprog_count(target, 2); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 4, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + err = bpf_link__detach(skel->links.tc2); + if (!ASSERT_OK(err, "link_detach")) + goto cleanup; + + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + skel->bss->seen_tc3 = false; + + err = bpf_link__update_program(skel->links.tc1, skel->progs.tc1); + if (!ASSERT_OK(err, "link_update_self")) + goto cleanup; + + assert_mprog_count(target, 1); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 1, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], 0, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], 0, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, false, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress and validates + * replacement in combination with various ops/flags. Similar + * for later detachment. + */ +void serial_test_tc_links_replace(void) +{ + test_tc_links_replace_target(BPF_TCX_INGRESS); + test_tc_links_replace_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_invalid_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 pid1, pid2, lid1; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE | BPF_F_AFTER; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE | BPF_F_ID; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_AFTER | BPF_F_ID; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_ID; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_LAST | BPF_F_BEFORE; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST | BPF_F_AFTER; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST | BPF_F_LAST; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST | BPF_F_LAST | BPF_F_LINK; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_FIRST | BPF_F_LAST | BPF_F_LINK; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = 0; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE | BPF_F_AFTER; + optl.relative_fd = bpf_program__fd(skel->progs.tc2); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_ID; + optl.relative_id = pid2; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_BEFORE | BPF_F_LINK; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_AFTER; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = BPF_F_AFTER | BPF_F_LINK; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 0); + + optl.flags = 0; + optl.relative_fd = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER | BPF_F_LINK; + optl.relative_fd = bpf_program__fd(skel->progs.tc1); + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_LINK | BPF_F_ID; + optl.relative_id = ~0; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_LINK | BPF_F_ID; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_ID; + optl.relative_id = pid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_ERR_PTR(link, "link_attach_should_fail")) { + bpf_link__destroy(link); + goto cleanup; + } + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE | BPF_F_LINK | BPF_F_ID; + optl.relative_id = lid1; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + assert_mprog_count(target, 2); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test invalid flag combinations when attaching/detaching a + * link. + */ +void serial_test_tc_links_invalid(void) +{ + test_tc_links_invalid_target(BPF_TCX_INGRESS); + test_tc_links_invalid_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_prepend_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[5], link_flags[5]; + __u32 prog_ids[5], link_ids[5]; + __u32 pid1, pid2, pid3, pid4; + __u32 lid1, lid2, lid3, lid4; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + optl.flags = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + optl.flags = BPF_F_BEFORE; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid2, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid2, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid1, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid1, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + + optl.flags = BPF_F_FIRST; + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc3 = link; + + lid3 = id_from_link_fd(bpf_link__fd(skel->links.tc3)); + + optl.flags = BPF_F_BEFORE; + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc4 = link; + + lid4 = id_from_link_fd(bpf_link__fd(skel->links.tc4)); + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid3, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid3, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], BPF_F_FIRST, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid4, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid4, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], pid2, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], lid2, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], pid1, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.link_ids[3], lid1, "link_ids[3]"); + ASSERT_EQ(optq.link_attach_flags[3], 0, "link_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + ASSERT_EQ(optq.link_ids[4], 0, "link_ids[4]"); + ASSERT_EQ(optq.link_attach_flags[4], 0, "link_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with before flag + * set and no fd/id, validates prepend behavior that the link got + * attached in the right location. Detaches everything. + */ +void serial_test_tc_links_prepend(void) +{ + test_tc_links_prepend_target(BPF_TCX_INGRESS); + test_tc_links_prepend_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_append_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl, + .ifindex = loopback, + ); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 prog_flags[5], link_flags[5]; + __u32 prog_ids[5], link_ids[5]; + __u32 pid1, pid2, pid3, pid4; + __u32 lid1, lid2, lid3, lid4; + struct test_tc_link *skel; + struct bpf_link *link; + int err; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + optl.flags = 0; + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + + lid1 = id_from_link_fd(bpf_link__fd(skel->links.tc1)); + + assert_mprog_count(target, 1); + + optl.flags = BPF_F_AFTER; + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + + lid2 = id_from_link_fd(bpf_link__fd(skel->links.tc2)); + + assert_mprog_count(target, 2); + + optq.prog_ids = prog_ids; + optq.prog_attach_flags = prog_flags; + optq.link_ids = link_ids; + optq.link_attach_flags = link_flags; + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 2, "count"); + ASSERT_EQ(optq.revision, 3, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], 0, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], 0, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, false, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, false, "seen_tc4"); + + skel->bss->seen_tc1 = false; + skel->bss->seen_tc2 = false; + + optl.flags = BPF_F_LAST; + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc3 = link; + + lid3 = id_from_link_fd(bpf_link__fd(skel->links.tc3)); + + optl.flags = BPF_F_AFTER; + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc4 = link; + + lid4 = id_from_link_fd(bpf_link__fd(skel->links.tc4)); + + assert_mprog_count(target, 4); + + memset(prog_ids, 0, sizeof(prog_ids)); + memset(prog_flags, 0, sizeof(prog_flags)); + memset(link_ids, 0, sizeof(link_ids)); + memset(link_flags, 0, sizeof(link_flags)); + optq.count = ARRAY_SIZE(prog_ids); + + err = bpf_prog_query_opts(loopback, target, &optq); + if (!ASSERT_OK(err, "prog_query")) + goto cleanup; + + ASSERT_EQ(optq.count, 4, "count"); + ASSERT_EQ(optq.revision, 5, "revision"); + ASSERT_EQ(optq.prog_ids[0], pid1, "prog_ids[0]"); + ASSERT_EQ(optq.prog_attach_flags[0], 0, "prog_flags[0]"); + ASSERT_EQ(optq.link_ids[0], lid1, "link_ids[0]"); + ASSERT_EQ(optq.link_attach_flags[0], 0, "link_flags[0]"); + ASSERT_EQ(optq.prog_ids[1], pid2, "prog_ids[1]"); + ASSERT_EQ(optq.prog_attach_flags[1], 0, "prog_flags[1]"); + ASSERT_EQ(optq.link_ids[1], lid2, "link_ids[1]"); + ASSERT_EQ(optq.link_attach_flags[1], 0, "link_flags[1]"); + ASSERT_EQ(optq.prog_ids[2], pid4, "prog_ids[2]"); + ASSERT_EQ(optq.prog_attach_flags[2], 0, "prog_flags[2]"); + ASSERT_EQ(optq.link_ids[2], lid4, "link_ids[2]"); + ASSERT_EQ(optq.link_attach_flags[2], 0, "link_flags[2]"); + ASSERT_EQ(optq.prog_ids[3], pid3, "prog_ids[3]"); + ASSERT_EQ(optq.prog_attach_flags[3], 0, "prog_flags[3]"); + ASSERT_EQ(optq.link_ids[3], lid3, "link_ids[3]"); + ASSERT_EQ(optq.link_attach_flags[3], BPF_F_LAST, "link_flags[3]"); + ASSERT_EQ(optq.prog_ids[4], 0, "prog_ids[4]"); + ASSERT_EQ(optq.prog_attach_flags[4], 0, "prog_flags[4]"); + ASSERT_EQ(optq.link_ids[4], 0, "link_ids[4]"); + ASSERT_EQ(optq.link_attach_flags[4], 0, "link_flags[4]"); + + ASSERT_OK(system(ping_cmd), ping_cmd); + + ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1"); + ASSERT_EQ(skel->bss->seen_tc2, true, "seen_tc2"); + ASSERT_EQ(skel->bss->seen_tc3, true, "seen_tc3"); + ASSERT_EQ(skel->bss->seen_tc4, true, "seen_tc4"); +cleanup: + test_tc_link__destroy(skel); + assert_mprog_count(target, 0); +} + +/* Test: + * + * Test which attaches a link to ingress/egress with after flag + * set and no fd/id, validates append behavior that the link got + * attached in the right location. Detaches everything. + */ +void serial_test_tc_links_append(void) +{ + test_tc_links_append_target(BPF_TCX_INGRESS); + test_tc_links_append_target(BPF_TCX_EGRESS); +} + +static void test_tc_links_dev_cleanup_target(int target) +{ + LIBBPF_OPTS(bpf_tcx_opts, optl); + LIBBPF_OPTS(bpf_prog_query_opts, optq); + __u32 pid1, pid2, pid3, pid4; + struct test_tc_link *skel; + struct bpf_link *link; + int err, ifindex; + + ASSERT_OK(system("ip link add dev tcx_opts1 type veth peer name tcx_opts2"), "add veth"); + ifindex = if_nametoindex("tcx_opts1"); + ASSERT_NEQ(ifindex, 0, "non_zero_ifindex"); + optl.ifindex = ifindex; + + skel = test_tc_link__open(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + goto cleanup; + + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1, target), + 0, "tc1_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc2, target), + 0, "tc2_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc3, target), + 0, "tc3_attach_type"); + ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc4, target), + 0, "tc4_attach_type"); + + err = test_tc_link__load(skel); + if (!ASSERT_OK(err, "skel_load")) + goto cleanup; + + pid1 = id_from_prog_fd(bpf_program__fd(skel->progs.tc1)); + pid2 = id_from_prog_fd(bpf_program__fd(skel->progs.tc2)); + pid3 = id_from_prog_fd(bpf_program__fd(skel->progs.tc3)); + pid4 = id_from_prog_fd(bpf_program__fd(skel->progs.tc4)); + ASSERT_NEQ(pid1, pid2, "prog_ids_1_2"); + ASSERT_NEQ(pid3, pid4, "prog_ids_3_4"); + ASSERT_NEQ(pid2, pid3, "prog_ids_2_3"); + + assert_mprog_count(target, 0); + + link = bpf_program__attach_tcx_opts(skel->progs.tc1, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc1 = link; + assert_mprog_count_ifindex(ifindex, target, 1); + + link = bpf_program__attach_tcx_opts(skel->progs.tc2, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc2 = link; + assert_mprog_count_ifindex(ifindex, target, 2); + + link = bpf_program__attach_tcx_opts(skel->progs.tc3, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc3 = link; + assert_mprog_count_ifindex(ifindex, target, 3); + + link = bpf_program__attach_tcx_opts(skel->progs.tc4, &optl); + if (!ASSERT_OK_PTR(link, "link_attach")) + goto cleanup; + skel->links.tc4 = link; + assert_mprog_count_ifindex(ifindex, target, 4); + + ASSERT_OK(system("ip link del dev tcx_opts1"), "del veth"); + ASSERT_EQ(if_nametoindex("tcx_opts1"), 0, "dev1_removed"); + ASSERT_EQ(if_nametoindex("tcx_opts2"), 0, "dev2_removed"); + + ASSERT_EQ(ifindex_from_link_fd(bpf_link__fd(skel->links.tc1)), 0, "tc1_ifindex"); + ASSERT_EQ(ifindex_from_link_fd(bpf_link__fd(skel->links.tc2)), 0, "tc2_ifindex"); + ASSERT_EQ(ifindex_from_link_fd(bpf_link__fd(skel->links.tc3)), 0, "tc3_ifindex"); + ASSERT_EQ(ifindex_from_link_fd(bpf_link__fd(skel->links.tc4)), 0, "tc4_ifindex"); + + test_tc_link__destroy(skel); + return; +cleanup: + test_tc_link__destroy(skel); + + ASSERT_OK(system("ip link del dev tcx_opts1"), "del veth"); + ASSERT_EQ(if_nametoindex("tcx_opts1"), 0, "dev1_removed"); + ASSERT_EQ(if_nametoindex("tcx_opts2"), 0, "dev2_removed"); +} + +/* Test: + * + * Test which attaches links to ingress/egress on a newly created + * device. Removes the device with attached links. Check links are + * indetached state. + */ +void serial_test_tc_links_dev_cleanup(void) +{ + test_tc_links_dev_cleanup_target(BPF_TCX_INGRESS); + test_tc_links_dev_cleanup_target(BPF_TCX_EGRESS); +}