From patchwork Tue Dec 13 02:35:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071650 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EB88C4332F for ; Tue, 13 Dec 2022 02:36:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234128AbiLMCgM (ORCPT ); Mon, 12 Dec 2022 21:36:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234116AbiLMCgK (ORCPT ); Mon, 12 Dec 2022 21:36:10 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B37E51A83A for ; Mon, 12 Dec 2022 18:36:09 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id l10-20020a170902f68a00b00189d1728848so11875832plg.2 for ; Mon, 12 Dec 2022 18:36:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=B1MU/P2BiLvl4DKV0cW5ZfQOjZsyBYnRPN2mipW5caI=; b=emobRxwiVGi40jGlmJ1WyeLeIDyY7r7EBZCYOJHtIB6ev+9ozAiVIOnWIH2lkIrPa/ Lgp72l3NkwdKny0QyJHWXzfrUoXDv0+gw56qItGwCHf7Rj4B/rlyzxbJPx/CXywB61hA axeUnQkZYQkdDHpOtc6ml7svoT1Pwvo9EwFGZUQXJxV1aYcRhP7VmA8DmLN1mOriL9zD 1LaRYln4AbJCn4NyPoceWcErb7X31JJueNFpNumdvf9BC9Ke2HBn7NTovKGW7snopgUs AX7Pi5oy7MiA1Zl7V+nI89f+5vYc4p6FnYSx33dn3SADuhbSxA/aDTY7FSrN92RewO7c C+uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B1MU/P2BiLvl4DKV0cW5ZfQOjZsyBYnRPN2mipW5caI=; b=hXakZKivHWSpOaTYuI5vr4SXYnEflbGpKHCoVzGCsc+oQx/f7r4G0dq3T1wQSAbO0I Fnbz+9vWQuik9e5b3GXi6gq9uIPbzjs0ZLlhO4T/RzSz6pz8ME76E1BmgiLp0m3bzIcu d05KfF/liIBI9iFG8ZSwt4PX/QkCt2L6X39Nq/56Z6E+TyGiouOlmcys+BJlLYkKWnBG YHyrBxlbWLaZvkQymZLKMmmSMALWvQTLToj3V9avjRz7ydcrIPsXuapMNYtHlDiPyLsJ oqbShg/3YMYK7m8tPDWOVMUSihi2I9nQoLmqjfd/jRseV3Y5W9xzzEu7aL1Oc2c7BQ6Z rsYg== X-Gm-Message-State: ANoB5pkHs1Ro/qLNreuJijsbq0Ue5Ae2Adzbxq/MjFTm99JQkHoQTChG nevlgSuuq7oA1WNq2cB6mtJBPlE= X-Google-Smtp-Source: AA0mqf6/AkpE1jTjiTXbmukoaL8S0mTA3M64Qh2Hei+vJVWPYqe42Rr4lWaUA9tIXsWcKj/GDMmyzGU= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:e194:b0:189:56ab:ab68 with SMTP id y20-20020a170902e19400b0018956abab68mr70587525pla.144.1670898969258; Mon, 12 Dec 2022 18:36:09 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:51 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-2-sdf@google.com> Subject: [PATCH bpf-next v4 01/15] bpf: Document XDP RX metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Document all current use-cases and assumptions. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- Documentation/bpf/xdp-rx-metadata.rst | 90 +++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) create mode 100644 Documentation/bpf/xdp-rx-metadata.rst diff --git a/Documentation/bpf/xdp-rx-metadata.rst b/Documentation/bpf/xdp-rx-metadata.rst new file mode 100644 index 000000000000..498eae718275 --- /dev/null +++ b/Documentation/bpf/xdp-rx-metadata.rst @@ -0,0 +1,90 @@ +=============== +XDP RX Metadata +=============== + +XDP programs support creating and passing custom metadata via +``bpf_xdp_adjust_meta``. This metadata can be consumed by the following +entities: + +1. ``AF_XDP`` consumer. +2. Kernel core stack via ``XDP_PASS``. +3. Another device via ``bpf_redirect_map``. +4. Other BPF programs via ``bpf_tail_call``. + +General Design +============== + +XDP has access to a set of kfuncs to manipulate the metadata. Every +device driver implements these kfuncs. The set of kfuncs is +declared in ``include/net/xdp.h`` via ``XDP_METADATA_KFUNC_xxx``. + +Currently, the following kfuncs are supported. In the future, as more +metadata is supported, this set will grow: + +- ``bpf_xdp_metadata_rx_timestamp_supported`` returns true/false to + indicate whether the device supports RX timestamps +- ``bpf_xdp_metadata_rx_timestamp`` returns packet RX timestamp +- ``bpf_xdp_metadata_rx_hash_supported`` returns true/false to + indicate whether the device supports RX hash +- ``bpf_xdp_metadata_rx_hash`` returns packet RX hash + +Within the XDP frame, the metadata layout is as follows:: + + +----------+-----------------+------+ + | headroom | custom metadata | data | + +----------+-----------------+------+ + ^ ^ + | | + xdp_buff->data_meta xdp_buff->data + +AF_XDP +====== + +``AF_XDP`` use-case implies that there is a contract between the BPF program +that redirects XDP frames into the ``XSK`` and the final consumer. +Thus the BPF program manually allocates a fixed number of +bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset +of kfuncs to populate it. User-space ``XSK`` consumer, looks +at ``xsk_umem__get_data() - METADATA_SIZE`` to locate its metadata. + +Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer):: + + +----------+-----------------+------+ + | headroom | custom metadata | data | + +----------+-----------------+------+ + ^ + | + rx_desc->address + +XDP_PASS +======== + +This is the path where the packets processed by the XDP program are passed +into the kernel. The kernel creates ``skb`` out of the ``xdp_buff`` contents. +Currently, every driver has a custom kernel code to parse the descriptors and +populate ``skb`` metadata when doing this ``xdp_buff->skb`` conversion. +In the future, we'd like to support a case where XDP program can override +some of that metadata. + +The plan of record is to make this path similar to ``bpf_redirect_map`` +so the program can control which metadata is passed to the skb layer. + +bpf_redirect_map +================ + +``bpf_redirect_map`` can redirect the frame to a different device. +In this case we don't know ahead of time whether that final consumer +will further redirect to an ``XSK`` or pass it to the kernel via ``XDP_PASS``. +Additionally, the final consumer doesn't have access to the original +hardware descriptor and can't access any of the original metadata. + +For this use-case, only custom metadata is currently supported. If +the frame is eventually passed to the kernel, the skb created from such +a frame won't have any skb metadata. The ``XSK`` consumer will only +have access to the custom metadata. + +bpf_tail_call +============= + +No special handling here. Tail-called program operates on the same context +as the original one. From patchwork Tue Dec 13 02:35:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071651 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBD57C4332F for ; Tue, 13 Dec 2022 02:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234192AbiLMCgP (ORCPT ); Mon, 12 Dec 2022 21:36:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234116AbiLMCgM (ORCPT ); Mon, 12 Dec 2022 21:36:12 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 277CD1A04A for ; Mon, 12 Dec 2022 18:36:11 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 7-20020a631547000000b00478959ba320so8756991pgv.19 for ; Mon, 12 Dec 2022 18:36:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ai+FqjynFynWRqmeHFVgu3l5Qbdymb5liGjo6244Fc0=; b=pyz0DfLZSwI0lZZhWEZWldfWHvnHHw6bGaGEr1BlBkP7B1Em0bqRV50HDt6kyyg6ls UQg8AYFmpAePBm8oGnSwP7cg03nnMNPQBudx9CVowwOlHi36xujjnGBgAToWYxy5+L+b Rm6w3t1Pri08tqCedAuA5wFVGWf3m+FPirUeWqZbziNL2xaTUjd3bnWlrSDkpM+5vHf3 q8p5NYBzYktd+IwPqpC3i9F5SLHwQkiTwiWpi5uzgvmlcdS6iOQv8pO07vfMHD6T9IXO X/vvwdM9gqC99wKV3mVPbYqTTTxD/P6C0eliO0ncoTXRNgjMKBJErezRjvxuBYWm+1MP 0iYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ai+FqjynFynWRqmeHFVgu3l5Qbdymb5liGjo6244Fc0=; b=XejwwNmDLCWLa6yL9cMsInit6kDBqz6blVgJ9HF5KOUPPt8WWPI7ujOTbGODSiaFvB 0qJNiXpKTXVHfVCjBHX3wHtOILO1LBGXy0rayOA/2/+3M3b4xR5D8fvbUirBBNfiM+Qj X5WqksTtima1veInZYgGsgygC7te7WlZon+RvirTOreDnecnlxFKYD+f1aXR/V/fVova 5Tdqc22GqcXyY1hYWCnPEwasJWTpqvAXCmupB0v1D93QJQMPpmf/9A9DOgNAgxuUlrmy P8p4V/DoeVHUwOT3P/B8vXKL4wRwE+q/qB8XkcFJDOraxG+BBl7U3gdpaPpHw8EObOkV x21Q== X-Gm-Message-State: ANoB5pm+Ju6igBeSpo2Y1MQeOIqsg4X1+SzpbxMKkfPM8llC+DaiMqwg xgI3AWK6BwV2l1f7DvdUZH5qIDE= X-Google-Smtp-Source: AA0mqf7EkVrWd1jxMMcle0ZHlRAB5x9XPAWprvM8StpY0y6W24JdI/7twrpblfTqnkXwzxW6kQnfy08= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:cf0e:b0:189:e577:c833 with SMTP id i14-20020a170902cf0e00b00189e577c833mr15198479plg.118.1670898970652; Mon, 12 Dec 2022 18:36:10 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:52 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-3-sdf@google.com> Subject: [PATCH bpf-next v4 02/15] bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, Jakub Kicinski , David Ahern , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net BPF offloading infra will be reused to implement bound-but-not-offloaded bpf programs. Rename existing helpers for clarity. No functional changes. Reviewed-by: Jakub Kicinski Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- include/linux/bpf.h | 8 ++++---- kernel/bpf/core.c | 4 ++-- kernel/bpf/offload.c | 4 ++-- kernel/bpf/syscall.c | 22 +++++++++++----------- kernel/bpf/verifier.c | 18 +++++++++--------- net/core/dev.c | 4 ++-- net/core/filter.c | 2 +- 7 files changed, 31 insertions(+), 31 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3de24cfb7a3d..f8d3a93703f3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2481,12 +2481,12 @@ void unpriv_ebpf_notify(int new_state); #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr); -static inline bool bpf_prog_is_dev_bound(const struct bpf_prog_aux *aux) +static inline bool bpf_prog_is_offloaded(const struct bpf_prog_aux *aux) { return aux->offload_requested; } -static inline bool bpf_map_is_dev_bound(struct bpf_map *map) +static inline bool bpf_map_is_offloaded(struct bpf_map *map) { return unlikely(map->ops == &bpf_map_offload_ops); } @@ -2513,12 +2513,12 @@ static inline int bpf_prog_offload_init(struct bpf_prog *prog, return -EOPNOTSUPP; } -static inline bool bpf_prog_is_dev_bound(struct bpf_prog_aux *aux) +static inline bool bpf_prog_is_offloaded(struct bpf_prog_aux *aux) { return false; } -static inline bool bpf_map_is_dev_bound(struct bpf_map *map) +static inline bool bpf_map_is_offloaded(struct bpf_map *map) { return false; } diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2e57fc839a5c..641ab412ad7e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2183,7 +2183,7 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) * valid program, which in this case would simply not * be JITed, but falls back to the interpreter. */ - if (!bpf_prog_is_dev_bound(fp->aux)) { + if (!bpf_prog_is_offloaded(fp->aux)) { *err = bpf_prog_alloc_jited_linfo(fp); if (*err) return fp; @@ -2554,7 +2554,7 @@ static void bpf_prog_free_deferred(struct work_struct *work) #endif bpf_free_used_maps(aux); bpf_free_used_btfs(aux); - if (bpf_prog_is_dev_bound(aux)) + if (bpf_prog_is_offloaded(aux)) bpf_prog_offload_destroy(aux->prog); #ifdef CONFIG_PERF_EVENTS if (aux->prog->has_callchain_buf) diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 13e4efc971e6..f5769a8ecbee 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -549,7 +549,7 @@ static bool __bpf_offload_dev_match(struct bpf_prog *prog, struct bpf_offload_netdev *ondev1, *ondev2; struct bpf_prog_offload *offload; - if (!bpf_prog_is_dev_bound(prog->aux)) + if (!bpf_prog_is_offloaded(prog->aux)) return false; offload = prog->aux->offload; @@ -581,7 +581,7 @@ bool bpf_offload_prog_map_match(struct bpf_prog *prog, struct bpf_map *map) struct bpf_offloaded_map *offmap; bool ret; - if (!bpf_map_is_dev_bound(map)) + if (!bpf_map_is_offloaded(map)) return bpf_map_offload_neutral(map); offmap = map_to_offmap(map); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 35972afb6850..13bc96035116 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -181,7 +181,7 @@ static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, int err; /* Need to create a kthread, thus must support schedule */ - if (bpf_map_is_dev_bound(map)) { + if (bpf_map_is_offloaded(map)) { return bpf_map_offload_update_elem(map, key, value, flags); } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { @@ -238,7 +238,7 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, void *ptr; int err; - if (bpf_map_is_dev_bound(map)) + if (bpf_map_is_offloaded(map)) return bpf_map_offload_lookup_elem(map, key, value); bpf_disable_instrumentation(); @@ -1483,7 +1483,7 @@ static int map_delete_elem(union bpf_attr *attr, bpfptr_t uattr) goto err_put; } - if (bpf_map_is_dev_bound(map)) { + if (bpf_map_is_offloaded(map)) { err = bpf_map_offload_delete_elem(map, key); goto out; } else if (IS_FD_PROG_ARRAY(map) || @@ -1547,7 +1547,7 @@ static int map_get_next_key(union bpf_attr *attr) if (!next_key) goto free_key; - if (bpf_map_is_dev_bound(map)) { + if (bpf_map_is_offloaded(map)) { err = bpf_map_offload_get_next_key(map, key, next_key); goto out; } @@ -1605,7 +1605,7 @@ int generic_map_delete_batch(struct bpf_map *map, map->key_size)) break; - if (bpf_map_is_dev_bound(map)) { + if (bpf_map_is_offloaded(map)) { err = bpf_map_offload_delete_elem(map, key); break; } @@ -1851,7 +1851,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr) map->map_type == BPF_MAP_TYPE_PERCPU_HASH || map->map_type == BPF_MAP_TYPE_LRU_HASH || map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - if (!bpf_map_is_dev_bound(map)) { + if (!bpf_map_is_offloaded(map)) { bpf_disable_instrumentation(); rcu_read_lock(); err = map->ops->map_lookup_and_delete_elem(map, key, value, attr->flags); @@ -1944,7 +1944,7 @@ static int find_prog_type(enum bpf_prog_type type, struct bpf_prog *prog) if (!ops) return -EINVAL; - if (!bpf_prog_is_dev_bound(prog->aux)) + if (!bpf_prog_is_offloaded(prog->aux)) prog->aux->ops = ops; else prog->aux->ops = &bpf_offload_prog_ops; @@ -2255,7 +2255,7 @@ bool bpf_prog_get_ok(struct bpf_prog *prog, if (prog->type != *attach_type) return false; - if (bpf_prog_is_dev_bound(prog->aux) && !attach_drv) + if (bpf_prog_is_offloaded(prog->aux) && !attach_drv) return false; return true; @@ -2598,7 +2598,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) atomic64_set(&prog->aux->refcnt, 1); prog->gpl_compatible = is_gpl ? 1 : 0; - if (bpf_prog_is_dev_bound(prog->aux)) { + if (bpf_prog_is_offloaded(prog->aux)) { err = bpf_prog_offload_init(prog, attr); if (err) goto free_prog_sec; @@ -3997,7 +3997,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, return -EFAULT; } - if (bpf_prog_is_dev_bound(prog->aux)) { + if (bpf_prog_is_offloaded(prog->aux)) { err = bpf_prog_offload_info_fill(&info, prog); if (err) return err; @@ -4225,7 +4225,7 @@ static int bpf_map_get_info_by_fd(struct file *file, } info.btf_vmlinux_value_type_id = map->btf_vmlinux_value_type_id; - if (bpf_map_is_dev_bound(map)) { + if (bpf_map_is_offloaded(map)) { err = bpf_map_offload_info_fill(&info, map); if (err) return err; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a5255a0dcbb6..203d8cfeda70 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13806,7 +13806,7 @@ static int do_check(struct bpf_verifier_env *env) env->prev_log_len = env->log.len_used; } - if (bpf_prog_is_dev_bound(env->prog->aux)) { + if (bpf_prog_is_offloaded(env->prog->aux)) { err = bpf_prog_offload_verify_insn(env, env->insn_idx, env->prev_insn_idx); if (err) @@ -14286,7 +14286,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, } } - if ((bpf_prog_is_dev_bound(prog->aux) || bpf_map_is_dev_bound(map)) && + if ((bpf_prog_is_offloaded(prog->aux) || bpf_map_is_offloaded(map)) && !bpf_offload_prog_map_match(prog, map)) { verbose(env, "offload device mismatch between prog and map\n"); return -EINVAL; @@ -14767,7 +14767,7 @@ static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt) unsigned int orig_prog_len = env->prog->len; int err; - if (bpf_prog_is_dev_bound(env->prog->aux)) + if (bpf_prog_is_offloaded(env->prog->aux)) bpf_prog_offload_remove_insns(env, off, cnt); err = bpf_remove_insns(env->prog, off, cnt); @@ -14848,7 +14848,7 @@ static void opt_hard_wire_dead_code_branches(struct bpf_verifier_env *env) else continue; - if (bpf_prog_is_dev_bound(env->prog->aux)) + if (bpf_prog_is_offloaded(env->prog->aux)) bpf_prog_offload_replace_insn(env, i, &ja); memcpy(insn, &ja, sizeof(ja)); @@ -15035,7 +15035,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) } } - if (bpf_prog_is_dev_bound(env->prog->aux)) + if (bpf_prog_is_offloaded(env->prog->aux)) return 0; insn = env->prog->insnsi + delta; @@ -15435,7 +15435,7 @@ static int fixup_call_args(struct bpf_verifier_env *env) int err = 0; if (env->prog->jit_requested && - !bpf_prog_is_dev_bound(env->prog->aux)) { + !bpf_prog_is_offloaded(env->prog->aux)) { err = jit_subprogs(env); if (err == 0) return 0; @@ -16922,7 +16922,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) if (ret < 0) goto skip_full_check; - if (bpf_prog_is_dev_bound(env->prog->aux)) { + if (bpf_prog_is_offloaded(env->prog->aux)) { ret = bpf_prog_offload_verifier_prep(env->prog); if (ret) goto skip_full_check; @@ -16935,7 +16935,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) ret = do_check_subprogs(env); ret = ret ?: do_check_main(env); - if (ret == 0 && bpf_prog_is_dev_bound(env->prog->aux)) + if (ret == 0 && bpf_prog_is_offloaded(env->prog->aux)) ret = bpf_prog_offload_finalize(env); skip_full_check: @@ -16970,7 +16970,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr) /* do 32-bit optimization after insn patching has done so those patched * insns could be handled correctly. */ - if (ret == 0 && !bpf_prog_is_dev_bound(env->prog->aux)) { + if (ret == 0 && !bpf_prog_is_offloaded(env->prog->aux)) { ret = opt_subreg_zext_lo32_rnd_hi32(env, attr); env->prog->aux->verifier_zext = bpf_jit_needs_zext() ? !ret : false; diff --git a/net/core/dev.c b/net/core/dev.c index 7627c475d991..5d51999cba30 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9224,8 +9224,8 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack NL_SET_ERR_MSG(extack, "Native and generic XDP can't be active at the same time"); return -EEXIST; } - if (!offload && bpf_prog_is_dev_bound(new_prog->aux)) { - NL_SET_ERR_MSG(extack, "Using device-bound program without HW_MODE flag is not supported"); + if (!offload && bpf_prog_is_offloaded(new_prog->aux)) { + NL_SET_ERR_MSG(extack, "Using offloaded program without HW_MODE flag is not supported"); return -EINVAL; } if (new_prog->expected_attach_type == BPF_XDP_DEVMAP) { diff --git a/net/core/filter.c b/net/core/filter.c index 929358677183..ac1fa0ef6f35 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -8728,7 +8728,7 @@ static bool xdp_is_valid_access(int off, int size, } if (type == BPF_WRITE) { - if (bpf_prog_is_dev_bound(prog->aux)) { + if (bpf_prog_is_offloaded(prog->aux)) { switch (off) { case offsetof(struct xdp_md, rx_queue_index): return __is_valid_xdp_access(off, size); From patchwork Tue Dec 13 02:35:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071652 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 266A6C00145 for ; Tue, 13 Dec 2022 02:36:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233781AbiLMCgg (ORCPT ); Mon, 12 Dec 2022 21:36:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234130AbiLMCgP (ORCPT ); Mon, 12 Dec 2022 21:36:15 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A78241AF31 for ; Mon, 12 Dec 2022 18:36:12 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id 7-20020a631547000000b00478959ba320so8757033pgv.19 for ; Mon, 12 Dec 2022 18:36:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=P0AA5h8Ik63MZ6KrfvJ4IJRwQNCDa9Gzm9cthJyMxv8=; b=WtDM+4e/11lfGkaeaFpm0mhsi8L0ReKqluev9Lz7e9VoA/eUVD32+R2zSazg3pdMz7 UOrVCUssKOdMNBAQla+1Sht1gMAtfBbGyVWwztOhAIqFqewQUwQ+UY7bw33ZUkRp466O woYgikqHVcvPZEdWeRg+tv2tqW6ABTLbglkaCjOfyhrCVvLn/sjCJfSL+deCXGqXIQTu 8l+yeoKI5wmbE5x2IFZP8vq1kf7QqEPKum94XvF7v66TT01zAiVX+zkzaNiKAekLa0Yu YT5qV5eH1HQEPlPLyz8bFIejbkYFheDy9j9TGInglNn8Z1+VCjBzRgBIFPup4hgpVhG5 BKFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P0AA5h8Ik63MZ6KrfvJ4IJRwQNCDa9Gzm9cthJyMxv8=; b=eWODD9YyKa32xPFtKrii4STpaX2mCv+mc+CFqLLwxvwk1SF8hplAeyteHtgmVLtIhf dTeR0eB2Azk41OzuSdcmzBzMb9d/+TDIZ/l/zyWyp/eBJFYuXntQft4avOCP2BrXsEBL ZbB3azZ6KWKFD+yc+QfHBF/8CNzNoY7b1SirC9w3j96G1ilkJWjU0ZiHCUt4WnTOIV/F 0pi/2xI7IRwGBeEeHZEqc0a8aRp/SHRZ7/AKts3VhT016cGTAYBx0+Pn7Fm1xgBjM8ru lY+EQJz4Edt3g7et2ekHahJPWbZQtaSjv7GjQGdFb6RRovO0hLFkPZMQdEVK0izdS3Zk 9m5Q== X-Gm-Message-State: ANoB5plzUPqVlpgxz4Cw+1Ce+OA5W4Z9WSDSS3lnj+tES9Cop+7mVw/E RstyMC92MVuPg1thDg1rd/eDzik= X-Google-Smtp-Source: AA0mqf7nCqN6EnU/l7THmMeMve70AzFdZopeRncWoDUWqVebYHIvl6lId4G0Uh+oSMC12K1OdTIpPGA= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:ca92:b0:189:a2d4:7f5 with SMTP id v18-20020a170902ca9200b00189a2d407f5mr42517471pld.23.1670898972154; Mon, 12 Dec 2022 18:36:12 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:53 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-4-sdf@google.com> Subject: [PATCH bpf-next v4 03/15] bpf: Introduce device-bound XDP programs From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net New flag BPF_F_XDP_DEV_BOUND_ONLY plus all the infra to have a way to associate a netdev with a BPF program at load time. Some existing 'offloaded' routines are renamed to 'dev_bound' for consistency with the rest. Also moved a bunch of code around to avoid forward declarations. netdevsim checks are dropped in favor of generic check in dev_xdp_attach. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- drivers/net/netdevsim/bpf.c | 4 - include/linux/bpf.h | 24 ++- include/uapi/linux/bpf.h | 5 + kernel/bpf/core.c | 4 +- kernel/bpf/offload.c | 293 ++++++++++++++++++++------------- kernel/bpf/syscall.c | 9 +- net/core/dev.c | 5 + tools/include/uapi/linux/bpf.h | 5 + 8 files changed, 218 insertions(+), 131 deletions(-) diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c index 50854265864d..f60eb97e3a62 100644 --- a/drivers/net/netdevsim/bpf.c +++ b/drivers/net/netdevsim/bpf.c @@ -315,10 +315,6 @@ nsim_setup_prog_hw_checks(struct netdevsim *ns, struct netdev_bpf *bpf) NSIM_EA(bpf->extack, "xdpoffload of non-bound program"); return -EINVAL; } - if (!bpf_offload_dev_match(bpf->prog, ns->netdev)) { - NSIM_EA(bpf->extack, "program bound to different dev"); - return -EINVAL; - } state = bpf->prog->aux->offload->dev_priv; if (WARN_ON(strcmp(state->state, "xlated"))) { diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f8d3a93703f3..ca22e8b8bd82 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1261,7 +1261,8 @@ struct bpf_prog_aux { enum bpf_prog_type saved_dst_prog_type; enum bpf_attach_type saved_dst_attach_type; bool verifier_zext; /* Zero extensions has been inserted by verifier. */ - bool offload_requested; + bool dev_bound; /* Program is bound to the netdev. */ + bool offload_requested; /* Program is bound and offloaded to the netdev. */ bool attach_btf_trace; /* true if attaching to BTF-enabled raw tp */ bool func_proto_unreliable; bool sleepable; @@ -2451,7 +2452,7 @@ void __bpf_free_used_maps(struct bpf_prog_aux *aux, bool bpf_prog_get_ok(struct bpf_prog *, enum bpf_prog_type *, bool); int bpf_prog_offload_compile(struct bpf_prog *prog); -void bpf_prog_offload_destroy(struct bpf_prog *prog); +void bpf_prog_dev_bound_destroy(struct bpf_prog *prog); int bpf_prog_offload_info_fill(struct bpf_prog_info *info, struct bpf_prog *prog); @@ -2479,7 +2480,13 @@ bool bpf_offload_dev_match(struct bpf_prog *prog, struct net_device *netdev); void unpriv_ebpf_notify(int new_state); #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) -int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr); +int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr); +void bpf_dev_bound_netdev_unregister(struct net_device *dev); + +static inline bool bpf_prog_is_dev_bound(const struct bpf_prog_aux *aux) +{ + return aux->dev_bound; +} static inline bool bpf_prog_is_offloaded(const struct bpf_prog_aux *aux) { @@ -2507,12 +2514,21 @@ void sock_map_unhash(struct sock *sk); void sock_map_destroy(struct sock *sk); void sock_map_close(struct sock *sk, long timeout); #else -static inline int bpf_prog_offload_init(struct bpf_prog *prog, +static inline int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) { return -EOPNOTSUPP; } +static inline void bpf_dev_bound_netdev_unregister(struct net_device *dev) +{ +} + +static inline bool bpf_prog_is_dev_bound(const struct bpf_prog_aux *aux) +{ + return false; +} + static inline bool bpf_prog_is_offloaded(struct bpf_prog_aux *aux) { return false; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 464ca3f01fe7..fa28603a48e7 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1156,6 +1156,11 @@ enum bpf_link_type { */ #define BPF_F_XDP_HAS_FRAGS (1U << 5) +/* If BPF_F_XDP_DEV_BOUND_ONLY is used in BPF_PROG_LOAD command, the loaded + * program becomes device-bound but can access XDP metadata. + */ +#define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) + /* link_create.kprobe_multi.flags used in LINK_CREATE command for * BPF_TRACE_KPROBE_MULTI attach type to create return probe. */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 641ab412ad7e..d434a994ee04 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2554,8 +2554,8 @@ static void bpf_prog_free_deferred(struct work_struct *work) #endif bpf_free_used_maps(aux); bpf_free_used_btfs(aux); - if (bpf_prog_is_offloaded(aux)) - bpf_prog_offload_destroy(aux->prog); + if (bpf_prog_is_dev_bound(aux)) + bpf_prog_dev_bound_destroy(aux->prog); #ifdef CONFIG_PERF_EVENTS if (aux->prog->has_callchain_buf) put_callchain_buffers(); diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index f5769a8ecbee..f714c941f8ea 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -41,7 +41,7 @@ struct bpf_offload_dev { struct bpf_offload_netdev { struct rhash_head l; struct net_device *netdev; - struct bpf_offload_dev *offdev; + struct bpf_offload_dev *offdev; /* NULL when bound-only */ struct list_head progs; struct list_head maps; struct list_head offdev_netdevs; @@ -56,7 +56,6 @@ static const struct rhashtable_params offdevs_params = { }; static struct rhashtable offdevs; -static bool offdevs_inited; static int bpf_dev_offload_check(struct net_device *netdev) { @@ -72,12 +71,124 @@ bpf_offload_find_netdev(struct net_device *netdev) { lockdep_assert_held(&bpf_devs_lock); - if (!offdevs_inited) - return NULL; return rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params); } -int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) +static int __bpf_offload_dev_netdev_register(struct bpf_offload_dev *offdev, + struct net_device *netdev) +{ + struct bpf_offload_netdev *ondev; + int err; + + ondev = kzalloc(sizeof(*ondev), GFP_KERNEL); + if (!ondev) + return -ENOMEM; + + ondev->netdev = netdev; + ondev->offdev = offdev; + INIT_LIST_HEAD(&ondev->progs); + INIT_LIST_HEAD(&ondev->maps); + + err = rhashtable_insert_fast(&offdevs, &ondev->l, offdevs_params); + if (err) { + netdev_warn(netdev, "failed to register for BPF offload\n"); + goto err_unlock_free; + } + + if (offdev) + list_add(&ondev->offdev_netdevs, &offdev->netdevs); + return 0; + +err_unlock_free: + up_write(&bpf_devs_lock); + kfree(ondev); + return err; +} + +static void __bpf_prog_dev_bound_destroy(struct bpf_prog *prog) +{ + struct bpf_prog_offload *offload = prog->aux->offload; + + if (offload->dev_state) + offload->offdev->ops->destroy(prog); + + /* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */ + bpf_prog_free_id(prog, true); + + list_del_init(&offload->offloads); + kfree(offload); + prog->aux->offload = NULL; +} + +static int bpf_map_offload_ndo(struct bpf_offloaded_map *offmap, + enum bpf_netdev_command cmd) +{ + struct netdev_bpf data = {}; + struct net_device *netdev; + + ASSERT_RTNL(); + + data.command = cmd; + data.offmap = offmap; + /* Caller must make sure netdev is valid */ + netdev = offmap->netdev; + + return netdev->netdev_ops->ndo_bpf(netdev, &data); +} + +static void __bpf_map_offload_destroy(struct bpf_offloaded_map *offmap) +{ + WARN_ON(bpf_map_offload_ndo(offmap, BPF_OFFLOAD_MAP_FREE)); + /* Make sure BPF_MAP_GET_NEXT_ID can't find this dead map */ + bpf_map_free_id(&offmap->map, true); + list_del_init(&offmap->offloads); + offmap->netdev = NULL; +} + +static void __bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, + struct net_device *netdev) +{ + struct bpf_offload_netdev *ondev, *altdev = NULL; + struct bpf_offloaded_map *offmap, *mtmp; + struct bpf_prog_offload *offload, *ptmp; + + ASSERT_RTNL(); + + ondev = rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params); + if (WARN_ON(!ondev)) + return; + + WARN_ON(rhashtable_remove_fast(&offdevs, &ondev->l, offdevs_params)); + + /* Try to move the objects to another netdev of the device */ + if (offdev) { + list_del(&ondev->offdev_netdevs); + altdev = list_first_entry_or_null(&offdev->netdevs, + struct bpf_offload_netdev, + offdev_netdevs); + } + + if (altdev) { + list_for_each_entry(offload, &ondev->progs, offloads) + offload->netdev = altdev->netdev; + list_splice_init(&ondev->progs, &altdev->progs); + + list_for_each_entry(offmap, &ondev->maps, offloads) + offmap->netdev = altdev->netdev; + list_splice_init(&ondev->maps, &altdev->maps); + } else { + list_for_each_entry_safe(offload, ptmp, &ondev->progs, offloads) + __bpf_prog_dev_bound_destroy(offload->prog); + list_for_each_entry_safe(offmap, mtmp, &ondev->maps, offloads) + __bpf_map_offload_destroy(offmap); + } + + WARN_ON(!list_empty(&ondev->progs)); + WARN_ON(!list_empty(&ondev->maps)); + kfree(ondev); +} + +int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) { struct bpf_offload_netdev *ondev; struct bpf_prog_offload *offload; @@ -87,7 +198,7 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) attr->prog_type != BPF_PROG_TYPE_XDP) return -EINVAL; - if (attr->prog_flags) + if (attr->prog_flags & ~BPF_F_XDP_DEV_BOUND_ONLY) return -EINVAL; offload = kzalloc(sizeof(*offload), GFP_USER); @@ -102,11 +213,25 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) if (err) goto err_maybe_put; + prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_DEV_BOUND_ONLY); + down_write(&bpf_devs_lock); ondev = bpf_offload_find_netdev(offload->netdev); if (!ondev) { - err = -EINVAL; - goto err_unlock; + if (!bpf_prog_is_offloaded(prog->aux)) { + /* When only binding to the device, explicitly + * create an entry in the hashtable. See related + * bpf_dev_bound_try_remove_netdev. + */ + err = __bpf_offload_dev_netdev_register(NULL, offload->netdev); + if (err) + goto err_unlock; + ondev = bpf_offload_find_netdev(offload->netdev); + } + if (!ondev) { + err = -EINVAL; + goto err_unlock; + } } offload->offdev = ondev->offdev; prog->aux->offload = offload; @@ -209,27 +334,28 @@ bpf_prog_offload_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt) up_read(&bpf_devs_lock); } -static void __bpf_prog_offload_destroy(struct bpf_prog *prog) +static void bpf_dev_bound_try_remove_netdev(struct net_device *dev) { - struct bpf_prog_offload *offload = prog->aux->offload; - - if (offload->dev_state) - offload->offdev->ops->destroy(prog); + struct bpf_offload_netdev *ondev; - /* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */ - bpf_prog_free_id(prog, true); + if (!dev) + return; - list_del_init(&offload->offloads); - kfree(offload); - prog->aux->offload = NULL; + ondev = bpf_offload_find_netdev(dev); + if (ondev && !ondev->offdev && list_empty(&ondev->progs)) + __bpf_offload_dev_netdev_unregister(NULL, dev); } -void bpf_prog_offload_destroy(struct bpf_prog *prog) +void bpf_prog_dev_bound_destroy(struct bpf_prog *prog) { + rtnl_lock(); down_write(&bpf_devs_lock); - if (prog->aux->offload) - __bpf_prog_offload_destroy(prog); + if (prog->aux->offload) { + bpf_dev_bound_try_remove_netdev(prog->aux->offload->netdev); + __bpf_prog_dev_bound_destroy(prog); + } up_write(&bpf_devs_lock); + rtnl_unlock(); } static int bpf_prog_offload_translate(struct bpf_prog *prog) @@ -343,22 +469,6 @@ int bpf_prog_offload_info_fill(struct bpf_prog_info *info, const struct bpf_prog_ops bpf_offload_prog_ops = { }; -static int bpf_map_offload_ndo(struct bpf_offloaded_map *offmap, - enum bpf_netdev_command cmd) -{ - struct netdev_bpf data = {}; - struct net_device *netdev; - - ASSERT_RTNL(); - - data.command = cmd; - data.offmap = offmap; - /* Caller must make sure netdev is valid */ - netdev = offmap->netdev; - - return netdev->netdev_ops->ndo_bpf(netdev, &data); -} - struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr) { struct net *net = current->nsproxy->net_ns; @@ -408,15 +518,6 @@ struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr) return ERR_PTR(err); } -static void __bpf_map_offload_destroy(struct bpf_offloaded_map *offmap) -{ - WARN_ON(bpf_map_offload_ndo(offmap, BPF_OFFLOAD_MAP_FREE)); - /* Make sure BPF_MAP_GET_NEXT_ID can't find this dead map */ - bpf_map_free_id(&offmap->map, true); - list_del_init(&offmap->offloads); - offmap->netdev = NULL; -} - void bpf_map_offload_map_free(struct bpf_map *map) { struct bpf_offloaded_map *offmap = map_to_offmap(map); @@ -549,7 +650,7 @@ static bool __bpf_offload_dev_match(struct bpf_prog *prog, struct bpf_offload_netdev *ondev1, *ondev2; struct bpf_prog_offload *offload; - if (!bpf_prog_is_offloaded(prog->aux)) + if (!bpf_prog_is_dev_bound(prog->aux)) return false; offload = prog->aux->offload; @@ -595,32 +696,11 @@ bool bpf_offload_prog_map_match(struct bpf_prog *prog, struct bpf_map *map) int bpf_offload_dev_netdev_register(struct bpf_offload_dev *offdev, struct net_device *netdev) { - struct bpf_offload_netdev *ondev; int err; - ondev = kzalloc(sizeof(*ondev), GFP_KERNEL); - if (!ondev) - return -ENOMEM; - - ondev->netdev = netdev; - ondev->offdev = offdev; - INIT_LIST_HEAD(&ondev->progs); - INIT_LIST_HEAD(&ondev->maps); - down_write(&bpf_devs_lock); - err = rhashtable_insert_fast(&offdevs, &ondev->l, offdevs_params); - if (err) { - netdev_warn(netdev, "failed to register for BPF offload\n"); - goto err_unlock_free; - } - - list_add(&ondev->offdev_netdevs, &offdev->netdevs); - up_write(&bpf_devs_lock); - return 0; - -err_unlock_free: + err = __bpf_offload_dev_netdev_register(offdev, netdev); up_write(&bpf_devs_lock); - kfree(ondev); return err; } EXPORT_SYMBOL_GPL(bpf_offload_dev_netdev_register); @@ -628,43 +708,8 @@ EXPORT_SYMBOL_GPL(bpf_offload_dev_netdev_register); void bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, struct net_device *netdev) { - struct bpf_offload_netdev *ondev, *altdev; - struct bpf_offloaded_map *offmap, *mtmp; - struct bpf_prog_offload *offload, *ptmp; - - ASSERT_RTNL(); - down_write(&bpf_devs_lock); - ondev = rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params); - if (WARN_ON(!ondev)) - goto unlock; - - WARN_ON(rhashtable_remove_fast(&offdevs, &ondev->l, offdevs_params)); - list_del(&ondev->offdev_netdevs); - - /* Try to move the objects to another netdev of the device */ - altdev = list_first_entry_or_null(&offdev->netdevs, - struct bpf_offload_netdev, - offdev_netdevs); - if (altdev) { - list_for_each_entry(offload, &ondev->progs, offloads) - offload->netdev = altdev->netdev; - list_splice_init(&ondev->progs, &altdev->progs); - - list_for_each_entry(offmap, &ondev->maps, offloads) - offmap->netdev = altdev->netdev; - list_splice_init(&ondev->maps, &altdev->maps); - } else { - list_for_each_entry_safe(offload, ptmp, &ondev->progs, offloads) - __bpf_prog_offload_destroy(offload->prog); - list_for_each_entry_safe(offmap, mtmp, &ondev->maps, offloads) - __bpf_map_offload_destroy(offmap); - } - - WARN_ON(!list_empty(&ondev->progs)); - WARN_ON(!list_empty(&ondev->maps)); - kfree(ondev); -unlock: + __bpf_offload_dev_netdev_unregister(offdev, netdev); up_write(&bpf_devs_lock); } EXPORT_SYMBOL_GPL(bpf_offload_dev_netdev_unregister); @@ -673,18 +718,6 @@ struct bpf_offload_dev * bpf_offload_dev_create(const struct bpf_prog_offload_ops *ops, void *priv) { struct bpf_offload_dev *offdev; - int err; - - down_write(&bpf_devs_lock); - if (!offdevs_inited) { - err = rhashtable_init(&offdevs, &offdevs_params); - if (err) { - up_write(&bpf_devs_lock); - return ERR_PTR(err); - } - offdevs_inited = true; - } - up_write(&bpf_devs_lock); offdev = kzalloc(sizeof(*offdev), GFP_KERNEL); if (!offdev) @@ -710,3 +743,29 @@ void *bpf_offload_dev_priv(struct bpf_offload_dev *offdev) return offdev->priv; } EXPORT_SYMBOL_GPL(bpf_offload_dev_priv); + +void bpf_dev_bound_netdev_unregister(struct net_device *dev) +{ + struct bpf_offload_netdev *ondev; + + ASSERT_RTNL(); + + down_write(&bpf_devs_lock); + ondev = bpf_offload_find_netdev(dev); + if (ondev && !ondev->offdev) + __bpf_offload_dev_netdev_unregister(NULL, ondev->netdev); + up_write(&bpf_devs_lock); +} + +static int __init bpf_offload_init(void) +{ + int err; + + down_write(&bpf_devs_lock); + err = rhashtable_init(&offdevs, &offdevs_params); + up_write(&bpf_devs_lock); + + return err; +} + +late_initcall(bpf_offload_init); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 13bc96035116..11c558be4992 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2491,7 +2491,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) BPF_F_TEST_STATE_FREQ | BPF_F_SLEEPABLE | BPF_F_TEST_RND_HI32 | - BPF_F_XDP_HAS_FRAGS)) + BPF_F_XDP_HAS_FRAGS | + BPF_F_XDP_DEV_BOUND_ONLY)) return -EINVAL; if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && @@ -2575,7 +2576,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) prog->aux->attach_btf = attach_btf; prog->aux->attach_btf_id = attr->attach_btf_id; prog->aux->dst_prog = dst_prog; - prog->aux->offload_requested = !!attr->prog_ifindex; + prog->aux->dev_bound = !!attr->prog_ifindex; prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE; prog->aux->xdp_has_frags = attr->prog_flags & BPF_F_XDP_HAS_FRAGS; @@ -2598,8 +2599,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr) atomic64_set(&prog->aux->refcnt, 1); prog->gpl_compatible = is_gpl ? 1 : 0; - if (bpf_prog_is_offloaded(prog->aux)) { - err = bpf_prog_offload_init(prog, attr); + if (bpf_prog_is_dev_bound(prog->aux)) { + err = bpf_prog_dev_bound_init(prog, attr); if (err) goto free_prog_sec; } diff --git a/net/core/dev.c b/net/core/dev.c index 5d51999cba30..194f8116aad4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9228,6 +9228,10 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack NL_SET_ERR_MSG(extack, "Using offloaded program without HW_MODE flag is not supported"); return -EINVAL; } + if (bpf_prog_is_dev_bound(new_prog->aux) && !bpf_offload_dev_match(new_prog, dev)) { + NL_SET_ERR_MSG(extack, "Program bound to different device"); + return -EINVAL; + } if (new_prog->expected_attach_type == BPF_XDP_DEVMAP) { NL_SET_ERR_MSG(extack, "BPF_XDP_DEVMAP programs can not be attached to a device"); return -EINVAL; @@ -10813,6 +10817,7 @@ void unregister_netdevice_many_notify(struct list_head *head, /* Shutdown queueing discipline. */ dev_shutdown(dev); + bpf_dev_bound_netdev_unregister(dev); dev_xdp_uninstall(dev); netdev_offload_xstats_disable_all(dev); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 464ca3f01fe7..fa28603a48e7 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1156,6 +1156,11 @@ enum bpf_link_type { */ #define BPF_F_XDP_HAS_FRAGS (1U << 5) +/* If BPF_F_XDP_DEV_BOUND_ONLY is used in BPF_PROG_LOAD command, the loaded + * program becomes device-bound but can access XDP metadata. + */ +#define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) + /* link_create.kprobe_multi.flags used in LINK_CREATE command for * BPF_TRACE_KPROBE_MULTI attach type to create return probe. */ From patchwork Tue Dec 13 02:35:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071653 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9463C4167B for ; Tue, 13 Dec 2022 02:36:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234182AbiLMCgl (ORCPT ); Mon, 12 Dec 2022 21:36:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234197AbiLMCgb (ORCPT ); Mon, 12 Dec 2022 21:36:31 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 876311D0C6 for ; Mon, 12 Dec 2022 18:36:14 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id m13-20020a170902f64d00b001899a70c8f1so12023217plg.14 for ; Mon, 12 Dec 2022 18:36:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qvvIC7YrDQMRyZvpvAP4n+tRFkpRWKJ/6Yf2M75Qa3M=; b=AQmLlkOwM+PjHCIjUllzOz2prpi9ZfYy0ptb4bI8FYikZWpBw7yL+pHvng3TiXDbhe cySGOjVaG7GOwiT2Nj69f/P0anH/WtlywuCRrh04mzmvu+oktD7mtxTzcXqoQ2soStuG wNf+fquL3Xqkhz+hgq83Ro9e8LHrYbh8OidCzG8XtlYy9pKxZRFa4dpCRdhB/+7keB/4 zID1bPp3pfHzyzY0sXJ9tdfleUNc2BIyBL7KsGm1dsNGjPCqfUlLah4Z7IvITv8nElkK XbK1NFJSntAPIH5Tk82Fzs7qjgGY6+NNeIxik6H8hULF/LprVWN0/v1JhdQRcaOF7EOq bdQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qvvIC7YrDQMRyZvpvAP4n+tRFkpRWKJ/6Yf2M75Qa3M=; b=53dUatsjgVR7rRLL7xHuciNzYJ48lACtq62ry1ziZp6m55nlilgffiOvHQpRHtw2Vc U78RL5664IKrqPlz/qhSx2AMj9vI1ykXsC0FHSfB3tCZJiivp+v6bhJrzmXHH3McTSOU LUgLl7DnHE4zjJ0sMNzhDSkmJ+/8r37pcE9LXLFQm1UDyOP+eUwSzNObbkhidgvGTV1K dInsJkzVH0Dn26QstQz54WypHpWgpmDwdDNSw/X82185Bw9P3Ykafo/G/4l27yskAHUy DKD1FNMxDHSZRcwBdQK8WAG2aGvQ07nbXqJlMK27UOOb66O03aadLVKjcfIAm5vWy/sD lvAg== X-Gm-Message-State: ANoB5pnoPIrpbHzgenaBwsO1+LIjNsMVmrGFZ68xcUuIDickiWBt57qP 2e8CWmZSklc/xP6VaK5gm6kSO7U= X-Google-Smtp-Source: AA0mqf6ciZA8rokylYS3W9hhZ7SyYnvGkNX6Ju2z5ynQu8qDYkxxS60UW4sBaBHfC6+Ktw2lOmUXGUY= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a62:1dcd:0:b0:576:4b38:c08e with SMTP id d196-20020a621dcd000000b005764b38c08emr31155877pfd.37.1670898973905; Mon, 12 Dec 2022 18:36:13 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:54 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-5-sdf@google.com> Subject: [PATCH bpf-next v4 04/15] selftests/bpf: Update expected test_offload.py messages From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Generic check has a different error message, update the selftest. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/test_offload.py | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/bpf/test_offload.py b/tools/testing/selftests/bpf/test_offload.py index 7cb1bc05e5cf..40cba8d368d9 100755 --- a/tools/testing/selftests/bpf/test_offload.py +++ b/tools/testing/selftests/bpf/test_offload.py @@ -1039,7 +1039,7 @@ netns = [] offload = bpf_pinned("/sys/fs/bpf/offload") ret, _, err = sim.set_xdp(offload, "drv", fail=False, include_stderr=True) fail(ret == 0, "attached offloaded XDP program to drv") - check_extack(err, "Using device-bound program without HW_MODE flag is not supported.", args) + check_extack(err, "Using offloaded program without HW_MODE flag is not supported.", args) rm("/sys/fs/bpf/offload") sim.wait_for_flush() @@ -1088,12 +1088,12 @@ netns = [] ret, _, err = sim.set_xdp(pinned, "offload", fail=False, include_stderr=True) fail(ret == 0, "Pinned program loaded for a different device accepted") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) simdev2.remove() ret, _, err = sim.set_xdp(pinned, "offload", fail=False, include_stderr=True) fail(ret == 0, "Pinned program loaded for a removed device accepted") - check_extack_nsim(err, "xdpoffload of non-bound program.", args) + check_extack(err, "Program bound to different device.", args) rm(pin_file) bpftool_prog_list_wait(expected=0) @@ -1334,12 +1334,12 @@ netns = [] ret, _, err = simA.set_xdp(progB, "offload", force=True, JSON=False, fail=False, include_stderr=True) fail(ret == 0, "cross-ASIC program allowed") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) for d in simdevB.nsims: ret, _, err = d.set_xdp(progA, "offload", force=True, JSON=False, fail=False, include_stderr=True) fail(ret == 0, "cross-ASIC program allowed") - check_extack_nsim(err, "program bound to different dev.", args) + check_extack(err, "Program bound to different device.", args) start_test("Test multi-dev ASIC cross-dev map reuse...") From patchwork Tue Dec 13 02:35:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071654 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67B2DC4167B for ; Tue, 13 Dec 2022 02:36:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234290AbiLMCgw (ORCPT ); Mon, 12 Dec 2022 21:36:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234236AbiLMCgc (ORCPT ); Mon, 12 Dec 2022 21:36:32 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B1241D32F for ; Mon, 12 Dec 2022 18:36:16 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id i4-20020a17090332c400b0018f82951826so4178585plr.20 for ; Mon, 12 Dec 2022 18:36:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aaTD9DtSYMCbnO/EY+KiLmSkv7hwjtULTL1SptPB+YU=; b=pDZsPSvSjd/n4M0AH7Acv2WSn6IQ33MkGH994y1WNR9qiRyj89tkuXUkX6MyVx6OY3 d1uU5DK55+8gyIJiP6eJxDi+0ndu5fqKLBDfIgO/ZYJlgtz1R7p/QIWzQcZecm4bcZ0U Y+LE6Fe75+llrau+QdCBL9DbCfsnXl6IY0p5WzKlNiRy0Ulmu2UcmgIzukQgTFKZgVrL N0cuCWq0adlyk7CKe7sUxOUes7F0tVojL4SjwcDjfNYxyFwhKVKaS2yJ2b/WfvdoEv6h 3s3KDoB0fG203GkgjQV19iEQb5eXfQTDEpCBY8rNGSbQ9PPfzRYWaL4zn2jwMdK7kG6c 2Q1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aaTD9DtSYMCbnO/EY+KiLmSkv7hwjtULTL1SptPB+YU=; b=tQ64COTrn7fOhxw00MBPfGSGLTRSlQmN6WtVA3nOyJ6FsAMbzd2Ke7sy+OroREm78T EDa4JnsLy+KsknXCKzm0ifHjEYZLNVdVLZz+IQRVCDRrhZIlI2iz6L3PYcJoxRDlU11n quq4Oy2/7jrzkdZ7uZM0VPEa3Ggz5gbwobYQ6OhY4fElSqx4u9eznrEev1piSVr1sWIF sC2fuCrTyA7X45zyRNRd+MeWPBuS2ctG2a/GSZjzfEwFAUMF+tZ8JYj55nL7lXzH2ya1 t3qa0E+CbMii+BGLEq92oae12Kxs3tgqBSp1TqLvNshZTClSRGMiiDtf3ecZFNGMO/Zk 7kbg== X-Gm-Message-State: ANoB5pnCmlM27Fi3qviaJgwzn+kCFO2OseDKzZD19Uui+m94ZWjoyOpi Y5Uyg+hVeQalW3bBqJDnUKIj8BM= X-Google-Smtp-Source: AA0mqf7D8B2okxUV5pMM2kVVp/R2bGRBum5pVPQA7csDCjwkUZOjcG3qElwHceei/SWgc4DJYX6K6mY= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:6a00:2483:b0:56c:12c0:aaf1 with SMTP id c3-20020a056a00248300b0056c12c0aaf1mr80175603pfv.50.1670898975555; Mon, 12 Dec 2022 18:36:15 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:55 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-6-sdf@google.com> Subject: [PATCH bpf-next v4 05/15] bpf: XDP metadata RX kfuncs From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Define a new kfunc set (xdp_metadata_kfunc_ids) which implements all possible XDP metatada kfuncs. Not all devices have to implement them. If kfunc is not supported by the target device, the default implementation is called instead. The verifier, at load time, replaces a call to the generic kfunc with a call to the per-device one. Per-device kfunc pointers are stored in separate struct xdp_metadata_ops. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- include/linux/bpf.h | 2 ++ include/linux/netdevice.h | 7 +++++++ include/net/xdp.h | 25 ++++++++++++++++++++++ kernel/bpf/core.c | 7 +++++++ kernel/bpf/offload.c | 23 ++++++++++++++++++++ kernel/bpf/verifier.c | 29 +++++++++++++++++++++++++- net/core/xdp.c | 44 +++++++++++++++++++++++++++++++++++++++ 7 files changed, 136 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index ca22e8b8bd82..de6279725f41 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2477,6 +2477,8 @@ void bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, struct net_device *netdev); bool bpf_offload_dev_match(struct bpf_prog *prog, struct net_device *netdev); +void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id); + void unpriv_ebpf_notify(int new_state); #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5aa35c58c342..63786091c60d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -74,6 +74,7 @@ struct udp_tunnel_nic_info; struct udp_tunnel_nic; struct bpf_prog; struct xdp_buff; +struct xdp_md; void synchronize_net(void); void netdev_set_default_ethtool_ops(struct net_device *dev, @@ -1613,6 +1614,11 @@ struct net_device_ops { bool cycles); }; +struct xdp_metadata_ops { + int (*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp); + int (*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash); +}; + /** * enum netdev_priv_flags - &struct net_device priv_flags * @@ -2044,6 +2050,7 @@ struct net_device { unsigned int flags; unsigned long long priv_flags; const struct net_device_ops *netdev_ops; + const struct xdp_metadata_ops *xdp_metadata_ops; int ifindex; unsigned short gflags; unsigned short hard_header_len; diff --git a/include/net/xdp.h b/include/net/xdp.h index 55dbc68bfffc..152c3a9c1127 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -409,4 +409,29 @@ void xdp_attachment_setup(struct xdp_attachment_info *info, #define DEV_MAP_BULK_SIZE XDP_BULK_QUEUE_SIZE +#define XDP_METADATA_KFUNC_xxx \ + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_TIMESTAMP, \ + bpf_xdp_metadata_rx_timestamp) \ + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_HASH, \ + bpf_xdp_metadata_rx_hash) \ + +enum { +#define XDP_METADATA_KFUNC(name, str) name, +XDP_METADATA_KFUNC_xxx +#undef XDP_METADATA_KFUNC +MAX_XDP_METADATA_KFUNC, +}; + +struct xdp_md; +int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp); +int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash); + +#ifdef CONFIG_NET +u32 xdp_metadata_kfunc_id(int id); +bool xdp_is_metadata_kfunc_id(u32 btf_id); +#else +static inline u32 xdp_metadata_kfunc_id(int id) { return 0; } +static inline bool xdp_is_metadata_kfunc_id(u32 btf_id) { return false; } +#endif + #endif /* __LINUX_NET_XDP_H__ */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index d434a994ee04..c3e501e3e39c 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2097,6 +2097,13 @@ bool bpf_prog_map_compatible(struct bpf_map *map, if (fp->kprobe_override) return false; + /* When tail-calling from a non-dev-bound program to a dev-bound one, + * XDP metadata helpers should be disabled. Until it's implemented, + * prohibit adding dev-bound programs to tail-call maps. + */ + if (bpf_prog_is_dev_bound(fp->aux)) + return false; + spin_lock(&map->owner.lock); if (!map->owner.type) { /* There's no owner yet where we could check for diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index f714c941f8ea..3b6c9023f24d 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -757,6 +757,29 @@ void bpf_dev_bound_netdev_unregister(struct net_device *dev) up_write(&bpf_devs_lock); } +void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id) +{ + const struct xdp_metadata_ops *ops; + void *p = NULL; + + down_read(&bpf_devs_lock); + if (!prog->aux->offload || !prog->aux->offload->netdev) + goto out; + + ops = prog->aux->offload->netdev->xdp_metadata_ops; + if (!ops) + goto out; + + if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP)) + p = ops->xmo_rx_timestamp; + else if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_HASH)) + p = ops->xmo_rx_hash; +out: + up_read(&bpf_devs_lock); + + return p; +} + static int __init bpf_offload_init(void) { int err; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 203d8cfeda70..e61fe0472b9b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -15479,12 +15479,35 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, struct bpf_insn *insn_buf, int insn_idx, int *cnt) { const struct bpf_kfunc_desc *desc; + void *xdp_kfunc; if (!insn->imm) { verbose(env, "invalid kernel function call not eliminated in verifier pass\n"); return -EINVAL; } + *cnt = 0; + + if (xdp_is_metadata_kfunc_id(insn->imm)) { + if (!bpf_prog_is_dev_bound(env->prog->aux)) { + verbose(env, "metadata kfuncs require device-bound program\n"); + return -EINVAL; + } + + if (bpf_prog_is_offloaded(env->prog->aux)) { + verbose(env, "metadata kfuncs can't be offloaded\n"); + return -EINVAL; + } + + xdp_kfunc = bpf_dev_bound_resolve_kfunc(env->prog, insn->imm); + if (xdp_kfunc) { + insn->imm = BPF_CALL_IMM(xdp_kfunc); + return 0; + } + + /* fallback to default kfunc when not supported by netdev */ + } + /* insn->imm has the btf func_id. Replace it with * an address (relative to __bpf_call_base). */ @@ -15495,7 +15518,6 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, return -EFAULT; } - *cnt = 0; insn->imm = desc->imm; if (insn->off) return 0; @@ -16502,6 +16524,11 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, if (tgt_prog) { struct bpf_prog_aux *aux = tgt_prog->aux; + if (bpf_prog_is_dev_bound(tgt_prog->aux)) { + bpf_log(log, "Replacing device-bound programs not supported\n"); + return -EINVAL; + } + for (i = 0; i < aux->func_info_cnt; i++) if (aux->func_info[i].type_id == btf_id) { subprog = i; diff --git a/net/core/xdp.c b/net/core/xdp.c index 844c9d99dc0e..b0d4080249d7 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -4,6 +4,7 @@ * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc. */ #include +#include #include #include #include @@ -709,3 +710,46 @@ struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf) return nxdpf; } + +noinline int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) +{ + return -EOPNOTSUPP; +} + +noinline int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash) +{ + return -EOPNOTSUPP; +} + +BTF_SET8_START(xdp_metadata_kfunc_ids) +#define XDP_METADATA_KFUNC(name, str) BTF_ID_FLAGS(func, str, 0) +XDP_METADATA_KFUNC_xxx +#undef XDP_METADATA_KFUNC +BTF_SET8_END(xdp_metadata_kfunc_ids) + +static const struct btf_kfunc_id_set xdp_metadata_kfunc_set = { + .owner = THIS_MODULE, + .set = &xdp_metadata_kfunc_ids, +}; + +BTF_ID_LIST(xdp_metadata_kfunc_ids_unsorted) +#define XDP_METADATA_KFUNC(name, str) BTF_ID(func, str) +XDP_METADATA_KFUNC_xxx +#undef XDP_METADATA_KFUNC + +u32 xdp_metadata_kfunc_id(int id) +{ + /* xdp_metadata_kfunc_ids is sorted and can't be used */ + return xdp_metadata_kfunc_ids_unsorted[id]; +} + +bool xdp_is_metadata_kfunc_id(u32 btf_id) +{ + return btf_id_set8_contains(&xdp_metadata_kfunc_ids, btf_id); +} + +static int __init xdp_metadata_init(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set); +} +late_initcall(xdp_metadata_init); From patchwork Tue Dec 13 02:35:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071664 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5A6CC4332F for ; Tue, 13 Dec 2022 02:36:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234265AbiLMCgy (ORCPT ); Mon, 12 Dec 2022 21:36:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234145AbiLMCgd (ORCPT ); Mon, 12 Dec 2022 21:36:33 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE5C21DDF3 for ; Mon, 12 Dec 2022 18:36:17 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id r7-20020a25c107000000b006ff55ac0ee7so15099064ybf.15 for ; Mon, 12 Dec 2022 18:36:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=hHrbQKzCmMkP479Z1RRO0/pYKwPfq1SR7Hgtp7/VwOs=; b=iSFgiSo/fvsyyLVOyUFMSwI6mkJf5wRVolbeG55XCn1EGMIMSfLaCO65nyrY6mccV5 k+8NeKzWmdvwr80mBhm9HBcFnHLAKSHgJd7KcRWjPiT1F6e8JTUryriDHwgEURb7BULO tLh8rG+znedhLY5sBaqQRmcLFL5/Wieh4Q8szEWd3lohTgZtnj9ave2k0own4WA73QE9 lRn7joKrJ88Z2rPHpeb6vDMhYVmrGoWXKMT/003qNMyWT9W5TP2A1R/GL/FhjNDjs2Ja 56ARUHpdm3hA6J6pVM30A8hG1PrV6467Dms6xVXj1Cmbf+F4z6ML1iOPZV2YL+gDBVzj aSYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=hHrbQKzCmMkP479Z1RRO0/pYKwPfq1SR7Hgtp7/VwOs=; b=NhBva/Zdc9il/WelmI0ri7pxIIJS3QxUww8Z4n5hMf00T7B4yRmLwIZVaCeZbWekHi yhJAklhGluzhLZ1aiCbYxpxUPx+9wIV9NNO/25M7NiyFISwXt4QfxPR5IuF7c/arL86f FiNul1n51HlQFsoTFwemXjyUMR6Fd+CvOzHsQTxnShPYA2DU32/e12U7yM5UWB6QX/5f xg/0hAttnkNqElhQopVMRnW+ZZn0lZCHxf5EyyrOkrv992ZVDn28afSVmFYA1B3w+I42 EmoY6d7Rg+LF2VYi9uD7see6AXO4JA2rzAGUC721V86A+tDOZhdWMgzP1td4kBWSLQBu rEXA== X-Gm-Message-State: ANoB5pkUEmFB/2ISVXtmxgnFxC3vNVgxgJ7M/DkDC4l4SkQbbkUEFXTV jkq2aVX+mXLV6rpo7pX+l8RWxwIRkEhGJIKeYbNZTd5BzgvY7fya+ID+RbLBdLR1bg6+32z8hKl DsdPs0ihrF68/O8QuZGgb77TTaI9/Y+vMXjTnP6NkptD0qFbrBQ== X-Google-Smtp-Source: AA0mqf6E5S4pD2N0gkQvq2JrCPEjvGxNLzC1eAspPRyHTxmJwXIRG6Z4Wm3by1o5O/YVA3JuPAqfBgE= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:b741:0:b0:703:6335:5f5d with SMTP id e1-20020a25b741000000b0070363355f5dmr12419814ybm.592.1670898977090; Mon, 12 Dec 2022 18:36:17 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:56 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-7-sdf@google.com> Subject: [PATCH bpf-next v4 06/15] bpf: Support consuming XDP HW metadata from fext programs From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, " =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= " Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Toke Høiland-Jørgensen Instead of rejecting the attaching of PROG_TYPE_EXT programs to XDP programs that consume HW metadata, implement support for propagating the offload information. The extension program doesn't need to set a flag or ifindex, it these will just be propagated from the target by the verifier. We need to create a separate offload object for the extension program, though, since it can be reattached to a different program later (which means we can't just inhering the offload information from the target). An additional check is added on attach that the new target is compatible with the offload information in the extension prog. Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Stanislav Fomichev --- include/linux/bpf.h | 7 ++++++ kernel/bpf/offload.c | 52 ++++++++++++++++++++++++++++--------------- kernel/bpf/syscall.c | 8 +++++++ kernel/bpf/verifier.c | 19 +++++++++++----- 4 files changed, 62 insertions(+), 24 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index de6279725f41..ed288d18bf8d 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2482,6 +2482,7 @@ void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id); void unpriv_ebpf_notify(int new_state); #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) +int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *netdev); int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr); void bpf_dev_bound_netdev_unregister(struct net_device *dev); @@ -2516,6 +2517,12 @@ void sock_map_unhash(struct sock *sk); void sock_map_destroy(struct sock *sk); void sock_map_close(struct sock *sk, long timeout); #else +static inline int __bpf_prog_dev_bound_init(struct bpf_prog *prog, + struct net_device *netdev) +{ + return -EOPNOTSUPP; +} + static inline int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) { diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 3b6c9023f24d..0646dea1b0e1 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -188,17 +188,13 @@ static void __bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, kfree(ondev); } -int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) +int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *netdev) { struct bpf_offload_netdev *ondev; struct bpf_prog_offload *offload; int err; - if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && - attr->prog_type != BPF_PROG_TYPE_XDP) - return -EINVAL; - - if (attr->prog_flags & ~BPF_F_XDP_DEV_BOUND_ONLY) + if (!netdev) return -EINVAL; offload = kzalloc(sizeof(*offload), GFP_USER); @@ -206,14 +202,7 @@ int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) return -ENOMEM; offload->prog = prog; - - offload->netdev = dev_get_by_index(current->nsproxy->net_ns, - attr->prog_ifindex); - err = bpf_dev_offload_check(offload->netdev); - if (err) - goto err_maybe_put; - - prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_DEV_BOUND_ONLY); + offload->netdev = netdev; down_write(&bpf_devs_lock); ondev = bpf_offload_find_netdev(offload->netdev); @@ -236,19 +225,46 @@ int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) offload->offdev = ondev->offdev; prog->aux->offload = offload; list_add_tail(&offload->offloads, &ondev->progs); - dev_put(offload->netdev); up_write(&bpf_devs_lock); return 0; err_unlock: up_write(&bpf_devs_lock); -err_maybe_put: - if (offload->netdev) - dev_put(offload->netdev); kfree(offload); return err; } +int bpf_prog_dev_bound_init(struct bpf_prog *prog, union bpf_attr *attr) +{ + struct net_device *netdev; + int err; + + if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && + attr->prog_type != BPF_PROG_TYPE_XDP) + return -EINVAL; + + if (attr->prog_flags & ~BPF_F_XDP_DEV_BOUND_ONLY) + return -EINVAL; + + netdev = dev_get_by_index(current->nsproxy->net_ns, attr->prog_ifindex); + if (!netdev) + return -EINVAL; + + err = bpf_dev_offload_check(netdev); + if (err) + goto out; + + prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_DEV_BOUND_ONLY); + + err = __bpf_prog_dev_bound_init(prog, netdev); + if (err) + goto out; + +out: + dev_put(netdev); + return err; +} + int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) { struct bpf_prog_offload *offload; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 11c558be4992..8686475f0dbe 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3021,6 +3021,14 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, goto out_put_prog; } + if (bpf_prog_is_dev_bound(prog->aux) && + (bpf_prog_is_offloaded(tgt_prog->aux) || + !bpf_prog_is_dev_bound(tgt_prog->aux) || + !bpf_offload_dev_match(prog, tgt_prog->aux->offload->netdev))) { + err = -EINVAL; + goto out_put_prog; + } + key = bpf_trampoline_compute_key(tgt_prog, NULL, btf_id); } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e61fe0472b9b..5c6d6d61e57a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16524,11 +16524,6 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, if (tgt_prog) { struct bpf_prog_aux *aux = tgt_prog->aux; - if (bpf_prog_is_dev_bound(tgt_prog->aux)) { - bpf_log(log, "Replacing device-bound programs not supported\n"); - return -EINVAL; - } - for (i = 0; i < aux->func_info_cnt; i++) if (aux->func_info[i].type_id == btf_id) { subprog = i; @@ -16789,10 +16784,22 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) if (tgt_prog && prog->type == BPF_PROG_TYPE_EXT) { /* to make freplace equivalent to their targets, they need to * inherit env->ops and expected_attach_type for the rest of the - * verification + * verification; we also need to propagate the prog offload data + * for resolving kfuncs. */ env->ops = bpf_verifier_ops[tgt_prog->type]; prog->expected_attach_type = tgt_prog->expected_attach_type; + + if (bpf_prog_is_dev_bound(tgt_prog->aux)) { + if (bpf_prog_is_offloaded(tgt_prog->aux)) + return -EINVAL; + + prog->aux->dev_bound = true; + ret = __bpf_prog_dev_bound_init(prog, + tgt_prog->aux->offload->netdev); + if (ret) + return ret; + } } /* store info about the attachment target that will be used later */ From patchwork Tue Dec 13 02:35:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071655 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0123C4332F for ; Tue, 13 Dec 2022 02:37:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234159AbiLMChD (ORCPT ); Mon, 12 Dec 2022 21:37:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234130AbiLMCgl (ORCPT ); Mon, 12 Dec 2022 21:36:41 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F4D01D667 for ; Mon, 12 Dec 2022 18:36:19 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id p7-20020a631e47000000b0047691854a86so8775952pgm.16 for ; Mon, 12 Dec 2022 18:36:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=eLbwZVoIdeiDs0e+MLPZphHSP+35iiBlddtDTyUBuKk=; b=HD1MOfHpAUeKeZUiKl7ft3cCAhG6gcbH6aYPbn2SYHFYvLgM0lOjJx45p9ykMmU/fo BJmswx/iZ+hLw4swgjIsMSa9g5IsEC0GtGDoLd77oOE8hJeq5dBUVQYzp1/NSCTrD+aA NXp53yD5+eFcUjlArd1lwPvA9GWfqCIb1wfTzaDzZldDJjS/csD03aZxiHi42tlhiThX JRER6fuxoh9jdUGYSUmT5P8WpOLIYqhpw8jOFTINm2bzYhUOMtBn6qoklDCH12WF7G1j uBA6PD9/3w49nnRM07PVDgoVG5t4yK2HkKeU7Lmk7LPhyqgUW5t4S7SBjjVAc9Ks2C7z LNeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eLbwZVoIdeiDs0e+MLPZphHSP+35iiBlddtDTyUBuKk=; b=0wma7YNGyw57LT5P8zYAzQrYK61JT1p8VdA7awPdCwHlJHNjgqpCvx8fQk9Pxrw7xG XwThFf1mMT5PKV1EtDdjm+CVQfPZUxp011ZvB8E7ClKVkD8H7Rh+VjIEwNZoRuSO/P3m 0wPbO+ySU9QruujEb1Bmd5hv0VVSA2+PRNvQYIMdfGVtnKU4wP9OLGB0wLJ+k3m1eWC0 oEZWogWkzkci9xQ844mwOVRJagBVf2262J9Uxt8iHyhNvr5/WrtIyUYktQCab0wJmfd+ jH8we7WyBos/TBrdJsV7nex6bFgZQ8K4DlqJqw8fiZ1bYEmZf14Ii/c2MVg/HbwZw4DR CcLw== X-Gm-Message-State: ANoB5pmxihuIIH/LIvJexSr9nyHu22FkNvddi16Jyy8gRMnX45e95+Bb g6fbEDO/2sLiQaMrr3sp62/C9PI= X-Google-Smtp-Source: AA0mqf4k5ngczsRoDw3/3Jh2eFmWYlX+gnprhjopMh5shB56zpX69jblxVyDjJQxQPXMIFKYE5NCsxY= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:6a00:21c2:b0:56b:bba4:650a with SMTP id t2-20020a056a0021c200b0056bbba4650amr79648605pfj.4.1670898978719; Mon, 12 Dec 2022 18:36:18 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:57 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-8-sdf@google.com> Subject: [PATCH bpf-next v4 07/15] veth: Introduce veth_xdp_buff wrapper for xdp_buff From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net No functional changes. Boilerplate to allow stuffing more data after xdp_buff. Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- drivers/net/veth.c | 56 +++++++++++++++++++++++++--------------------- 1 file changed, 31 insertions(+), 25 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index ac7c0653695f..04ffd8cb2945 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -116,6 +116,10 @@ static struct { { "peer_ifindex" }, }; +struct veth_xdp_buff { + struct xdp_buff xdp; +}; + static int veth_get_link_ksettings(struct net_device *dev, struct ethtool_link_ksettings *cmd) { @@ -592,23 +596,24 @@ static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, rcu_read_lock(); xdp_prog = rcu_dereference(rq->xdp_prog); if (likely(xdp_prog)) { - struct xdp_buff xdp; + struct veth_xdp_buff vxbuf; + struct xdp_buff *xdp = &vxbuf.xdp; u32 act; - xdp_convert_frame_to_buff(frame, &xdp); - xdp.rxq = &rq->xdp_rxq; + xdp_convert_frame_to_buff(frame, xdp); + xdp->rxq = &rq->xdp_rxq; - act = bpf_prog_run_xdp(xdp_prog, &xdp); + act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: - if (xdp_update_frame_from_buff(&xdp, frame)) + if (xdp_update_frame_from_buff(xdp, frame)) goto err_xdp; break; case XDP_TX: orig_frame = *frame; - xdp.rxq->mem = frame->mem; - if (unlikely(veth_xdp_tx(rq, &xdp, bq) < 0)) { + xdp->rxq->mem = frame->mem; + if (unlikely(veth_xdp_tx(rq, xdp, bq) < 0)) { trace_xdp_exception(rq->dev, xdp_prog, act); frame = &orig_frame; stats->rx_drops++; @@ -619,8 +624,8 @@ static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, goto xdp_xmit; case XDP_REDIRECT: orig_frame = *frame; - xdp.rxq->mem = frame->mem; - if (xdp_do_redirect(rq->dev, &xdp, xdp_prog)) { + xdp->rxq->mem = frame->mem; + if (xdp_do_redirect(rq->dev, xdp, xdp_prog)) { frame = &orig_frame; stats->rx_drops++; goto err_xdp; @@ -801,7 +806,8 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, { void *orig_data, *orig_data_end; struct bpf_prog *xdp_prog; - struct xdp_buff xdp; + struct veth_xdp_buff vxbuf; + struct xdp_buff *xdp = &vxbuf.xdp; u32 act, metalen; int off; @@ -815,22 +821,22 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, } __skb_push(skb, skb->data - skb_mac_header(skb)); - if (veth_convert_skb_to_xdp_buff(rq, &xdp, &skb)) + if (veth_convert_skb_to_xdp_buff(rq, xdp, &skb)) goto drop; - orig_data = xdp.data; - orig_data_end = xdp.data_end; + orig_data = xdp->data; + orig_data_end = xdp->data_end; - act = bpf_prog_run_xdp(xdp_prog, &xdp); + act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: break; case XDP_TX: - veth_xdp_get(&xdp); + veth_xdp_get(xdp); consume_skb(skb); - xdp.rxq->mem = rq->xdp_mem; - if (unlikely(veth_xdp_tx(rq, &xdp, bq) < 0)) { + xdp->rxq->mem = rq->xdp_mem; + if (unlikely(veth_xdp_tx(rq, xdp, bq) < 0)) { trace_xdp_exception(rq->dev, xdp_prog, act); stats->rx_drops++; goto err_xdp; @@ -839,10 +845,10 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, rcu_read_unlock(); goto xdp_xmit; case XDP_REDIRECT: - veth_xdp_get(&xdp); + veth_xdp_get(xdp); consume_skb(skb); - xdp.rxq->mem = rq->xdp_mem; - if (xdp_do_redirect(rq->dev, &xdp, xdp_prog)) { + xdp->rxq->mem = rq->xdp_mem; + if (xdp_do_redirect(rq->dev, xdp, xdp_prog)) { stats->rx_drops++; goto err_xdp; } @@ -862,7 +868,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, rcu_read_unlock(); /* check if bpf_xdp_adjust_head was used */ - off = orig_data - xdp.data; + off = orig_data - xdp->data; if (off > 0) __skb_push(skb, off); else if (off < 0) @@ -871,21 +877,21 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, skb_reset_mac_header(skb); /* check if bpf_xdp_adjust_tail was used */ - off = xdp.data_end - orig_data_end; + off = xdp->data_end - orig_data_end; if (off != 0) __skb_put(skb, off); /* positive on grow, negative on shrink */ /* XDP frag metadata (e.g. nr_frags) are updated in eBPF helpers * (e.g. bpf_xdp_adjust_tail), we need to update data_len here. */ - if (xdp_buff_has_frags(&xdp)) + if (xdp_buff_has_frags(xdp)) skb->data_len = skb_shinfo(skb)->xdp_frags_size; else skb->data_len = 0; skb->protocol = eth_type_trans(skb, rq->dev); - metalen = xdp.data - xdp.data_meta; + metalen = xdp->data - xdp->data_meta; if (metalen) skb_metadata_set(skb, metalen); out: @@ -898,7 +904,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, return NULL; err_xdp: rcu_read_unlock(); - xdp_return_buff(&xdp); + xdp_return_buff(xdp); xdp_xmit: return NULL; } From patchwork Tue Dec 13 02:35:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071656 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B59A7C4167B for ; Tue, 13 Dec 2022 02:37:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234285AbiLMChH (ORCPT ); Mon, 12 Dec 2022 21:37:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234280AbiLMCgv (ORCPT ); Mon, 12 Dec 2022 21:36:51 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E49FD1DA75 for ; Mon, 12 Dec 2022 18:36:20 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id 84-20020a630257000000b00477f88d334eso8763620pgc.11 for ; Mon, 12 Dec 2022 18:36:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=d8hcKXQLNNQ4WAgQXpWd59iukIOCYOBUZwXa1y2qCeI=; b=X02gtD4dAC/yQa/roDaEjtKXDqiYXTGbbQzFNXNkEXDcU0XIjrnb7WBa5AcE/ls0K4 wrl7BBUJ/9dpHFLA/zg+IGS9i2Y//8UE/UHzYgEh7z0cCMdX/vTMKkds7ihxeu0nxF1A 4Hyl758h2etOysY52RCSpNjHy5yslAJMdQSSqR47wDnNxpAW/vXu9uC5CIYrRu58JTn2 7SkCx7KVmusAbCGB6ZIInjrw7Lqk3z/Mq9/GbZBsHPXE1b7mCVC9WcWZMMyiG8nL6zsX k6xNL2utwZHIQv1FUs+bCqjA1LheiGrrIqwl3xiKOMFBtvEpuImE9aGIudpD8Tb5RVcJ xe6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d8hcKXQLNNQ4WAgQXpWd59iukIOCYOBUZwXa1y2qCeI=; b=MOVOTJ8oDxKej2NyoYwHOFghDUqI+9XUdYotqTeVtJ77wMWvkIiMxPjgMh8QQK0R8M St5rZ1VznrpuprManIRBUpmZcc6QnMsEzd5Jvi8Y6hDv8XXU4dpb62EYIcLyut1Rgfxf 8pQWBYvL21dopNnSH+G9n8uaU4Faj8yTHqqE5LiQeN+eWlWdSMsZaloYQulw/hR0SXTA irFACFD448oyhSmtAJ/3CZu8zzh/HTaQgpvvycZObMw4cyJZDmD1Li4DyWm5VNuF3NhR PVDQGULoc3Oj4VCiSuYKanBXwXxHQkORT5YoruQ4qFub0OxRIzuysYKgU/FEeEF8hBHW ZF3Q== X-Gm-Message-State: ANoB5pn9ED+JaQ1jUDGj/8MXQ/8TYQVSAS8bZb0+1/YekDRy3Twp0qbm hiKHzuakcOTh0rZsd3+F8RzTNJw= X-Google-Smtp-Source: AA0mqf6//DJzqTv5zHLOXNZ5ezU8ozVCcxgYuAbglp754hLu10S9H7MxyeauJF2EIfRBCIOMJi7vVvQ= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:90a:d358:b0:20d:d531:97cc with SMTP id i24-20020a17090ad35800b0020dd53197ccmr107286pjx.164.1670898980424; Mon, 12 Dec 2022 18:36:20 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:58 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-9-sdf@google.com> Subject: [PATCH bpf-next v4 08/15] veth: Support RX XDP metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The goal is to enable end-to-end testing of the metadata for AF_XDP. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- drivers/net/veth.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 04ffd8cb2945..d5491e7a2798 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -118,6 +118,7 @@ static struct { struct veth_xdp_buff { struct xdp_buff xdp; + struct sk_buff *skb; }; static int veth_get_link_ksettings(struct net_device *dev, @@ -602,6 +603,7 @@ static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, xdp_convert_frame_to_buff(frame, xdp); xdp->rxq = &rq->xdp_rxq; + vxbuf.skb = NULL; act = bpf_prog_run_xdp(xdp_prog, xdp); @@ -823,6 +825,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, __skb_push(skb, skb->data - skb_mac_header(skb)); if (veth_convert_skb_to_xdp_buff(rq, xdp, &skb)) goto drop; + vxbuf.skb = skb; orig_data = xdp->data; orig_data_end = xdp->data_end; @@ -1601,6 +1604,21 @@ static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp) } } +static int veth_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) +{ + *timestamp = ktime_get_mono_fast_ns(); + return 0; +} + +static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash) +{ + struct veth_xdp_buff *_ctx = (void *)ctx; + + if (_ctx->skb) + *hash = skb_get_hash(_ctx->skb); + return 0; +} + static const struct net_device_ops veth_netdev_ops = { .ndo_init = veth_dev_init, .ndo_open = veth_open, @@ -1622,6 +1640,11 @@ static const struct net_device_ops veth_netdev_ops = { .ndo_get_peer_dev = veth_peer_dev, }; +static const struct xdp_metadata_ops veth_xdp_metadata_ops = { + .xmo_rx_timestamp = veth_xdp_rx_timestamp, + .xmo_rx_hash = veth_xdp_rx_hash, +}; + #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \ NETIF_F_RXCSUM | NETIF_F_SCTP_CRC | NETIF_F_HIGHDMA | \ NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ENCAP_ALL | \ @@ -1638,6 +1661,7 @@ static void veth_setup(struct net_device *dev) dev->priv_flags |= IFF_PHONY_HEADROOM; dev->netdev_ops = &veth_netdev_ops; + dev->xdp_metadata_ops = &veth_xdp_metadata_ops; dev->ethtool_ops = &veth_ethtool_ops; dev->features |= NETIF_F_LLTX; dev->features |= VETH_FEATURES; From patchwork Tue Dec 13 02:35:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071657 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C8E5C4332F for ; Tue, 13 Dec 2022 02:37:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234289AbiLMChJ (ORCPT ); Mon, 12 Dec 2022 21:37:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234238AbiLMCgw (ORCPT ); Mon, 12 Dec 2022 21:36:52 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DAF51DF31 for ; Mon, 12 Dec 2022 18:36:22 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id b17-20020a170903229100b00189da3b178bso11845736plh.7 for ; Mon, 12 Dec 2022 18:36:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YdDm+2udenCLyCXoXj0VwOODF0742Kvd3NdJ2W+lBgU=; b=fcGid/qGjSHsGlN7gsSVn4Wvy7Yg+R3N0c6bOPUJhf7B0XBe3QR+RAHRXyYr2Rlyr3 lwCSU8Iozug4QaPJHUnArZmj5bOrVwWBUorvdNU0dU+3mj+b+8teW2ffFzeLTxsYn9L/ OfeNiJL7XxQdAQHde8tmiU3RMtldUcjlYWSQXzMuBk+lwnRjMLGm6kMTNl2ghAblAizD FOUXORFZz5zqnqJKcTh3OOioJmSZpkILNVS7whcEXWHE/6M265k+VEVNQWpKFl8u1N3C /kWznuG7cXVZmpw6B4IizawxjMuKFU5Ne47onabmPbl4xcCsFo2IL/0crWKEeHOXOru0 l7Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YdDm+2udenCLyCXoXj0VwOODF0742Kvd3NdJ2W+lBgU=; b=Dp6HV4gVpXPD+NiwLNV6+K7m8sWCdvI9jXhXB2SN2tToI1jK+jthBlCXmu+FKtS9DT tGWOKyQqcLFEIyzKZqr1iPSw4/qlbW8CsUaWGXq5UNZdCQQsgQdTTPrgN0oHPvBRojhy zusUfh0R7Lg/N57bY+Lb+sI6wHe9BHHTPxtJ2Xu7vB2cs0vLtVhIMgZR+l7rsyg/c+Vj RCL/EacUXFEp/Kn0/6NrHSIuT+4Moy0jfNSdbToUIs+Q0a7uOscce+d+3JeiaD+Ac4CN bQmGmC3kotmzcai3DVhuS0QFKEWPRzjPfg4Qa5EwtRPYv7uDH+Q1SAe+KCYoT4usdX9X 2A3w== X-Gm-Message-State: ANoB5plw/g8yAw1rt+1SNTdmQAy5e3cfTwiwMFaNSHHq9C2xe6mAzBmP HG9LQg7ukseEIgSlVFSsEx4W+3A= X-Google-Smtp-Source: AA0mqf5pAgSgA+cgbhvDYkqsllMnPfNyen01DIZMfN9io1RFF1LvjaZ8anh5HX8cLdsT99gxznm0Pik= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:90a:2881:b0:219:53fd:2cbb with SMTP id f1-20020a17090a288100b0021953fd2cbbmr132878pjd.185.1670898982054; Mon, 12 Dec 2022 18:36:22 -0800 (PST) Date: Mon, 12 Dec 2022 18:35:59 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-10-sdf@google.com> Subject: [PATCH bpf-next v4 09/15] selftests/bpf: Verify xdp_metadata xdp->af_xdp path From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net - create new netns - create veth pair (veTX+veRX) - setup AF_XDP socket for both interfaces - attach bpf to veRX - send packet via veTX - verify the packet has expected metadata at veRX Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/Makefile | 2 +- .../selftests/bpf/prog_tests/xdp_metadata.c | 412 ++++++++++++++++++ .../selftests/bpf/progs/xdp_metadata.c | 56 +++ .../selftests/bpf/progs/xdp_metadata2.c | 23 + tools/testing/selftests/bpf/xdp_metadata.h | 15 + 5 files changed, 507 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata2.c create mode 100644 tools/testing/selftests/bpf/xdp_metadata.h diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c22c43bbee19..e6cbc04a7920 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -527,7 +527,7 @@ TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ btf_helpers.c flow_dissector_load.h \ - cap_helpers.c test_loader.c + cap_helpers.c test_loader.c xsk.c TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(OUTPUT)/liburandom_read.so \ $(OUTPUT)/xdp_synproxy \ diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c new file mode 100644 index 000000000000..55f4c8cfba66 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -0,0 +1,412 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include "xdp_metadata.skel.h" +#include "xdp_metadata2.skel.h" +#include "xdp_metadata.h" +#include "xsk.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +#define TX_NAME "veTX" +#define RX_NAME "veRX" + +#define UDP_PAYLOAD_BYTES 4 + +#define AF_XDP_SOURCE_PORT 1234 +#define AF_XDP_CONSUMER_PORT 8080 + +#define UMEM_NUM 16 +#define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE +#define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) +#define XDP_FLAGS XDP_FLAGS_DRV_MODE +#define QUEUE_ID 0 + +#define TX_ADDR "10.0.0.1" +#define RX_ADDR "10.0.0.2" +#define PREFIX_LEN "8" +#define FAMILY AF_INET + +#define SYS(cmd) ({ \ + if (!ASSERT_OK(system(cmd), (cmd))) \ + goto out; \ +}) + +struct xsk { + void *umem_area; + struct xsk_umem *umem; + struct xsk_ring_prod fill; + struct xsk_ring_cons comp; + struct xsk_ring_prod tx; + struct xsk_ring_cons rx; + struct xsk_socket *socket; +}; + +static int open_xsk(const char *ifname, struct xsk *xsk) +{ + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; + const struct xsk_socket_config socket_config = { + .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD, + .xdp_flags = XDP_FLAGS, + .bind_flags = XDP_COPY, + }; + const struct xsk_umem_config umem_config = { + .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, + .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + }; + __u32 idx; + u64 addr; + int ret; + int i; + + xsk->umem_area = mmap(NULL, UMEM_SIZE, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); + if (!ASSERT_NEQ(xsk->umem_area, MAP_FAILED, "mmap")) + return -1; + + ret = xsk_umem__create(&xsk->umem, + xsk->umem_area, UMEM_SIZE, + &xsk->fill, + &xsk->comp, + &umem_config); + if (!ASSERT_OK(ret, "xsk_umem__create")) + return ret; + + ret = xsk_socket__create(&xsk->socket, ifname, QUEUE_ID, + xsk->umem, + &xsk->rx, + &xsk->tx, + &socket_config); + if (!ASSERT_OK(ret, "xsk_socket__create")) + return ret; + + /* First half of umem is for TX. This way address matches 1-to-1 + * to the completion queue index. + */ + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = i * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%d] -> %lx\n", xsk, i, addr); + } + + /* Second half of umem is for RX. */ + + ret = xsk_ring_prod__reserve(&xsk->fill, UMEM_NUM / 2, &idx); + if (!ASSERT_EQ(UMEM_NUM / 2, ret, "xsk_ring_prod__reserve")) + return ret; + if (!ASSERT_EQ(idx, 0, "fill idx != 0")) + return -1; + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; + printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + } + xsk_ring_prod__submit(&xsk->fill, ret); + + return 0; +} + +static void close_xsk(struct xsk *xsk) +{ + if (xsk->umem) + xsk_umem__delete(xsk->umem); + if (xsk->socket) + xsk_socket__delete(xsk->socket); + munmap(xsk->umem, UMEM_SIZE); +} + +static void ip_csum(struct iphdr *iph) +{ + __u32 sum = 0; + __u16 *p; + int i; + + iph->check = 0; + p = (void *)iph; + for (i = 0; i < sizeof(*iph) / sizeof(*p); i++) + sum += p[i]; + + while (sum >> 16) + sum = (sum & 0xffff) + (sum >> 16); + + iph->check = ~sum; +} + +static int generate_packet(struct xsk *xsk, __u16 dst_port) +{ + struct xdp_desc *tx_desc; + struct udphdr *udph; + struct ethhdr *eth; + struct iphdr *iph; + void *data; + __u32 idx; + int ret; + + ret = xsk_ring_prod__reserve(&xsk->tx, 1, &idx); + if (!ASSERT_EQ(ret, 1, "xsk_ring_prod__reserve")) + return -1; + + tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, idx); + tx_desc->addr = idx % (UMEM_NUM / 2) * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%u]->addr=%llx\n", xsk, idx, tx_desc->addr); + data = xsk_umem__get_data(xsk->umem_area, tx_desc->addr); + + eth = data; + iph = (void *)(eth + 1); + udph = (void *)(iph + 1); + + memcpy(eth->h_dest, "\x00\x00\x00\x00\x00\x02", ETH_ALEN); + memcpy(eth->h_source, "\x00\x00\x00\x00\x00\x01", ETH_ALEN); + eth->h_proto = htons(ETH_P_IP); + + iph->version = 0x4; + iph->ihl = 0x5; + iph->tos = 0x9; + iph->tot_len = htons(sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES); + iph->id = 0; + iph->frag_off = 0; + iph->ttl = 0; + iph->protocol = IPPROTO_UDP; + ASSERT_EQ(inet_pton(FAMILY, TX_ADDR, &iph->saddr), 1, "inet_pton(TX_ADDR)"); + ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)"); + ip_csum(iph); + + udph->source = htons(AF_XDP_SOURCE_PORT); + udph->dest = htons(dst_port); + udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); + udph->check = 0; + + memset(udph + 1, 0xAA, UDP_PAYLOAD_BYTES); + + tx_desc->len = sizeof(*eth) + sizeof(*iph) + sizeof(*udph) + UDP_PAYLOAD_BYTES; + xsk_ring_prod__submit(&xsk->tx, 1); + + ret = sendto(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, 0); + if (!ASSERT_GE(ret, 0, "sendto")) + return ret; + + return 0; +} + +static void complete_tx(struct xsk *xsk) +{ + __u32 idx; + __u64 addr; + + if (ASSERT_EQ(xsk_ring_cons__peek(&xsk->comp, 1, &idx), 1, "xsk_ring_cons__peek")) { + addr = *xsk_ring_cons__comp_addr(&xsk->comp, idx); + + printf("%p: refill idx=%u addr=%llx\n", xsk, idx, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; + xsk_ring_prod__submit(&xsk->fill, 1); + } +} + +static void refill_rx(struct xsk *xsk, __u64 addr) +{ + __u32 idx; + + if (ASSERT_EQ(xsk_ring_prod__reserve(&xsk->fill, 1, &idx), 1, "xsk_ring_prod__reserve")) { + printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; + xsk_ring_prod__submit(&xsk->fill, 1); + } +} + +static int verify_xsk_metadata(struct xsk *xsk) +{ + const struct xdp_desc *rx_desc; + struct pollfd fds = {}; + struct xdp_meta *meta; + struct ethhdr *eth; + struct iphdr *iph; + __u64 comp_addr; + void *data; + __u64 addr; + __u32 idx; + int ret; + + ret = recvfrom(xsk_socket__fd(xsk->socket), NULL, 0, MSG_DONTWAIT, NULL, NULL); + if (!ASSERT_EQ(ret, 0, "recvfrom")) + return -1; + + fds.fd = xsk_socket__fd(xsk->socket); + fds.events = POLLIN; + + ret = poll(&fds, 1, 1000); + if (!ASSERT_GT(ret, 0, "poll")) + return -1; + + ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); + if (!ASSERT_EQ(ret, 1, "xsk_ring_cons__peek")) + return -2; + + rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); + comp_addr = xsk_umem__extract_addr(rx_desc->addr); + addr = xsk_umem__add_offset_to_addr(rx_desc->addr); + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", + xsk, idx, rx_desc->addr, addr, comp_addr); + data = xsk_umem__get_data(xsk->umem_area, addr); + + /* Make sure we got the packet offset correctly. */ + + eth = data; + ASSERT_EQ(eth->h_proto, htons(ETH_P_IP), "eth->h_proto"); + iph = (void *)(eth + 1); + ASSERT_EQ((int)iph->version, 4, "iph->version"); + + /* custom metadata */ + + meta = data - sizeof(struct xdp_meta); + + if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) + return -1; + + if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash")) + return -1; + + xsk_ring_cons__release(&xsk->rx, 1); + refill_rx(xsk, comp_addr); + + return 0; +} + +void test_xdp_metadata(void) +{ + struct xdp_metadata2 *bpf_obj2 = NULL; + struct xdp_metadata *bpf_obj = NULL; + struct bpf_program *new_prog, *prog; + struct nstoken *tok = NULL; + __u32 queue_id = QUEUE_ID; + struct bpf_map *prog_arr; + struct xsk tx_xsk = {}; + struct xsk rx_xsk = {}; + __u32 val, key = 0; + int retries = 10; + int rx_ifindex; + int sock_fd; + int ret; + + /* Setup new networking namespace, with a veth pair. */ + + SYS("ip netns add xdp_metadata"); + tok = open_netns("xdp_metadata"); + SYS("ip link add numtxqueues 1 numrxqueues 1 " TX_NAME + " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1"); + SYS("ip link set dev " TX_NAME " address 00:00:00:00:00:01"); + SYS("ip link set dev " RX_NAME " address 00:00:00:00:00:02"); + SYS("ip link set dev " TX_NAME " up"); + SYS("ip link set dev " RX_NAME " up"); + SYS("ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); + SYS("ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME); + + rx_ifindex = if_nametoindex(RX_NAME); + + /* Setup separate AF_XDP for TX and RX interfaces. */ + + ret = open_xsk(TX_NAME, &tx_xsk); + if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) + goto out; + + ret = open_xsk(RX_NAME, &rx_xsk); + if (!ASSERT_OK(ret, "open_xsk(RX_NAME)")) + goto out; + + bpf_obj = xdp_metadata__open(); + if (!ASSERT_OK_PTR(bpf_obj, "open skeleton")) + goto out; + + prog = bpf_object__find_program_by_name(bpf_obj->obj, "rx"); + bpf_program__set_ifindex(prog, rx_ifindex); + bpf_program__set_flags(prog, BPF_F_XDP_DEV_BOUND_ONLY); + + if (!ASSERT_OK(xdp_metadata__load(bpf_obj), "load skeleton")) + goto out; + + /* Make sure we can't add dev-bound programs to prog maps. */ + prog_arr = bpf_object__find_map_by_name(bpf_obj->obj, "prog_arr"); + if (!ASSERT_OK_PTR(prog_arr, "no prog_arr map")) + goto out; + + val = bpf_program__fd(prog); + if (!ASSERT_ERR(bpf_map__update_elem(prog_arr, &key, sizeof(key), + &val, sizeof(val), BPF_ANY), + "update prog_arr")) + goto out; + + /* Attach BPF program to RX interface. */ + + ret = bpf_xdp_attach(rx_ifindex, + bpf_program__fd(bpf_obj->progs.rx), + XDP_FLAGS, NULL); + if (!ASSERT_GE(ret, 0, "bpf_xdp_attach")) + goto out; + + sock_fd = xsk_socket__fd(rx_xsk.socket); + ret = bpf_map_update_elem(bpf_map__fd(bpf_obj->maps.xsk), &queue_id, &sock_fd, 0); + if (!ASSERT_GE(ret, 0, "bpf_map_update_elem")) + goto out; + + /* Send packet destined to RX AF_XDP socket. */ + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, + "generate AF_XDP_CONSUMER_PORT")) + goto out; + + /* Verify AF_XDP RX packet has proper metadata. */ + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk), 0, + "verify_xsk_metadata")) + goto out; + + complete_tx(&tx_xsk); + + /* Make sure freplace correctly picks up original bound device + * and doesn't crash. + */ + + bpf_obj2 = xdp_metadata2__open(); + if (!ASSERT_OK_PTR(bpf_obj2, "open skeleton")) + goto out; + + new_prog = bpf_object__find_program_by_name(bpf_obj2->obj, "freplace_rx"); + bpf_program__set_attach_target(new_prog, bpf_program__fd(prog), "rx"); + + if (!ASSERT_OK(xdp_metadata2__load(bpf_obj2), "load freplace skeleton")) + goto out; + + if (!ASSERT_OK(xdp_metadata2__attach(bpf_obj2), "attach freplace")) + goto out; + + /* Send packet to trigger . */ + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, + "generate freplace packet")) + goto out; + + while (!retries--) { + if (bpf_obj2->bss->called) + break; + usleep(10); + } + ASSERT_GT(bpf_obj2->bss->called, 0, "not called"); + +out: + close_xsk(&rx_xsk); + close_xsk(&tx_xsk); + if (bpf_obj2) + xdp_metadata2__destroy(bpf_obj2); + if (bpf_obj) + xdp_metadata__destroy(bpf_obj); + system("ip netns del xdp_metadata"); + if (tok) + close_netns(tok); +} diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c new file mode 100644 index 000000000000..2f5eb39ea4bc --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "xdp_metadata.h" +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, 4); + __type(key, __u32); + __type(value, __u32); +} xsk SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u32); +} prog_arr SEC(".maps"); + +extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, + __u64 *timestamp) __ksym; +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +SEC("xdp") +int rx(struct xdp_md *ctx) +{ + void *data, *data_meta; + struct xdp_meta *meta; + int ret; + + /* Reserve enough for all custom metadata. */ + + ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + if (ret != 0) + return XDP_DROP; + + data = (void *)(long)ctx->data; + data_meta = (void *)(long)ctx->data_meta; + + if (data_meta + sizeof(struct xdp_meta) > data) + return XDP_DROP; + + meta = data_meta; + + /* Export metadata. */ + + bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp); + bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash); + + return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata2.c b/tools/testing/selftests/bpf/progs/xdp_metadata2.c new file mode 100644 index 000000000000..cf69d05451c3 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_metadata2.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "xdp_metadata.h" +#include +#include + +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +int called; + +SEC("freplace/rx") +int freplace_rx(struct xdp_md *ctx) +{ + u32 hash = 0; + /* Call _any_ metadata function to make sure we don't crash. */ + bpf_xdp_metadata_rx_hash(ctx, &hash); + called++; + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h new file mode 100644 index 000000000000..f6780fbb0a21 --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#pragma once + +#ifndef ETH_P_IP +#define ETH_P_IP 0x0800 +#endif + +#ifndef ETH_P_IPV6 +#define ETH_P_IPV6 0x86DD +#endif + +struct xdp_meta { + __u64 rx_timestamp; + __u32 rx_hash; +}; From patchwork Tue Dec 13 02:36:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071658 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6077EC4167B for ; Tue, 13 Dec 2022 02:37:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234297AbiLMChZ (ORCPT ); Mon, 12 Dec 2022 21:37:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234142AbiLMCgx (ORCPT ); Mon, 12 Dec 2022 21:36:53 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A26021E3C0 for ; Mon, 12 Dec 2022 18:36:24 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id v195-20020a252fcc000000b007125383fe0dso15046358ybv.23 for ; Mon, 12 Dec 2022 18:36:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=C/11ZOv9dp252wO/iEY3od+QlXFjWbrw8VnfZvsdseQ=; b=iQaWTQ5i1WvyXxuNWFfpK7w9ehir2Z3DToOJHH41njvgWtJnqHTmF56TPUBGXig1nK phTS/1JNvryB2RIv+AC5pH6jXdnZBgHvdM/Tsmn29gU/FC/OD7y5i6jyjy2knETUFic8 Gku+m6kmuTBNdUJR0w1A5SRy5F3R8YTk8pLptcNBVpAaJ3DbiZ9NDYGI6d3N99zZXYmt 8+DykWXSOktCM8iKBybTe640ku7NoeyXNKL8Gj8n3eqpsSdJdOWMZMLWgq5c5Q0mOhtf xa6lEXdIoiVL/KqnlNVKect6GwYmgx4vocch2rPQfLchGTK5/mimE6khbFKQQGnw2W4I q12A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=C/11ZOv9dp252wO/iEY3od+QlXFjWbrw8VnfZvsdseQ=; b=3AIMV4jgBk1bO/AU5XF4quZygqup4Nah3k6Om6QdAW4FU2l2wjQaxygXACvDa/rTQK p0GeX4zd24RIs8ZzUs7OyAq3OaENhev7EiCT8Uh8MvIS/oX+uFg+gPUUj5i7dNLHjlJO JgHVp+Q8RZthOFyUw9POFEogVS4z+uwJoRrZos+mpn/84KN+5HcQVUCGnJRr1yZpIlwC +dUdBYXtc4PFkHWJYY70vceuZ+Sczy0cd5Luy4CccSTY8AbrTSrWp2KABcnmSX5MknKP DjfraYfbsAANN0RPK38sfyb9yQxwxfgg4gapz954k7RIGxvNNp3JVf1ttAKK264DtJUw cEzQ== X-Gm-Message-State: ANoB5pn4SPC1LYf7/edb6MF3vJch4q6k7g0t697T56LCoEauiuTmIuBC AbZE9ioMPo5u6JW7PjILkHplSG0= X-Google-Smtp-Source: AA0mqf5dAKxHGHyFgsfH9XrSykz/CB7VXBqYxTBVzbVlj7f8zEbeHflmDccnZ+4b52BT5bpCBcri3HI= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a81:194f:0:b0:3de:7599:c5fd with SMTP id 76-20020a81194f000000b003de7599c5fdmr30393429ywz.84.1670898983671; Mon, 12 Dec 2022 18:36:23 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:00 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-11-sdf@google.com> Subject: [PATCH bpf-next v4 10/15] net/mlx4_en: Introduce wrapper for xdp_buff From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, Tariq Toukan , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net No functional changes. Boilerplate to allow stuffing more data after xdp_buff. Cc: Tariq Toukan Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 26 +++++++++++++--------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 8f762fc170b3..014a80af2813 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -661,9 +661,14 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, #define MLX4_CQE_STATUS_IP_ANY (MLX4_CQE_STATUS_IPV4) #endif +struct mlx4_en_xdp_buff { + struct xdp_buff xdp; +}; + int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int budget) { struct mlx4_en_priv *priv = netdev_priv(dev); + struct mlx4_en_xdp_buff mxbuf = {}; int factor = priv->cqe_factor; struct mlx4_en_rx_ring *ring; struct bpf_prog *xdp_prog; @@ -671,7 +676,6 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud bool doorbell_pending; bool xdp_redir_flush; struct mlx4_cqe *cqe; - struct xdp_buff xdp; int polled = 0; int index; @@ -681,7 +685,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud ring = priv->rx_ring[cq_ring]; xdp_prog = rcu_dereference_bh(ring->xdp_prog); - xdp_init_buff(&xdp, priv->frag_info[0].frag_stride, &ring->xdp_rxq); + xdp_init_buff(&mxbuf.xdp, priv->frag_info[0].frag_stride, &ring->xdp_rxq); doorbell_pending = false; xdp_redir_flush = false; @@ -776,24 +780,24 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud priv->frag_info[0].frag_size, DMA_FROM_DEVICE); - xdp_prepare_buff(&xdp, va - frags[0].page_offset, + xdp_prepare_buff(&mxbuf.xdp, va - frags[0].page_offset, frags[0].page_offset, length, false); - orig_data = xdp.data; + orig_data = mxbuf.xdp.data; - act = bpf_prog_run_xdp(xdp_prog, &xdp); + act = bpf_prog_run_xdp(xdp_prog, &mxbuf.xdp); - length = xdp.data_end - xdp.data; - if (xdp.data != orig_data) { - frags[0].page_offset = xdp.data - - xdp.data_hard_start; - va = xdp.data; + length = mxbuf.xdp.data_end - mxbuf.xdp.data; + if (mxbuf.xdp.data != orig_data) { + frags[0].page_offset = mxbuf.xdp.data - + mxbuf.xdp.data_hard_start; + va = mxbuf.xdp.data; } switch (act) { case XDP_PASS: break; case XDP_REDIRECT: - if (likely(!xdp_do_redirect(dev, &xdp, xdp_prog))) { + if (likely(!xdp_do_redirect(dev, &mxbuf.xdp, xdp_prog))) { ring->xdp_redirect++; xdp_redir_flush = true; frags[0].page = NULL; From patchwork Tue Dec 13 02:36:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071659 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 236E2C00145 for ; Tue, 13 Dec 2022 02:37:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234307AbiLMCh1 (ORCPT ); Mon, 12 Dec 2022 21:37:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234298AbiLMCg5 (ORCPT ); Mon, 12 Dec 2022 21:36:57 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13A4C1E3C3 for ; Mon, 12 Dec 2022 18:36:26 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 134-20020a63008c000000b00478b9313e0eso8677418pga.9 for ; Mon, 12 Dec 2022 18:36:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QpcOlS8O+iuDE4q8v93o9r3bTjxxaLceef6G1IdJp60=; b=ZfhSnk19BzTLecLyGQAcv1E/UPDJClqPahgcrWUXPjCGZ6wOkNIYQX3CpsXcCq/zQk sBnWn+5kSDKD9R8+fpcYK3+Dowo912j+bgGBSSI/0HZ6PYeQGmao8IfXdgLbR4WTowER 3Yu//BwPXc/ihwM/BEiKTCdRKdDChj8XCl/42PpLa8v6vcrVMjmkfPQcxF77FGgj7xCZ UDsTb5r3ZUjILHsZG6IJi59neki8ABIUbNkVb/50om+cfrbStZEBhwXyeXeQzxAp6hEX yOL7q1/UR53a+jZOYlP8ICwGK+S7EFesMFTUfJ1fukDLRSbJcZaf2fRjf2kvwaec4PqI oKDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QpcOlS8O+iuDE4q8v93o9r3bTjxxaLceef6G1IdJp60=; b=2Rsv5moHghlE437X7hHiKTWOLvNgUKmXdRoz5tRkNCJ5srOlMKlgmwBVHrNMEKS6ea 3WuRgf/69TlaS3uLaoNiMA67Q8NI7SynGU8DvZ7DQ6qaGtZrwp2MGcTAUexGo8Uix8qD YvCCq5tBoUk5DhAZT4OyUOUfKTQ8cdH2RjVRwXZFQlg3I2daUo3gPTqzJJeu6p5NSXJl e1+7UOQW3DOB4j+9e9UVHsyrRRnlZws85jWMNk/mJ7QzVgDzatc34YpyQGlFD0R5Ignp QX1Wk3ZwdxoGhLK4BAENSjIWcBB7hzhCngDnjHNkjXnaheiS0gRFPJrftINy+FD5KN7C rMBg== X-Gm-Message-State: ANoB5pkee2hcij2ogzDJXnxNngprgvc8Vx0QW2PWx8brCctT4xBwsT3G BX5lxQD19eSPkVis72eEpqi6ya0= X-Google-Smtp-Source: AA0mqf4KekH88SFJ0ktqphFkTc4elIpJE6hYad5/nf4FGPzvo6ReWFOd39PCP1S3NPxsZv3q/jweyDA= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:90a:4fc6:b0:21a:a0f:6acc with SMTP id q64-20020a17090a4fc600b0021a0a0f6accmr175898pjh.77.1670898985610; Mon, 12 Dec 2022 18:36:25 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:01 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-12-sdf@google.com> Subject: [PATCH bpf-next v4 11/15] net/mlx4_en: Support RX XDP metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, Tariq Toukan , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net RX timestamp and hash for now. Tested using the prog from the next patch. Also enabling xdp metadata support; don't see why it's disabled, there is enough headroom.. Cc: Tariq Toukan Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx4/en_clock.c | 13 +++++--- .../net/ethernet/mellanox/mlx4/en_netdev.c | 6 ++++ drivers/net/ethernet/mellanox/mlx4/en_rx.c | 33 ++++++++++++++++++- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 5 +++ 4 files changed, 52 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c index 98b5ffb4d729..9e3b76182088 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c @@ -58,9 +58,7 @@ u64 mlx4_en_get_cqe_ts(struct mlx4_cqe *cqe) return hi | lo; } -void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev, - struct skb_shared_hwtstamps *hwts, - u64 timestamp) +u64 mlx4_en_get_hwtstamp(struct mlx4_en_dev *mdev, u64 timestamp) { unsigned int seq; u64 nsec; @@ -70,8 +68,15 @@ void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev, nsec = timecounter_cyc2time(&mdev->clock, timestamp); } while (read_seqretry(&mdev->clock_lock, seq)); + return ns_to_ktime(nsec); +} + +void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev, + struct skb_shared_hwtstamps *hwts, + u64 timestamp) +{ memset(hwts, 0, sizeof(struct skb_shared_hwtstamps)); - hwts->hwtstamp = ns_to_ktime(nsec); + hwts->hwtstamp = mlx4_en_get_hwtstamp(mdev, timestamp); } /** diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 8800d3f1f55c..af4c4858f397 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -2889,6 +2889,11 @@ static const struct net_device_ops mlx4_netdev_ops_master = { .ndo_bpf = mlx4_xdp, }; +static const struct xdp_metadata_ops mlx4_xdp_metadata_ops = { + .xmo_rx_timestamp = mlx4_en_xdp_rx_timestamp, + .xmo_rx_hash = mlx4_en_xdp_rx_hash, +}; + struct mlx4_en_bond { struct work_struct work; struct mlx4_en_priv *priv; @@ -3310,6 +3315,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, dev->netdev_ops = &mlx4_netdev_ops_master; else dev->netdev_ops = &mlx4_netdev_ops; + dev->xdp_metadata_ops = &mlx4_xdp_metadata_ops; dev->watchdog_timeo = MLX4_EN_WATCHDOG_TIMEOUT; netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]); netif_set_real_num_rx_queues(dev, priv->rx_ring_num); diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 014a80af2813..0869d4fff17b 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -663,8 +663,35 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va, struct mlx4_en_xdp_buff { struct xdp_buff xdp; + struct mlx4_cqe *cqe; + struct mlx4_en_dev *mdev; + struct mlx4_en_rx_ring *ring; + struct net_device *dev; }; +int mlx4_en_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) +{ + struct mlx4_en_xdp_buff *_ctx = (void *)ctx; + + if (unlikely(_ctx->ring->hwtstamp_rx_filter != HWTSTAMP_FILTER_ALL)) + return -EOPNOTSUPP; + + *timestamp = mlx4_en_get_hwtstamp(_ctx->mdev, + mlx4_en_get_cqe_ts(_ctx->cqe)); + return 0; +} + +int mlx4_en_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash) +{ + struct mlx4_en_xdp_buff *_ctx = (void *)ctx; + + if (unlikely(!(_ctx->dev->features & NETIF_F_RXHASH))) + return -EOPNOTSUPP; + + *hash = be32_to_cpu(_ctx->cqe->immed_rss_invalid); + return 0; +} + int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int budget) { struct mlx4_en_priv *priv = netdev_priv(dev); @@ -781,8 +808,12 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud DMA_FROM_DEVICE); xdp_prepare_buff(&mxbuf.xdp, va - frags[0].page_offset, - frags[0].page_offset, length, false); + frags[0].page_offset, length, true); orig_data = mxbuf.xdp.data; + mxbuf.cqe = cqe; + mxbuf.mdev = priv->mdev; + mxbuf.ring = ring; + mxbuf.dev = dev; act = bpf_prog_run_xdp(xdp_prog, &mxbuf.xdp); diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index e132ff4c82f2..2f8ef0b30083 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -788,10 +788,15 @@ void mlx4_en_update_pfc_stats_bitmap(struct mlx4_dev *dev, int mlx4_en_netdev_event(struct notifier_block *this, unsigned long event, void *ptr); +struct xdp_md; +int mlx4_en_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp); +int mlx4_en_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash); + /* * Functions for time stamping */ u64 mlx4_en_get_cqe_ts(struct mlx4_cqe *cqe); +u64 mlx4_en_get_hwtstamp(struct mlx4_en_dev *mdev, u64 timestamp); void mlx4_en_fill_hwtstamps(struct mlx4_en_dev *mdev, struct skb_shared_hwtstamps *hwts, u64 timestamp); From patchwork Tue Dec 13 02:36:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071661 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07912C4167B for ; Tue, 13 Dec 2022 02:37:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234270AbiLMChb (ORCPT ); Mon, 12 Dec 2022 21:37:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234310AbiLMCg7 (ORCPT ); Mon, 12 Dec 2022 21:36:59 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDF081E3CA for ; Mon, 12 Dec 2022 18:36:27 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id mn20-20020a17090b189400b0021941492f66so404679pjb.0 for ; Mon, 12 Dec 2022 18:36:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=EOrvK1gw9vfZIaV6escFweupEfR/SySrD3QD5/GqZUQ=; b=Hjf6Mn54lrEHFalZ3MDUKajTGcbTWEGqXALxq9bDY7Gk6dNlUM5uQOS8qPVZ6M632t Vz/mm9XBf1PYnBL9Omy7BnGiTrY9vhEt7wcMEqs3UQ8M/BWrjYMtMpc01WG55TO7TAUX qSkPhYqWK/Gq4xY/S4JOrrlAZA4rpl3gxDQasKBc2HVc2jXH2gFV2AMIHb28en5KtRfh PYTXf4HA7+xg9H0NU7bPGlfxS3nH9nxSIWDJAxpOzxepGt5N93GNvuUGnYBCreqiTAeg zzrcegDY83gyx/EYtTosPB5+lmPFNVX9wiAYDtjZyFaUU0XY9D1z0VUpp9qLoV6oCOIy 3tQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=EOrvK1gw9vfZIaV6escFweupEfR/SySrD3QD5/GqZUQ=; b=ktly3JoJ+xALNRpgaBa9i2aIwh2UcLmeWnBM3WG0sSzDNPd1dgdXGtCUHHa36+l9Gi nkA+k/DOGjeWXmFufrNXWw4cSre0MMmd/1XvL6+tQTcMZCdicey2D2h4Pzn2HdtA1lNE /SXX3UMZEQy+J/ePEryZGDWi3Rgs2gCMDZjo1W6GFwDaes3j9TyDAZhahRs553g5IDMw vHxyCBS4CTBKt0A8FptLxQPeiV/O6S88sQR5TEiDkDm0JgSJqPMxnYHYebqT5PiHMtuG S/OwqIZI0ffsxS0zSYd6nY8L+2mMea2saBOPUkZ8Q4qib41aIi3yU1ca6mcslpeidCII /aEw== X-Gm-Message-State: ANoB5pnl/8eKgUoG3EwZ6VLIEHRh9S5Cyd/L/iKbRlIJ/DDkoz8oOCWQ 2TiYVuc4qEX3db9XfBV6AVuIC7k= X-Google-Smtp-Source: AA0mqf5Yezl460xdj2mhHHtqGgfZz/ZzIk5c7VQN31jyGdgaoqi+dk4UuU0g3JO+ktli9sB56GKm2Eo= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:90b:1010:b0:219:1d0a:34a6 with SMTP id gm16-20020a17090b101000b002191d0a34a6mr11500pjb.1.1670898987095; Mon, 12 Dec 2022 18:36:27 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:02 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-13-sdf@google.com> Subject: [PATCH bpf-next v4 12/15] xsk: Add cb area to struct xdp_buff_xsk From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, " =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= " , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Toke Høiland-Jørgensen Add an area after the xdp_buff in struct xdp_buff_xsk that drivers can use to stash extra information to use in metadata kfuncs. The maximum size of 24 bytes means the full xdp_buff_xsk structure will take up exactly two cache lines (with the cb field spanning both). Also add a macro drivers can use to check their own wrapping structs against the available size. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Suggested-by: Jakub Kicinski Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Stanislav Fomichev --- include/net/xsk_buff_pool.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index f787c3f524b0..3e952e569418 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -19,8 +19,11 @@ struct xdp_sock; struct device; struct page; +#define XSK_PRIV_MAX 24 + struct xdp_buff_xsk { struct xdp_buff xdp; + u8 cb[XSK_PRIV_MAX]; dma_addr_t dma; dma_addr_t frame_dma; struct xsk_buff_pool *pool; @@ -28,6 +31,8 @@ struct xdp_buff_xsk { struct list_head free_list_node; }; +#define XSK_CHECK_PRIV_TYPE(t) BUILD_BUG_ON(sizeof(t) > offsetofend(struct xdp_buff_xsk, cb)) + struct xsk_dma_map { dma_addr_t *dma_pages; struct device *dev; From patchwork Tue Dec 13 02:36:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071660 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EDBBC4332F for ; Tue, 13 Dec 2022 02:37:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234118AbiLMCh3 (ORCPT ); Mon, 12 Dec 2022 21:37:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234173AbiLMChA (ORCPT ); Mon, 12 Dec 2022 21:37:00 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 776181E3CD for ; Mon, 12 Dec 2022 18:36:29 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id om16-20020a17090b3a9000b002216006cbffso1292306pjb.3 for ; Mon, 12 Dec 2022 18:36:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=jUn7yf++A6h6DuOJafUEH14BgG1tpLsbX68awleiLkk=; b=JPTxDz1cq+1reaKltMPIXNAJMbFxWkm+fDEYBNWPyZ4M9s8Om3ZkL05D7P3zHXcZcr +Rn4KnFT4ZTo1YAl1XP6zElnuwDZrtOuC8+Vuim9KzNYOCfKJPV/VtJOK4YgyxTlGklz qwSxDOGaUTpGQktMRfOVwE0/kRW9biwOBidJjeJEnMXUoNTUmzwzSdfFxBryADsXLAXL OoGEy6H5d05SSmXG2d1JljwC9BexQAyV7Ap58CZRs9UNQAjYKadBCw2r0mCxQmf8M6Vj EHi8KFt919l2a6m1VSYBD0ARUzMpQvNvLXbR4RN9yEyBFxvsnLVx8SroJzJz9MeBqFEi BVlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=jUn7yf++A6h6DuOJafUEH14BgG1tpLsbX68awleiLkk=; b=yrIKqvICr0hFtZfHEVfuABAL9Rvh+BfWgUYM5fawnpHCz/Jw9SP/goswhM/eIpHLbj Uf9cP8oPWxXWvWVgHYejZ3ryRuCb8dQR8uEZg/2ON067PYjbdqzO9M8ahnUJs6AY6dWL kbpw2JFfpHcfDNzcKgR6rufkHsIy2uu1QwyXO+CiehcQCKKzZ1ALC9ua1rNs3Vg1Xyn3 L68r3TwPC0JUFC02piZZDvQzA3bQF4+kFI3ClMFa1olMGWLVvSZdJD11kY6QcCra1MRp PyQHXN0+cO3+z9cIlqqWa7bxv07BODijh+VsnSd4tZ35oQmMFHmrEpeHcqCV4iRA9suJ ktwg== X-Gm-Message-State: ANoB5pmV+/xd+rkRNzdDkT+fTk3ozUO1lOiVx29C5dnAidj41ABI6Ask ppHtq/6T4DlX/V9M8eFhIMrVHlA= X-Google-Smtp-Source: AA0mqf5giqxaSOqhxZGD74rJj5qlYQTjMYXSUfGDKQ2jfOlRo++NywVhtozQmjTdfHzN7jr2PCNCtyc= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a05:6a00:24c1:b0:573:a1f0:5968 with SMTP id d1-20020a056a0024c100b00573a1f05968mr78956499pfv.0.1670898988999; Mon, 12 Dec 2022 18:36:28 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:03 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-14-sdf@google.com> Subject: [PATCH bpf-next v4 13/15] net/mlx5e: Introduce wrapper for xdp_buff From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, " =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= " , Saeed Mahameed , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Toke Høiland-Jørgensen Preparation for implementing HW metadata kfuncs. No functional change. Cc: Saeed Mahameed Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 6 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 25 +++++---- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 56 +++++++++---------- 5 files changed, 49 insertions(+), 42 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index ff5b302531d5..be86a599df97 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -469,6 +469,7 @@ struct mlx5e_txqsq { union mlx5e_alloc_unit { struct page *page; struct xdp_buff *xsk; + struct mlx5e_xdp_buff *mxbuf; }; /* XDP packets can be transmitted in different ways. On completion, we need to diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 20507ef2f956..31bb6806bf5d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -158,8 +158,9 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, - struct bpf_prog *prog, struct xdp_buff *xdp) + struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf) { + struct xdp_buff *xdp = &mxbuf->xdp; u32 act; int err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index bc2d9034af5b..389818bf6833 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -44,10 +44,14 @@ (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \ sizeof(struct mlx5_wqe_inline_seg)) +struct mlx5e_xdp_buff { + struct xdp_buff xdp; +}; + struct mlx5e_xsk_param; int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk); bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, - struct bpf_prog *prog, struct xdp_buff *xdp); + struct bpf_prog *prog, struct mlx5e_xdp_buff *mlctx); void mlx5e_xdp_mpwqe_complete(struct mlx5e_xdpsq *sq); bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq); void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index c91b54d9ff27..9cff82d764e3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -22,6 +22,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) goto err; BUILD_BUG_ON(sizeof(wi->alloc_units[0]) != sizeof(wi->alloc_units[0].xsk)); + XSK_CHECK_PRIV_TYPE(struct mlx5e_xdp_buff); batch = xsk_buff_alloc_batch(rq->xsk_pool, (struct xdp_buff **)wi->alloc_units, rq->mpwqe.pages_per_wqe); @@ -233,7 +234,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, u32 head_offset, u32 page_idx) { - struct xdp_buff *xdp = wi->alloc_units[page_idx].xsk; + struct mlx5e_xdp_buff *mxbuf = wi->alloc_units[page_idx].mxbuf; struct bpf_prog *prog; /* Check packet size. Note LRO doesn't use linear SKB */ @@ -249,9 +250,9 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, */ WARN_ON_ONCE(head_offset); - xsk_buff_set_size(xdp, cqe_bcnt); - xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); - net_prefetch(xdp->data); + xsk_buff_set_size(&mxbuf->xdp, cqe_bcnt); + xsk_buff_dma_sync_for_cpu(&mxbuf->xdp, rq->xsk_pool); + net_prefetch(mxbuf->xdp.data); /* Possible flows: * - XDP_REDIRECT to XSKMAP: @@ -269,7 +270,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, */ prog = rcu_dereference(rq->xdp_prog); - if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, xdp))) { + if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf))) { if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))) __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */ return NULL; /* page/packet was consumed by XDP */ @@ -278,14 +279,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, /* XDP_PASS: copy the data from the UMEM to a new SKB and reuse the * frame. On SKB allocation failure, NULL is returned. */ - return mlx5e_xsk_construct_skb(rq, xdp); + return mlx5e_xsk_construct_skb(rq, &mxbuf->xdp); } struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt) { - struct xdp_buff *xdp = wi->au->xsk; + struct mlx5e_xdp_buff *mxbuf = wi->au->mxbuf; struct bpf_prog *prog; /* wi->offset is not used in this function, because xdp->data and the @@ -295,17 +296,17 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, */ WARN_ON_ONCE(wi->offset); - xsk_buff_set_size(xdp, cqe_bcnt); - xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); - net_prefetch(xdp->data); + xsk_buff_set_size(&mxbuf->xdp, cqe_bcnt); + xsk_buff_dma_sync_for_cpu(&mxbuf->xdp, rq->xsk_pool); + net_prefetch(mxbuf->xdp.data); prog = rcu_dereference(rq->xdp_prog); - if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, xdp))) + if (likely(prog && mlx5e_xdp_handle(rq, NULL, prog, mxbuf))) return NULL; /* page/packet was consumed by XDP */ /* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse * will be handled by mlx5e_free_rx_wqe. * On SKB allocation failure, NULL is returned. */ - return mlx5e_xsk_construct_skb(rq, xdp); + return mlx5e_xsk_construct_skb(rq, &mxbuf->xdp); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index b1ea0b995d9c..9785b58ff3bc 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1565,10 +1565,10 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va, } static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, void *va, u16 headroom, - u32 len, struct xdp_buff *xdp) + u32 len, struct mlx5e_xdp_buff *mxbuf) { - xdp_init_buff(xdp, rq->buff.frame0_sz, &rq->xdp_rxq); - xdp_prepare_buff(xdp, va, headroom, len, true); + xdp_init_buff(&mxbuf->xdp, rq->buff.frame0_sz, &rq->xdp_rxq); + xdp_prepare_buff(&mxbuf->xdp, va, headroom, len, true); } static struct sk_buff * @@ -1595,16 +1595,16 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, prog = rcu_dereference(rq->xdp_prog); if (prog) { - struct xdp_buff xdp; + struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp); - if (mlx5e_xdp_handle(rq, au->page, prog, &xdp)) + mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &mxbuf); + if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) return NULL; /* page/packet was consumed by XDP */ - rx_headroom = xdp.data - xdp.data_hard_start; - metasize = xdp.data - xdp.data_meta; - cqe_bcnt = xdp.data_end - xdp.data; + rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; + metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; + cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; } frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); @@ -1626,9 +1626,9 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi union mlx5e_alloc_unit *au = wi->au; u16 rx_headroom = rq->buff.headroom; struct skb_shared_info *sinfo; + struct mlx5e_xdp_buff mxbuf; u32 frag_consumed_bytes; struct bpf_prog *prog; - struct xdp_buff xdp; struct sk_buff *skb; dma_addr_t addr; u32 truesize; @@ -1643,8 +1643,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi net_prefetchw(va); /* xdp_frame data area */ net_prefetch(va + rx_headroom); - mlx5e_fill_xdp_buff(rq, va, rx_headroom, frag_consumed_bytes, &xdp); - sinfo = xdp_get_shared_info_from_buff(&xdp); + mlx5e_fill_xdp_buff(rq, va, rx_headroom, frag_consumed_bytes, &mxbuf); + sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); truesize = 0; cqe_bcnt -= frag_consumed_bytes; @@ -1662,13 +1662,13 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi dma_sync_single_for_cpu(rq->pdev, addr + wi->offset, frag_consumed_bytes, rq->buff.map_dir); - if (!xdp_buff_has_frags(&xdp)) { + if (!xdp_buff_has_frags(&mxbuf.xdp)) { /* Init on the first fragment to avoid cold cache access * when possible. */ sinfo->nr_frags = 0; sinfo->xdp_frags_size = 0; - xdp_buff_set_frags_flag(&xdp); + xdp_buff_set_frags_flag(&mxbuf.xdp); } frag = &sinfo->frags[sinfo->nr_frags++]; @@ -1677,7 +1677,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi skb_frag_size_set(frag, frag_consumed_bytes); if (page_is_pfmemalloc(au->page)) - xdp_buff_set_frag_pfmemalloc(&xdp); + xdp_buff_set_frag_pfmemalloc(&mxbuf.xdp); sinfo->xdp_frags_size += frag_consumed_bytes; truesize += frag_info->frag_stride; @@ -1690,7 +1690,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi au = head_wi->au; prog = rcu_dereference(rq->xdp_prog); - if (prog && mlx5e_xdp_handle(rq, au->page, prog, &xdp)) { + if (prog && mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) { if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { int i; @@ -1700,22 +1700,22 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi return NULL; /* page/packet was consumed by XDP */ } - skb = mlx5e_build_linear_skb(rq, xdp.data_hard_start, rq->buff.frame0_sz, - xdp.data - xdp.data_hard_start, - xdp.data_end - xdp.data, - xdp.data - xdp.data_meta); + skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, rq->buff.frame0_sz, + mxbuf.xdp.data - mxbuf.xdp.data_hard_start, + mxbuf.xdp.data_end - mxbuf.xdp.data, + mxbuf.xdp.data - mxbuf.xdp.data_meta); if (unlikely(!skb)) return NULL; page_ref_inc(au->page); - if (unlikely(xdp_buff_has_frags(&xdp))) { + if (unlikely(xdp_buff_has_frags(&mxbuf.xdp))) { int i; /* sinfo->nr_frags is reset by build_skb, calculate again. */ xdp_update_skb_shared_info(skb, wi - head_wi - 1, sinfo->xdp_frags_size, truesize, - xdp_buff_is_frag_pfmemalloc(&xdp)); + xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); for (i = 0; i < sinfo->nr_frags; i++) { skb_frag_t *frag = &sinfo->frags[i]; @@ -1996,19 +1996,19 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, prog = rcu_dereference(rq->xdp_prog); if (prog) { - struct xdp_buff xdp; + struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &xdp); - if (mlx5e_xdp_handle(rq, au->page, prog, &xdp)) { + mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &mxbuf); + if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */ return NULL; /* page/packet was consumed by XDP */ } - rx_headroom = xdp.data - xdp.data_hard_start; - metasize = xdp.data - xdp.data_meta; - cqe_bcnt = xdp.data_end - xdp.data; + rx_headroom = mxbuf.xdp.data - mxbuf.xdp.data_hard_start; + metasize = mxbuf.xdp.data - mxbuf.xdp.data_meta; + cqe_bcnt = mxbuf.xdp.data_end - mxbuf.xdp.data; } frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); skb = mlx5e_build_linear_skb(rq, va, frag_size, rx_headroom, cqe_bcnt, metasize); From patchwork Tue Dec 13 02:36:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071662 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC3AEC4332F for ; Tue, 13 Dec 2022 02:37:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234275AbiLMChe (ORCPT ); Mon, 12 Dec 2022 21:37:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234276AbiLMChB (ORCPT ); Mon, 12 Dec 2022 21:37:01 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 738681A83A for ; Mon, 12 Dec 2022 18:36:31 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id x18-20020a170902ec9200b00189d3797fc5so11954597plg.12 for ; Mon, 12 Dec 2022 18:36:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=eUUGP8axJTHw7H7VaVSaqhRj4+Te4WMeId4UDZ6Ut7Y=; b=KuEbNXoCjrPi7Tb2R23J86dzGPEiHzqcIcA/Y+zoYlXdN49ClKXtNWtG1SV51EYp98 hxq7ih/OtjtIHqzl85oVI0z9XFpJO5f3bKPJewT0GeMJbne8F0iGj27PKXQGV22J2HfN 0iksZpw6Ddap8ukD1hg7tJ7roDjanmJULJy0PjcLyjjE8bcATd1ikqwgwEfBDsU+Guzf pNv9GSsCxpK0KTsdYxQgC+hrI+B2ns8ih+0/fdGjul5l0KdNEcy2llEIJ7x7nkycDtxW 36oTBgigpCjxW8vUoeedQrzEZpdtGOjKDt1vsHMVBts7vYgGzyOaJ5nA+ME1h+Lxy7El wQ1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=eUUGP8axJTHw7H7VaVSaqhRj4+Te4WMeId4UDZ6Ut7Y=; b=6hCBbfWOlod/1uK9yVDUsRkEdtoq5e9y1so/XXDhWKmnzzNUPMgGHG6TGvaCG8oG/m 4CyMMpHzhu0X4j43DSeRFwdfIHDjLitupSfDA9y3JPf607l762iqF38FUAG9PblDEbQN sh0uW2MIwWJduLyIqR10q1TuCKiObO0zbV7N+6oBw/BXkqZT3tdtIBzIyz2qXelNgl2T div0D2DcsyOmaUeZA9pAKPL4DcLcA7xY/Ifr/5Y3RkeFGMKlw84GnGR8T0aVRvWjLrpl Did2wDzUY2FTFqfC8vC2qargwyEWmuOSWypbQrxRRjiizTgezpHerF4O7JNjVGiavKPC INyQ== X-Gm-Message-State: ANoB5pkS8gg+n/w6MqUBxVRBIpFBgaBHxgUdzFznrDfHijB0U8YmvRJ6 ivoJsWLsqmdQLyZh+OOG3gyA7ak= X-Google-Smtp-Source: AA0mqf5QJxsD2ao4N3W2e/VVbIdssVrO1lzClCthbKwDaikM4PHQgVkGS45RaplS8Qiek+g/vmkiQTs= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:b7cc:b0:189:b36a:544c with SMTP id v12-20020a170902b7cc00b00189b36a544cmr34640135plz.78.1670898990763; Mon, 12 Dec 2022 18:36:30 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:04 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-15-sdf@google.com> Subject: [PATCH bpf-next v4 14/15] net/mlx5e: Support RX XDP metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, " =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= " , Saeed Mahameed , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net From: Toke Høiland-Jørgensen Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe pointer to the mlx5e_skb_from* functions so it can be retrieved from the XDP ctx to do this. Cc: Saeed Mahameed Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Stanislav Fomichev --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 10 +++- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 23 +++++++++ .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 5 ++ .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 10 ++++ .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 2 + .../net/ethernet/mellanox/mlx5/core/en_main.c | 6 +++ .../net/ethernet/mellanox/mlx5/core/en_rx.c | 51 ++++++++++--------- 7 files changed, 81 insertions(+), 26 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index be86a599df97..b3f7a3e5926c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -627,10 +627,11 @@ struct mlx5e_rq; typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe_mpwrq)(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, - u16 cqe_bcnt, u32 head_offset, u32 page_idx); + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, + u32 head_offset, u32 page_idx); typedef struct sk_buff * (*mlx5e_fp_skb_from_cqe)(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt); + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); typedef bool (*mlx5e_fp_post_rx_wqes)(struct mlx5e_rq *rq); typedef void (*mlx5e_fp_dealloc_wqe)(struct mlx5e_rq*, u16); typedef void (*mlx5e_fp_shampo_dealloc_hd)(struct mlx5e_rq*, u16, u16, bool); @@ -1036,6 +1037,11 @@ int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto, u16 vid); void mlx5e_timestamp_init(struct mlx5e_priv *priv); +static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) +{ + return config->rx_filter == HWTSTAMP_FILTER_ALL; +} + struct mlx5e_xsk_param; struct mlx5e_rq_param; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 31bb6806bf5d..d10d31e12ba2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -156,6 +156,29 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, return true; } +int mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) +{ + const struct mlx5e_xdp_buff *_ctx = (void *)ctx; + + if (unlikely(!mlx5e_rx_hw_stamp(_ctx->rq->tstamp))) + return -EOPNOTSUPP; + + *timestamp = mlx5e_cqe_ts_to_ns(_ctx->rq->ptp_cyc2time, + _ctx->rq->clock, get_cqe_ts(_ctx->cqe)); + return 0; +} + +int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash) +{ + const struct mlx5e_xdp_buff *_ctx = (void *)ctx; + + if (unlikely(!(_ctx->xdp.rxq->dev->features & NETIF_F_RXHASH))) + return -EOPNOTSUPP; + + *hash = be32_to_cpu(_ctx->cqe->rss_hash_result); + return 0; +} + /* returns true if packet was consumed by xdp */ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, struct bpf_prog *prog, struct mlx5e_xdp_buff *mxbuf) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index 389818bf6833..cb568c62aba0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -46,6 +46,8 @@ struct mlx5e_xdp_buff { struct xdp_buff xdp; + struct mlx5_cqe64 *cqe; + struct mlx5e_rq *rq; }; struct mlx5e_xsk_param; @@ -60,6 +62,9 @@ void mlx5e_xdp_rx_poll_complete(struct mlx5e_rq *rq); int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags); +int mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp); +int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash); + INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, struct skb_shared_info *sinfo, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index 9cff82d764e3..8bf3029abd3c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -49,6 +49,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) umr_wqe->inline_mtts[i] = (struct mlx5_mtt) { .ptag = cpu_to_be64(addr | MLX5_EN_WR), }; + wi->alloc_units[i].mxbuf->rq = rq; } } else if (unlikely(rq->mpwqe.umr_mode == MLX5E_MPWRQ_UMR_MODE_UNALIGNED)) { for (i = 0; i < batch; i++) { @@ -58,6 +59,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) .key = rq->mkey_be, .va = cpu_to_be64(addr), }; + wi->alloc_units[i].mxbuf->rq = rq; } } else if (likely(rq->mpwqe.umr_mode == MLX5E_MPWRQ_UMR_MODE_TRIPLE)) { u32 mapping_size = 1 << (rq->mpwqe.page_shift - 2); @@ -81,6 +83,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) .key = rq->mkey_be, .va = cpu_to_be64(rq->wqe_overflow.addr), }; + wi->alloc_units[i].mxbuf->rq = rq; } } else { __be32 pad_size = cpu_to_be32((1 << rq->mpwqe.page_shift) - @@ -100,6 +103,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) .va = cpu_to_be64(rq->wqe_overflow.addr), .bcount = pad_size, }; + wi->alloc_units[i].mxbuf->rq = rq; } } @@ -230,6 +234,7 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) @@ -250,6 +255,8 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, */ WARN_ON_ONCE(head_offset); + /* mxbuf->rq is set on allocation, but cqe is per-packet so set it here */ + mxbuf->cqe = cqe; xsk_buff_set_size(&mxbuf->xdp, cqe_bcnt); xsk_buff_dma_sync_for_cpu(&mxbuf->xdp, rq->xsk_pool); net_prefetch(mxbuf->xdp.data); @@ -284,6 +291,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { struct mlx5e_xdp_buff *mxbuf = wi->au->mxbuf; @@ -296,6 +304,8 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, */ WARN_ON_ONCE(wi->offset); + /* mxbuf->rq is set on allocation, but cqe is per-packet so set it here */ + mxbuf->cqe = cqe; xsk_buff_set_size(&mxbuf->xdp, cqe_bcnt); xsk_buff_dma_sync_for_cpu(&mxbuf->xdp, rq->xsk_pool); net_prefetch(mxbuf->xdp.data); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h index 087c943bd8e9..cefc0ef6105d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h @@ -13,11 +13,13 @@ int mlx5e_xsk_alloc_rx_wqes_batched(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); int mlx5e_xsk_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk); struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx); struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, + struct mlx5_cqe64 *cqe, u32 cqe_bcnt); #endif /* __MLX5_EN_XSK_RX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 217c8a478977..6395ab141273 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4913,6 +4913,11 @@ const struct net_device_ops mlx5e_netdev_ops = { #endif }; +static const struct xdp_metadata_ops mlx5_xdp_metadata_ops = { + .xmo_rx_timestamp = mlx5e_xdp_rx_timestamp, + .xmo_rx_hash = mlx5e_xdp_rx_hash, +}; + static u32 mlx5e_choose_lro_timeout(struct mlx5_core_dev *mdev, u32 wanted_timeout) { int i; @@ -5053,6 +5058,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) SET_NETDEV_DEV(netdev, mdev->device); netdev->netdev_ops = &mlx5e_netdev_ops; + netdev->xdp_metadata_ops = &mlx5_xdp_metadata_ops; mlx5e_dcbnl_build_netdev(netdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 9785b58ff3bc..cbdb9e6199a5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -62,10 +62,12 @@ static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, - u16 cqe_bcnt, u32 head_offset, u32 page_idx); + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, + u32 page_idx); static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, - u16 cqe_bcnt, u32 head_offset, u32 page_idx); + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, + u32 page_idx); static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe); @@ -76,11 +78,6 @@ const struct mlx5e_rx_handlers mlx5e_rx_handlers_nic = { .handle_rx_cqe_mpwqe_shampo = mlx5e_handle_rx_cqe_mpwrq_shampo, }; -static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) -{ - return config->rx_filter == HWTSTAMP_FILTER_ALL; -} - static inline void mlx5e_read_cqe_slot(struct mlx5_cqwq *wq, u32 cqcc, void *data) { @@ -1564,16 +1561,19 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va, return skb; } -static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, void *va, u16 headroom, - u32 len, struct mlx5e_xdp_buff *mxbuf) +static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, + void *va, u16 headroom, u32 len, + struct mlx5e_xdp_buff *mxbuf) { xdp_init_buff(&mxbuf->xdp, rq->buff.frame0_sz, &rq->xdp_rxq); xdp_prepare_buff(&mxbuf->xdp, va, headroom, len, true); + mxbuf->cqe = cqe; + mxbuf->rq = rq; } static struct sk_buff * mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt) + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { union mlx5e_alloc_unit *au = wi->au; u16 rx_headroom = rq->buff.headroom; @@ -1598,7 +1598,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &mxbuf); + mlx5e_fill_xdp_buff(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf); if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) return NULL; /* page/packet was consumed by XDP */ @@ -1619,7 +1619,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, static struct sk_buff * mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, - u32 cqe_bcnt) + struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; struct mlx5e_wqe_frag_info *head_wi = wi; @@ -1643,7 +1643,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi net_prefetchw(va); /* xdp_frame data area */ net_prefetch(va + rx_headroom); - mlx5e_fill_xdp_buff(rq, va, rx_headroom, frag_consumed_bytes, &mxbuf); + mlx5e_fill_xdp_buff(rq, cqe, va, rx_headroom, frag_consumed_bytes, &mxbuf); sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); truesize = 0; @@ -1766,7 +1766,7 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) mlx5e_skb_from_cqe_linear, mlx5e_skb_from_cqe_nonlinear, mlx5e_xsk_skb_from_cqe_linear, - rq, wi, cqe_bcnt); + rq, wi, cqe, cqe_bcnt); if (!skb) { /* probably for XDP */ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { @@ -1819,7 +1819,7 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) skb = INDIRECT_CALL_2(rq->wqe.skb_from_cqe, mlx5e_skb_from_cqe_linear, mlx5e_skb_from_cqe_nonlinear, - rq, wi, cqe_bcnt); + rq, wi, cqe, cqe_bcnt); if (!skb) { /* probably for XDP */ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { @@ -1878,7 +1878,7 @@ static void mlx5e_handle_rx_cqe_mpwrq_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 skb = INDIRECT_CALL_2(rq->mpwqe.skb_from_cqe_mpwrq, mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, page_idx); if (!skb) goto mpwrq_cqe_out; @@ -1929,7 +1929,8 @@ mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, - u16 cqe_bcnt, u32 head_offset, u32 page_idx) + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, + u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); @@ -1968,7 +1969,8 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w static struct sk_buff * mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, - u16 cqe_bcnt, u32 head_offset, u32 page_idx) + struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, + u32 page_idx) { union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; u16 rx_headroom = rq->buff.headroom; @@ -1999,7 +2001,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_xdp_buff(rq, va, rx_headroom, cqe_bcnt, &mxbuf); + mlx5e_fill_xdp_buff(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf); if (mlx5e_xdp_handle(rq, au->page, prog, &mxbuf)) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */ @@ -2163,8 +2165,8 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cq if (likely(head_size)) *skb = mlx5e_skb_from_cqe_shampo(rq, wi, cqe, header_index); else - *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe_bcnt, data_offset, - page_idx); + *skb = mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe, cqe_bcnt, + data_offset, page_idx); if (unlikely(!*skb)) goto free_hd_entry; @@ -2238,7 +2240,8 @@ static void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cq mlx5e_skb_from_cqe_mpwrq_linear, mlx5e_skb_from_cqe_mpwrq_nonlinear, mlx5e_xsk_skb_from_cqe_mpwrq_linear, - rq, wi, cqe_bcnt, head_offset, page_idx); + rq, wi, cqe, cqe_bcnt, head_offset, + page_idx); if (!skb) goto mpwrq_cqe_out; @@ -2483,7 +2486,7 @@ static void mlx5i_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) skb = INDIRECT_CALL_2(rq->wqe.skb_from_cqe, mlx5e_skb_from_cqe_linear, mlx5e_skb_from_cqe_nonlinear, - rq, wi, cqe_bcnt); + rq, wi, cqe, cqe_bcnt); if (!skb) goto wq_free_wqe; @@ -2575,7 +2578,7 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe goto free_wqe; } - skb = mlx5e_skb_from_cqe_nonlinear(rq, wi, cqe_bcnt); + skb = mlx5e_skb_from_cqe_nonlinear(rq, wi, cqe, cqe_bcnt); if (!skb) goto free_wqe; From patchwork Tue Dec 13 02:36:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 13071663 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 711C3C4708D for ; Tue, 13 Dec 2022 02:37:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234367AbiLMChp (ORCPT ); Mon, 12 Dec 2022 21:37:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234314AbiLMChB (ORCPT ); Mon, 12 Dec 2022 21:37:01 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D4471A04A for ; Mon, 12 Dec 2022 18:36:33 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id f14-20020a056a0022ce00b005782e3b4704so1133230pfj.4 for ; Mon, 12 Dec 2022 18:36:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0m6R6fzsh/fm8sAJzuMe3pI5P5q7by5MUsxZ9GF5kAs=; b=cFq8A/ecb6rOXem+ASLqGjmJMRwTmkOiVwIf28tAfKiuqAmaW9CD69cYpoHwh1pJAw 2AFehOyZXaIf8eCm8gQXGQoMCmOuDEBXXlht4wg/ESAaey+MZxlrIOnerKYSXtKWPuCn /eJNr+oepg0L2PMhYq3KmmjDPSbnp51DxBHpLhvFf6DeoxQ8izXg/QVJq97JahuShPT1 VqDd8yFOLfAwpYEwfFRYUPLCe4tKufZe7+kLqrBSvEsPqhe+LKVcBEzFwGztRnGJoDFY pk5HhWrkB+Wg+5JzvTjnVbCSEUxfae+B1uJrXwudCg+F90t+9xiHmOzMjd7uTJJPPkaA FCXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0m6R6fzsh/fm8sAJzuMe3pI5P5q7by5MUsxZ9GF5kAs=; b=xwrYLyDWoiuYYBLz5A20fNF3MM6h920kEW9ETMBnirjTH8Qq8iT1fcq3vt4V2byPTm 0S3FLBvx6Lpp4uFyAzjfioANduYOFhN1hbWmzHFvgkMhgyZ01xty3w0nDPeMm9blx1wX HAaxu8rnmXlnNfKah3LGIPnKQBzbgRwlim0Qq0tZhRxtzORdq5Fqw9wHBY5XHAICLdmK PV2m3Er93F7YOZyXL74ghabSje65jOYS6vXyuGWXR1R36RdXx5eAVBcLQ2bB2yOgs+mO EXjBHK0swww7pIr/tVOL8K4S7/qNp7+Iwk6uFYOsfQmVK0enmqyszxZz/zF3Jo9TtKFb XN7w== X-Gm-Message-State: ANoB5pnZ7F+6ivoegHWZOrsT5eb8PlvBLc2AtJ4pxy+dq9J8NDAtdJFj FbVOs5cmVJfT9U818T8MYu5lhY4= X-Google-Smtp-Source: AA0mqf71DlvCiw9ZbG2Y+Xn16ubvfY+EQY3dKNagEPJq2GP+1rKY87Gwjx0uOU7vGYABV/0elbM9s2o= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a17:902:82c8:b0:189:f390:61d5 with SMTP id u8-20020a17090282c800b00189f39061d5mr8409273plz.141.1670898992507; Mon, 12 Dec 2022 18:36:32 -0800 (PST) Date: Mon, 12 Dec 2022 18:36:05 -0800 In-Reply-To: <20221213023605.737383-1-sdf@google.com> Mime-Version: 1.0 References: <20221213023605.737383-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213023605.737383-16-sdf@google.com> Subject: [PATCH bpf-next v4 15/15] selftests/bpf: Simple program to dump XDP RX metadata From: Stanislav Fomichev To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net To be used for verification of driver implementations. Note that the skb path is gone from the series, but I'm still keeping the implementation for any possible future work. $ xdp_hw_metadata On the other machine: $ echo -n xdp | nc -u -q1 9091 # for AF_XDP $ echo -n skb | nc -u -q1 9092 # for skb Sample output: # xdp xsk_ring_cons__peek: 1 0x19f9090: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 rx_timestamp_supported: 1 rx_timestamp: 1667850075063948829 0x19f9090: complete idx=8 addr=8000 # skb found skb hwtstamp = 1668314052.854274681 Decoding: # xdp rx_timestamp=1667850075.063948829 $ date -d @1667850075 Mon Nov 7 11:41:15 AM PST 2022 $ date Mon Nov 7 11:42:05 AM PST 2022 # skb $ date -d @1668314052 Sat Nov 12 08:34:12 PM PST 2022 $ date Sat Nov 12 08:37:06 PM PST 2022 Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 6 +- .../selftests/bpf/progs/xdp_hw_metadata.c | 81 ++++ tools/testing/selftests/bpf/xdp_hw_metadata.c | 405 ++++++++++++++++++ 4 files changed, 492 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/xdp_hw_metadata.c create mode 100644 tools/testing/selftests/bpf/xdp_hw_metadata.c diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore index 07d2d0a8c5cb..01e3baeefd4f 100644 --- a/tools/testing/selftests/bpf/.gitignore +++ b/tools/testing/selftests/bpf/.gitignore @@ -46,3 +46,4 @@ test_cpp xskxceiver xdp_redirect_multi xdp_synproxy +xdp_hw_metadata diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index e6cbc04a7920..b7d5d3aa554e 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -83,7 +83,7 @@ TEST_PROGS_EXTENDED := with_addr.sh \ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ - xskxceiver xdp_redirect_multi xdp_synproxy veristat + xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read $(OUTPUT)/sign-file TEST_GEN_FILES += liburandom_read.so @@ -241,6 +241,9 @@ $(OUTPUT)/test_maps: $(TESTING_HELPERS) $(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS) $(OUTPUT)/xsk.o: $(BPFOBJ) $(OUTPUT)/xskxceiver: $(OUTPUT)/xsk.o +$(OUTPUT)/xdp_hw_metadata: $(OUTPUT)/xsk.o $(OUTPUT)/xdp_hw_metadata.skel.h +$(OUTPUT)/xdp_hw_metadata: $(OUTPUT)/network_helpers.o +$(OUTPUT)/xdp_hw_metadata: LDFLAGS += -static BPFTOOL ?= $(DEFAULT_BPFTOOL) $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ @@ -383,6 +386,7 @@ linked_maps.skel.h-deps := linked_maps1.bpf.o linked_maps2.bpf.o test_subskeleton.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o test_subskeleton.bpf.o test_subskeleton_lib.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o test_usdt.skel.h-deps := test_usdt.bpf.o test_usdt_multispec.bpf.o +xdp_hw_metadata.skel.h-deps := xdp_hw_metadata.bpf.o LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps))) diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c new file mode 100644 index 000000000000..25b8178735ee --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -0,0 +1,81 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "xdp_metadata.h" +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_XSKMAP); + __uint(max_entries, 256); + __type(key, __u32); + __type(value, __u32); +} xsk SEC(".maps"); + +extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, + __u64 *timestamp) __ksym; +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, + __u32 *hash) __ksym; + +SEC("xdp") +int rx(struct xdp_md *ctx) +{ + void *data, *data_meta, *data_end; + struct ipv6hdr *ip6h = NULL; + struct ethhdr *eth = NULL; + struct udphdr *udp = NULL; + struct iphdr *iph = NULL; + struct xdp_meta *meta; + int ret; + + data = (void *)(long)ctx->data; + data_end = (void *)(long)ctx->data_end; + eth = data; + if (eth + 1 < data_end) { + if (eth->h_proto == bpf_htons(ETH_P_IP)) { + iph = (void *)(eth + 1); + if (iph + 1 < data_end && iph->protocol == IPPROTO_UDP) + udp = (void *)(iph + 1); + } + if (eth->h_proto == bpf_htons(ETH_P_IPV6)) { + ip6h = (void *)(eth + 1); + if (ip6h + 1 < data_end && ip6h->nexthdr == IPPROTO_UDP) + udp = (void *)(ip6h + 1); + } + if (udp && udp + 1 > data_end) + udp = NULL; + } + + if (!udp) + return XDP_PASS; + + if (udp->dest != bpf_htons(9091)) + return XDP_PASS; + + bpf_printk("forwarding UDP:9091 to AF_XDP"); + + ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + if (ret != 0) { + bpf_printk("bpf_xdp_adjust_meta returned %d", ret); + return XDP_PASS; + } + + data = (void *)(long)ctx->data; + data_meta = (void *)(long)ctx->data_meta; + meta = data_meta; + + if (meta + 1 > data) { + bpf_printk("bpf_xdp_adjust_meta doesn't appear to work"); + return XDP_PASS; + } + + if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp)) + bpf_printk("populated rx_timestamp with %u", meta->rx_timestamp); + + if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash)) + bpf_printk("populated rx_hash with %u", meta->rx_hash); + + return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c new file mode 100644 index 000000000000..b8b7600dc275 --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -0,0 +1,405 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Reference program for verifying XDP metadata on real HW. Functional test + * only, doesn't test the performance. + * + * RX: + * - UDP 9091 packets are diverted into AF_XDP + * - Metadata verified: + * - rx_timestamp + * - rx_hash + * + * TX: + * - TBD + */ + +#include +#include +#include "xdp_hw_metadata.skel.h" +#include "xsk.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "xdp_metadata.h" + +#define UMEM_NUM 16 +#define UMEM_FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE +#define UMEM_SIZE (UMEM_FRAME_SIZE * UMEM_NUM) +#define XDP_FLAGS (XDP_FLAGS_DRV_MODE | XDP_FLAGS_REPLACE) + +struct xsk { + void *umem_area; + struct xsk_umem *umem; + struct xsk_ring_prod fill; + struct xsk_ring_cons comp; + struct xsk_ring_prod tx; + struct xsk_ring_cons rx; + struct xsk_socket *socket; +}; + +struct xdp_hw_metadata *bpf_obj; +struct xsk *rx_xsk; +const char *ifname; +int ifindex; +int rxq; + +void test__fail(void) { /* for network_helpers.c */ } + +static int open_xsk(const char *ifname, struct xsk *xsk, __u32 queue_id) +{ + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; + const struct xsk_socket_config socket_config = { + .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD, + .xdp_flags = XDP_FLAGS, + .bind_flags = XDP_COPY, + }; + const struct xsk_umem_config umem_config = { + .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, + .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, + .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, + .flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG, + }; + __u32 idx; + u64 addr; + int ret; + int i; + + xsk->umem_area = mmap(NULL, UMEM_SIZE, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); + if (xsk->umem_area == MAP_FAILED) + return -ENOMEM; + + ret = xsk_umem__create(&xsk->umem, + xsk->umem_area, UMEM_SIZE, + &xsk->fill, + &xsk->comp, + &umem_config); + if (ret) + return ret; + + ret = xsk_socket__create(&xsk->socket, ifname, queue_id, + xsk->umem, + &xsk->rx, + &xsk->tx, + &socket_config); + if (ret) + return ret; + + /* First half of umem is for TX. This way address matches 1-to-1 + * to the completion queue index. + */ + + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = i * UMEM_FRAME_SIZE; + printf("%p: tx_desc[%d] -> %lx\n", xsk, i, addr); + } + + /* Second half of umem is for RX. */ + + ret = xsk_ring_prod__reserve(&xsk->fill, UMEM_NUM / 2, &idx); + for (i = 0; i < UMEM_NUM / 2; i++) { + addr = (UMEM_NUM / 2 + i) * UMEM_FRAME_SIZE; + printf("%p: rx_desc[%d] -> %lx\n", xsk, i, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, i) = addr; + } + xsk_ring_prod__submit(&xsk->fill, ret); + + return 0; +} + +static void close_xsk(struct xsk *xsk) +{ + if (xsk->umem) + xsk_umem__delete(xsk->umem); + if (xsk->socket) + xsk_socket__delete(xsk->socket); + munmap(xsk->umem, UMEM_SIZE); +} + +static void refill_rx(struct xsk *xsk, __u64 addr) +{ + __u32 idx; + + if (xsk_ring_prod__reserve(&xsk->fill, 1, &idx) == 1) { + printf("%p: complete idx=%u addr=%llx\n", xsk, idx, addr); + *xsk_ring_prod__fill_addr(&xsk->fill, idx) = addr; + xsk_ring_prod__submit(&xsk->fill, 1); + } +} + +static void verify_xdp_metadata(void *data) +{ + struct xdp_meta *meta; + + meta = data - sizeof(*meta); + + printf("rx_timestamp: %llu\n", meta->rx_timestamp); + printf("rx_hash: %u\n", meta->rx_hash); +} + +static void verify_skb_metadata(int fd) +{ + char cmsg_buf[1024]; + char packet_buf[128]; + + struct scm_timestamping *ts; + struct iovec packet_iov; + struct cmsghdr *cmsg; + struct msghdr hdr; + + memset(&hdr, 0, sizeof(hdr)); + hdr.msg_iov = &packet_iov; + hdr.msg_iovlen = 1; + packet_iov.iov_base = packet_buf; + packet_iov.iov_len = sizeof(packet_buf); + + hdr.msg_control = cmsg_buf; + hdr.msg_controllen = sizeof(cmsg_buf); + + if (recvmsg(fd, &hdr, 0) < 0) + error(-1, errno, "recvmsg"); + + for (cmsg = CMSG_FIRSTHDR(&hdr); cmsg != NULL; + cmsg = CMSG_NXTHDR(&hdr, cmsg)) { + + if (cmsg->cmsg_level != SOL_SOCKET) + continue; + + switch (cmsg->cmsg_type) { + case SCM_TIMESTAMPING: + ts = (struct scm_timestamping *)CMSG_DATA(cmsg); + if (ts->ts[2].tv_sec || ts->ts[2].tv_nsec) { + printf("found skb hwtstamp = %lu.%lu\n", + ts->ts[2].tv_sec, ts->ts[2].tv_nsec); + return; + } + break; + default: + break; + } + } + + printf("skb hwtstamp is not found!\n"); +} + +static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd) +{ + const struct xdp_desc *rx_desc; + struct pollfd fds[rxq + 1]; + __u64 comp_addr; + __u64 addr; + __u32 idx; + int ret; + int i; + + for (i = 0; i < rxq; i++) { + fds[i].fd = xsk_socket__fd(rx_xsk[i].socket); + fds[i].events = POLLIN; + fds[i].revents = 0; + } + + fds[rxq].fd = server_fd; + fds[rxq].events = POLLIN; + fds[rxq].revents = 0; + + while (true) { + errno = 0; + ret = poll(fds, rxq + 1, 1000); + printf("poll: %d (%d)\n", ret, errno); + if (ret < 0) + break; + if (ret == 0) + continue; + + if (fds[rxq].revents) + verify_skb_metadata(server_fd); + + for (i = 0; i < rxq; i++) { + if (fds[i].revents == 0) + continue; + + struct xsk *xsk = &rx_xsk[i]; + + ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); + printf("xsk_ring_cons__peek: %d\n", ret); + if (ret != 1) + continue; + + rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); + comp_addr = xsk_umem__extract_addr(rx_desc->addr); + addr = xsk_umem__add_offset_to_addr(rx_desc->addr); + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", + xsk, idx, rx_desc->addr, addr, comp_addr); + verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr)); + xsk_ring_cons__release(&xsk->rx, 1); + refill_rx(xsk, comp_addr); + } + } + + return 0; +} + +struct ethtool_channels { + __u32 cmd; + __u32 max_rx; + __u32 max_tx; + __u32 max_other; + __u32 max_combined; + __u32 rx_count; + __u32 tx_count; + __u32 other_count; + __u32 combined_count; +}; + +#define ETHTOOL_GCHANNELS 0x0000003c /* Get no of channels */ + +static int rxq_num(const char *ifname) +{ + struct ethtool_channels ch = { + .cmd = ETHTOOL_GCHANNELS, + }; + + struct ifreq ifr = { + .ifr_data = (void *)&ch, + }; + strcpy(ifr.ifr_name, ifname); + int fd, ret; + + fd = socket(AF_UNIX, SOCK_DGRAM, 0); + if (fd < 0) + error(-1, errno, "socket"); + + ret = ioctl(fd, SIOCETHTOOL, &ifr); + if (ret < 0) + error(-1, errno, "socket"); + + close(fd); + + return ch.rx_count + ch.combined_count; +} + +static void cleanup(void) +{ + LIBBPF_OPTS(bpf_xdp_attach_opts, opts); + int ret; + int i; + + if (bpf_obj) { + opts.old_prog_fd = bpf_program__fd(bpf_obj->progs.rx); + if (opts.old_prog_fd >= 0) { + printf("detaching bpf program....\n"); + ret = bpf_xdp_detach(ifindex, XDP_FLAGS, &opts); + if (ret) + printf("failed to detach XDP program: %d\n", ret); + } + } + + for (i = 0; i < rxq; i++) + close_xsk(&rx_xsk[i]); + + if (bpf_obj) + xdp_hw_metadata__destroy(bpf_obj); +} + +static void handle_signal(int sig) +{ + /* interrupting poll() is all we need */ +} + +static void timestamping_enable(int fd, int val) +{ + int ret; + + ret = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val)); + if (ret < 0) + error(-1, errno, "setsockopt(SO_TIMESTAMPING)"); +} + +int main(int argc, char *argv[]) +{ + int server_fd = -1; + int ret; + int i; + + struct bpf_program *prog; + + if (argc != 2) { + fprintf(stderr, "pass device name\n"); + return -1; + } + + ifname = argv[1]; + ifindex = if_nametoindex(ifname); + rxq = rxq_num(ifname); + + printf("rxq: %d\n", rxq); + + rx_xsk = malloc(sizeof(struct xsk) * rxq); + if (!rx_xsk) + error(-1, ENOMEM, "malloc"); + + for (i = 0; i < rxq; i++) { + printf("open_xsk(%s, %p, %d)\n", ifname, &rx_xsk[i], i); + ret = open_xsk(ifname, &rx_xsk[i], i); + if (ret) + error(-1, -ret, "open_xsk"); + + printf("xsk_socket__fd() -> %d\n", xsk_socket__fd(rx_xsk[i].socket)); + } + + printf("open bpf program...\n"); + bpf_obj = xdp_hw_metadata__open(); + if (libbpf_get_error(bpf_obj)) + error(-1, libbpf_get_error(bpf_obj), "xdp_hw_metadata__open"); + + prog = bpf_object__find_program_by_name(bpf_obj->obj, "rx"); + bpf_program__set_ifindex(prog, ifindex); + bpf_program__set_flags(prog, BPF_F_XDP_DEV_BOUND_ONLY); + + printf("load bpf program...\n"); + ret = xdp_hw_metadata__load(bpf_obj); + if (ret) + error(-1, -ret, "xdp_hw_metadata__load"); + + printf("prepare skb endpoint...\n"); + server_fd = start_server(AF_INET6, SOCK_DGRAM, NULL, 9092, 1000); + if (server_fd < 0) + error(-1, errno, "start_server"); + timestamping_enable(server_fd, + SOF_TIMESTAMPING_SOFTWARE | + SOF_TIMESTAMPING_RAW_HARDWARE); + + printf("prepare xsk map...\n"); + for (i = 0; i < rxq; i++) { + int sock_fd = xsk_socket__fd(rx_xsk[i].socket); + __u32 queue_id = i; + + printf("map[%d] = %d\n", queue_id, sock_fd); + ret = bpf_map_update_elem(bpf_map__fd(bpf_obj->maps.xsk), &queue_id, &sock_fd, 0); + if (ret) + error(-1, -ret, "bpf_map_update_elem"); + } + + printf("attach bpf program...\n"); + ret = bpf_xdp_attach(ifindex, + bpf_program__fd(bpf_obj->progs.rx), + XDP_FLAGS, NULL); + if (ret) + error(-1, -ret, "bpf_xdp_attach"); + + signal(SIGINT, handle_signal); + ret = verify_metadata(rx_xsk, rxq, server_fd); + close(server_fd); + cleanup(); + if (ret) + error(-1, -ret, "verify_metadata"); +}