From patchwork Wed Jul 21 21:28:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392239 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F55BC6377B for ; Wed, 21 Jul 2021 21:28:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82A0961208 for ; Wed, 21 Jul 2021 21:28:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbhGUUsJ (ORCPT ); Wed, 21 Jul 2021 16:48:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229830AbhGUUsI (ORCPT ); Wed, 21 Jul 2021 16:48:08 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4030C061757; Wed, 21 Jul 2021 14:28:43 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id gx2so3207801pjb.5; Wed, 21 Jul 2021 14:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iHMFoUx4ic/3jpkMbk5VC7m6rTgETqO4IDnqIqQNaMU=; b=u4b8cFGiBT7S0dHtJTzrwvScA958hkHWY1RE92eKNIeFCdxPuQ2tySuwA7LRhahDyU xdNn91oPX25suJ2H5Bidqfx6pT4cBSzxcUblnTTuNb3j5WXKBhmhMKmmOuR/XoSxtwAs nTJ+MDGQObxLFJn/UeJE5Prmw4htaBluwTNEbhPeU4NM2P08tbzi1/m/OPGuRwtShW2U 6PzpLbPADKg06gQYQH1wLS6SLhFRPChmrqznQM/yZcCTbwyOLGy2Ln93sQl29qZ60TSn iJIBjEXFtzmxXdoEXv3HTAabl5sESu9Y+jI09NQEmWrgLSVOdsBuA1raziZgFpDjpqQG egHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iHMFoUx4ic/3jpkMbk5VC7m6rTgETqO4IDnqIqQNaMU=; b=Mp0HKdjShfsJSLn0ibCQFdB45835bJ5Tw/DfS/23xdkwM+RrdQpAk657Xrkm+n5I+x AAx0Gc2lRY/zTAM3XRtJQaSsmFe1L+nLdDrIoT6LY299Zuafm2J1IzNdN0NONZlevqY1 LC5ZhC69GtXVcVbWbX+DlI5dl+pfmPwC71eq7jQUGRDlhGRN7BH+S4OE7cZzTp2eOxFy a4fAWKXxSNM4wGd5ukaZ49c7VmXTGyCOJkcc8CG7ywVgQdyGh8piyslRkQkRU8zOmgcU cba/fEfhzG/LbbFUu+VLMcTc9h417uiAMj0490QgqQo6/XUauMnGvSggEVIE9uXaJ/7a pfDA== X-Gm-Message-State: AOAM532gAGGTVaEhqzodzHZlC/dDtqEsO6BbtxU6X6HbAQF1mp7mRZ3V p/6LCH3Jf8/ZiFAXtMc/2sVjSk1MopDa7Q== X-Google-Smtp-Source: ABdhPJzH0/WDavYIB1SHJ4Qfs95YzN0vXGRCp0gwelkwTFxodwhGwarYCQ1S9dTNy9/6fnYNXGv4Og== X-Received: by 2002:a63:c041:: with SMTP id z1mr38400451pgi.49.1626902922944; Wed, 21 Jul 2021 14:28:42 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id n12sm29800433pgr.2.2021.07.21.14.28.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:42 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 1/8] samples: bpf: fix a couple of warnings Date: Thu, 22 Jul 2021 02:58:26 +0530 Message-Id: <20210721212833.701342-2-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net cookie_uid_helper_example.c: In function ‘main’: cookie_uid_helper_example.c:178:69: warning: ‘ -j ACCEPT’ directive writing 10 bytes into a region of size between 8 and 58 [-Wformat-overflow=] 178 | sprintf(rules, "iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT", | ^~~~~~~~~~ /home/kkd/src/linux/samples/bpf/cookie_uid_helper_example.c:178:9: note: ‘sprintf’ output between 53 and 103 bytes into a destination of size 100 178 | sprintf(rules, "iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 179 | file); | ~~~~~ Fix by using snprintf and a sufficiently sized buffer. tracex4_user.c:35:15: warning: ‘write’ reading 12 bytes from a region of size 11 [-Wstringop-overread] 35 | key = write(1, "\e[1;1H\e[2J", 12); /* clear screen */ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use size as 11. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/cookie_uid_helper_example.c | 12 +++++++++--- samples/bpf/tracex4_user.c | 2 +- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/samples/bpf/cookie_uid_helper_example.c b/samples/bpf/cookie_uid_helper_example.c index cc3bce8d3aac..30fdcd664da2 100644 --- a/samples/bpf/cookie_uid_helper_example.c +++ b/samples/bpf/cookie_uid_helper_example.c @@ -1,3 +1,4 @@ + /* This test is a demo of using get_socket_uid and get_socket_cookie * helper function to do per socket based network traffic monitoring. * It requires iptables version higher then 1.6.1. to load pinned eBPF @@ -167,7 +168,7 @@ static void prog_load(void) static void prog_attach_iptables(char *file) { int ret; - char rules[100]; + char rules[256]; if (bpf_obj_pin(prog_fd, file)) error(1, errno, "bpf_obj_pin"); @@ -175,8 +176,13 @@ static void prog_attach_iptables(char *file) printf("file path too long: %s\n", file); exit(1); } - sprintf(rules, "iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT", - file); + ret = snprintf(rules, sizeof(rules), + "iptables -A OUTPUT -m bpf --object-pinned %s -j ACCEPT", + file); + if (ret < 0 || ret >= sizeof(rules)) { + printf("error constructing iptables command\n"); + exit(1); + } ret = system(rules); if (ret < 0) { printf("iptables rule update failed: %d/n", WEXITSTATUS(ret)); diff --git a/samples/bpf/tracex4_user.c b/samples/bpf/tracex4_user.c index cea399424bca..566e6440e8c2 100644 --- a/samples/bpf/tracex4_user.c +++ b/samples/bpf/tracex4_user.c @@ -32,7 +32,7 @@ static void print_old_objects(int fd) __u64 key, next_key; struct pair v; - key = write(1, "\e[1;1H\e[2J", 12); /* clear screen */ + key = write(1, "\e[1;1H\e[2J", 11); /* clear screen */ key = -1; while (bpf_map_get_next_key(fd, &key, &next_key) == 0) { From patchwork Wed Jul 21 21:28:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392241 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51F26C6377A for ; Wed, 21 Jul 2021 21:28:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2FCCE6120D for ; Wed, 21 Jul 2021 21:28:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229862AbhGUUsM (ORCPT ); Wed, 21 Jul 2021 16:48:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229865AbhGUUsL (ORCPT ); Wed, 21 Jul 2021 16:48:11 -0400 Received: from mail-pj1-x1042.google.com (mail-pj1-x1042.google.com [IPv6:2607:f8b0:4864:20::1042]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3482CC061575; Wed, 21 Jul 2021 14:28:47 -0700 (PDT) Received: by mail-pj1-x1042.google.com with SMTP id jx7-20020a17090b46c7b02901757deaf2c8so2581069pjb.0; Wed, 21 Jul 2021 14:28:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MVzerbGli6klnpw5XV3m99EM5Yhx5rb7oVN/1EafHak=; b=olXGjLa4wqcEyAhsWRfRiTrY1GHxbsuYtvhT+TVCzymJ2l71g/9d+YrA6a8AtWZ6jM 3+sKWLOYiZV7T/FuPHqaor/FxYR2FUnfgayFwpSIMl1HJElfqhTWVdqL3XW8kVFphG+2 rpMDTV4eVukL2HTBJeEYUTA2GT7aMqYGH7Jzm3t+JbMWz3cW0xiJ87Z4F1Oj8FLTelI6 a/CsbMVocPJ9AgotuDSSvmvHQFd0ilJflXFQWnD700wd8oZ4Ed9tRQkxqbm2W+Qb6qdp 839yywE4pc3T6+Hs77f7O/Tuy6kZ23grqnw73fae4Fkk5yhDucRN7CrU4p9GfD+9GvfL Vu6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MVzerbGli6klnpw5XV3m99EM5Yhx5rb7oVN/1EafHak=; b=q9G8+ue8Ogr2mNK7MJejTIRf1VbxulQ0AKQdgS+7ra/6z04ZG3aNKmGoD46YuYYX3a xyT3QMtWenHZ5VoPqpoyYF874A+Gf8V4L9HOfqAQ3tnqVglcS8XRgm7R/IAqKePLFkzW AovepjVaN7z4QJQFWqDsV90ZERkq4nsQm/2/X2vpZ8v+B+B9Epdv+khhCuNmzaSNcetc aZGy1KVyJ4fjAOKEhUqdv68ScdMJLeKlqyAb9TlhlCvZwpqV8Qb0vIdkhMkqMmubeSdR ePe4pnjBkOHpRqflWYxgibdFNDDy27w85NmxWd5y/+xDScBV2igmvxj3D1Axac5Y54JP AOJQ== X-Gm-Message-State: AOAM533i8N4f3vwRKkhah9t+R9P82Lq75l7OCJ3Di17+Bi6t9E93+wV9 V4AwzsdVnbpmYZf+MSf1bn1jArIu/I8jIA== X-Google-Smtp-Source: ABdhPJx96gCUMV4CUKPCXtIKP/AHdSTGAQg3DjK4orimLXP63YPIoYfGI94/9kRaxHAQeK3Roq+VsQ== X-Received: by 2002:a17:90a:2906:: with SMTP id g6mr36633469pjd.100.1626902926345; Wed, 21 Jul 2021 14:28:46 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id j2sm29285708pfb.53.2021.07.21.14.28.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:46 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 2/8] samples: bpf: Add common infrastructure for XDP samples Date: Thu, 22 Jul 2021 02:58:27 +0530 Message-Id: <20210721212833.701342-3-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This file implements some common helpers to consolidate differences in features and functionality between the various XDP samples and give them a consistent look, feel, and reporting capabilities. Some of the key features are: * A concise output format accompanied by helpful text explaining its fields. * An elaborate output format building upon the concise one, and folding out details in case of errors and staying out of view otherwise. * Extended reporting of redirect errors by capturing hits for each errno and displaying them inline (ENETDOWN, EINVAL, ENOSPC, etc.) to aid debugging. * Reporting of each xdp_exception action for all samples that use these helpers (XDP_ABORTED, etc.) to aid debugging. * Capturing per ifindex pair devmap_xmit counts for decomposing the total TX count per devmap redirection. * Ability to jump to source locations invoking tracepoints. * Faster retrieval of stats per polling interval using mmap'd eBPF array map (through .bss). * Printing driver names for devices redirecting packets. * Printing summarized total statistics for the entire session. * Ability to dynamically switch between concise and verbose mode, using SIGQUIT (Ctrl + \). The goal is sharing these helpers that most of the XDP samples implement in some form but differently for each, lacking in some respect compared to one another. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 6 +- samples/bpf/xdp_sample_shared.h | 53 ++ samples/bpf/xdp_sample_user.c | 1380 +++++++++++++++++++++++++++++++ samples/bpf/xdp_sample_user.h | 202 +++++ 4 files changed, 1640 insertions(+), 1 deletion(-) create mode 100644 samples/bpf/xdp_sample_shared.h create mode 100644 samples/bpf/xdp_sample_user.c create mode 100644 samples/bpf/xdp_sample_user.h diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 036998d11ded..57ccff5ccac4 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -62,6 +62,7 @@ LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a CGROUP_HELPERS := ../../tools/testing/selftests/bpf/cgroup_helpers.o TRACE_HELPERS := ../../tools/testing/selftests/bpf/trace_helpers.o +XDP_SAMPLE := xdp_sample_user.o fds_example-objs := fds_example.o sockex1-objs := sockex1_user.o @@ -116,6 +117,8 @@ xdp_sample_pkts-objs := xdp_sample_pkts_user.o ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) +xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o + # Tell kbuild to always build the programs always-y := $(tprogs-y) always-y += sockex1_kern.o @@ -201,6 +204,7 @@ TPROGS_CFLAGS += -Wstrict-prototypes TPROGS_CFLAGS += -I$(objtree)/usr/include TPROGS_CFLAGS += -I$(srctree)/tools/testing/selftests/bpf/ TPROGS_CFLAGS += -I$(srctree)/tools/lib/ +TPROGS_CFLAGS += -I$(srctree)/tools/lib/bpf TPROGS_CFLAGS += -I$(srctree)/tools/include TPROGS_CFLAGS += -I$(srctree)/tools/perf TPROGS_CFLAGS += -DHAVE_ATTR_TEST=0 @@ -210,7 +214,7 @@ TPROGS_CFLAGS += --sysroot=$(SYSROOT) TPROGS_LDFLAGS := -L$(SYSROOT)/usr/lib endif -TPROGS_LDLIBS += $(LIBBPF) -lelf -lz +TPROGS_LDLIBS += $(LIBBPF) -lelf -lz -lm TPROGLDLIBS_tracex4 += -lrt TPROGLDLIBS_trace_output += -lrt TPROGLDLIBS_map_perf_test += -lrt diff --git a/samples/bpf/xdp_sample_shared.h b/samples/bpf/xdp_sample_shared.h new file mode 100644 index 000000000000..b211dca233d9 --- /dev/null +++ b/samples/bpf/xdp_sample_shared.h @@ -0,0 +1,53 @@ +#ifndef _XDP_SAMPLE_SHARED_H +#define _XDP_SAMPLE_SHARED_H + +/* + * Best-effort relaxed load/store + * __atomic_load_n/__atomic_store_n built-in is not supported for BPF target + */ +#define ATOMIC_LOAD(var) ({ (*(volatile typeof(var) *)&(var)); }) +#define ATOMIC_STORE(var, val) ({ *((volatile typeof(var) *)&(var)) = (val); }) +/* This does a load + store instead of the expensive atomic fetch add, but store + * is still atomic so that userspace reading the value reads the old or the new + * one, but not a partial store. + */ +#define ATOMIC_ADD_NORMW(var, val) \ + ({ \ + typeof(val) __val = (val); \ + if (__val) \ + ATOMIC_STORE((var), (var) + __val); \ + }) + +#define ATOMIC_INC_NORMW(var) ATOMIC_ADD_NORMW((var), 1) + +#define MAX_CPUS 64 + +/* Values being read/stored must be word sized */ +#if __LP64__ +typedef __u64 datarec_t; +#else +typedef __u32 datarec_t; +#endif + +struct datarec { + datarec_t processed; + datarec_t dropped; + datarec_t issue; + union { + datarec_t xdp_pass; + datarec_t info; + }; + datarec_t xdp_drop; + datarec_t xdp_redirect; +} __attribute__((aligned(64))); + +struct sample_data { + struct datarec rx_cnt[MAX_CPUS]; + struct datarec redirect_err_cnt[7 * MAX_CPUS]; + struct datarec cpumap_enqueue_cnt[MAX_CPUS * MAX_CPUS]; + struct datarec cpumap_kthread_cnt[MAX_CPUS]; + struct datarec exception_cnt[6 * MAX_CPUS]; + struct datarec devmap_xmit_cnt[MAX_CPUS]; +}; + +#endif diff --git a/samples/bpf/xdp_sample_user.c b/samples/bpf/xdp_sample_user.c new file mode 100644 index 000000000000..8180ff56b415 --- /dev/null +++ b/samples/bpf/xdp_sample_user.c @@ -0,0 +1,1380 @@ +// SPDX-License-Identifier: GPL-2.0-only +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#ifndef SIOCETHTOOL +#define SIOCETHTOOL 0x8946 +#endif +#include +#include +#include +#include +#include +#include +#include "bpf_util.h" +#include "xdp_sample_user.h" + +static struct { + enum log_level log_level; + struct sample_output out; + struct sample_data *data; + int map_fd[NUM_MAP]; + struct xdp_desc { + int ifindex; + __u32 prog_id; + int flags; + } xdp_progs[32]; + int xdp_cnt; + bool err_exp; + int n_cpus; + int sig_fd; + int mask; +} sample; + +#define __sample_print(fmt, cond, printer, ...) \ + ({ \ + if (cond) \ + printer(fmt, ##__VA_ARGS__); \ + }) + +#define print_always(fmt, ...) __sample_print(fmt, 1, printf, ##__VA_ARGS__) +#define print_default(fmt, ...) \ + __sample_print(fmt, sample.log_level & LL_DEFAULT, printf, ##__VA_ARGS__) +#define __print_err(err, fmt, printer, ...) \ + ({ \ + __sample_print(fmt, err > 0 || sample.log_level & LL_DEFAULT, \ + printer, ##__VA_ARGS__); \ + sample.err_exp = sample.err_exp ? true : err > 0; \ + }) +#define print_err(err, fmt, ...) __print_err(err, fmt, printf, ##__VA_ARGS__) + +#define print_link_err(err, str, width, type) \ + __print_err(err, str, print_link, width, type) + +#define __COLUMN(x) "%'10" x " %-13s" +#define FMT_COLUMNf __COLUMN(".0f") +#define FMT_COLUMNd __COLUMN("d") +#define FMT_COLUMNl __COLUMN("llu") +#define RX(rx) rx, "rx/s" +#define PPS(pps) pps, "pkt/s" +#define DROP(drop) drop, "drop/s" +#define ERR(err) err, "error/s" +#define HITS(hits) hits, "hit/s" +#define XMIT(xmit) xmit, "xmit/s" +#define PASS(pass) pass, "pass/s" +#define REDIR(redir) redir, "redir/s" +#define NANOSEC_PER_SEC 1000000000 /* 10^9 */ + +void sample_print_help(int mask) +{ + printf("Output format description\n\n" + "By default, redirect success statistics are disabled, use -s to enable.\n" + "The terse output mode is default, verbose mode can be activated using -v\n" + "Use SIGQUIT (Ctrl + \\) to switch the mode dynamically at runtime\n\n" + "Terse mode displays at most the following fields:\n" + " rx/s Number of packets received per second\n" + " redir/s Number of packets successfully redirected per second\n" + " err,drop/s Aggregated count of errors per second (including dropped packets)\n" + " xmit/s Number of packets transmitted on the output device per second\n\n" + "Output description for verbose mode:\n" + " FIELD DESCRIPTION\n"); + + if (mask & SAMPLE_RX_CNT) { + printf(" receive\tDisplays the number of packets received & errors encountered\n" + " \t\tWhenever an error or packet drop occurs, details of per CPU error\n" + " \t\tand drop statistics will be expanded inline in terse mode.\n" + " \t\t\tpkt/s - Packets received per second\n" + " \t\t\tdrop/s - Packets dropped per second\n" + " \t\t\terror/s - Errors encountered per second\n\n"); + } + if (mask & (SAMPLE_REDIRECT_CNT|SAMPLE_REDIRECT_ERR_CNT)) { + printf(" redirect\tDisplays the number of packets successfully redirected\n" + " \t\tErrors encountered are expanded under redirect_err field\n" + " \t\tNote that passing -s to enable it has a per packet overhead\n" + " \t\t\tredir/s - Packets redirected successfully per second\n\n" + " redirect_err\tDisplays the number of packets that failed redirection\n" + " \t\tThe errno is expanded under this field with per CPU count\n" + " \t\tThe recognized errors are EOPNOTSUPP, EINVAL, ENETDOWN and EMSGSIZE\n" + " \t\t\terror/s - Packets that failed redirection per second\n\n"); + } + + if (mask & SAMPLE_EXCEPTION_CNT) { + printf(" xdp_exception\tDisplays xdp_exception tracepoint events\n" + " \t\tThis can occur due to internal driver errors, unrecognized\n" + " \t\tXDP actions and due to explicit user trigger by use of XDP_ABORTED\n" + " \t\tEach action is expanded below this field with its count\n" + " \t\t\thit/s - Number of times the tracepoint was hit per second\n\n"); + } + + if (mask & SAMPLE_DEVMAP_XMIT_CNT) { + printf(" devmap_xmit\tDisplays devmap_xmit tracepoint events\n" + " \t\tThis tracepoint is invoked for successful transmissions on output\n" + " \t\tdevice but these statistics are not available for generic XDP mode,\n" + " \t\thence they will be omitted from the output when using SKB mode\n" + " \t\t\txmit/s - Number of packets that were transmitted per second\n" + " \t\t\tdrop/s - Number of packets that failed transmissions per second\n" + " \t\t\tdrv_err/s - Number of internal driver errors per second\n" + " \t\t\tbulk_avg - Average number of packets processed for each event\n\n"); + } +} + +static const char *elixir_search[NUM_TP] = { + [TP_REDIRECT_CNT] = "_trace_xdp_redirect", + [TP_REDIRECT_MAP_CNT] = "_trace_xdp_redirect_map", + [TP_REDIRECT_ERR_CNT] = "_trace_xdp_redirect_err", + [TP_REDIRECT_MAP_ERR_CNT] = "_trace_xdp_redirect_map_err", + [TP_CPUMAP_ENQUEUE_CNT] = "trace_xdp_cpumap_enqueue", + [TP_CPUMAP_KTHREAD_CNT] = "trace_xdp_cpumap_kthread", + [TP_EXCEPTION_CNT] = "trace_xdp_exception", + [TP_DEVMAP_XMIT_CNT] = "trace_xdp_devmap_xmit", +}; + +static const char *make_url(enum tp_type i) +{ + const char *key = elixir_search[i]; + static struct utsname uts = {}; + static char url[128]; + static bool uts_init; + int maj, min; + char c[2]; + + if (!uts_init) { + if (uname(&uts) < 0) + return NULL; + uts_init = true; + } + + if (!key || sscanf(uts.release, "%d.%d%1s", &maj, &min, c) != 3) + return NULL; + + snprintf(url, sizeof(url), "https://elixir.bootlin.com/linux/v%d.%d/C/ident/%s", + maj, min, key); + + return url; +} + +static void print_link(const char *str, int width, enum tp_type i) +{ + static int t = -1; + const char *s; + int fd, l; + + if (t < 0) { + fd = open("/proc/self/fd/1", O_RDONLY); + if (fd < 0) + return; + t = isatty(fd); + close(fd); + } + + s = make_url(i); + if (!s || !t) { + printf(" %-*s", width, str); + return; + } + + l = strlen(str); + width = width - l > 0 ? width - l : 0; + printf(" \x1B]8;;%s\a%s\x1B]8;;\a%*c", s, str, width, ' '); +} + +static __u64 gettime(void) +{ + struct timespec t; + int res; + + res = clock_gettime(CLOCK_MONOTONIC, &t); + if (res < 0) { + fprintf(stderr, "Error with gettimeofday! (%i)\n", res); + exit(EXIT_FAIL); + } + return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec; +} + +static struct datarec *alloc_record_per_cpu(void) +{ + unsigned int nr_cpus = bpf_num_possible_cpus(); + struct datarec *array; + + array = calloc(nr_cpus, sizeof(struct datarec)); + if (!array) { + fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); + exit(EXIT_FAIL_MEM); + } + return array; +} + +static void map_collect_percpu(struct datarec *values, struct record *rec) +{ + /* For percpu maps, userspace gets a value per possible CPU */ + unsigned int nr_cpus = bpf_num_possible_cpus(); + __u64 sum_xdp_redirect = 0; + __u64 sum_processed = 0; + __u64 sum_xdp_pass = 0; + __u64 sum_xdp_drop = 0; + __u64 sum_dropped = 0; + __u64 sum_issue = 0; + int i; + + /* Get time as close as possible to reading map contents */ + rec->timestamp = gettime(); + + /* Record and sum values from each CPU */ + for (i = 0; i < nr_cpus; i++) { + rec->cpu[i].processed = ATOMIC_LOAD(values[i].processed); + rec->cpu[i].dropped = ATOMIC_LOAD(values[i].dropped); + rec->cpu[i].issue = ATOMIC_LOAD(values[i].issue); + rec->cpu[i].xdp_pass = ATOMIC_LOAD(values[i].xdp_pass); + rec->cpu[i].xdp_drop = ATOMIC_LOAD(values[i].xdp_drop); + rec->cpu[i].xdp_redirect = ATOMIC_LOAD(values[i].xdp_redirect); + + sum_processed += rec->cpu[i].processed; + sum_dropped += rec->cpu[i].dropped; + sum_issue += rec->cpu[i].issue; + sum_xdp_pass += rec->cpu[i].xdp_pass; + sum_xdp_drop += rec->cpu[i].xdp_drop; + sum_xdp_redirect += rec->cpu[i].xdp_redirect; + } + + rec->total.processed = sum_processed; + rec->total.dropped = sum_dropped; + rec->total.issue = sum_issue; + rec->total.xdp_pass = sum_xdp_pass; + rec->total.xdp_drop = sum_xdp_drop; + rec->total.xdp_redirect = sum_xdp_redirect; +} + +static int map_collect_percpu_devmap(int map_fd, struct hashmap *map) +{ + unsigned int nr_cpus = bpf_num_possible_cpus(); + __u32 batch, count = 8; + struct datarec *values; + bool init = false; + __u64 *keys; + int i, ret; + + keys = calloc(8 * nr_cpus, sizeof(__u64)); + if (!keys) + return -ENOMEM; + values = calloc(8 * nr_cpus, sizeof(struct datarec)); + if (!values) + return -ENOMEM; + + for (;;) { + bool exit = false; + + ret = bpf_map_lookup_batch(map_fd, init ? &batch : NULL, &batch, + keys, values, &count, NULL); + if (ret < 0 && errno != ENOENT) + break; + if (errno == ENOENT) + exit = true; + + init = true; + for (i = 0; i < count; i++) { + __u64 pair = keys[i * nr_cpus]; + struct datarec *arr; + void *ex_rec; + + arr = &values[i * nr_cpus]; + if (!hashmap__find(map, &pair, &ex_rec)) { + struct record *rec; + __u64 *key; + + key = malloc(sizeof(*key)); + if (!key) + goto cleanup; + *key = pair; + + rec = malloc(sizeof(*rec)); + if (!rec) { + free(key); + goto cleanup; + } + rec->cpu = alloc_record_per_cpu(); + if (!rec) { + free(rec); + free(key); + goto cleanup; + } + + map_collect_percpu(arr, rec); + + if (hashmap__add(map, key, rec)) { + free(rec->cpu); + free(rec); + free(key); + goto cleanup; + } + } else { + map_collect_percpu(arr, ex_rec); + } + } + + if (exit) + break; + count = 8; + } + + free(values); + free(keys); + return 0; +cleanup: + free(values); + free(keys); + return -ENOMEM; +} + +static size_t u64_hash_fn(const void *key, void *ctx) +{ + const __u64 *k = key; + + return (size_t)*k; +} + +static bool u64_equal_fn(const void *key1, const void *key2, void *ctx) +{ + const __u64 *k1 = key1; + const __u64 *k2 = key2; + + return *k1 == *k2; +} + +static struct stats_record *alloc_stats_record(void) +{ + struct stats_record *rec; + int i; + + rec = calloc(1, sizeof(*rec) + sample.n_cpus * sizeof(struct record)); + if (!rec) { + fprintf(stderr, "Failed to allocate memory\n"); + return NULL; + } + + rec->rx_cnt.cpu = alloc_record_per_cpu(); + for (i = 0; i < XDP_REDIRECT_ERR_MAX; i++) + rec->redir_err[i].cpu = alloc_record_per_cpu(); + rec->kthread.cpu = alloc_record_per_cpu(); + for (i = 0; i < XDP_ACTION_MAX; i++) + rec->exception[i].cpu = alloc_record_per_cpu(); + rec->devmap_xmit.cpu = alloc_record_per_cpu(); + for (i = 0; i < sample.n_cpus; i++) + rec->enq[i].cpu = alloc_record_per_cpu(); + rec->devmap_xmit_multi = hashmap__new(u64_hash_fn, u64_equal_fn, NULL); + if (!rec->devmap_xmit_multi) { + fprintf(stderr, "Failed to allocate hashmap\n"); + return NULL; + } + + return rec; +} + +static void free_stats_record(struct stats_record *r) +{ + struct hashmap_entry *tmp; + int i, bkt; + + if (r->devmap_xmit_multi) { + hashmap__for_each_entry(r->devmap_xmit_multi, tmp, bkt) { + struct record *rec = tmp->value; + + free((void *)tmp->key); + free(rec->cpu); + free(rec); + } + } + hashmap__free(r->devmap_xmit_multi); + for (i = 0; i < sample.n_cpus; i++) + free(r->enq[i].cpu); + free(r->devmap_xmit.cpu); + for (i = 0; i < XDP_ACTION_MAX; i++) + free(r->exception[i].cpu); + free(r->kthread.cpu); + for (i = 0; i < XDP_REDIRECT_ERR_MAX; i++) + free(r->redir_err[i].cpu); + free(r->rx_cnt.cpu); + free(r); +} + +static double calc_period(struct record *r, struct record *p) +{ + double period_ = 0; + __u64 period = 0; + + period = r->timestamp - p->timestamp; + if (period > 0) + period_ = ((double) period / NANOSEC_PER_SEC); + + return period_; +} + +static double sample_round(double val) +{ + if (val - floor(val) < 0.5) + return floor(val); + return ceil(val); +} + +static __u64 calc_pps(struct datarec *r, struct datarec *p, double period_) +{ + __u64 packets = 0; + __u64 pps = 0; + + if (period_ > 0) { + packets = r->processed - p->processed; + pps = sample_round(packets / period_); + } + return pps; +} + +static __u64 calc_drop_pps(struct datarec *r, struct datarec *p, double period_) +{ + __u64 packets = 0; + __u64 pps = 0; + + if (period_ > 0) { + packets = r->dropped - p->dropped; + pps = sample_round(packets / period_); + } + return pps; +} + +static __u64 calc_errs_pps(struct datarec *r, + struct datarec *p, double period_) +{ + __u64 packets = 0; + __u64 pps = 0; + + if (period_ > 0) { + packets = r->issue - p->issue; + pps = sample_round(packets / period_); + } + return pps; +} + +static __u64 calc_info_pps(struct datarec *r, + struct datarec *p, double period_) +{ + __u64 packets = 0; + __u64 pps = 0; + + if (period_ > 0) { + packets = r->info - p->info; + pps = sample_round(packets / period_); + } + return pps; +} + +static void calc_xdp_pps(struct datarec *r, struct datarec *p, + double *xdp_pass, double *xdp_drop, + double *xdp_redirect, double period_) +{ + *xdp_pass = 0, *xdp_drop = 0, *xdp_redirect = 0; + if (period_ > 0) { + *xdp_redirect = (r->xdp_redirect - p->xdp_redirect) / period_; + *xdp_pass = (r->xdp_pass - p->xdp_pass) / period_; + *xdp_drop = (r->xdp_drop - p->xdp_drop) / period_; + } +} + +static void stats_get_rx_cnt(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, struct sample_output *out) +{ + struct record *rec, *prev; + double t, pps, drop, err; + int i; + + rec = &stats_rec->rx_cnt; + prev = &stats_prev->rx_cnt; + t = calc_period(rec, prev); + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + pps = calc_pps(r, p, t); + if (!pps) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + drop = calc_drop_pps(r, p, t); + err = calc_errs_pps(r, p, t); + print_default(" %-18s " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf "\n", + str, PPS(pps), DROP(drop), ERR(err)); + } + + if (out) { + pps = calc_pps(&rec->total, &prev->total, t); + drop = calc_drop_pps(&rec->total, &prev->total, t); + err = calc_errs_pps(&rec->total, &prev->total, t); + + out->rx_cnt.pps = pps; + out->rx_cnt.drop = drop; + out->rx_cnt.err = err; + out->totals.rx += pps; + out->totals.drop += drop; + out->totals.err += err; + } +} + +static void stats_get_cpumap_enqueue(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus) +{ + struct record *rec, *prev; + double t, pps, drop, err; + int i, to_cpu; + + /* cpumap enqueue stats */ + for (to_cpu = 0; to_cpu < sample.n_cpus; to_cpu++) { + rec = &stats_rec->enq[to_cpu]; + prev = &stats_prev->enq[to_cpu]; + t = calc_period(rec, prev); + + pps = calc_pps(&rec->total, &prev->total, t); + drop = calc_drop_pps(&rec->total, &prev->total, t); + err = calc_errs_pps(&rec->total, &prev->total, t); + + if (pps > 0) { + char str[64]; + + snprintf(str, sizeof(str), "enqueue to cpu %d", to_cpu); + + if (err > 0) + err = pps / err; /* calc average bulk size */ + + print_link_err(drop, str, 20, TP_CPUMAP_ENQUEUE_CNT); + print_err(drop, + " " FMT_COLUMNf FMT_COLUMNf __COLUMN(".2f") "\n", + PPS(pps), DROP(drop), err, "bulk_avg"); + } + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + pps = calc_pps(r, p, t); + if (!pps) + continue; + + snprintf(str, sizeof(str), "cpu:%d->%d", i, to_cpu); + drop = calc_drop_pps(r, p, t); + err = calc_errs_pps(r, p, t); + if (err > 0) + err = pps / err; /* calc average bulk size */ + print_default(" %-18s " FMT_COLUMNf FMT_COLUMNf __COLUMN(".2f") + "\n", str, PPS(pps), DROP(drop), err, "bulk_avg"); + } + } +} + +static void stats_get_cpumap_kthread(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus) +{ + struct record *rec, *prev; + double t, pps, drop, err; + int i; + + rec = &stats_rec->kthread; + prev = &stats_prev->kthread; + t = calc_period(rec, prev); + + pps = calc_pps(&rec->total, &prev->total, t); + drop = calc_drop_pps(&rec->total, &prev->total, t); + err = calc_errs_pps(&rec->total, &prev->total, t); + + print_link_err(drop, pps ? "kthread total" : "kthread", 20, TP_CPUMAP_KTHREAD_CNT); + print_err(drop, " " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf "\n", + PPS(pps), DROP(drop), err, "sched"); + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + pps = calc_pps(r, p, t); + if (!pps) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + drop = calc_drop_pps(r, p, t); + err = calc_errs_pps(r, p, t); + print_default(" %-18s " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf "\n", + str, PPS(pps), DROP(drop), err, "sched"); + } +} + +static void stats_get_redirect_cnt(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, struct sample_output *out) +{ + struct record *rec, *prev; + double t, pps; + int i; + + rec = &stats_rec->redir_err[0]; + prev = &stats_prev->redir_err[0]; + t = calc_period(rec, prev); + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + pps = calc_pps(r, p, t); + if (!pps) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + print_default(" %-18s " FMT_COLUMNf "\n", str, REDIR(pps)); + } + + if (out) { + pps = calc_pps(&rec->total, &prev->total, t); + out->redir_cnt.suc = pps; + out->totals.redir += pps; + } + +} + +static void stats_get_redirect_err_cnt(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, struct sample_output *out) +{ + struct record *rec, *prev; + double t, drop, sum = 0; + int rec_i, i; + + for (rec_i = 1; rec_i < XDP_REDIRECT_ERR_MAX; rec_i++) { + char str[64]; + + rec = &stats_rec->redir_err[rec_i]; + prev = &stats_prev->redir_err[rec_i]; + t = calc_period(rec, prev); + + drop = calc_drop_pps(&rec->total, &prev->total, t); + if (drop > 0 && !out) { + snprintf(str, sizeof(str), + sample.log_level & LL_DEFAULT ? "%s total" : + "%s", xdp_redirect_err_names[rec_i]); + print_err(drop, " %-18s " FMT_COLUMNf "\n", str, ERR(drop)); + } + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + double drop; + + drop = calc_drop_pps(r, p, t); + if (!drop) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + print_default(" %-16s" FMT_COLUMNf "\n", str, ERR(drop)); + } + + sum += drop; + } + + if (out) { + out->redir_cnt.err = sum; + out->totals.err += sum; + } +} + +static void stats_get_exception_cnt(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, struct sample_output *out) +{ + double t, drop, sum = 0; + struct record *rec, *prev; + int rec_i, i; + + + for (rec_i = 0; rec_i < XDP_ACTION_MAX; rec_i++) { + rec = &stats_rec->exception[rec_i]; + prev = &stats_prev->exception[rec_i]; + t = calc_period(rec, prev); + + drop = calc_drop_pps(&rec->total, &prev->total, t); + /* Fold out errors after heading */ + sum += drop; + + if (drop > 0 && !out) { + print_always(" %-18s " FMT_COLUMNf "\n", action2str(rec_i), ERR(drop)); + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + double drop; + + drop = calc_drop_pps(r, p, t); + if (!drop) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + print_default(" %-16s" FMT_COLUMNf "\n", str, ERR(drop)); + } + } + } + + if (out) { + out->except_cnt.hits = sum; + out->totals.err += sum; + } +} + +static void stats_get_cpumap_remote(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus) +{ + double xdp_pass, xdp_drop, xdp_redirect; + struct record *rec, *prev; + double t; + int i; + + rec = &stats_rec->kthread; + prev = &stats_prev->kthread; + t = calc_period(rec, prev); + + calc_xdp_pps(&rec->total, &prev->total, &xdp_pass, &xdp_drop, + &xdp_redirect, t); + if (xdp_pass || xdp_drop || xdp_redirect) { + print_default(" %-18s " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf "\n", + "xdp_stats", PASS(xdp_pass), DROP(xdp_drop), REDIR(xdp_redirect)); + } + + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + calc_xdp_pps(r, p, &xdp_pass, &xdp_drop, &xdp_redirect, t); + if (!xdp_pass && !xdp_drop && !xdp_redirect) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + print_default(" %-16s " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf "\n", + str, PASS(xdp_pass), DROP(xdp_drop), REDIR(xdp_redirect)); + } +} + +static void stats_get_devmap_xmit(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, struct sample_output *out) +{ + double pps, drop, info, err; + struct record *rec, *prev; + double t; + int i; + + rec = &stats_rec->devmap_xmit; + prev = &stats_prev->devmap_xmit; + t = calc_period(rec, prev); + for (i = 0; i < nr_cpus; i++) { + struct datarec *r = &rec->cpu[i]; + struct datarec *p = &prev->cpu[i]; + char str[64]; + + pps = calc_pps(r, p, t); + drop = calc_drop_pps(r, p, t); + + if (!pps) + continue; + + snprintf(str, sizeof(str), "cpu:%d", i); + info = calc_info_pps(r, p, t); + err = calc_errs_pps(r, p, t); + if (info > 0) + info = (pps + drop) / info; /* calc avg bulk */ + print_default(" %-18s" FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf + __COLUMN(".2f") "\n", str, XMIT(pps), DROP(drop), + err, "drv_err/s", info, "bulk_avg"); + } + if (out) { + pps = calc_pps(&rec->total, &prev->total, t); + drop = calc_drop_pps(&rec->total, &prev->total, t); + info = calc_info_pps(&rec->total, &prev->total, t); + if (info > 0) + info = (pps + drop) / info; /* calc avg bulk */ + err = calc_errs_pps(&rec->total, &prev->total, t); + + out->xmit_cnt.pps = pps; + out->xmit_cnt.drop = drop; + out->xmit_cnt.bavg = info; + out->xmit_cnt.err = err; + out->totals.xmit += pps; + out->totals.drop_xmit += drop; + out->totals.err += err; + } +} + +static void stats_get_devmap_xmit_multi(struct stats_record *stats_rec, + struct stats_record *stats_prev, + unsigned int nr_cpus, + struct sample_output *out) +{ + double pps, drop, info, err; + struct hashmap_entry *entry; + struct hashmap *cur, *prev; + struct record *r, *p; + double t; + int bkt; + + cur = stats_rec->devmap_xmit_multi; + prev = stats_prev->devmap_xmit_multi; + + hashmap__for_each_entry(cur, entry, bkt) { + const char *estr, *fstr, *tstr; + char ifname_from[IFNAMSIZ]; + char ifname_to[IFNAMSIZ]; + __u32 from_idx, to_idx; + char str[128]; + __u64 pair; + void *val; + + pair = *(__u64 *)entry->key; + + from_idx = pair >> 32; + to_idx = pair & 0xFFFFFFFF; + + r = entry->value; + if (!hashmap__find(prev, &pair, &val)) + continue; + p = val; + + t = calc_period(r, p); + + pps = calc_pps(&r->total, &p->total, t); + drop = calc_drop_pps(&r->total, &p->total, t); + info = calc_info_pps(&r->total, &p->total, t); + if (info > 0) + info = (pps + drop) / info; /* calc avg bulk */ + err = calc_errs_pps(&r->total, &p->total, t); + + estr = fstr = tstr = "[error]"; + if (if_indextoname(from_idx, ifname_from)) + fstr = ifname_from; + if (if_indextoname(to_idx, ifname_to)) + tstr = ifname_to; + + if (fstr == estr || tstr == estr) + snprintf(str, sizeof(str), "tx %d->%d", from_idx, to_idx); + else + snprintf(str, sizeof(str), "tx %s->%s", fstr, tstr); + /* Handle idle streams of redirection */ + if (pps || drop || err) { + printf(" "); + print_link_err(drop, str, 18, TP_DEVMAP_XMIT_CNT); + print_err(drop, " " FMT_COLUMNf FMT_COLUMNf FMT_COLUMNf __COLUMN(".2f") "\n", + XMIT(pps), DROP(drop), err, "drv_err/s", info, "bulk_avg"); + } + } +} + +static void stats_print(const char *prefix, int mask, struct stats_record *r, + struct stats_record *p, struct sample_output *out) +{ + int nr_cpus = bpf_num_possible_cpus(); + const char *str; + + print_always("%-23s", prefix ?: "Summary"); + if (mask & SAMPLE_RX_CNT) + print_always(FMT_COLUMNl, RX(out->totals.rx)); + if (mask & SAMPLE_REDIRECT_CNT) + print_always(FMT_COLUMNl, REDIR(out->totals.redir)); + printf(FMT_COLUMNl, out->totals.err + out->totals.drop + + out->totals.drop_xmit, "err,drop/s"); + if (mask & SAMPLE_DEVMAP_XMIT_CNT) + printf(FMT_COLUMNl, XMIT(out->totals.xmit)); + printf("\n"); + + if (mask & SAMPLE_RX_CNT) { + str = (sample.log_level & LL_DEFAULT) && out->rx_cnt.pps ? + "receive total" : "receive"; + print_err( + (out->rx_cnt.err || out->rx_cnt.drop), + " %-20s " FMT_COLUMNl FMT_COLUMNl FMT_COLUMNl "\n", + str, PPS(out->rx_cnt.pps), DROP(out->rx_cnt.drop), + ERR(out->rx_cnt.err)); + + stats_get_rx_cnt(r, p, nr_cpus, NULL); + } + + if (mask & SAMPLE_CPUMAP_ENQUEUE_CNT) + stats_get_cpumap_enqueue(r, p, nr_cpus); + + if (mask & SAMPLE_CPUMAP_KTHREAD_CNT) { + stats_get_cpumap_kthread(r, p, nr_cpus); + stats_get_cpumap_remote(r, p, nr_cpus); + } + + if (mask & SAMPLE_REDIRECT_CNT) { + str = out->redir_cnt.suc ? "redirect total" : "redirect"; + print_link_err(0, str, 20, mask & _SAMPLE_REDIRECT_MAP ? + TP_REDIRECT_MAP_CNT : TP_REDIRECT_CNT); + print_default(" " FMT_COLUMNl "\n", REDIR(out->redir_cnt.suc)); + + stats_get_redirect_cnt(r, p, nr_cpus, NULL); + } + + if (mask & SAMPLE_REDIRECT_ERR_CNT) { + str = (sample.log_level & LL_DEFAULT) && out->redir_cnt.err ? + "redirect_err total" : "redirect_err"; + print_link_err(out->redir_cnt.err, str, 20, mask & _SAMPLE_REDIRECT_MAP ? + TP_REDIRECT_MAP_ERR_CNT : TP_REDIRECT_ERR_CNT); + print_err(out->redir_cnt.err, " " FMT_COLUMNl "\n", ERR(out->redir_cnt.err)); + + stats_get_redirect_err_cnt(r, p, nr_cpus, NULL); + } + + if (mask & SAMPLE_EXCEPTION_CNT) { + str = out->except_cnt.hits ? "xdp_exception total" : "xdp_exception"; + + print_link_err(out->except_cnt.hits, str, 20, TP_EXCEPTION_CNT); + print_err(out->except_cnt.hits, " " FMT_COLUMNl "\n", HITS(out->except_cnt.hits)); + + stats_get_exception_cnt(r, p, nr_cpus, NULL); + } + + if (mask & SAMPLE_DEVMAP_XMIT_CNT) { + str = (sample.log_level & LL_DEFAULT) && out->xmit_cnt.pps ? + "devmap_xmit total" : "devmap_xmit"; + + print_link_err(out->xmit_cnt.err || out->xmit_cnt.drop, str, 20, + TP_DEVMAP_XMIT_CNT); + print_err(out->xmit_cnt.err || out->xmit_cnt.drop, + " " FMT_COLUMNl FMT_COLUMNl FMT_COLUMNl __COLUMN(".2f") "\n", + XMIT(out->xmit_cnt.pps), DROP(out->xmit_cnt.drop), + out->xmit_cnt.err, "drv_err/s", out->xmit_cnt.bavg, + "bulk_avg"); + + stats_get_devmap_xmit(r, p, nr_cpus, NULL); + } + + if (mask & SAMPLE_DEVMAP_XMIT_CNT_MULTI) + stats_get_devmap_xmit_multi(r, p, nr_cpus, NULL); + + if (sample.log_level & LL_DEFAULT || ((sample.log_level & LL_SIMPLE) && sample.err_exp)) { + sample.err_exp = false; + printf("\n"); + } +} + +int __sample_init(struct sample_data *data, int map_fd, int mask) +{ + sigset_t st; + + sample.n_cpus = get_nprocs_conf(); + + sigemptyset(&st); + sigaddset(&st, SIGQUIT); + sigaddset(&st, SIGINT); + sigaddset(&st, SIGTERM); + + if (sigprocmask(SIG_BLOCK, &st, NULL) < 0) + return -errno; + + sample.sig_fd = signalfd(-1, &st, SFD_CLOEXEC|SFD_NONBLOCK); + if (sample.sig_fd < 0) + return -errno; + + sample.map_fd[MAP_DEVMAP_XMIT_MULTI] = map_fd; + sample.data = data; + sample.mask = mask; + + return 0; +} + +static int __sample_remove_xdp(int ifindex, __u32 prog_id, int xdp_flags) +{ + __u32 cur_prog_id = 0; + int ret; + + if (prog_id) { + ret = bpf_get_link_xdp_id(ifindex, &cur_prog_id, xdp_flags); + if (ret < 0) + return -errno; + + if (prog_id != cur_prog_id) { + print_always("Program on ifindex %d does not match installed " + "program, skipping unload\n", ifindex); + return -ENOENT; + } + } + + return bpf_set_link_xdp_fd(ifindex, -1, xdp_flags); +} + +int sample_install_xdp(struct bpf_program *xdp_prog, int ifindex, bool generic, bool force) +{ + int ret, xdp_flags = 0; + __u32 prog_id = 0; + + if (sample.xdp_cnt == 32) { + fprintf(stderr, + "Total limit for installed XDP programs in a sample reached\n"); + return -ENOTSUP; + } + + xdp_flags |= !force ? XDP_FLAGS_UPDATE_IF_NOEXIST : 0; + xdp_flags |= generic ? XDP_FLAGS_SKB_MODE : XDP_FLAGS_DRV_MODE; + ret = bpf_set_link_xdp_fd(ifindex, bpf_program__fd(xdp_prog), xdp_flags); + if (ret < 0) { + ret = -errno; + fprintf(stderr, "Failed to install program \"%s\" on ifindex %d, mode = %s, " + "force = %s: %s\n", bpf_program__name(xdp_prog), ifindex, + generic ? "skb" : "native", force ? "true" : "false", strerror(errno)); + return ret; + } + + ret = bpf_get_link_xdp_id(ifindex, &prog_id, xdp_flags); + if (ret < 0) { + ret = -errno; + fprintf(stderr, + "Failed to get XDP program id for ifindex %d, removing program: %s\n", + ifindex, strerror(errno)); + __sample_remove_xdp(ifindex, 0, xdp_flags); + return ret; + } + sample.xdp_progs[sample.xdp_cnt++] = (struct xdp_desc){ ifindex, prog_id, xdp_flags }; + + return 0; +} + +static void sample_summary_print(void) +{ + double period = sample.out.rx_cnt.pps; + + print_always("\nTotals\n"); + if (sample.out.totals.rx) { + double pkts = sample.out.totals.rx; + + print_always(" Packets received : %'-10llu\n", sample.out.totals.rx); + print_always(" Average packets/s : %'-10.0f\n", sample_round(pkts/period)); + } + if (sample.out.totals.redir) { + double pkts = sample.out.totals.redir; + + print_always(" Packets redirected : %'-10llu\n", sample.out.totals.redir); + print_always(" Average redir/s : %'-10.0f\n", sample_round(pkts/period)); + } + print_always(" Rx dropped : %'-10llu\n", sample.out.totals.drop); + print_always(" Tx dropped : %'-10llu\n", sample.out.totals.drop_xmit); + print_always(" Errors recorded : %'-10llu\n", sample.out.totals.err); + if (sample.out.totals.xmit) { + double pkts = sample.out.totals.xmit; + + print_always(" Packets transmitted : %'-10llu\n", sample.out.totals.xmit); + print_always(" Average transmit/s : %'-10.0f\n", sample_round(pkts/period)); + } +} + +void sample_exit(int status) +{ + while (sample.xdp_cnt--) { + int i = sample.xdp_cnt, ifindex, xdp_flags; + __u32 prog_id; + + prog_id = sample.xdp_progs[i].prog_id; + ifindex = sample.xdp_progs[i].ifindex; + xdp_flags = sample.xdp_progs[i].flags; + + __sample_remove_xdp(ifindex, prog_id, xdp_flags); + } + sample_summary_print(); + close(sample.sig_fd); + exit(status); +} + +static int sample_stats_collect(struct stats_record *rec) +{ + int i; + + if (sample.mask & SAMPLE_RX_CNT) + map_collect_percpu(sample.data->rx_cnt, &rec->rx_cnt); + + if (sample.mask & SAMPLE_REDIRECT_CNT) + map_collect_percpu(sample.data->redirect_err_cnt, &rec->redir_err[0]); + + if (sample.mask & SAMPLE_REDIRECT_ERR_CNT) { + for (i = 1; i < XDP_REDIRECT_ERR_MAX; i++) + map_collect_percpu(&sample.data->redirect_err_cnt[i * MAX_CPUS], + &rec->redir_err[i]); + } + + if (sample.mask & SAMPLE_CPUMAP_ENQUEUE_CNT) + for (i = 0; i < sample.n_cpus; i++) + map_collect_percpu(&sample.data->cpumap_enqueue_cnt[i * MAX_CPUS], + &rec->enq[i]); + + if (sample.mask & SAMPLE_CPUMAP_KTHREAD_CNT) + map_collect_percpu(sample.data->cpumap_kthread_cnt, &rec->kthread); + + if (sample.mask & SAMPLE_EXCEPTION_CNT) + for (i = 0; i < XDP_ACTION_MAX; i++) + map_collect_percpu(&sample.data->exception_cnt[i * MAX_CPUS], + &rec->exception[i]); + + if (sample.mask & SAMPLE_DEVMAP_XMIT_CNT) + map_collect_percpu(sample.data->devmap_xmit_cnt, &rec->devmap_xmit); + + if (sample.mask & SAMPLE_DEVMAP_XMIT_CNT_MULTI) { + if (map_collect_percpu_devmap(sample.map_fd[MAP_DEVMAP_XMIT_MULTI], + rec->devmap_xmit_multi) < 0) + return -EINVAL; + } + return 0; +} + +static void sample_summary_update(struct sample_output *out, int interval) +{ + sample.out.totals.rx += out->totals.rx; + sample.out.totals.redir += out->totals.redir; + sample.out.totals.drop += out->totals.drop; + sample.out.totals.drop_xmit += out->totals.drop_xmit; + sample.out.totals.err += out->totals.err; + sample.out.totals.xmit += out->totals.xmit; + sample.out.rx_cnt.pps += interval; +} + + +static void sample_stats_print(int mask, struct stats_record *cur, + struct stats_record *prev, char *prog_name, + int interval) +{ + struct sample_output out = {}; + + if (mask & SAMPLE_RX_CNT) + stats_get_rx_cnt(cur, prev, 0, &out); + + if (mask & SAMPLE_REDIRECT_CNT) + stats_get_redirect_cnt(cur, prev, 0, &out); + + if (mask & SAMPLE_REDIRECT_ERR_CNT) + stats_get_redirect_err_cnt(cur, prev, 0, &out); + + if (mask & SAMPLE_EXCEPTION_CNT) + stats_get_exception_cnt(cur, prev, 0, &out); + + if (mask & SAMPLE_DEVMAP_XMIT_CNT) + stats_get_devmap_xmit(cur, prev, 0, &out); + + sample_summary_update(&out, interval); + + stats_print(prog_name, mask, cur, prev, &out); +} + +void sample_switch_mode(void) +{ + sample.log_level ^= LL_DEBUG - 1; +} + +static int sample_signal_cb(void) +{ + struct signalfd_siginfo si; + int r; + + r = read(sample.sig_fd, &si, sizeof(si)); + if (r < 0) + return -errno; + + switch (si.ssi_signo) { + case SIGQUIT: + sample_switch_mode(); + printf("\n"); + break; + default: + return 1; + } + + return 0; +} + +static int sample_timer_cb(int timerfd, struct stats_record **rec, + struct stats_record **prev, int interval) +{ + char line[64] = "Summary"; + int ret; + __u64 t; + + ret = read(timerfd, &t, sizeof(t)); + if (ret < 0) + return -errno; + + swap(prev, rec); + ret = sample_stats_collect(*rec); + if (ret < 0) + return ret; + + if (sample.xdp_cnt == 2) { + char fi[IFNAMSIZ]; + char to[IFNAMSIZ]; + const char *f, *t; + + f = t = "[error]"; + if (if_indextoname(sample.xdp_progs[0].ifindex, fi)) + f = fi; + if (if_indextoname(sample.xdp_progs[1].ifindex, to)) + t = to; + + snprintf(line, sizeof(line), "%s->%s", f, t); + } + + sample_stats_print(sample.mask, *rec, *prev, line, interval); + return 0; +} + +int sample_run(int interval, void (*post_cb)(void *), void *ctx) +{ + struct timespec ts = { interval, 0 }; + struct itimerspec its = { ts, ts }; + struct stats_record *rec, *prev; + struct pollfd pfd[2] = {}; + int timerfd, ret; + + /* Pretty print numbers */ + setlocale(LC_NUMERIC, "en_US"); + + timerfd = timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC|TFD_NONBLOCK); + if (timerfd < 0) + return -errno; + timerfd_settime(timerfd, 0, &its, NULL); + + pfd[0].fd = sample.sig_fd; + pfd[0].events = POLLIN; + + pfd[1].fd = timerfd; + pfd[1].events = POLLIN; + + ret = -ENOMEM; + rec = alloc_stats_record(); + if (!rec) + goto end; + prev = alloc_stats_record(); + if (!prev) + goto end_rec; + + ret = sample_stats_collect(rec); + if (ret < 0) + goto end_rec_prev; + + for (;;) { + ret = poll(pfd, 2, -1); + if (ret < 0) { + if (errno == EINTR) + continue; + else + break; + } + + if (pfd[0].revents & POLLIN) + ret = sample_signal_cb(); + else if (pfd[1].revents & POLLIN) + ret = sample_timer_cb(timerfd, &rec, &prev, interval); + + if (ret) + break; + + if (post_cb) + post_cb(ctx); + } + +end_rec_prev: + free_stats_record(prev); +end_rec: + free_stats_record(rec); +end: + close(timerfd); + + return ret; +} + +const char *get_driver_name(int ifindex) +{ + struct ethtool_drvinfo drv = {}; + char ifname[IF_NAMESIZE]; + static char drvname[32]; + struct ifreq ifr = {}; + int fd, r = 0; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd < 0) + return "[error]"; + + if (!if_indextoname(ifindex, ifname)) + goto end; + + drv.cmd = ETHTOOL_GDRVINFO; + safe_strncpy(ifr.ifr_name, ifname, sizeof(ifr.ifr_name)); + ifr.ifr_data = (void *)&drv; + + r = ioctl(fd, SIOCETHTOOL, &ifr); + if (r) + goto end; + + safe_strncpy(drvname, drv.driver, sizeof(drvname)); + + close(fd); + return drvname; + +end: + r = errno; + close(fd); + return r == EOPNOTSUPP ? "loopback" : "[error]"; +} + +int get_mac_addr(int ifindex, void *mac_addr) +{ + char ifname[IF_NAMESIZE]; + struct ifreq ifr = {}; + int fd, r; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd < 0) + return -errno; + + if (!if_indextoname(ifindex, ifname)) { + r = -errno; + goto end; + } + + safe_strncpy(ifr.ifr_name, ifname, sizeof(ifr.ifr_name)); + + r = ioctl(fd, SIOCGIFHWADDR, &ifr); + if (r) { + r = -errno; + goto end; + } + + memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char)); + +end: + close(fd); + return r; +} diff --git a/samples/bpf/xdp_sample_user.h b/samples/bpf/xdp_sample_user.h new file mode 100644 index 000000000000..f22e86e89b38 --- /dev/null +++ b/samples/bpf/xdp_sample_user.h @@ -0,0 +1,202 @@ +// SPDX-License-Identifier: GPL-2.0-only +#pragma once + +#include +#include + +#include "xdp_sample_shared.h" + +enum tp_type { + TP_REDIRECT_CNT, + TP_REDIRECT_MAP_CNT, + TP_REDIRECT_ERR_CNT, + TP_REDIRECT_MAP_ERR_CNT, + TP_CPUMAP_ENQUEUE_CNT, + TP_CPUMAP_KTHREAD_CNT, + TP_EXCEPTION_CNT, + TP_DEVMAP_XMIT_CNT, + NUM_TP, +}; + +enum map_type { + MAP_DEVMAP_XMIT_MULTI, + NUM_MAP, +}; + +enum stats_mask { + _SAMPLE_REDIRECT_MAP = 1U << 0, + SAMPLE_RX_CNT = 1U << 1, + SAMPLE_REDIRECT_ERR_CNT = 1U << 2, + SAMPLE_CPUMAP_ENQUEUE_CNT = 1U << 3, + SAMPLE_CPUMAP_KTHREAD_CNT = 1U << 4, + SAMPLE_EXCEPTION_CNT = 1U << 5, + SAMPLE_DEVMAP_XMIT_CNT = 1U << 6, + SAMPLE_REDIRECT_CNT = 1U << 7, + SAMPLE_REDIRECT_MAP_CNT = SAMPLE_REDIRECT_CNT | _SAMPLE_REDIRECT_MAP, + SAMPLE_REDIRECT_ERR_MAP_CNT = SAMPLE_REDIRECT_ERR_CNT | _SAMPLE_REDIRECT_MAP, + SAMPLE_DEVMAP_XMIT_CNT_MULTI = 1U << 8, +}; + +enum log_level { + LL_DEFAULT = 1U << 0, + LL_SIMPLE = 1U << 1, + LL_DEBUG = 1U << 2, +}; + +/* Exit return codes */ +#define EXIT_OK 0 +#define EXIT_FAIL 1 +#define EXIT_FAIL_OPTION 2 +#define EXIT_FAIL_XDP 3 +#define EXIT_FAIL_BPF 4 +#define EXIT_FAIL_MEM 5 + +#define XDP_REDIRECT_ERR_MAX 7 + +__attribute__((unused)) +static const char *xdp_redirect_err_names[XDP_REDIRECT_ERR_MAX] = { + /* Key=1 keeps unknown errors */ + "Success", "Unknown", "EINVAL", "ENETDOWN", "EMSGSIZE", + "EOPNOTSUPP", "ENOSPC", +}; + +#define XDP_UNKNOWN (XDP_REDIRECT + 1) +#define XDP_ACTION_MAX (XDP_UNKNOWN + 1) + +__attribute__((unused)) +static const char *xdp_action_names[XDP_ACTION_MAX] = { + [XDP_ABORTED] = "XDP_ABORTED", + [XDP_DROP] = "XDP_DROP", + [XDP_PASS] = "XDP_PASS", + [XDP_TX] = "XDP_TX", + [XDP_REDIRECT] = "XDP_REDIRECT", + [XDP_UNKNOWN] = "XDP_UNKNOWN", +}; + +__attribute__((unused)) +static inline const char *action2str(int action) +{ + if (action < XDP_ACTION_MAX) + return xdp_action_names[action]; + return NULL; +} + +struct record { + __u64 timestamp; + struct datarec total; + struct datarec *cpu; +}; + +struct stats_record { + struct record rx_cnt; + struct record redir_err[XDP_REDIRECT_ERR_MAX]; + struct record kthread; + struct record exception[XDP_ACTION_MAX]; + struct record devmap_xmit; + struct hashmap *devmap_xmit_multi; + struct record enq[]; +}; + +struct sample_output { + struct { + __u64 rx; + __u64 redir; + __u64 drop; + __u64 drop_xmit; + __u64 err; + __u64 xmit; + } totals; + struct { + __u64 pps; + __u64 drop; + __u64 err; + } rx_cnt; + struct { + __u64 suc; + __u64 err; + } redir_cnt; + struct { + __u64 hits; + } except_cnt; + struct { + __u64 pps; + __u64 drop; + __u64 err; + double bavg; + } xmit_cnt; +}; + +int __sample_init(struct sample_data *data, int map_fd, int mask); +void sample_exit(int status); +int sample_run(int interval, void (*post_cb)(void *), void *ctx); + +void sample_switch_mode(void); +int sample_install_xdp(struct bpf_program *xdp_prog, int ifindex, bool generic, + bool force); +void sample_print_help(int mask); + +const char *get_driver_name(int ifindex); +int get_mac_addr(int ifindex, void *mac_addr); + +/* Pointer swap trick */ +__attribute__((unused)) +static inline void swap(struct stats_record **a, struct stats_record **b) +{ + struct stats_record *tmp; + + tmp = *a; + *a = *b; + *b = tmp; +} + +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wstringop-truncation" +__attribute__((unused)) +static inline char *safe_strncpy(char *dst, const char *src, size_t size) +{ + if (!size) + return dst; + strncpy(dst, src, size - 1); + dst[size - 1] = '\0'; + return dst; +} +#pragma GCC diagnostic pop + +#define __attach_tp(name) \ + ({ \ + if (!bpf_program__is_tracing(skel->progs.name)) \ + return -EINVAL; \ + skel->links.name = bpf_program__attach(skel->progs.name); \ + if (!skel->links.name) \ + return -errno; \ + }) + +#define DEFINE_SAMPLE_INIT(name) \ + static int sample_init(struct name *skel, int mask) \ + { \ + int ret; \ + ret = __sample_init( \ + &skel->bss->sample_data, \ + bpf_map__fd(skel->maps.devmap_xmit_cnt_multi), mask); \ + if (ret < 0) \ + return ret; \ + if (mask & SAMPLE_REDIRECT_MAP_CNT) \ + __attach_tp(tp_xdp_redirect_map); \ + if (mask & SAMPLE_REDIRECT_CNT) \ + __attach_tp(tp_xdp_redirect); \ + if (mask & SAMPLE_REDIRECT_ERR_MAP_CNT) \ + __attach_tp(tp_xdp_redirect_map_err); \ + if (mask & SAMPLE_REDIRECT_ERR_CNT) \ + __attach_tp(tp_xdp_redirect_err); \ + if (mask & SAMPLE_CPUMAP_ENQUEUE_CNT) \ + __attach_tp(tp_xdp_cpumap_enqueue); \ + if (mask & SAMPLE_CPUMAP_KTHREAD_CNT) \ + __attach_tp(tp_xdp_cpumap_kthread); \ + if (mask & SAMPLE_EXCEPTION_CNT) \ + __attach_tp(tp_xdp_exception); \ + if (mask & SAMPLE_DEVMAP_XMIT_CNT) \ + __attach_tp(tp_xdp_devmap_xmit); \ + if (mask & SAMPLE_DEVMAP_XMIT_CNT_MULTI) \ + __attach_tp(tp_xdp_devmap_xmit_multi); \ + return 0; \ + } From patchwork Wed Jul 21 21:28:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392243 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 905EBC6377D for ; Wed, 21 Jul 2021 21:28:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7BE456120C for ; Wed, 21 Jul 2021 21:28:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229865AbhGUUsQ (ORCPT ); Wed, 21 Jul 2021 16:48:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229882AbhGUUsN (ORCPT ); Wed, 21 Jul 2021 16:48:13 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2C63C0613D3; Wed, 21 Jul 2021 14:28:49 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id k4-20020a17090a5144b02901731c776526so1179994pjm.4; Wed, 21 Jul 2021 14:28:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OOhR5AMX8RfXZC1KUEZgLZRVbLDpQr1d3ZP+g6mM52o=; b=f7I8PSYGSFwh0UMH4Rhb9th1KzMoxCRwihGokO/oSzSYWOk0FwZjnckyyLYuy9s1Xr x4lkDi4IjLgpfNv6/EMZBW6AnbX3iWZaWNQ/EY6bidb5EYBbwnciseVB6VAineL0INsq gv8jdpZ1QpsPvHzkiHa7Ouu42LhNC1sDbXS+w6JC7tvjS7/7H6jHuKvJK6KYHf2LE3ew f8LhrNNa07TqM9kxu08syfOXfIaMCqKZLfYigact000ct8YDJJtweT6TL40XOoiqIkRC 4Ljb1LH73YGUhandUwAo+g44eQkpjEBJoZLSdyK06Vf9xNP5LCSQeyPGDCLa0T/QohTt IFlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OOhR5AMX8RfXZC1KUEZgLZRVbLDpQr1d3ZP+g6mM52o=; b=D9OmNH1yOjl+ryB/2KYOPW/dGNYfqzEj/bwMr567jhpspCQZ3FJvH95/S5v3rs0lN2 mryE+zFqzHaaCkHLH2waFWmrlOlRnS6OK0heEASIX+dtw+ljJfzGFeZoVCTZ8GE6M2hA HmX3QMggWoZ5Ar5iPqSG/nJSypqfGhJYAwl/p31743s1jhyij3V7Xm7ZU8OQ3efZKsTb YPOkAhM87V0Sb5R8DO+RydWD5q2lHmaIJbBKZ/bLv/X6ASoBSsnizl4LrmG8wbVn1hDB xHpGbwS4dIgjYL2fAa/fIkTnzccRm+EhSDwOHsRVElMI002NJNvn0+KKbzc+xHA4XspY gzJA== X-Gm-Message-State: AOAM530txG2AVHHsBMMdSjPQvm22gbLBBpaaiMG+MMyoqhMRjE/nfOwz 9HcKDR1n+uDujQ4QtSi0PFI8tb1j6QN/4A== X-Google-Smtp-Source: ABdhPJyadMfFAIwnXKozqdBkOxwqP6kDKsqJqGzoGfTlWaWQGipbSAXFygDMyPpCDgXePBZdbrUfBQ== X-Received: by 2002:a65:568c:: with SMTP id v12mr38874281pgs.88.1626902929305; Wed, 21 Jul 2021 14:28:49 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id r10sm31119125pga.48.2021.07.21.14.28.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:49 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 3/8] samples: bpf: Add BPF support for XDP samples helper Date: Thu, 22 Jul 2021 02:58:28 +0530 Message-Id: <20210721212833.701342-4-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net These eBPF tracepoint programs export data that is consumed by the helpers added in the previous commit. Also add support int the Makefile to generate a vmlinux.h header that would be used by other bpf.c files in later commits. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 25 ++++ samples/bpf/xdp_sample.bpf.c | 215 +++++++++++++++++++++++++++++++++++ samples/bpf/xdp_sample.bpf.h | 57 ++++++++++ 3 files changed, 297 insertions(+) create mode 100644 samples/bpf/xdp_sample.bpf.c create mode 100644 samples/bpf/xdp_sample.bpf.h diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 57ccff5ccac4..52ac1c6b56e2 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -280,6 +280,11 @@ $(LIBBPF): FORCE $(MAKE) -C $(dir $@) RM='rm -rf' EXTRA_CFLAGS="$(TPROGS_CFLAGS)" \ LDFLAGS=$(TPROGS_LDFLAGS) srctree=$(BPF_SAMPLES_PATH)/../../ O= +BPFTOOLDIR := $(TOOLS_PATH)/bpf/bpftool +BPFTOOL := $(BPFTOOLDIR)/bpftool +$(BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) + $(MAKE) -C $(BPFTOOLDIR) srctree=$(BPF_SAMPLES_PATH)/../../ + $(obj)/syscall_nrs.h: $(obj)/syscall_nrs.s FORCE $(call filechk,offsets,__SYSCALL_NRS_H__) @@ -317,6 +322,26 @@ $(obj)/hbm_edt_kern.o: $(src)/hbm.h $(src)/hbm_kern.h -include $(BPF_SAMPLES_PATH)/Makefile.target +VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux) \ + $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \ + ../../../../vmlinux \ + /sys/kernel/btf/vmlinux \ + /boot/vmlinux-$(shell uname -r) +VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) + +ifeq ($(VMLINUX_BTF),) +$(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)") +endif + +$(obj)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) +ifeq ($(VMLINUX_H),) + $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@ +else + $(Q)cp "$(VMLINUX_H)" $@ +endif + +clean-files += vmlinux.h + # asm/sysreg.h - inline assembly used by it is incompatible with llvm. # But, there is no easy way to fix it, so just exclude it since it is # useless for BPF samples. diff --git a/samples/bpf/xdp_sample.bpf.c b/samples/bpf/xdp_sample.bpf.c new file mode 100644 index 000000000000..3359b4791ca4 --- /dev/null +++ b/samples/bpf/xdp_sample.bpf.c @@ -0,0 +1,215 @@ +// SPDX-License-Identifier: GPL-2.0 +/* GPLv2, Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. */ +#include "xdp_sample.bpf.h" + +#include +#include +#include + +struct sample_data sample_data; +devmap_xmit_cnt_multi_t devmap_xmit_cnt_multi SEC(".maps"); + +static __always_inline +__u32 xdp_get_err_key(int err) +{ + switch (err) { + case 0: + return 0; + case -EINVAL: + return 2; + case -ENETDOWN: + return 3; + case -EMSGSIZE: + return 4; + case -EOPNOTSUPP: + return 5; + case -ENOSPC: + return 6; + default: + return 1; + } +} + +static __always_inline +int xdp_redirect_collect_stat(int err) +{ + u32 cpu = bpf_get_smp_processor_id(); + u32 key = XDP_REDIRECT_ERROR; + struct datarec *rec; + u32 idx; + + key = xdp_get_err_key(err); + idx = key * MAX_CPUS + cpu; + + if (idx < 7 * MAX_CPUS) { + rec = &sample_data.redirect_err_cnt[idx]; + if (key) + ATOMIC_INC_NORMW(rec->dropped); + else + ATOMIC_INC_NORMW(rec->processed); + } + return 0; /* Indicate event was filtered (no further processing)*/ + /* + * Returning 1 here would allow e.g. a perf-record tracepoint + * to see and record these events, but it doesn't work well + * in-practice as stopping perf-record also unload this + * bpf_prog. Plus, there is additional overhead of doing so. + */ +} + +SEC("tp_btf/xdp_redirect_err") +int BPF_PROG(tp_xdp_redirect_err, const struct net_device *dev, + const struct bpf_prog *xdp, + const void *tgt, int err, + const struct bpf_map *map, u32 index) +{ + return xdp_redirect_collect_stat(err); +} + +SEC("tp_btf/xdp_redirect_map_err") +int BPF_PROG(tp_xdp_redirect_map_err, const struct net_device *dev, + const struct bpf_prog *xdp, + const void *tgt, int err, + const struct bpf_map *map, u32 index) +{ + return xdp_redirect_collect_stat(err); +} + +SEC("tp_btf/xdp_redirect") +int BPF_PROG(tp_xdp_redirect, const struct net_device *dev, + const struct bpf_prog *xdp, + const void *tgt, int err, + const struct bpf_map *map, u32 index) +{ + return xdp_redirect_collect_stat(err); +} + +SEC("tp_btf/xdp_redirect_map") +int BPF_PROG(tp_xdp_redirect_map, const struct net_device *dev, + const struct bpf_prog *xdp, + const void *tgt, int err, + const struct bpf_map *map, u32 index) +{ + return xdp_redirect_collect_stat(err); +} + +SEC("tp_btf/xdp_exception") +int BPF_PROG(tp_xdp_exception, const struct net_device *dev, + const struct bpf_prog *xdp, u32 act) +{ + u32 cpu = bpf_get_smp_processor_id(); + struct datarec *rec; + u32 key = act, idx; + + if (key > XDP_REDIRECT) + key = XDP_REDIRECT + 1; + idx = key * MAX_CPUS + cpu; + + if (idx < 6 * MAX_CPUS) { + rec = &sample_data.exception_cnt[idx]; + ATOMIC_INC_NORMW(rec->dropped); + } + + return 0; +} + +SEC("tp_btf/xdp_cpumap_enqueue") +int BPF_PROG(tp_xdp_cpumap_enqueue, int map_id, unsigned int processed, + unsigned int drops, int to_cpu) +{ + u32 cpu = bpf_get_smp_processor_id(); + struct datarec *rec; + u32 idx; + + idx = to_cpu * MAX_CPUS + cpu; + + if (idx < MAX_CPUS * MAX_CPUS) { + rec = &sample_data.cpumap_enqueue_cnt[idx]; + + ATOMIC_ADD_NORMW(rec->processed, processed); + ATOMIC_ADD_NORMW(rec->dropped, drops); + /* Record bulk events, then userspace can calc average bulk size */ + if (processed > 0) + ATOMIC_INC_NORMW(rec->issue); + } + /* Inception: It's possible to detect overload situations, via + * this tracepoint. This can be used for creating a feedback + * loop to XDP, which can take appropriate actions to mitigate + * this overload situation. + */ + return 0; +} + +SEC("tp_btf/xdp_cpumap_kthread") +int BPF_PROG(tp_xdp_cpumap_kthread, int map_id, unsigned int processed, + unsigned int drops, int sched, struct xdp_cpumap_stats *xdp_stats) +{ + // Using xdp_stats directly fails verification + u32 cpu = bpf_get_smp_processor_id(); + struct datarec *rec; + + if (cpu < MAX_CPUS) { + rec = &sample_data.cpumap_kthread_cnt[cpu]; + + ATOMIC_ADD_NORMW(rec->processed, processed); + ATOMIC_ADD_NORMW(rec->dropped, drops); + ATOMIC_ADD_NORMW(rec->xdp_pass, xdp_stats->pass); + ATOMIC_ADD_NORMW(rec->xdp_drop, xdp_stats->drop); + ATOMIC_ADD_NORMW(rec->xdp_redirect, xdp_stats->redirect); + /* Count times kthread yielded CPU via schedule call */ + if (sched) + ATOMIC_INC_NORMW(rec->issue); + } + return 0; +} + +SEC("tp_btf/xdp_devmap_xmit") +int BPF_PROG(tp_xdp_devmap_xmit, const struct net_device *from_dev, + const struct net_device *to_dev, int sent, int drops, int err) +{ + u32 cpu = bpf_get_smp_processor_id(); + struct datarec *rec; + + if (cpu < MAX_CPUS) { + rec = &sample_data.devmap_xmit_cnt[cpu]; + + ATOMIC_ADD_NORMW(rec->processed, sent); + ATOMIC_ADD_NORMW(rec->dropped, drops); + /* Record bulk events, then userspace can calc average bulk size */ + ATOMIC_INC_NORMW(rec->info); + /* Record error cases, where no frame were sent */ + /* Catch API error of drv ndo_xdp_xmit sent more than count */ + if (err || drops < 0) + ATOMIC_INC_NORMW(rec->issue); + } + return 0; +} + +SEC("tp_btf/xdp_devmap_xmit") +int BPF_PROG(tp_xdp_devmap_xmit_multi, const struct net_device *from_dev, + const struct net_device *to_dev, int sent, int drops, int err) +{ + struct datarec empty = {}; + struct datarec *rec; + int idx_in, idx_out; + u64 idx; + + idx_in = from_dev->ifindex; + idx_out = to_dev->ifindex; + idx = idx_in; + idx = idx << 32 | idx_out; + + bpf_map_update_elem(&devmap_xmit_cnt_multi, &idx, &empty, BPF_NOEXIST); + rec = bpf_map_lookup_elem(&devmap_xmit_cnt_multi, &idx); + if (!rec) + return 0; + + ATOMIC_ADD_NORMW(rec->processed, sent); + ATOMIC_ADD_NORMW(rec->dropped, drops); + ATOMIC_INC_NORMW(rec->info); + + if (err || drops < 0) + ATOMIC_INC_NORMW(rec->issue); + + return 0; +} diff --git a/samples/bpf/xdp_sample.bpf.h b/samples/bpf/xdp_sample.bpf.h new file mode 100644 index 000000000000..20fbafe28be5 --- /dev/null +++ b/samples/bpf/xdp_sample.bpf.h @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef _XDP_SAMPLE_BPF_H +#define _XDP_SAMPLE_BPF_H + +#include "vmlinux.h" +#include +#include +#include + +#include "xdp_sample_shared.h" + +#define ETH_ALEN 6 +#define ETH_P_802_3_MIN 0x0600 +#define ETH_P_8021Q 0x8100 +#define ETH_P_8021AD 0x88A8 +#define ETH_P_IP 0x0800 +#define ETH_P_IPV6 0x86DD +#define ETH_P_ARP 0x0806 +#define IPPROTO_ICMPV6 58 + +#define EINVAL 22 +#define ENETDOWN 100 +#define EMSGSIZE 90 +#define EOPNOTSUPP 95 +#define ENOSPC 28 + +extern struct sample_data sample_data; + +typedef struct { + __uint(type, BPF_MAP_TYPE_PERCPU_HASH); + __uint(max_entries, 32); + __type(key, u64); + __type(value, struct datarec); +} devmap_xmit_cnt_multi_t; + +enum { + XDP_REDIRECT_SUCCESS = 0, + XDP_REDIRECT_ERROR = 1 +}; + +static __always_inline void swap_src_dst_mac(void *data) +{ + unsigned short *p = data; + unsigned short dst[3]; + + dst[0] = p[0]; + dst[1] = p[1]; + dst[2] = p[2]; + p[0] = p[3]; + p[1] = p[4]; + p[2] = p[5]; + p[3] = dst[0]; + p[4] = dst[1]; + p[5] = dst[2]; +} + +#endif From patchwork Wed Jul 21 21:28:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392245 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96BB2C6377A for ; Wed, 21 Jul 2021 21:28:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DB086120D for ; Wed, 21 Jul 2021 21:28:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229871AbhGUUsU (ORCPT ); Wed, 21 Jul 2021 16:48:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229900AbhGUUsR (ORCPT ); Wed, 21 Jul 2021 16:48:17 -0400 Received: from mail-pj1-x1042.google.com (mail-pj1-x1042.google.com [IPv6:2607:f8b0:4864:20::1042]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 549B9C061575; Wed, 21 Jul 2021 14:28:53 -0700 (PDT) Received: by mail-pj1-x1042.google.com with SMTP id bt15so3233867pjb.2; Wed, 21 Jul 2021 14:28:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=huKNr7tn2SOucVczag+CDXgcZAudbGsrmWWCgckhSXA=; b=Wd9ec7sKGZuxYb8lTNz4PT8c5nvf3wHi4SOk7/+0ZFjrb4OyPqLUQzCARcWczBC8it QF7kAirSrG3BNilShi9I5yOsFQkF4eyrqrajD2R5A9ybDtDfm/uOlM/bzakDa/bUCx1z Wkh5PDVf0gxHsQ7UHTfwyuIB7HBXw0EOWUjDN78kpB50HJd9b70JA4NdD5ZTi6UvgV/y pbCSwrWFue3gNIMLlzCNTlO/DRNKo9n1mbx/o193+UveZlTteGDXOQ7NxIfP++0dgKR1 d7UdT3ACLzzhB9DHr16hJaSN9jT2+oeYEx/OC/1QwOtxUI0Mkw3pkSK/gVfAsGJ9Vkb5 6nKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=huKNr7tn2SOucVczag+CDXgcZAudbGsrmWWCgckhSXA=; b=cKKTMmVD+QyI08s9U1ZA+YfliGUlfPjAPygGQy++WQGlYlUqGyh9JjlvI5GgxDQRzQ Xcy4DIQ9vEggezFp1hDpmFwH9cHVgzitDCw81CYiA0wrBb7m+9yzkvTxdKC/3j3bL/Pj LNbWgn6oHjUTvNvxkJwLx8APHTOt1XBuD+2qJvZqqWCqZf//tsOqhDQ0G69+WisBmxHR 66BHLJDRSF1xVVvTzBM6Jiz01qJ6Pr1LD1vutZNNqqIItHfsfyZQCzidBzEIIwq1xbrd Av+ja5Ynbd2vPuQfkYjAlAyp+nfWp9N737FMn4w39/dTeiI7R02c/S65qYHDT4Dam+A5 5eeA== X-Gm-Message-State: AOAM5338aRKWulR5QKB9EBPOuh7gNHwmyFL0eOWCWroyFHMDaepfUfZt LtGMW++z7jsEgkdO382LqQ86x5oVXmzyuQ== X-Google-Smtp-Source: ABdhPJwYJgJvLwRWKnmetZj1SJw8J0hQQVCufAI8e23RKLRAxERQwnmmKPgEZzIM6MW0Ya2OfVrWNw== X-Received: by 2002:a17:90a:a087:: with SMTP id r7mr5844666pjp.84.1626902932456; Wed, 21 Jul 2021 14:28:52 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id w23sm14298766pfc.60.2021.07.21.14.28.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:52 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 4/8] samples: bpf: Convert xdp_monitor to use XDP samples helper Date: Thu, 22 Jul 2021 02:58:29 +0530 Message-Id: <20210721212833.701342-5-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This change converts XDP monitor tool to use the XDP samples support introduced in previous changes. Makefile support is added to produce the BPF skeleton and link xdp_monitor bpf object with xdp_sample bpf object. This is then used to open and load the bpf object and attach the needed tracepoints only. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 49 ++- samples/bpf/xdp_monitor.bpf.c | 8 + samples/bpf/xdp_monitor_kern.c | 257 ----------- samples/bpf/xdp_monitor_user.c | 768 +++------------------------------ 4 files changed, 109 insertions(+), 973 deletions(-) create mode 100644 samples/bpf/xdp_monitor.bpf.c delete mode 100644 samples/bpf/xdp_monitor_kern.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 52ac1c6b56e2..1b4838b2beb0 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -43,7 +43,6 @@ tprogs-y += xdp_redirect tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_cpu -tprogs-y += xdp_monitor tprogs-y += xdp_rxq_info tprogs-y += syscall_tp tprogs-y += cpustat @@ -57,6 +56,8 @@ tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_monitor + # Libbpf dependencies LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a @@ -103,7 +104,6 @@ xdp_redirect-objs := xdp_redirect_user.o xdp_redirect_map-objs := xdp_redirect_map_user.o xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o -xdp_monitor-objs := xdp_monitor_user.o xdp_rxq_info-objs := xdp_rxq_info_user.o syscall_tp-objs := syscall_tp_user.o cpustat-objs := cpustat_user.o @@ -118,6 +118,7 @@ ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o +xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE) # Tell kbuild to always build the programs always-y := $(tprogs-y) @@ -167,7 +168,6 @@ always-y += xdp_redirect_kern.o always-y += xdp_redirect_map_kern.o always-y += xdp_redirect_map_multi_kern.o always-y += xdp_redirect_cpu_kern.o -always-y += xdp_monitor_kern.o always-y += xdp_rxq_info_kern.o always-y += xdp2skb_meta_kern.o always-y += syscall_tp_kern.o @@ -315,6 +315,8 @@ verify_target_bpf: verify_cmds $(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF) $(src)/*.c: verify_target_bpf $(LIBBPF) +$(obj)/xdp_monitor_user.o: $(obj)/xdp_monitor.skel.h + $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h $(obj)/hbm_out_kern.o: $(src)/hbm.h $(src)/hbm_kern.h $(obj)/hbm.o: $(src)/hbm.h @@ -342,6 +344,47 @@ endif clean-files += vmlinux.h +# Get Clang's default includes on this system, as opposed to those seen by +# '-target bpf'. This fixes "missing" files on some architectures/distros, +# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc. +# +# Use '-idirafter': Don't interfere with include mechanics except where the +# build would have failed anyways. +define get_sys_includes +$(shell $(1) -v -E - &1 \ + | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \ +$(shell $(1) -dM -E - $@ + # asm/sysreg.h - inline assembly used by it is incompatible with llvm. # But, there is no easy way to fix it, so just exclude it since it is # useless for BPF samples. diff --git a/samples/bpf/xdp_monitor.bpf.c b/samples/bpf/xdp_monitor.bpf.c new file mode 100644 index 000000000000..cfb41e2205f4 --- /dev/null +++ b/samples/bpf/xdp_monitor.bpf.c @@ -0,0 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2017-2018 Jesper Dangaard Brouer, Red Hat Inc. + * + * XDP monitor tool, based on tracepoints + */ +#include "xdp_sample.bpf.h" + +char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_monitor_kern.c b/samples/bpf/xdp_monitor_kern.c deleted file mode 100644 index 5c955b812c47..000000000000 --- a/samples/bpf/xdp_monitor_kern.c +++ /dev/null @@ -1,257 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 - * Copyright(c) 2017-2018 Jesper Dangaard Brouer, Red Hat Inc. - * - * XDP monitor tool, based on tracepoints - */ -#include -#include - -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, u64); - __uint(max_entries, 2); - /* TODO: have entries for all possible errno's */ -} redirect_err_cnt SEC(".maps"); - -#define XDP_UNKNOWN XDP_REDIRECT + 1 -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, u64); - __uint(max_entries, XDP_UNKNOWN + 1); -} exception_cnt SEC(".maps"); - -/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_redirect/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct xdp_redirect_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int prog_id; // offset:8; size:4; signed:1; - u32 act; // offset:12 size:4; signed:0; - int ifindex; // offset:16 size:4; signed:1; - int err; // offset:20 size:4; signed:1; - int to_ifindex; // offset:24 size:4; signed:1; - u32 map_id; // offset:28 size:4; signed:0; - int map_index; // offset:32 size:4; signed:1; -}; // offset:36 - -enum { - XDP_REDIRECT_SUCCESS = 0, - XDP_REDIRECT_ERROR = 1 -}; - -static __always_inline -int xdp_redirect_collect_stat(struct xdp_redirect_ctx *ctx) -{ - u32 key = XDP_REDIRECT_ERROR; - int err = ctx->err; - u64 *cnt; - - if (!err) - key = XDP_REDIRECT_SUCCESS; - - cnt = bpf_map_lookup_elem(&redirect_err_cnt, &key); - if (!cnt) - return 1; - *cnt += 1; - - return 0; /* Indicate event was filtered (no further processing)*/ - /* - * Returning 1 here would allow e.g. a perf-record tracepoint - * to see and record these events, but it doesn't work well - * in-practice as stopping perf-record also unload this - * bpf_prog. Plus, there is additional overhead of doing so. - */ -} - -SEC("tracepoint/xdp/xdp_redirect_err") -int trace_xdp_redirect_err(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - - -SEC("tracepoint/xdp/xdp_redirect_map_err") -int trace_xdp_redirect_map_err(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - -/* Likely unloaded when prog starts */ -SEC("tracepoint/xdp/xdp_redirect") -int trace_xdp_redirect(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - -/* Likely unloaded when prog starts */ -SEC("tracepoint/xdp/xdp_redirect_map") -int trace_xdp_redirect_map(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - -/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_exception/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct xdp_exception_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int prog_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int ifindex; // offset:16; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_exception") -int trace_xdp_exception(struct xdp_exception_ctx *ctx) -{ - u64 *cnt; - u32 key; - - key = ctx->act; - if (key > XDP_REDIRECT) - key = XDP_UNKNOWN; - - cnt = bpf_map_lookup_elem(&exception_cnt, &key); - if (!cnt) - return 1; - *cnt += 1; - - return 0; -} - -/* Common stats data record shared with _user.c */ -struct datarec { - u64 processed; - u64 dropped; - u64 info; - u64 err; -}; -#define MAX_CPUS 64 - -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, MAX_CPUS); -} cpumap_enqueue_cnt SEC(".maps"); - -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 1); -} cpumap_kthread_cnt SEC(".maps"); - -/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_enqueue/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct cpumap_enqueue_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int map_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int cpu; // offset:16; size:4; signed:1; - unsigned int drops; // offset:20; size:4; signed:0; - unsigned int processed; // offset:24; size:4; signed:0; - int to_cpu; // offset:28; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_cpumap_enqueue") -int trace_xdp_cpumap_enqueue(struct cpumap_enqueue_ctx *ctx) -{ - u32 to_cpu = ctx->to_cpu; - struct datarec *rec; - - if (to_cpu >= MAX_CPUS) - return 1; - - rec = bpf_map_lookup_elem(&cpumap_enqueue_cnt, &to_cpu); - if (!rec) - return 0; - rec->processed += ctx->processed; - rec->dropped += ctx->drops; - - /* Record bulk events, then userspace can calc average bulk size */ - if (ctx->processed > 0) - rec->info += 1; - - return 0; -} - -/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_kthread/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct cpumap_kthread_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int map_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int cpu; // offset:16; size:4; signed:1; - unsigned int drops; // offset:20; size:4; signed:0; - unsigned int processed; // offset:24; size:4; signed:0; - int sched; // offset:28; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_cpumap_kthread") -int trace_xdp_cpumap_kthread(struct cpumap_kthread_ctx *ctx) -{ - struct datarec *rec; - u32 key = 0; - - rec = bpf_map_lookup_elem(&cpumap_kthread_cnt, &key); - if (!rec) - return 0; - rec->processed += ctx->processed; - rec->dropped += ctx->drops; - - /* Count times kthread yielded CPU via schedule call */ - if (ctx->sched) - rec->info++; - - return 0; -} - -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 1); -} devmap_xmit_cnt SEC(".maps"); - -/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_devmap_xmit/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct devmap_xmit_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int from_ifindex; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int to_ifindex; // offset:16; size:4; signed:1; - int drops; // offset:20; size:4; signed:1; - int sent; // offset:24; size:4; signed:1; - int err; // offset:28; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_devmap_xmit") -int trace_xdp_devmap_xmit(struct devmap_xmit_ctx *ctx) -{ - struct datarec *rec; - u32 key = 0; - - rec = bpf_map_lookup_elem(&devmap_xmit_cnt, &key); - if (!rec) - return 0; - rec->processed += ctx->sent; - rec->dropped += ctx->drops; - - /* Record bulk events, then userspace can calc average bulk size */ - rec->info += 1; - - /* Record error cases, where no frame were sent */ - if (ctx->err) - rec->err++; - - /* Catch API error of drv ndo_xdp_xmit sent more than count */ - if (ctx->drops < 0) - rec->err++; - - return 1; -} diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c index 49ebc49aefc3..3ad2d2a0d439 100644 --- a/samples/bpf/xdp_monitor_user.c +++ b/samples/bpf/xdp_monitor_user.c @@ -1,14 +1,13 @@ -/* SPDX-License-Identifier: GPL-2.0 - * Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. - */ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. */ static const char *__doc__= - "XDP monitor tool, based on tracepoints\n" +"XDP monitor tool, based on tracepoints\n" ; static const char *__doc_err_only__= - " NOTICE: Only tracking XDP redirect errors\n" - " Enable TX success stats via '--stats'\n" - " (which comes with a per packet processing overhead)\n" +" NOTICE: Only tracking XDP redirect errors\n" +" Enable redirect success stats via '-s/--stats'\n" +" (which comes with a per packet processing overhead)\n" ; #include @@ -20,68 +19,37 @@ static const char *__doc_err_only__= #include #include #include - #include #include #include #include - #include #include #include #include "bpf_util.h" +#include "xdp_sample_user.h" +#include "xdp_monitor.skel.h" -enum map_type { - REDIRECT_ERR_CNT, - EXCEPTION_CNT, - CPUMAP_ENQUEUE_CNT, - CPUMAP_KTHREAD_CNT, - DEVMAP_XMIT_CNT, -}; +static int mask = SAMPLE_REDIRECT_ERR_CNT | SAMPLE_CPUMAP_ENQUEUE_CNT | + SAMPLE_CPUMAP_KTHREAD_CNT | SAMPLE_EXCEPTION_CNT | + SAMPLE_DEVMAP_XMIT_CNT | SAMPLE_DEVMAP_XMIT_CNT_MULTI; -static const char *const map_type_strings[] = { - [REDIRECT_ERR_CNT] = "redirect_err_cnt", - [EXCEPTION_CNT] = "exception_cnt", - [CPUMAP_ENQUEUE_CNT] = "cpumap_enqueue_cnt", - [CPUMAP_KTHREAD_CNT] = "cpumap_kthread_cnt", - [DEVMAP_XMIT_CNT] = "devmap_xmit_cnt", -}; - -#define NUM_MAP 5 -#define NUM_TP 8 - -static int tp_cnt; -static int map_cnt; -static int verbose = 1; -static bool debug = false; -struct bpf_map *map_data[NUM_MAP] = {}; -struct bpf_link *tp_links[NUM_TP] = {}; -struct bpf_object *obj; +DEFINE_SAMPLE_INIT(xdp_monitor); static const struct option long_options[] = { - {"help", no_argument, NULL, 'h' }, - {"debug", no_argument, NULL, 'D' }, - {"stats", no_argument, NULL, 'S' }, - {"sec", required_argument, NULL, 's' }, - {0, 0, NULL, 0 } + { "help", no_argument, NULL, 'h' }, + { "stats", no_argument, NULL, 's' }, + { "interval", required_argument, NULL, 'i' }, + { "verbose", no_argument, NULL, 'v' }, + {} }; -static void int_exit(int sig) -{ - /* Detach tracepoints */ - while (tp_cnt) - bpf_link__destroy(tp_links[--tp_cnt]); - - bpf_object__close(obj); - exit(0); -} - -/* C standard specifies two constants, EXIT_SUCCESS(0) and EXIT_FAILURE(1) */ -#define EXIT_FAIL_MEM 5 - static void usage(char *argv[]) { int i; + + sample_print_help(mask); + printf("\nDOCUMENTATION:\n%s\n", __doc__); printf("\n"); printf(" Usage: %s (options-see-below)\n", @@ -100,615 +68,27 @@ static void usage(char *argv[]) printf("\n"); } -#define NANOSEC_PER_SEC 1000000000 /* 10^9 */ -static __u64 gettime(void) -{ - struct timespec t; - int res; - - res = clock_gettime(CLOCK_MONOTONIC, &t); - if (res < 0) { - fprintf(stderr, "Error with gettimeofday! (%i)\n", res); - exit(EXIT_FAILURE); - } - return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec; -} - -enum { - REDIR_SUCCESS = 0, - REDIR_ERROR = 1, -}; -#define REDIR_RES_MAX 2 -static const char *redir_names[REDIR_RES_MAX] = { - [REDIR_SUCCESS] = "Success", - [REDIR_ERROR] = "Error", -}; -static const char *err2str(int err) -{ - if (err < REDIR_RES_MAX) - return redir_names[err]; - return NULL; -} -/* enum xdp_action */ -#define XDP_UNKNOWN XDP_REDIRECT + 1 -#define XDP_ACTION_MAX (XDP_UNKNOWN + 1) -static const char *xdp_action_names[XDP_ACTION_MAX] = { - [XDP_ABORTED] = "XDP_ABORTED", - [XDP_DROP] = "XDP_DROP", - [XDP_PASS] = "XDP_PASS", - [XDP_TX] = "XDP_TX", - [XDP_REDIRECT] = "XDP_REDIRECT", - [XDP_UNKNOWN] = "XDP_UNKNOWN", -}; -static const char *action2str(int action) -{ - if (action < XDP_ACTION_MAX) - return xdp_action_names[action]; - return NULL; -} - -/* Common stats data record shared with _kern.c */ -struct datarec { - __u64 processed; - __u64 dropped; - __u64 info; - __u64 err; -}; -#define MAX_CPUS 64 - -/* Userspace structs for collection of stats from maps */ -struct record { - __u64 timestamp; - struct datarec total; - struct datarec *cpu; -}; -struct u64rec { - __u64 processed; -}; -struct record_u64 { - /* record for _kern side __u64 values */ - __u64 timestamp; - struct u64rec total; - struct u64rec *cpu; -}; - -struct stats_record { - struct record_u64 xdp_redirect[REDIR_RES_MAX]; - struct record_u64 xdp_exception[XDP_ACTION_MAX]; - struct record xdp_cpumap_kthread; - struct record xdp_cpumap_enqueue[MAX_CPUS]; - struct record xdp_devmap_xmit; -}; - -static bool map_collect_record(int fd, __u32 key, struct record *rec) -{ - /* For percpu maps, userspace gets a value per possible CPU */ - unsigned int nr_cpus = bpf_num_possible_cpus(); - struct datarec values[nr_cpus]; - __u64 sum_processed = 0; - __u64 sum_dropped = 0; - __u64 sum_info = 0; - __u64 sum_err = 0; - int i; - - if ((bpf_map_lookup_elem(fd, &key, values)) != 0) { - fprintf(stderr, - "ERR: bpf_map_lookup_elem failed key:0x%X\n", key); - return false; - } - /* Get time as close as possible to reading map contents */ - rec->timestamp = gettime(); - - /* Record and sum values from each CPU */ - for (i = 0; i < nr_cpus; i++) { - rec->cpu[i].processed = values[i].processed; - sum_processed += values[i].processed; - rec->cpu[i].dropped = values[i].dropped; - sum_dropped += values[i].dropped; - rec->cpu[i].info = values[i].info; - sum_info += values[i].info; - rec->cpu[i].err = values[i].err; - sum_err += values[i].err; - } - rec->total.processed = sum_processed; - rec->total.dropped = sum_dropped; - rec->total.info = sum_info; - rec->total.err = sum_err; - return true; -} - -static bool map_collect_record_u64(int fd, __u32 key, struct record_u64 *rec) -{ - /* For percpu maps, userspace gets a value per possible CPU */ - unsigned int nr_cpus = bpf_num_possible_cpus(); - struct u64rec values[nr_cpus]; - __u64 sum_total = 0; - int i; - - if ((bpf_map_lookup_elem(fd, &key, values)) != 0) { - fprintf(stderr, - "ERR: bpf_map_lookup_elem failed key:0x%X\n", key); - return false; - } - /* Get time as close as possible to reading map contents */ - rec->timestamp = gettime(); - - /* Record and sum values from each CPU */ - for (i = 0; i < nr_cpus; i++) { - rec->cpu[i].processed = values[i].processed; - sum_total += values[i].processed; - } - rec->total.processed = sum_total; - return true; -} - -static double calc_period(struct record *r, struct record *p) -{ - double period_ = 0; - __u64 period = 0; - - period = r->timestamp - p->timestamp; - if (period > 0) - period_ = ((double) period / NANOSEC_PER_SEC); - - return period_; -} - -static double calc_period_u64(struct record_u64 *r, struct record_u64 *p) -{ - double period_ = 0; - __u64 period = 0; - - period = r->timestamp - p->timestamp; - if (period > 0) - period_ = ((double) period / NANOSEC_PER_SEC); - - return period_; -} - -static double calc_pps(struct datarec *r, struct datarec *p, double period) -{ - __u64 packets = 0; - double pps = 0; - - if (period > 0) { - packets = r->processed - p->processed; - pps = packets / period; - } - return pps; -} - -static double calc_pps_u64(struct u64rec *r, struct u64rec *p, double period) -{ - __u64 packets = 0; - double pps = 0; - - if (period > 0) { - packets = r->processed - p->processed; - pps = packets / period; - } - return pps; -} - -static double calc_drop(struct datarec *r, struct datarec *p, double period) -{ - __u64 packets = 0; - double pps = 0; - - if (period > 0) { - packets = r->dropped - p->dropped; - pps = packets / period; - } - return pps; -} - -static double calc_info(struct datarec *r, struct datarec *p, double period) -{ - __u64 packets = 0; - double pps = 0; - - if (period > 0) { - packets = r->info - p->info; - pps = packets / period; - } - return pps; -} - -static double calc_err(struct datarec *r, struct datarec *p, double period) -{ - __u64 packets = 0; - double pps = 0; - - if (period > 0) { - packets = r->err - p->err; - pps = packets / period; - } - return pps; -} - -static void stats_print(struct stats_record *stats_rec, - struct stats_record *stats_prev, - bool err_only) -{ - unsigned int nr_cpus = bpf_num_possible_cpus(); - int rec_i = 0, i, to_cpu; - double t = 0, pps = 0; - - /* Header */ - printf("%-15s %-7s %-12s %-12s %-9s\n", - "XDP-event", "CPU:to", "pps", "drop-pps", "extra-info"); - - /* tracepoint: xdp:xdp_redirect_* */ - if (err_only) - rec_i = REDIR_ERROR; - - for (; rec_i < REDIR_RES_MAX; rec_i++) { - struct record_u64 *rec, *prev; - char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %s\n"; - char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %s\n"; - - rec = &stats_rec->xdp_redirect[rec_i]; - prev = &stats_prev->xdp_redirect[rec_i]; - t = calc_period_u64(rec, prev); - - for (i = 0; i < nr_cpus; i++) { - struct u64rec *r = &rec->cpu[i]; - struct u64rec *p = &prev->cpu[i]; - - pps = calc_pps_u64(r, p, t); - if (pps > 0) - printf(fmt1, "XDP_REDIRECT", i, - rec_i ? 0.0: pps, rec_i ? pps : 0.0, - err2str(rec_i)); - } - pps = calc_pps_u64(&rec->total, &prev->total, t); - printf(fmt2, "XDP_REDIRECT", "total", - rec_i ? 0.0: pps, rec_i ? pps : 0.0, err2str(rec_i)); - } - - /* tracepoint: xdp:xdp_exception */ - for (rec_i = 0; rec_i < XDP_ACTION_MAX; rec_i++) { - struct record_u64 *rec, *prev; - char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %s\n"; - char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %s\n"; - - rec = &stats_rec->xdp_exception[rec_i]; - prev = &stats_prev->xdp_exception[rec_i]; - t = calc_period_u64(rec, prev); - - for (i = 0; i < nr_cpus; i++) { - struct u64rec *r = &rec->cpu[i]; - struct u64rec *p = &prev->cpu[i]; - - pps = calc_pps_u64(r, p, t); - if (pps > 0) - printf(fmt1, "Exception", i, - 0.0, pps, action2str(rec_i)); - } - pps = calc_pps_u64(&rec->total, &prev->total, t); - if (pps > 0) - printf(fmt2, "Exception", "total", - 0.0, pps, action2str(rec_i)); - } - - /* cpumap enqueue stats */ - for (to_cpu = 0; to_cpu < MAX_CPUS; to_cpu++) { - char *fmt1 = "%-15s %3d:%-3d %'-12.0f %'-12.0f %'-10.2f %s\n"; - char *fmt2 = "%-15s %3s:%-3d %'-12.0f %'-12.0f %'-10.2f %s\n"; - struct record *rec, *prev; - char *info_str = ""; - double drop, info; - - rec = &stats_rec->xdp_cpumap_enqueue[to_cpu]; - prev = &stats_prev->xdp_cpumap_enqueue[to_cpu]; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop(r, p, t); - info = calc_info(r, p, t); - if (info > 0) { - info_str = "bulk-average"; - info = pps / info; /* calc average bulk size */ - } - if (pps > 0) - printf(fmt1, "cpumap-enqueue", - i, to_cpu, pps, drop, info, info_str); - } - pps = calc_pps(&rec->total, &prev->total, t); - if (pps > 0) { - drop = calc_drop(&rec->total, &prev->total, t); - info = calc_info(&rec->total, &prev->total, t); - if (info > 0) { - info_str = "bulk-average"; - info = pps / info; /* calc average bulk size */ - } - printf(fmt2, "cpumap-enqueue", - "sum", to_cpu, pps, drop, info, info_str); - } - } - - /* cpumap kthread stats */ - { - char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.0f %s\n"; - char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.0f %s\n"; - struct record *rec, *prev; - double drop, info; - char *i_str = ""; - - rec = &stats_rec->xdp_cpumap_kthread; - prev = &stats_prev->xdp_cpumap_kthread; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop(r, p, t); - info = calc_info(r, p, t); - if (info > 0) - i_str = "sched"; - if (pps > 0 || drop > 0) - printf(fmt1, "cpumap-kthread", - i, pps, drop, info, i_str); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop(&rec->total, &prev->total, t); - info = calc_info(&rec->total, &prev->total, t); - if (info > 0) - i_str = "sched-sum"; - printf(fmt2, "cpumap-kthread", "total", pps, drop, info, i_str); - } - - /* devmap ndo_xdp_xmit stats */ - { - char *fmt1 = "%-15s %-7d %'-12.0f %'-12.0f %'-10.2f %s %s\n"; - char *fmt2 = "%-15s %-7s %'-12.0f %'-12.0f %'-10.2f %s %s\n"; - struct record *rec, *prev; - double drop, info, err; - char *i_str = ""; - char *err_str = ""; - - rec = &stats_rec->xdp_devmap_xmit; - prev = &stats_prev->xdp_devmap_xmit; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop(r, p, t); - info = calc_info(r, p, t); - err = calc_err(r, p, t); - if (info > 0) { - i_str = "bulk-average"; - info = (pps+drop) / info; /* calc avg bulk */ - } - if (err > 0) - err_str = "drv-err"; - if (pps > 0 || drop > 0) - printf(fmt1, "devmap-xmit", - i, pps, drop, info, i_str, err_str); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop(&rec->total, &prev->total, t); - info = calc_info(&rec->total, &prev->total, t); - err = calc_err(&rec->total, &prev->total, t); - if (info > 0) { - i_str = "bulk-average"; - info = (pps+drop) / info; /* calc avg bulk */ - } - if (err > 0) - err_str = "drv-err"; - printf(fmt2, "devmap-xmit", "total", pps, drop, - info, i_str, err_str); - } - - printf("\n"); -} - -static bool stats_collect(struct stats_record *rec) -{ - int fd; - int i; - - /* TODO: Detect if someone unloaded the perf event_fd's, as - * this can happen by someone running perf-record -e - */ - - fd = bpf_map__fd(map_data[REDIRECT_ERR_CNT]); - for (i = 0; i < REDIR_RES_MAX; i++) - map_collect_record_u64(fd, i, &rec->xdp_redirect[i]); - - fd = bpf_map__fd(map_data[EXCEPTION_CNT]); - for (i = 0; i < XDP_ACTION_MAX; i++) { - map_collect_record_u64(fd, i, &rec->xdp_exception[i]); - } - - fd = bpf_map__fd(map_data[CPUMAP_ENQUEUE_CNT]); - for (i = 0; i < MAX_CPUS; i++) - map_collect_record(fd, i, &rec->xdp_cpumap_enqueue[i]); - - fd = bpf_map__fd(map_data[CPUMAP_KTHREAD_CNT]); - map_collect_record(fd, 0, &rec->xdp_cpumap_kthread); - - fd = bpf_map__fd(map_data[DEVMAP_XMIT_CNT]); - map_collect_record(fd, 0, &rec->xdp_devmap_xmit); - - return true; -} - -static void *alloc_rec_per_cpu(int record_size) -{ - unsigned int nr_cpus = bpf_num_possible_cpus(); - void *array; - - array = calloc(nr_cpus, record_size); - if (!array) { - fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); - exit(EXIT_FAIL_MEM); - } - return array; -} - -static struct stats_record *alloc_stats_record(void) -{ - struct stats_record *rec; - int rec_sz; - int i; - - /* Alloc main stats_record structure */ - rec = calloc(1, sizeof(*rec)); - if (!rec) { - fprintf(stderr, "Mem alloc error\n"); - exit(EXIT_FAIL_MEM); - } - - /* Alloc stats stored per CPU for each record */ - rec_sz = sizeof(struct u64rec); - for (i = 0; i < REDIR_RES_MAX; i++) - rec->xdp_redirect[i].cpu = alloc_rec_per_cpu(rec_sz); - - for (i = 0; i < XDP_ACTION_MAX; i++) - rec->xdp_exception[i].cpu = alloc_rec_per_cpu(rec_sz); - - rec_sz = sizeof(struct datarec); - rec->xdp_cpumap_kthread.cpu = alloc_rec_per_cpu(rec_sz); - rec->xdp_devmap_xmit.cpu = alloc_rec_per_cpu(rec_sz); - - for (i = 0; i < MAX_CPUS; i++) - rec->xdp_cpumap_enqueue[i].cpu = alloc_rec_per_cpu(rec_sz); - - return rec; -} - -static void free_stats_record(struct stats_record *r) -{ - int i; - - for (i = 0; i < REDIR_RES_MAX; i++) - free(r->xdp_redirect[i].cpu); - - for (i = 0; i < XDP_ACTION_MAX; i++) - free(r->xdp_exception[i].cpu); - - free(r->xdp_cpumap_kthread.cpu); - free(r->xdp_devmap_xmit.cpu); - - for (i = 0; i < MAX_CPUS; i++) - free(r->xdp_cpumap_enqueue[i].cpu); - - free(r); -} - -/* Pointer swap trick */ -static inline void swap(struct stats_record **a, struct stats_record **b) -{ - struct stats_record *tmp; - - tmp = *a; - *a = *b; - *b = tmp; -} - -static void stats_poll(int interval, bool err_only) -{ - struct stats_record *rec, *prev; - - rec = alloc_stats_record(); - prev = alloc_stats_record(); - stats_collect(rec); - - if (err_only) - printf("\n%s\n", __doc_err_only__); - - /* Trick to pretty printf with thousands separators use %' */ - setlocale(LC_NUMERIC, "en_US"); - - /* Header */ - if (verbose) - printf("\n%s", __doc__); - - /* TODO Need more advanced stats on error types */ - if (verbose) { - printf(" - Stats map0: %s\n", bpf_map__name(map_data[0])); - printf(" - Stats map1: %s\n", bpf_map__name(map_data[1])); - printf("\n"); - } - fflush(stdout); - - while (1) { - swap(&prev, &rec); - stats_collect(rec); - stats_print(rec, prev, err_only); - fflush(stdout); - sleep(interval); - } - - free_stats_record(rec); - free_stats_record(prev); -} - -static void print_bpf_prog_info(void) -{ - struct bpf_program *prog; - struct bpf_map *map; - int i = 0; - - /* Prog info */ - printf("Loaded BPF prog have %d bpf program(s)\n", tp_cnt); - bpf_object__for_each_program(prog, obj) { - printf(" - prog_fd[%d] = fd(%d)\n", i, bpf_program__fd(prog)); - i++; - } - - i = 0; - /* Maps info */ - printf("Loaded BPF prog have %d map(s)\n", map_cnt); - bpf_object__for_each_map(map, obj) { - const char *name = bpf_map__name(map); - int fd = bpf_map__fd(map); - - printf(" - map_data[%d] = fd(%d) name:%s\n", i, fd, name); - i++; - } - - /* Event info */ - printf("Searching for (max:%d) event file descriptor(s)\n", tp_cnt); - for (i = 0; i < tp_cnt; i++) { - int fd = bpf_link__fd(tp_links[i]); - - if (fd != -1) - printf(" - event_fd[%d] = fd(%d)\n", i, fd); - } -} - int main(int argc, char **argv) { - struct bpf_program *prog; - int longindex = 0, opt; - int ret = EXIT_FAILURE; - enum map_type type; - char filename[256]; - - /* Default settings: */ + unsigned long interval = 2; + int ret = EXIT_FAIL_OPTION; + struct xdp_monitor *skel; bool errors_only = true; - int interval = 2; + int longindex = 0, opt; /* Parse commands line args */ - while ((opt = getopt_long(argc, argv, "hDSs:", + while ((opt = getopt_long(argc, argv, "si:vh", long_options, &longindex)) != -1) { switch (opt) { - case 'D': - debug = true; - break; - case 'S': + case 's': errors_only = false; + mask |= SAMPLE_REDIRECT_CNT; break; - case 's': - interval = atoi(optarg); + case 'i': + interval = strtoul(optarg, NULL, 0); + break; + case 'v': + sample_switch_mode(); break; case 'h': default: @@ -717,71 +97,33 @@ int main(int argc, char **argv) } } - snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); - - /* Remove tracepoint program when program is interrupted or killed */ - signal(SIGINT, int_exit); - signal(SIGTERM, int_exit); - - obj = bpf_object__open_file(filename, NULL); - if (libbpf_get_error(obj)) { - printf("ERROR: opening BPF object file failed\n"); - obj = NULL; - goto cleanup; - } - - /* load BPF program */ - if (bpf_object__load(obj)) { - printf("ERROR: loading BPF object file failed\n"); - goto cleanup; - } - - for (type = 0; type < NUM_MAP; type++) { - map_data[type] = - bpf_object__find_map_by_name(obj, map_type_strings[type]); - - if (libbpf_get_error(map_data[type])) { - printf("ERROR: finding a map in obj file failed\n"); - goto cleanup; - } - map_cnt++; - } - - bpf_object__for_each_program(prog, obj) { - tp_links[tp_cnt] = bpf_program__attach(prog); - if (libbpf_get_error(tp_links[tp_cnt])) { - printf("ERROR: bpf_program__attach failed\n"); - tp_links[tp_cnt] = NULL; - goto cleanup; - } - tp_cnt++; + skel = xdp_monitor__open_and_load(); + if (!skel) { + fprintf(stderr, "Failed to xdp_monitor__open_and_load: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end; } - if (debug) { - print_bpf_prog_info(); + ret = sample_init(skel, mask); + if (ret < 0) { + fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret)); + ret = EXIT_FAIL_BPF; + goto end_destroy; } - /* Unload/stop tracepoint event by closing bpf_link's */ - if (errors_only) { - /* The bpf_link[i] depend on the order of - * the functions was defined in _kern.c - */ - bpf_link__destroy(tp_links[2]); /* tracepoint/xdp/xdp_redirect */ - tp_links[2] = NULL; + if (errors_only) + printf("%s", __doc_err_only__); - bpf_link__destroy(tp_links[3]); /* tracepoint/xdp/xdp_redirect_map */ - tp_links[3] = NULL; + ret = sample_run(interval, NULL, NULL); + if (ret < 0) { + fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; } - - stats_poll(interval, errors_only); - - ret = EXIT_SUCCESS; - -cleanup: - /* Detach tracepoints */ - while (tp_cnt) - bpf_link__destroy(tp_links[--tp_cnt]); - - bpf_object__close(obj); - return ret; + ret = EXIT_OK; +end_destroy: + xdp_monitor__destroy(skel); +end: + sample_exit(ret); } From patchwork Wed Jul 21 21:28:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392247 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A16F9C6377A for ; Wed, 21 Jul 2021 21:29:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8793E6120C for ; Wed, 21 Jul 2021 21:29:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229915AbhGUUs2 (ORCPT ); Wed, 21 Jul 2021 16:48:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229916AbhGUUsV (ORCPT ); Wed, 21 Jul 2021 16:48:21 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30300C061799; Wed, 21 Jul 2021 14:28:56 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id u3so2105307plf.5; Wed, 21 Jul 2021 14:28:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7L4NhoGJlhjlx/ujnoqd5H+g7981hLVzfyi89WlvdpE=; b=m83w0nvu9rNI5MW9wIKL3fhi4Ipqq80t4yWv1oJDTIg0sJwtmi+H1F+jCJ9hgS2L9r NP3Hm8HcQHfH6HVnbj2eXA3Fym7ARjuVyJTHwsNW6l+iN9h4Uf+DEJ/f94uybuEWltVE xXvdBw1ENNa+jHTSLLyns9ml/w97SkpGHx+SXbeExU7Pkim/9lm/wyRQCF9irADzAtQB vU5FuSLiLSElAqSRrUFrDuaB892QRxqt8/HjogJnR4qhTQYFcaTMsIOq28hiio8NaFsB TBZEKI3Qk91ro1aeXLotdf2xtjYDJ6+/CG5N3k1Q9NPeiyBA3d7qlmTQCf5X3s+CIBoF 57/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7L4NhoGJlhjlx/ujnoqd5H+g7981hLVzfyi89WlvdpE=; b=EeABIt6n+q23fxfCrwfUxJlXtSlWrmEewhCJc4K013QvaRxOWztxzWNd4MPl4yI6Tl L7QX6u0d25Z8IGSIN/8FzSYNfXiOOn7x6msiauqH/FzaF/O8Vr+75KRSXYzXyhfc6JD4 TJORAcZ0DsWr+VX0sQCbwR24kou+cNA0Vf+wsDQl5AMmv4DLHwQZ871RTbW9/8oENAoh Wpuvqqfg3u0fU+aUJZP21nAIZAsmCn9IAVOk3PLsSAr/0aMN44tr7auh+sK1uhVdoKbM 7bdUKdLOIrLwaI+m6QYxnjJ9KoaVayzDi3SIceen7F9YUt3eKISZrozI1G6OfLPCGsqc eDVg== X-Gm-Message-State: AOAM5310BRxyQk2ioYk8Zy4H1Rq9jY4lCcWo7kDd8wuDDdlDb30/N9Rp uCJN4xypd/oJj+CWnRXQ86hNDvwmoJkmcA== X-Google-Smtp-Source: ABdhPJyEFdN1u852sG7BzcWet4LR9z4wW958JL24yorC1P5pHQcVm6AtzCcQsaoph6+v7MNmHsbFfg== X-Received: by 2002:a63:5643:: with SMTP id g3mr38081992pgm.263.1626902935486; Wed, 21 Jul 2021 14:28:55 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id v23sm794860pje.33.2021.07.21.14.28.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:55 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 5/8] samples: bpf: Convert xdp_redirect to use XDP samples helper Date: Thu, 22 Jul 2021 02:58:30 +0530 Message-Id: <20210721212833.701342-6-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This change converts XDP redirect tool to use the XDP samples support introduced in previous changes. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 10 +- samples/bpf/xdp_redirect.bpf.c | 52 +++++++ samples/bpf/xdp_redirect_kern.c | 90 ----------- samples/bpf/xdp_redirect_user.c | 262 +++++++++++++------------------- 4 files changed, 160 insertions(+), 254 deletions(-) create mode 100644 samples/bpf/xdp_redirect.bpf.c delete mode 100644 samples/bpf/xdp_redirect_kern.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 1b4838b2beb0..b94b6dac09ff 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -39,7 +39,6 @@ tprogs-y += lwt_len_hist tprogs-y += xdp_tx_iptunnel tprogs-y += test_map_in_map tprogs-y += per_socket_stats_example -tprogs-y += xdp_redirect tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_cpu @@ -56,6 +55,7 @@ tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_redirect tprogs-y += xdp_monitor # Libbpf dependencies @@ -100,7 +100,6 @@ lwt_len_hist-objs := lwt_len_hist_user.o xdp_tx_iptunnel-objs := xdp_tx_iptunnel_user.o test_map_in_map-objs := test_map_in_map_user.o per_socket_stats_example-objs := cookie_uid_helper_example.o -xdp_redirect-objs := xdp_redirect_user.o xdp_redirect_map-objs := xdp_redirect_map_user.o xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o @@ -118,6 +117,7 @@ ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o +xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE) xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE) # Tell kbuild to always build the programs @@ -164,7 +164,6 @@ always-y += tcp_clamp_kern.o always-y += tcp_basertt_kern.o always-y += tcp_tos_reflect_kern.o always-y += tcp_dumpstats_kern.o -always-y += xdp_redirect_kern.o always-y += xdp_redirect_map_kern.o always-y += xdp_redirect_map_multi_kern.o always-y += xdp_redirect_cpu_kern.o @@ -315,6 +314,7 @@ verify_target_bpf: verify_cmds $(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF) $(src)/*.c: verify_target_bpf $(LIBBPF) +$(obj)/xdp_redirect_user.o: $(obj)/xdp_redirect.skel.h $(obj)/xdp_monitor_user.o: $(obj)/xdp_monitor.skel.h $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h @@ -358,6 +358,7 @@ endef CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) +$(obj)/xdp_redirect.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_monitor.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/xdp_sample_shared.h @@ -368,9 +369,10 @@ $(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/x -I$(srctree)/tools/lib $(CLANG_SYS_INCLUDES) \ -c $(filter %.bpf.c,$^) -o $@ -LINKED_SKELS := xdp_monitor.skel.h +LINKED_SKELS := xdp_redirect.skel.h xdp_monitor.skel.h clean-files += $(LINKED_SKELS) +xdp_redirect.skel.h-deps := xdp_redirect.bpf.o xdp_sample.bpf.o xdp_monitor.skel.h-deps := xdp_monitor.bpf.o xdp_sample.bpf.o LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.bpf.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps))) diff --git a/samples/bpf/xdp_redirect.bpf.c b/samples/bpf/xdp_redirect.bpf.c new file mode 100644 index 000000000000..f5098812db36 --- /dev/null +++ b/samples/bpf/xdp_redirect.bpf.c @@ -0,0 +1,52 @@ +/* Copyright (c) 2016 John Fastabend + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +#define KBUILD_MODNAME "foo" + +#include "vmlinux.h" +#include "xdp_sample.bpf.h" +#include "xdp_sample_shared.h" + +const volatile int ifindex_out; + +SEC("xdp_redirect") +int xdp_redirect_prog(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct ethhdr *eth = data; + struct datarec *rec; + u64 nh_off; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + if (key < MAX_CPUS) { + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + swap_src_dst_mac(data); + return bpf_redirect(ifindex_out, 0); + } + + return XDP_PASS; +} + +/* Redirect require an XDP bpf_prog loaded on the TX device */ +SEC("xdp_redirect_dummy") +int xdp_redirect_dummy_prog(struct xdp_md *ctx) +{ + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_redirect_kern.c b/samples/bpf/xdp_redirect_kern.c deleted file mode 100644 index d26ec3aa215e..000000000000 --- a/samples/bpf/xdp_redirect_kern.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (c) 2016 John Fastabend - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of version 2 of the GNU General Public - * License as published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - */ -#define KBUILD_MODNAME "foo" -#include -#include -#include -#include -#include -#include -#include -#include - -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __type(key, int); - __type(value, int); - __uint(max_entries, 1); -} tx_port SEC(".maps"); - -/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success - * feedback. Redirect TX errors can be caught via a tracepoint. - */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, long); - __uint(max_entries, 1); -} rxcnt SEC(".maps"); - -static void swap_src_dst_mac(void *data) -{ - unsigned short *p = data; - unsigned short dst[3]; - - dst[0] = p[0]; - dst[1] = p[1]; - dst[2] = p[2]; - p[0] = p[3]; - p[1] = p[4]; - p[2] = p[5]; - p[3] = dst[0]; - p[4] = dst[1]; - p[5] = dst[2]; -} - -SEC("xdp_redirect") -int xdp_redirect_prog(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - int rc = XDP_DROP; - int *ifindex, port = 0; - long *value; - u32 key = 0; - u64 nh_off; - - nh_off = sizeof(*eth); - if (data + nh_off > data_end) - return rc; - - ifindex = bpf_map_lookup_elem(&tx_port, &port); - if (!ifindex) - return rc; - - value = bpf_map_lookup_elem(&rxcnt, &key); - if (value) - *value += 1; - - swap_src_dst_mac(data); - return bpf_redirect(*ifindex, 0); -} - -/* Redirect require an XDP bpf_prog loaded on the TX device */ -SEC("xdp_redirect_dummy") -int xdp_redirect_dummy_prog(struct xdp_md *ctx) -{ - return XDP_PASS; -} - -char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_redirect_user.c b/samples/bpf/xdp_redirect_user.c index 93854e135134..192fd620573f 100644 --- a/samples/bpf/xdp_redirect_user.c +++ b/samples/bpf/xdp_redirect_user.c @@ -13,126 +13,91 @@ #include #include #include +#include #include - -#include "bpf_util.h" #include #include +#include "bpf_util.h" +#include "xdp_sample_user.h" +#include "xdp_redirect.skel.h" -static int ifindex_in; -static int ifindex_out; -static bool ifindex_out_xdp_dummy_attached = true; -static __u32 prog_id; -static __u32 dummy_prog_id; - -static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; -static int rxcnt_map_fd; +static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_CNT | + SAMPLE_EXCEPTION_CNT; -static void int_exit(int sig) -{ - __u32 curr_prog_id = 0; +DEFINE_SAMPLE_INIT(xdp_redirect); - if (bpf_get_link_xdp_id(ifindex_in, &curr_prog_id, xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(1); - } - if (prog_id == curr_prog_id) - bpf_set_link_xdp_fd(ifindex_in, -1, xdp_flags); - else if (!curr_prog_id) - printf("couldn't find a prog id on iface IN\n"); - else - printf("program on iface IN changed, not removing\n"); - - if (ifindex_out_xdp_dummy_attached) { - curr_prog_id = 0; - if (bpf_get_link_xdp_id(ifindex_out, &curr_prog_id, - xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(1); - } - if (dummy_prog_id == curr_prog_id) - bpf_set_link_xdp_fd(ifindex_out, -1, xdp_flags); - else if (!curr_prog_id) - printf("couldn't find a prog id on iface OUT\n"); - else - printf("program on iface OUT changed, not removing\n"); - } - exit(0); -} +static const struct option long_options[] = { + {"help", no_argument, NULL, 'h' }, + {"skb-mode", no_argument, NULL, 'S' }, + {"force", no_argument, NULL, 'F' }, + {"stats", no_argument, NULL, 's' }, + {"interval", required_argument, NULL, 'i' }, + {"verbose", no_argument, NULL, 'v' }, + {} +}; -static void poll_stats(int interval, int ifindex) +static void usage(char *argv[]) { - unsigned int nr_cpus = bpf_num_possible_cpus(); - __u64 values[nr_cpus], prev[nr_cpus]; - - memset(prev, 0, sizeof(prev)); - - while (1) { - __u64 sum = 0; - __u32 key = 0; - int i; - - sleep(interval); - assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0); - for (i = 0; i < nr_cpus; i++) - sum += (values[i] - prev[i]); - if (sum) - printf("ifindex %i: %10llu pkt/s\n", - ifindex, sum / interval); - memcpy(prev, values, sizeof(values)); + int i; + + sample_print_help(mask); + + printf("\n"); + printf(" Usage: %s (options-see-below)\n", + argv[0]); + printf(" Listing options:\n"); + for (i = 0; long_options[i].name != 0; i++) { + printf(" --%-15s", long_options[i].name); + if (long_options[i].flag != NULL) + printf(" flag (internal value:%d)", + *long_options[i].flag); + else + printf("short-option: -%c", + long_options[i].val); + printf("\n"); } + printf("\n"); } -static void usage(const char *prog) -{ - fprintf(stderr, - "usage: %s [OPTS] _IN _OUT\n\n" - "OPTS:\n" - " -S use skb-mode\n" - " -N enforce native mode\n" - " -F force loading prog\n", - prog); -} - - int main(int argc, char **argv) { - struct bpf_prog_load_attr prog_load_attr = { - .prog_type = BPF_PROG_TYPE_XDP, - }; - struct bpf_program *prog, *dummy_prog; - int prog_fd, tx_port_map_fd, opt; - struct bpf_prog_info info = {}; - __u32 info_len = sizeof(info); - const char *optstr = "FSN"; - struct bpf_object *obj; - char filename[256]; - int dummy_prog_fd; - int ret, key = 0; - - while ((opt = getopt(argc, argv, optstr)) != -1) { + int ifindex_in, ifindex_out, opt; + char str[2 * IF_NAMESIZE + 1]; + char ifname_out[IF_NAMESIZE]; + char ifname_in[IF_NAMESIZE]; + int ret = EXIT_FAIL_OPTION; + unsigned long interval = 2; + struct xdp_redirect *skel; + bool generic = false; + bool force = false; + + while ((opt = getopt_long(argc, argv, "SFi:vs", + long_options, NULL)) != -1) { switch (opt) { case 'S': - xdp_flags |= XDP_FLAGS_SKB_MODE; - break; - case 'N': - /* default, set below */ + generic = true; break; case 'F': - xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + force = true; + break; + case 'i': + interval = strtoul(optarg, NULL, 0); + break; + case 'v': + sample_switch_mode(); + break; + case 's': + mask |= SAMPLE_REDIRECT_CNT; break; default: - usage(basename(argv[0])); - return 1; + usage(argv); + return ret; } } - if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) - xdp_flags |= XDP_FLAGS_DRV_MODE; - - if (optind + 2 != argc) { - printf("usage: %s _IN _OUT\n", argv[0]); - return 1; + if (argc <= optind + 1) { + usage(argv); + return ret; } ifindex_in = if_nametoindex(argv[optind]); @@ -143,75 +108,52 @@ int main(int argc, char **argv) if (!ifindex_out) ifindex_out = strtoul(argv[optind + 1], NULL, 0); - printf("input: %d output: %d\n", ifindex_in, ifindex_out); - - snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); - prog_load_attr.file = filename; - - if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) - return 1; - - prog = bpf_program__next(NULL, obj); - dummy_prog = bpf_program__next(prog, obj); - if (!prog || !dummy_prog) { - printf("finding a prog in obj file failed\n"); - return 1; - } - /* bpf_prog_load_xattr gives us the pointer to first prog's fd, - * so we're missing only the fd for dummy prog - */ - dummy_prog_fd = bpf_program__fd(dummy_prog); - if (prog_fd < 0 || dummy_prog_fd < 0) { - printf("bpf_prog_load_xattr: %s\n", strerror(errno)); - return 1; - } - - tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port"); - rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt"); - if (tx_port_map_fd < 0 || rxcnt_map_fd < 0) { - printf("bpf_object__find_map_fd_by_name failed\n"); - return 1; + skel = xdp_redirect__open(); + if (!skel) { + fprintf(stderr, "Failed to xdp_redirect__open: %s\n", strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end; } - if (bpf_set_link_xdp_fd(ifindex_in, prog_fd, xdp_flags) < 0) { - printf("ERROR: link set xdp fd failed on %d\n", ifindex_in); - return 1; - } + skel->rodata->ifindex_out = ifindex_out; - ret = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (ret) { - printf("can't get prog info - %s\n", strerror(errno)); - return ret; + ret = xdp_redirect__load(skel); + if (ret < 0) { + fprintf(stderr, "Failed to xdp_redirect__load: %s\n", strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; } - prog_id = info.id; - /* Loading dummy XDP prog on out-device */ - if (bpf_set_link_xdp_fd(ifindex_out, dummy_prog_fd, - (xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST)) < 0) { - printf("WARN: link set xdp fd failed on %d\n", ifindex_out); - ifindex_out_xdp_dummy_attached = false; + ret = sample_init(skel, mask); + if (ret < 0) { + fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; } - memset(&info, 0, sizeof(info)); - ret = bpf_obj_get_info_by_fd(dummy_prog_fd, &info, &info_len); - if (ret) { - printf("can't get prog info - %s\n", strerror(errno)); - return ret; - } - dummy_prog_id = info.id; + ret = EXIT_FAIL_XDP; + if (sample_install_xdp(skel->progs.xdp_redirect_prog, ifindex_in, + generic, force) < 0) + goto end_destroy; - signal(SIGINT, int_exit); - signal(SIGTERM, int_exit); - - /* bpf redirect port */ - ret = bpf_map_update_elem(tx_port_map_fd, &key, &ifindex_out, 0); - if (ret) { - perror("bpf_update_elem"); - goto out; + /* Loading dummy XDP prog on out-device */ + sample_install_xdp(skel->progs.xdp_redirect_dummy_prog, ifindex_out, + generic, force); + + safe_strncpy(str, get_driver_name(ifindex_in), sizeof(str)); + printf("Redirecting from %s (ifindex %d; driver %s) to %s (ifindex %d; driver %s)\n", + ifname_in, ifindex_in, str, ifname_out, ifindex_out, get_driver_name(ifindex_out)); + snprintf(str, sizeof(str), "%s->%s", ifname_in, ifname_out); + + ret = sample_run(interval, NULL, NULL); + if (ret < 0) { + fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; } - - poll_stats(2, ifindex_out); - -out: - return ret; + ret = EXIT_OK; +end_destroy: + xdp_redirect__destroy(skel); +end: + sample_exit(ret); } From patchwork Wed Jul 21 21:28:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392251 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FE0DC6377E for ; Wed, 21 Jul 2021 21:29:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1928E61208 for ; Wed, 21 Jul 2021 21:29:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229830AbhGUUs2 (ORCPT ); Wed, 21 Jul 2021 16:48:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229900AbhGUUsY (ORCPT ); Wed, 21 Jul 2021 16:48:24 -0400 Received: from mail-pj1-x1044.google.com (mail-pj1-x1044.google.com [IPv6:2607:f8b0:4864:20::1044]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35E27C061575; Wed, 21 Jul 2021 14:28:59 -0700 (PDT) Received: by mail-pj1-x1044.google.com with SMTP id pf12-20020a17090b1d8cb0290175c085e7a5so1235265pjb.0; Wed, 21 Jul 2021 14:28:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8ga1wzbkQAHVBAo9qcp53OvXs8Z8kN38u7DL/9uNS0U=; b=jGQv0WBMpAXAcSSDuQ15g65xEqErLckqVJIK0hdKLB5qbWuF5DvbrxLVk8n7SpGzmn leyrq7h20CA72d9ZwW2842tyBRXGrOBBskq+E5nDC9PtACpd2EtTeTiN+pkFPLSLDI1r eBWSedmZi85m1Y479101jFxeiQz4Y2TYcU0RAJEJuGG7I3kzg2cb61G5JRjr7ALRxzbt YqWwcbooCgQ48VmekaMN2S94s+hBuvpUcptEv20S5b5hgmk3Y7ddQ5PqOx/cV2msOQ8l zofuylrVAnUL7KS/q02Ux4Spk45YTavHUVwDAUUjNqLvfeT7A/dPY2r510C9RPThjC9B L7Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8ga1wzbkQAHVBAo9qcp53OvXs8Z8kN38u7DL/9uNS0U=; b=sUALVRsTMd/mhKMS4lHMeTjLtA98ta6wbb3rgIY/suu3zd/NxEUe5GdJ8RYf7LadTW GKFqyw14xF2x1NVIt2AhrLEloON44xSVdhiQXSXuUTVDqiKNC51GbH146AENTti9+t3Y i+dRvtOFpcqMxHqrp7BjbfpvmcEo6ZVaXTGC+16ru4pi9j6j4ovRYc05wcpU6XZIb56Q 8NknH5uQCpsp1X2CEXCmJZmzm/BU1IEcb/OaApncb2z3qZPgswT6ZqqCyTAcO9u3LlZx MiMxRHHofS4esdavGdFEb+TH5g1cxzxthkvZFD6fsEBXq1sfmgdifWgU7/zQMGbMZaO0 QoJw== X-Gm-Message-State: AOAM532iuVX1j1adzjJhocNyrw/6Zldz2lbtn9J0P7o82w9vPHp8VHb/ 5WCxpUt9+u5o3Q++YwFRT3SCfbAE1BAcSA== X-Google-Smtp-Source: ABdhPJyRwSCnrpNTFv6T6E1GZlrGH5ibTRVFW7TF62bg9z1lIAuDOH02NeQobaX4L6ykArzELGMFUg== X-Received: by 2002:aa7:8d10:0:b029:303:8d17:7b8d with SMTP id j16-20020aa78d100000b02903038d177b8dmr37952764pfe.26.1626902938471; Wed, 21 Jul 2021 14:28:58 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id 37sm13741645pgt.28.2021.07.21.14.28.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:28:58 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 6/8] samples: bpf: Convert xdp_redirect_map to use XDP samples helpers Date: Thu, 22 Jul 2021 02:58:31 +0530 Message-Id: <20210721212833.701342-7-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This change converts XDP redirect_map tool to use the XDP samples support introduced in previous changes. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 10 +- ...rect_map_kern.c => xdp_redirect_map.bpf.c} | 75 +--- samples/bpf/xdp_redirect_map_user.c | 385 +++++++----------- 3 files changed, 181 insertions(+), 289 deletions(-) rename samples/bpf/{xdp_redirect_map_kern.c => xdp_redirect_map.bpf.c} (62%) diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index b94b6dac09ff..2cccb2aa8f2b 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -39,7 +39,6 @@ tprogs-y += lwt_len_hist tprogs-y += xdp_tx_iptunnel tprogs-y += test_map_in_map tprogs-y += per_socket_stats_example -tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_cpu tprogs-y += xdp_rxq_info @@ -55,6 +54,7 @@ tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect tprogs-y += xdp_monitor @@ -100,7 +100,6 @@ lwt_len_hist-objs := lwt_len_hist_user.o xdp_tx_iptunnel-objs := xdp_tx_iptunnel_user.o test_map_in_map-objs := test_map_in_map_user.o per_socket_stats_example-objs := cookie_uid_helper_example.o -xdp_redirect_map-objs := xdp_redirect_map_user.o xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o xdp_rxq_info-objs := xdp_rxq_info_user.o @@ -117,6 +116,7 @@ ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o +xdp_redirect_map-objs := xdp_redirect_map_user.o $(XDP_SAMPLE) xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE) xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE) @@ -164,7 +164,6 @@ always-y += tcp_clamp_kern.o always-y += tcp_basertt_kern.o always-y += tcp_tos_reflect_kern.o always-y += tcp_dumpstats_kern.o -always-y += xdp_redirect_map_kern.o always-y += xdp_redirect_map_multi_kern.o always-y += xdp_redirect_cpu_kern.o always-y += xdp_rxq_info_kern.o @@ -314,6 +313,7 @@ verify_target_bpf: verify_cmds $(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF) $(src)/*.c: verify_target_bpf $(LIBBPF) +$(obj)/xdp_redirect_map_user.o: $(obj)/xdp_redirect_map.skel.h $(obj)/xdp_redirect_user.o: $(obj)/xdp_redirect.skel.h $(obj)/xdp_monitor_user.o: $(obj)/xdp_monitor.skel.h @@ -358,6 +358,7 @@ endef CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) +$(obj)/xdp_redirect_map.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_monitor.bpf.o: $(obj)/xdp_sample.bpf.o @@ -369,9 +370,10 @@ $(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/x -I$(srctree)/tools/lib $(CLANG_SYS_INCLUDES) \ -c $(filter %.bpf.c,$^) -o $@ -LINKED_SKELS := xdp_redirect.skel.h xdp_monitor.skel.h +LINKED_SKELS := xdp_redirect_map.skel.h xdp_redirect.skel.h xdp_monitor.skel.h clean-files += $(LINKED_SKELS) +xdp_redirect_map.skel.h-deps := xdp_redirect_map.bpf.o xdp_sample.bpf.o xdp_redirect.skel.h-deps := xdp_redirect.bpf.o xdp_sample.bpf.o xdp_monitor.skel.h-deps := xdp_monitor.bpf.o xdp_sample.bpf.o diff --git a/samples/bpf/xdp_redirect_map_kern.c b/samples/bpf/xdp_redirect_map.bpf.c similarity index 62% rename from samples/bpf/xdp_redirect_map_kern.c rename to samples/bpf/xdp_redirect_map.bpf.c index a92b8e567bdd..2e8d807ccd57 100644 --- a/samples/bpf/xdp_redirect_map_kern.c +++ b/samples/bpf/xdp_redirect_map.bpf.c @@ -10,14 +10,10 @@ * General Public License for more details. */ #define KBUILD_MODNAME "foo" -#include -#include -#include -#include -#include -#include -#include -#include + +#include "vmlinux.h" +#include "xdp_sample.bpf.h" +#include "xdp_sample_shared.h" /* The 2nd xdp prog on egress does not support skb mode, so we define two * maps, tx_port_general and tx_port_native. @@ -36,67 +32,38 @@ struct { __uint(max_entries, 100); } tx_port_native SEC(".maps"); -/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success - * feedback. Redirect TX errors can be caught via a tracepoint. - */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, long); - __uint(max_entries, 1); -} rxcnt SEC(".maps"); - -/* map to store egress interface mac address */ -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __type(key, u32); - __type(value, __be64); - __uint(max_entries, 1); -} tx_mac SEC(".maps"); - -static void swap_src_dst_mac(void *data) -{ - unsigned short *p = data; - unsigned short dst[3]; - - dst[0] = p[0]; - dst[1] = p[1]; - dst[2] = p[2]; - p[0] = p[3]; - p[1] = p[4]; - p[2] = p[5]; - p[3] = dst[0]; - p[4] = dst[1]; - p[5] = dst[2]; -} +/* store egress interface mac address */ +const volatile char tx_mac_addr[ETH_ALEN]; static __always_inline int xdp_redirect_map(struct xdp_md *ctx, void *redirect_map) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; - int rc = XDP_DROP; - long *value; + struct datarec *rec; u32 key = 0; u64 nh_off; int vport; nh_off = sizeof(*eth); if (data + nh_off > data_end) - return rc; + return XDP_DROP; /* constant virtual port */ vport = 0; - /* count packet in global counter */ - value = bpf_map_lookup_elem(&rxcnt, &key); - if (value) - *value += 1; + if (key < MAX_CPUS) { + /* count packet in global counter */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); - swap_src_dst_mac(data); + swap_src_dst_mac(data); - /* send packet out physical port */ - return bpf_redirect_map(redirect_map, vport, 0); + /* send packet out physical port */ + return bpf_redirect_map(redirect_map, vport, 0); + } + + return XDP_PASS; } SEC("xdp_redirect_general") @@ -117,17 +84,13 @@ int xdp_redirect_map_egress(struct xdp_md *ctx) void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; - __be64 *mac; - u32 key = 0; u64 nh_off; nh_off = sizeof(*eth); if (data + nh_off > data_end) return XDP_DROP; - mac = bpf_map_lookup_elem(&tx_mac, &key); - if (mac) - __builtin_memcpy(eth->h_source, mac, ETH_ALEN); + __builtin_memcpy(eth->h_source, (const char *)tx_mac_addr, ETH_ALEN); return XDP_PASS; } diff --git a/samples/bpf/xdp_redirect_map_user.c b/samples/bpf/xdp_redirect_map_user.c index ad3cdc4c07d3..d305beba902e 100644 --- a/samples/bpf/xdp_redirect_map_user.c +++ b/samples/bpf/xdp_redirect_map_user.c @@ -13,165 +13,102 @@ #include #include #include -#include -#include -#include -#include -#include - -#include "bpf_util.h" +#include #include #include - -static int ifindex_in; -static int ifindex_out; -static bool ifindex_out_xdp_dummy_attached = true; -static bool xdp_devmap_attached; -static __u32 prog_id; -static __u32 dummy_prog_id; - -static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; -static int rxcnt_map_fd; - -static void int_exit(int sig) +#include "bpf_util.h" +#include "xdp_sample_user.h" +#include "xdp_redirect_map.skel.h" + +static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_MAP_CNT | + SAMPLE_EXCEPTION_CNT | SAMPLE_DEVMAP_XMIT_CNT; + +DEFINE_SAMPLE_INIT(xdp_redirect_map); + +static const struct option long_options[] = { + { "help", no_argument, NULL, 'h' }, + { "skb-mode", no_argument, NULL, 'S' }, + { "force", no_argument, NULL, 'F' }, + { "load-egress", no_argument, NULL, 'X' }, + { "stats", no_argument, NULL, 's' }, + { "interval", required_argument, NULL, 'i' }, + { "verbose", no_argument, NULL, 'v' }, + {} +}; + +static void usage(char *argv[]) { - __u32 curr_prog_id = 0; - - if (bpf_get_link_xdp_id(ifindex_in, &curr_prog_id, xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(1); - } - if (prog_id == curr_prog_id) - bpf_set_link_xdp_fd(ifindex_in, -1, xdp_flags); - else if (!curr_prog_id) - printf("couldn't find a prog id on iface IN\n"); - else - printf("program on iface IN changed, not removing\n"); - - if (ifindex_out_xdp_dummy_attached) { - curr_prog_id = 0; - if (bpf_get_link_xdp_id(ifindex_out, &curr_prog_id, - xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(1); - } - if (dummy_prog_id == curr_prog_id) - bpf_set_link_xdp_fd(ifindex_out, -1, xdp_flags); - else if (!curr_prog_id) - printf("couldn't find a prog id on iface OUT\n"); + int i; + + sample_print_help(mask); + + printf("\n"); + printf(" Usage: %s (options-see-below)\n", + argv[0]); + printf(" Listing options:\n"); + for (i = 0; long_options[i].name != 0; i++) { + printf(" --%-15s", long_options[i].name); + if (long_options[i].flag != NULL) + printf(" flag (internal value:%d)", + *long_options[i].flag); else - printf("program on iface OUT changed, not removing\n"); - } - exit(0); -} - -static void poll_stats(int interval, int ifindex) -{ - unsigned int nr_cpus = bpf_num_possible_cpus(); - __u64 values[nr_cpus], prev[nr_cpus]; - - memset(prev, 0, sizeof(prev)); - - while (1) { - __u64 sum = 0; - __u32 key = 0; - int i; - - sleep(interval); - assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0); - for (i = 0; i < nr_cpus; i++) - sum += (values[i] - prev[i]); - if (sum) - printf("ifindex %i: %10llu pkt/s\n", - ifindex, sum / interval); - memcpy(prev, values, sizeof(values)); + printf("short-option: -%c", + long_options[i].val); + printf("\n"); } -} - -static int get_mac_addr(unsigned int ifindex_out, void *mac_addr) -{ - char ifname[IF_NAMESIZE]; - struct ifreq ifr; - int fd, ret = -1; - - fd = socket(AF_INET, SOCK_DGRAM, 0); - if (fd < 0) - return ret; - - if (!if_indextoname(ifindex_out, ifname)) - goto err_out; - - strcpy(ifr.ifr_name, ifname); - - if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0) - goto err_out; - - memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char)); - ret = 0; - -err_out: - close(fd); - return ret; -} - -static void usage(const char *prog) -{ - fprintf(stderr, - "usage: %s [OPTS] _IN _OUT\n\n" - "OPTS:\n" - " -S use skb-mode\n" - " -N enforce native mode\n" - " -F force loading prog\n" - " -X load xdp program on egress\n", - prog); + printf("\n"); } int main(int argc, char **argv) { - struct bpf_prog_load_attr prog_load_attr = { - .prog_type = BPF_PROG_TYPE_UNSPEC, - }; - struct bpf_program *prog, *dummy_prog, *devmap_prog; - int prog_fd, dummy_prog_fd, devmap_prog_fd = 0; - int tx_port_map_fd, tx_mac_map_fd; - struct bpf_devmap_val devmap_val; - struct bpf_prog_info info = {}; - __u32 info_len = sizeof(info); - const char *optstr = "FSNX"; - struct bpf_object *obj; - int ret, opt, key = 0; - char filename[256]; - - while ((opt = getopt(argc, argv, optstr)) != -1) { + struct bpf_devmap_val devmap_val = {}; + bool xdp_devmap_attached = false; + struct xdp_redirect_map *skel; + char str[2 * IF_NAMESIZE + 1]; + char ifname_out[IF_NAMESIZE]; + struct bpf_map *tx_port_map; + char ifname_in[IF_NAMESIZE]; + int ifindex_in, ifindex_out; + unsigned long interval = 2; + int ret = EXIT_FAIL_OPTION; + struct bpf_program *prog; + bool generic = false; + bool force = false; + bool tried = false; + int opt, key = 0; + + while ((opt = getopt_long(argc, argv, "SFXi:vs", + long_options, NULL)) != -1) { switch (opt) { case 'S': - xdp_flags |= XDP_FLAGS_SKB_MODE; - break; - case 'N': - /* default, set below */ + generic = true; + /* devmap_xmit tracepoint not available */ + mask &= ~SAMPLE_DEVMAP_XMIT_CNT; break; case 'F': - xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + force = true; break; case 'X': xdp_devmap_attached = true; break; + case 'i': + interval = strtoul(optarg, NULL, 0); + break; + case 'v': + sample_switch_mode(); + break; + case 's': + mask |= SAMPLE_REDIRECT_MAP_CNT; + break; default: - usage(basename(argv[0])); - return 1; + usage(argv); + return ret; } } - if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) { - xdp_flags |= XDP_FLAGS_DRV_MODE; - } else if (xdp_devmap_attached) { - printf("Load xdp program on egress with SKB mode not supported yet\n"); - return 1; - } - if (argc <= optind + 1) { - usage(basename(argv[0])); - return 1; + usage(argv); + return ret; } ifindex_in = if_nametoindex(argv[optind]); @@ -182,107 +119,97 @@ int main(int argc, char **argv) if (!ifindex_out) ifindex_out = strtoul(argv[optind + 1], NULL, 0); - printf("input: %d output: %d\n", ifindex_in, ifindex_out); - - snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); - prog_load_attr.file = filename; - - if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) - return 1; - - if (xdp_flags & XDP_FLAGS_SKB_MODE) { - prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_general"); - tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port_general"); - } else { - prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_native"); - tx_port_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_port_native"); - } - dummy_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_dummy_prog"); - if (!prog || dummy_prog < 0 || tx_port_map_fd < 0) { - printf("finding prog/dummy_prog/tx_port_map in obj file failed\n"); - goto out; - } - prog_fd = bpf_program__fd(prog); - dummy_prog_fd = bpf_program__fd(dummy_prog); - if (prog_fd < 0 || dummy_prog_fd < 0 || tx_port_map_fd < 0) { - printf("bpf_prog_load_xattr: %s\n", strerror(errno)); - return 1; - } - - tx_mac_map_fd = bpf_object__find_map_fd_by_name(obj, "tx_mac"); - rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt"); - if (tx_mac_map_fd < 0 || rxcnt_map_fd < 0) { - printf("bpf_object__find_map_fd_by_name failed\n"); - return 1; - } - - if (bpf_set_link_xdp_fd(ifindex_in, prog_fd, xdp_flags) < 0) { - printf("ERROR: link set xdp fd failed on %d\n", ifindex_in); - return 1; - } - - ret = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (ret) { - printf("can't get prog info - %s\n", strerror(errno)); - return ret; - } - prog_id = info.id; - - /* Loading dummy XDP prog on out-device */ - if (bpf_set_link_xdp_fd(ifindex_out, dummy_prog_fd, - (xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST)) < 0) { - printf("WARN: link set xdp fd failed on %d\n", ifindex_out); - ifindex_out_xdp_dummy_attached = false; + skel = xdp_redirect_map__open(); + if (!skel) { + fprintf(stderr, "Failed to xdp_redirect_map__open: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end; } - memset(&info, 0, sizeof(info)); - ret = bpf_obj_get_info_by_fd(dummy_prog_fd, &info, &info_len); - if (ret) { - printf("can't get prog info - %s\n", strerror(errno)); - return ret; - } - dummy_prog_id = info.id; - /* Load 2nd xdp prog on egress. */ if (xdp_devmap_attached) { - unsigned char mac_addr[6]; - - devmap_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_egress"); - if (!devmap_prog) { - printf("finding devmap_prog in obj file failed\n"); - goto out; - } - devmap_prog_fd = bpf_program__fd(devmap_prog); - if (devmap_prog_fd < 0) { - printf("finding devmap_prog fd failed\n"); - goto out; - } - - if (get_mac_addr(ifindex_out, mac_addr) < 0) { - printf("get interface %d mac failed\n", ifindex_out); - goto out; + ret = get_mac_addr(ifindex_out, skel->rodata->tx_mac_addr); + if (ret < 0) { + fprintf(stderr, "Failed to get interface %d mac address: %s\n", + ifindex_out, strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; } + } - ret = bpf_map_update_elem(tx_mac_map_fd, &key, mac_addr, 0); - if (ret) { - perror("bpf_update_elem tx_mac_map_fd"); - goto out; + ret = xdp_redirect_map__load(skel); + if (ret < 0) { + fprintf(stderr, "Failed to xdp_redirect_map__load: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; + } + + ret = sample_init(skel, mask); + if (ret < 0) { + fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; + } + + prog = skel->progs.xdp_redirect_map_native; + tx_port_map = skel->maps.tx_port_native; +restart: + if (sample_install_xdp(prog, ifindex_in, generic, force) < 0) { + /* First try with struct bpf_devmap_val as value for generic + * mode, then fallback to sizeof(int) for older kernels. + */ + fprintf(stderr, "Trying fallback to sizeof(int) as value_size for devmap in generic mode\n"); + if (generic && !tried) { + prog = skel->progs.xdp_redirect_map_general; + tx_port_map = skel->maps.tx_port_general; + tried = true; + goto restart; } + ret = EXIT_FAIL_XDP; + goto end_destroy; } - signal(SIGINT, int_exit); - signal(SIGTERM, int_exit); + /* Loading dummy XDP prog on out-device */ + sample_install_xdp(skel->progs.xdp_redirect_dummy_prog, ifindex_out, generic, force); devmap_val.ifindex = ifindex_out; - devmap_val.bpf_prog.fd = devmap_prog_fd; - ret = bpf_map_update_elem(tx_port_map_fd, &key, &devmap_val, 0); - if (ret) { - perror("bpf_update_elem"); - goto out; - } - - poll_stats(2, ifindex_out); - -out: - return 0; + if (xdp_devmap_attached) + devmap_val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_redirect_map_egress); + ret = bpf_map_update_elem(bpf_map__fd(tx_port_map), &key, &devmap_val, 0); + if (ret < 0) { + fprintf(stderr, "Failed to update devmap value: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; + } + + ret = EXIT_FAIL; + if (!if_indextoname(ifindex_in, ifname_in)) { + perror("if_nametoindex"); + goto end_destroy; + } + + if (!if_indextoname(ifindex_out, ifname_out)) { + perror("if_nametoindex"); + goto end_destroy; + } + + safe_strncpy(str, get_driver_name(ifindex_in), sizeof(str)); + printf("Redirecting from %s (ifindex %d; driver %s) to %s (ifindex %d; driver %s)\n", + ifname_in, ifindex_in, str, ifname_out, ifindex_out, get_driver_name(ifindex_out)); + snprintf(str, sizeof(str), "%s->%s", ifname_in, ifname_out); + + ret = sample_run(interval, NULL, NULL); + if (ret < 0) { + fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; + } + ret = EXIT_OK; +end_destroy: + xdp_redirect_map__destroy(skel); +end: + sample_exit(ret); } From patchwork Wed Jul 21 21:28:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392249 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70B12C6377C for ; Wed, 21 Jul 2021 21:29:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A9CA6120D for ; Wed, 21 Jul 2021 21:29:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229870AbhGUUs3 (ORCPT ); Wed, 21 Jul 2021 16:48:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229944AbhGUUsZ (ORCPT ); Wed, 21 Jul 2021 16:48:25 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29CF2C061757; Wed, 21 Jul 2021 14:29:02 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id cu14so3251496pjb.0; Wed, 21 Jul 2021 14:29:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=o6/upuUx9DU2rgvPVajuNyq/DpblDtC48eTme7SnQTs=; b=mow6XHZDFtxN19XOwNcx90Pqfj69bO3EnxTGHWPrXhcxViwIyv922kkB1yO54H0wPV jrsgOCkTMiIyq1zoakWY3F2wIttzbEVp8n9Bf1jcrihuZNoqBeabuPRhgQTzgz4RidL6 uTq2yKE3nK4Az8su7pRwPF/M4cidlglzLoCXZLCUU7rHDCZVr3wNStCYlnAO02bOOgQI TFSKfAK3zOfcSnXoKmYl7OgSv5mCv5GI7VhF9BwFc+5Yoy3MZPfC//HKTDBpOYb52CnS t3PWIvVCAnVNUUo1WsI4Jr1tmazqAQt2r+98Mn1l1efni21pO7P3SMSik7sAVryGNuAL VLiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=o6/upuUx9DU2rgvPVajuNyq/DpblDtC48eTme7SnQTs=; b=sHcK3sAIdQVAVpkJybKBJHj8NBO09DWHTxYsKQyItiH5J2blWei4zK60YB/SI2oztj HZt7YoiIwPwgWaf4aPDcxt3gaoVxWzRh7PAsT0aFSNKceqyth7VIfn4Oz4SgN7pbC1ft 08OFA7SPrb5t1VQFG57zO2xvrcinjKRT40MdCk6ugvQ+Cfsyesde4UDKOusZt9lZ4W3y vn1L2qNLq6xgflw3eLlHAwvAaEa4lwR0YN/TKvNf685jpzPR7gHP8m9x9jegMFCPaWKz c40NHbYK37OaZUwCtH0m1LXHH2WIDHlkBJZ1ZcNTgVIrlKH4HPxZt1hyULSdCtBfqDZ2 Tevw== X-Gm-Message-State: AOAM532THtv7fGoSMqqRS3X4jNJupXl6FkUnc6DzL9QLs6q63UfIVhxN 6b4LCSMekkQ8sQ5NgaOy21A7558L+XuYyQ== X-Google-Smtp-Source: ABdhPJyMPWIM1LOTuLUNW+LMQ51hYpEhzpLYiCdTWsxSQsynhQ7yL6fjnRBV8BwS3ZjXeDlRc1zx3w== X-Received: by 2002:a63:1622:: with SMTP id w34mr38090283pgl.354.1626902941448; Wed, 21 Jul 2021 14:29:01 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id g12sm18277009pfv.167.2021.07.21.14.29.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:29:01 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 7/8] samples: bpf: Convert xdp_redirect_map_multi to use XDP samples helpers Date: Thu, 22 Jul 2021 02:58:32 +0530 Message-Id: <20210721212833.701342-8-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This change converts XDP redirect_map_multi tool to use the XDP samples support introduced in previous changes. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 11 +- ...ti_kern.c => xdp_redirect_map_multi.bpf.c} | 40 ++- samples/bpf/xdp_redirect_map_multi_user.c | 316 +++++++----------- 3 files changed, 153 insertions(+), 214 deletions(-) rename samples/bpf/{xdp_redirect_map_multi_kern.c => xdp_redirect_map_multi.bpf.c} (74%) diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 2cccb2aa8f2b..064f16c12ccc 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -39,7 +39,6 @@ tprogs-y += lwt_len_hist tprogs-y += xdp_tx_iptunnel tprogs-y += test_map_in_map tprogs-y += per_socket_stats_example -tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_cpu tprogs-y += xdp_rxq_info tprogs-y += syscall_tp @@ -54,6 +53,7 @@ tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect tprogs-y += xdp_monitor @@ -100,7 +100,6 @@ lwt_len_hist-objs := lwt_len_hist_user.o xdp_tx_iptunnel-objs := xdp_tx_iptunnel_user.o test_map_in_map-objs := test_map_in_map_user.o per_socket_stats_example-objs := cookie_uid_helper_example.o -xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o xdp_rxq_info-objs := xdp_rxq_info_user.o syscall_tp-objs := syscall_tp_user.o @@ -116,6 +115,7 @@ ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o +xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o $(XDP_SAMPLE) xdp_redirect_map-objs := xdp_redirect_map_user.o $(XDP_SAMPLE) xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE) xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE) @@ -164,7 +164,6 @@ always-y += tcp_clamp_kern.o always-y += tcp_basertt_kern.o always-y += tcp_tos_reflect_kern.o always-y += tcp_dumpstats_kern.o -always-y += xdp_redirect_map_multi_kern.o always-y += xdp_redirect_cpu_kern.o always-y += xdp_rxq_info_kern.o always-y += xdp2skb_meta_kern.o @@ -313,6 +312,7 @@ verify_target_bpf: verify_cmds $(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF) $(src)/*.c: verify_target_bpf $(LIBBPF) +$(obj)/xdp_redirect_map_multi_user.o: $(obj)/xdp_redirect_map_multi.skel.h $(obj)/xdp_redirect_map_user.o: $(obj)/xdp_redirect_map.skel.h $(obj)/xdp_redirect_user.o: $(obj)/xdp_redirect.skel.h $(obj)/xdp_monitor_user.o: $(obj)/xdp_monitor.skel.h @@ -358,6 +358,7 @@ endef CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) +$(obj)/xdp_redirect_map_multi.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect_map.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_monitor.bpf.o: $(obj)/xdp_sample.bpf.o @@ -370,9 +371,11 @@ $(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/x -I$(srctree)/tools/lib $(CLANG_SYS_INCLUDES) \ -c $(filter %.bpf.c,$^) -o $@ -LINKED_SKELS := xdp_redirect_map.skel.h xdp_redirect.skel.h xdp_monitor.skel.h +LINKED_SKELS := xdp_redirect_map_multi.skel.h xdp_redirect_map.skel.h \ + xdp_redirect.skel.h xdp_monitor.skel.h clean-files += $(LINKED_SKELS) +xdp_redirect_map_multi.skel.h-deps := xdp_redirect_map_multi.bpf.o xdp_sample.bpf.o xdp_redirect_map.skel.h-deps := xdp_redirect_map.bpf.o xdp_sample.bpf.o xdp_redirect.skel.h-deps := xdp_redirect.bpf.o xdp_sample.bpf.o xdp_monitor.skel.h-deps := xdp_monitor.bpf.o xdp_sample.bpf.o diff --git a/samples/bpf/xdp_redirect_map_multi_kern.c b/samples/bpf/xdp_redirect_map_multi.bpf.c similarity index 74% rename from samples/bpf/xdp_redirect_map_multi_kern.c rename to samples/bpf/xdp_redirect_map_multi.bpf.c index 71aa23d1cb2b..66144e3a69f5 100644 --- a/samples/bpf/xdp_redirect_map_multi_kern.c +++ b/samples/bpf/xdp_redirect_map_multi.bpf.c @@ -1,11 +1,14 @@ // SPDX-License-Identifier: GPL-2.0 #define KBUILD_MODNAME "foo" -#include -#include -#include -#include -#include -#include + +#include "vmlinux.h" +#include "xdp_sample.bpf.h" +#include "xdp_sample_shared.h" + +enum { + BPF_F_BROADCAST = (1ULL << 3), + BPF_F_EXCLUDE_INGRESS = (1ULL << 4), +}; struct { __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); @@ -21,13 +24,6 @@ struct { __uint(max_entries, 32); } forward_map_native SEC(".maps"); -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, long); - __uint(max_entries, 1); -} rxcnt SEC(".maps"); - /* map to store egress interfaces mac addresses, set the * max_entries to 1 and extend it in user sapce prog. */ @@ -40,16 +36,18 @@ struct { static int xdp_redirect_map(struct xdp_md *ctx, void *forward_map) { - long *value; - u32 key = 0; + u32 key = bpf_get_smp_processor_id(); + struct datarec *rec; - /* count packet in global counter */ - value = bpf_map_lookup_elem(&rxcnt, &key); - if (value) - *value += 1; + if (key < MAX_CPUS) { + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); - return bpf_redirect_map(forward_map, key, - BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); + return bpf_redirect_map(forward_map, 0, + BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); + } + + return XDP_PASS; } SEC("xdp_redirect_general") diff --git a/samples/bpf/xdp_redirect_map_multi_user.c b/samples/bpf/xdp_redirect_map_multi_user.c index 84cdbbed20b7..a4239f8af627 100644 --- a/samples/bpf/xdp_redirect_map_multi_user.c +++ b/samples/bpf/xdp_redirect_map_multi_user.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -15,93 +16,60 @@ #include #include #include - -#include "bpf_util.h" #include #include +#include "bpf_util.h" +#include "xdp_sample_user.h" +#include "xdp_redirect_map_multi.skel.h" #define MAX_IFACE_NUM 32 - -static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; static int ifaces[MAX_IFACE_NUM] = {}; -static int rxcnt_map_fd; -static void int_exit(int sig) -{ - __u32 prog_id = 0; - int i; +static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_MAP_CNT | + SAMPLE_EXCEPTION_CNT | SAMPLE_DEVMAP_XMIT_CNT | + SAMPLE_DEVMAP_XMIT_CNT_MULTI; - for (i = 0; ifaces[i] > 0; i++) { - if (bpf_get_link_xdp_id(ifaces[i], &prog_id, xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(1); - } - if (prog_id) - bpf_set_link_xdp_fd(ifaces[i], -1, xdp_flags); - } +DEFINE_SAMPLE_INIT(xdp_redirect_map_multi); - exit(0); -} +static const struct option long_options[] = { + { "help", no_argument, NULL, 'h' }, + { "skb-mode", no_argument, NULL, 'S' }, + { "force", no_argument, NULL, 'F' }, + { "load-egress", no_argument, NULL, 'X' }, + { "stats", no_argument, NULL, 's' }, + { "interval", required_argument, NULL, 'i' }, + { "verbose", no_argument, NULL, 'v' }, + {} +}; -static void poll_stats(int interval) +static void usage(char *argv[]) { - unsigned int nr_cpus = bpf_num_possible_cpus(); - __u64 values[nr_cpus], prev[nr_cpus]; - - memset(prev, 0, sizeof(prev)); + int i; - while (1) { - __u64 sum = 0; - __u32 key = 0; - int i; + sample_print_help(mask); - sleep(interval); - assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0); - for (i = 0; i < nr_cpus; i++) - sum += (values[i] - prev[i]); - if (sum) - printf("Forwarding %10llu pkt/s\n", sum / interval); - memcpy(prev, values, sizeof(values)); + printf("\n"); + printf(" Usage: %s (options-see-below)\n", + argv[0]); + printf(" Listing options:\n"); + for (i = 0; long_options[i].name != 0; i++) { + printf(" --%-15s", long_options[i].name); + if (long_options[i].flag != NULL) + printf(" flag (internal value:%d)", + *long_options[i].flag); + else + printf("short-option: -%c", + long_options[i].val); + printf("\n"); } + printf("\n"); } -static int get_mac_addr(unsigned int ifindex, void *mac_addr) -{ - char ifname[IF_NAMESIZE]; - struct ifreq ifr; - int fd, ret = -1; - - fd = socket(AF_INET, SOCK_DGRAM, 0); - if (fd < 0) - return ret; - - if (!if_indextoname(ifindex, ifname)) - goto err_out; - - strcpy(ifr.ifr_name, ifname); - - if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0) - goto err_out; - - memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char)); - ret = 0; - -err_out: - close(fd); - return ret; -} - -static int update_mac_map(struct bpf_object *obj) +static int update_mac_map(int mac_map_fd) { - int i, ret = -1, mac_map_fd; unsigned char mac_addr[6]; unsigned int ifindex; - - mac_map_fd = bpf_object__find_map_fd_by_name(obj, "mac_map"); - if (mac_map_fd < 0) { - printf("find mac map fd failed\n"); - return ret; - } + int i, ret = -1; for (i = 0; ifaces[i] > 0; i++) { ifindex = ifaces[i]; @@ -122,181 +90,151 @@ static int update_mac_map(struct bpf_object *obj) return 0; } -static void usage(const char *prog) -{ - fprintf(stderr, - "usage: %s [OPTS] ...\n" - "OPTS:\n" - " -S use skb-mode\n" - " -N enforce native mode\n" - " -F force loading prog\n" - " -X load xdp program on egress\n", - prog); -} - int main(int argc, char **argv) { - int i, ret, opt, forward_map_fd, max_ifindex = 0; - struct bpf_program *ingress_prog, *egress_prog; - int ingress_prog_fd, egress_prog_fd = 0; + struct bpf_map *mac_map, *forward_map; + struct xdp_redirect_map_multi *skel; + struct bpf_program *ingress_prog; struct bpf_devmap_val devmap_val; - bool attach_egress_prog = false; + bool xdp_devmap_attached = false; + int i, opt, max_ifindex = 0; + int ret = EXIT_FAIL_OPTION; + unsigned long interval = 2; char ifname[IF_NAMESIZE]; - struct bpf_map *mac_map; - struct bpf_object *obj; unsigned int ifindex; - char filename[256]; + bool generic = false; + bool force = false; + bool tried = false; - while ((opt = getopt(argc, argv, "SNFX")) != -1) { + while ((opt = getopt_long(argc, argv, "SFXi:vs", + long_options, NULL)) != -1) { switch (opt) { case 'S': - xdp_flags |= XDP_FLAGS_SKB_MODE; - break; - case 'N': - /* default, set below */ + generic = true; + /* devmap_xmit tracepoint not available */ + mask &= ~SAMPLE_DEVMAP_XMIT_CNT; break; case 'F': - xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + force = true; break; case 'X': - attach_egress_prog = true; + xdp_devmap_attached = true; + break; + case 'i': + interval = strtoul(optarg, NULL, 0); + break; + case 'v': + sample_switch_mode(); + break; + case 's': + mask |= SAMPLE_REDIRECT_MAP_CNT; break; default: - usage(basename(argv[0])); - return 1; + usage(argv); + return ret; } } - if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) { - xdp_flags |= XDP_FLAGS_DRV_MODE; - } else if (attach_egress_prog) { - printf("Load xdp program on egress with SKB mode not supported yet\n"); - return 1; - } - - if (optind == argc) { - printf("usage: %s ...\n", argv[0]); - return 1; + if (argc <= optind + 1) { + usage(argv); + return ret; } - printf("Get interfaces"); for (i = 0; i < MAX_IFACE_NUM && argv[optind + i]; i++) { ifaces[i] = if_nametoindex(argv[optind + i]); if (!ifaces[i]) ifaces[i] = strtoul(argv[optind + i], NULL, 0); if (!if_indextoname(ifaces[i], ifname)) { perror("Invalid interface name or i"); - return 1; + return EXIT_FAIL; } /* Find the largest index number */ if (ifaces[i] > max_ifindex) max_ifindex = ifaces[i]; - - printf(" %d", ifaces[i]); } - printf("\n"); - - snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); - obj = bpf_object__open(filename); - if (libbpf_get_error(obj)) { - printf("ERROR: opening BPF object file failed\n"); - obj = NULL; - goto err_out; + skel = xdp_redirect_map_multi__open(); + if (!skel) { + fprintf(stderr, "Failed to xdp_redirect_map_multi__open: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end; } /* Reset the map size to max ifindex + 1 */ - if (attach_egress_prog) { - mac_map = bpf_object__find_map_by_name(obj, "mac_map"); + if (xdp_devmap_attached) { + mac_map = skel->maps.mac_map; ret = bpf_map__resize(mac_map, max_ifindex + 1); if (ret < 0) { - printf("ERROR: reset mac map size failed\n"); - goto err_out; + fprintf(stderr, "Resizing mac_map failed: %s\n", strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; + } + /* Update mac_map with all egress interfaces' mac addr */ + if (update_mac_map(bpf_map__fd(mac_map)) < 0) { + fprintf(stderr, "Updating mac address failed\n"); + ret = EXIT_FAIL; + goto end_destroy; } } - /* load BPF program */ - if (bpf_object__load(obj)) { - printf("ERROR: loading BPF object file failed\n"); - goto err_out; - } - - if (xdp_flags & XDP_FLAGS_SKB_MODE) { - ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_general"); - forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_general"); - } else { - ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_native"); - forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_native"); - } - if (!ingress_prog || forward_map_fd < 0) { - printf("finding ingress_prog/forward_map in obj file failed\n"); - goto err_out; - } - - ingress_prog_fd = bpf_program__fd(ingress_prog); - if (ingress_prog_fd < 0) { - printf("find ingress_prog fd failed\n"); - goto err_out; - } - - rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt"); - if (rxcnt_map_fd < 0) { - printf("bpf_object__find_map_fd_by_name failed\n"); - goto err_out; + if (xdp_redirect_map_multi__load(skel)) { + fprintf(stderr, "Failed to xdp_redirect_map_multi__load: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; } - if (attach_egress_prog) { - /* Update mac_map with all egress interfaces' mac addr */ - if (update_mac_map(obj) < 0) { - printf("Error: update mac map failed"); - goto err_out; - } - - /* Find egress prog fd */ - egress_prog = bpf_object__find_program_by_name(obj, "xdp_devmap_prog"); - if (!egress_prog) { - printf("finding egress_prog in obj file failed\n"); - goto err_out; - } - egress_prog_fd = bpf_program__fd(egress_prog); - if (egress_prog_fd < 0) { - printf("find egress_prog fd failed\n"); - goto err_out; - } + ret = sample_init(skel, mask); + if (ret < 0) { + fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; } - /* Remove attached program when program is interrupted or killed */ - signal(SIGINT, int_exit); - signal(SIGTERM, int_exit); + ingress_prog = skel->progs.xdp_redirect_map_native; + forward_map = skel->maps.forward_map_native; - /* Init forward multicast groups */ for (i = 0; ifaces[i] > 0; i++) { ifindex = ifaces[i]; + ret = EXIT_FAIL_XDP; +restart: /* bind prog_fd to each interface */ - ret = bpf_set_link_xdp_fd(ifindex, ingress_prog_fd, xdp_flags); - if (ret) { - printf("Set xdp fd failed on %d\n", ifindex); - goto err_out; + if (sample_install_xdp(ingress_prog, ifindex, generic, force) < 0) { + if (generic && !tried) { + ingress_prog = skel->progs.xdp_redirect_map_general; + forward_map = skel->maps.forward_map_general; + tried = true; + goto restart; + } + goto end_destroy; } /* Add all the interfaces to forward group and attach - * egress devmap programe if exist + * egress devmap program if exist */ devmap_val.ifindex = ifindex; - devmap_val.bpf_prog.fd = egress_prog_fd; - ret = bpf_map_update_elem(forward_map_fd, &ifindex, &devmap_val, 0); - if (ret) { - perror("bpf_map_update_elem forward_map"); - goto err_out; + devmap_val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_devmap_prog); + ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); + if (ret < 0) { + fprintf(stderr, "Failed to update devmap value: %s\n", + strerror(errno)); + ret = EXIT_FAIL_BPF; + goto end_destroy; } } - poll_stats(2); - - return 0; - -err_out: - return 1; + ret = sample_run(interval, NULL, NULL); + if (ret < 0) { + fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_destroy; + } + ret = EXIT_OK; +end_destroy: + xdp_redirect_map_multi__destroy(skel); +end: + sample_exit(ret); } From patchwork Wed Jul 21 21:28:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 12392253 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5A41C6377B for ; Wed, 21 Jul 2021 21:29:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9353E6120D for ; Wed, 21 Jul 2021 21:29:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229948AbhGUUsb (ORCPT ); Wed, 21 Jul 2021 16:48:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229934AbhGUUsa (ORCPT ); Wed, 21 Jul 2021 16:48:30 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD5C3C061575; Wed, 21 Jul 2021 14:29:05 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id i16-20020a17090acf90b02901736d9d2218so2561550pju.1; Wed, 21 Jul 2021 14:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AyUcJHXtggFP9GeH14iKuvz8UHkWxeYEpcfQbz2akt8=; b=WrnCd5RA5tRdqsyTSZkLZSCGgG1uAYMmTVh96iSuCOYv6mJ7NoyHvLMq/n8WO9wmk3 xySDS+5AO5NTIXIdEOaqTWf4+ntBYfBHvdUQSM/f4MjhTGHUvnLwO+6YL+fT0C3ufgZQ 82/UxmvPdLlZjba5MiyHTdDU86yLFdq/G6rjSQNrXQ2CphvbpdYHd+Kpo9R4Hhjv2++6 hM2Mcxff+bgEFajnc7rjAl06oU7Etp7F6KtM1NTNj4YDfNV1aKHPqXblAfn5R79Ff4c5 /QFZqG5Bydf4SjSBh7MVXrEQyO7rfcJg0zuJVRKNCys3UsoREzzKh5k80SA64pj3JxsY BWxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AyUcJHXtggFP9GeH14iKuvz8UHkWxeYEpcfQbz2akt8=; b=s2aSUkZPGBxKJ7aXAb/l3Va6ODXbr01wcM5ogLCxBu5mx+5gOfOnaLEBD47zrQVrKr GcgGqwOSTaOkGlX6PrVbMuTntr3ATPdV9D2YtFZ+KmbsMe26zU5kGYS40nAccG9RT7+J VpdchHsUqqDWi1LjcvVfrxjY8k8eWSoNLnTVAx10l63JBhp8g3abFXeZpIqBDWzFyITR yPucb9ltKEOujj4CAzoxiDC7pck9341InFZtN/Zn+EV6C3Kex0EVmGoNLi9K67jPeU1U JwhVwMgwzGtvhkcEQtGy+fNrNHvkW4DUa9n+2bm6WBErSL4eBLNCtj5yXjvSbhBm2cZi DdLw== X-Gm-Message-State: AOAM532SJJH6w9RoDje3mSnUv+UsnxH1TlsYyHUhnkjY7JWTr6/J5MBG AhTNs6lni0BkFa2aYJiqeri9GRgPkAOWKw== X-Google-Smtp-Source: ABdhPJz5cweFc6Imb1Bl0V+nH5qpU8fFnUSItj8eV/eyJlysZ9NKV5qLlB1Vk+rQzf7Oy3mZypYjZQ== X-Received: by 2002:aa7:804f:0:b029:334:4951:da88 with SMTP id y15-20020aa7804f0000b02903344951da88mr32298326pfm.29.1626902944774; Wed, 21 Jul 2021 14:29:04 -0700 (PDT) Received: from localhost ([2405:201:6014:d0bb:dc30:f309:2f53:5818]) by smtp.gmail.com with ESMTPSA id 125sm369722pge.34.2021.07.21.14.29.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Jul 2021 14:29:04 -0700 (PDT) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Kumar Kartikeya Dwivedi , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Jesper Dangaard Brouer , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , netdev@vger.kernel.org Subject: [PATCH bpf-next v2 8/8] samples: bpf: Convert xdp_redirect_cpu to use XDP samples helpers Date: Thu, 22 Jul 2021 02:58:33 +0530 Message-Id: <20210721212833.701342-9-memxor@gmail.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210721212833.701342-1-memxor@gmail.com> References: <20210721212833.701342-1-memxor@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This change converts XDP redirect_cpu tool to use the XDP samples support introduced in previous changes. One of the notable changes is removal of options to specify a separate eBPF program that supplies a redirection program that is attached to the CPUMAP, so that the packet can be redirected to a device after redirecting it to a CPU. This is replaced by a program built into the BPF object file and now the user only needs to specify the out interface to do the same. The program also uses devmap to redirect XDP frames, instead of directly redirecting with bpf_redirect, for performance reasons. Signed-off-by: Kumar Kartikeya Dwivedi --- samples/bpf/Makefile | 12 +- samples/bpf/xdp_redirect_cpu.bpf.c | 561 +++++++++++++++++ samples/bpf/xdp_redirect_cpu_kern.c | 730 ---------------------- samples/bpf/xdp_redirect_cpu_user.c | 916 +++++----------------------- 4 files changed, 732 insertions(+), 1487 deletions(-) create mode 100644 samples/bpf/xdp_redirect_cpu.bpf.c delete mode 100644 samples/bpf/xdp_redirect_cpu_kern.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index 064f16c12ccc..60d3a5ae6630 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -39,7 +39,6 @@ tprogs-y += lwt_len_hist tprogs-y += xdp_tx_iptunnel tprogs-y += test_map_in_map tprogs-y += per_socket_stats_example -tprogs-y += xdp_redirect_cpu tprogs-y += xdp_rxq_info tprogs-y += syscall_tp tprogs-y += cpustat @@ -53,6 +52,7 @@ tprogs-y += xdp_sample_pkts tprogs-y += ibumad tprogs-y += hbm +tprogs-y += xdp_redirect_cpu tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_map tprogs-y += xdp_redirect @@ -100,7 +100,6 @@ lwt_len_hist-objs := lwt_len_hist_user.o xdp_tx_iptunnel-objs := xdp_tx_iptunnel_user.o test_map_in_map-objs := test_map_in_map_user.o per_socket_stats_example-objs := cookie_uid_helper_example.o -xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o xdp_rxq_info-objs := xdp_rxq_info_user.o syscall_tp-objs := syscall_tp_user.o cpustat-objs := cpustat_user.o @@ -115,6 +114,7 @@ ibumad-objs := ibumad_user.o hbm-objs := hbm.o $(CGROUP_HELPERS) xdp_sample_user-objs := xdp_sample_user.o $(LIBBPFDIR)/hashmap.o +xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o $(XDP_SAMPLE) xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o $(XDP_SAMPLE) xdp_redirect_map-objs := xdp_redirect_map_user.o $(XDP_SAMPLE) xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE) @@ -164,7 +164,6 @@ always-y += tcp_clamp_kern.o always-y += tcp_basertt_kern.o always-y += tcp_tos_reflect_kern.o always-y += tcp_dumpstats_kern.o -always-y += xdp_redirect_cpu_kern.o always-y += xdp_rxq_info_kern.o always-y += xdp2skb_meta_kern.o always-y += syscall_tp_kern.o @@ -312,6 +311,7 @@ verify_target_bpf: verify_cmds $(BPF_SAMPLES_PATH)/*.c: verify_target_bpf $(LIBBPF) $(src)/*.c: verify_target_bpf $(LIBBPF) +$(obj)/xdp_redirect_cpu_user.o: $(obj)/xdp_redirect_cpu.skel.h $(obj)/xdp_redirect_map_multi_user.o: $(obj)/xdp_redirect_map_multi.skel.h $(obj)/xdp_redirect_map_user.o: $(obj)/xdp_redirect_map.skel.h $(obj)/xdp_redirect_user.o: $(obj)/xdp_redirect.skel.h @@ -358,6 +358,7 @@ endef CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG)) +$(obj)/xdp_redirect_cpu.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect_map_multi.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect_map.bpf.o: $(obj)/xdp_sample.bpf.o $(obj)/xdp_redirect.bpf.o: $(obj)/xdp_sample.bpf.o @@ -371,10 +372,11 @@ $(obj)/%.bpf.o: $(src)/%.bpf.c $(obj)/vmlinux.h $(src)/xdp_sample.bpf.h $(src)/x -I$(srctree)/tools/lib $(CLANG_SYS_INCLUDES) \ -c $(filter %.bpf.c,$^) -o $@ -LINKED_SKELS := xdp_redirect_map_multi.skel.h xdp_redirect_map.skel.h \ - xdp_redirect.skel.h xdp_monitor.skel.h +LINKED_SKELS := xdp_redirect_cpu.skel.h xdp_redirect_map_multi.skel.h \ + xdp_redirect_map.skel.h xdp_redirect.skel.h xdp_monitor.skel.h clean-files += $(LINKED_SKELS) +xdp_redirect_cpu.skel.h-deps := xdp_redirect_cpu.bpf.o xdp_sample.bpf.o xdp_redirect_map_multi.skel.h-deps := xdp_redirect_map_multi.bpf.o xdp_sample.bpf.o xdp_redirect_map.skel.h-deps := xdp_redirect_map.bpf.o xdp_sample.bpf.o xdp_redirect.skel.h-deps := xdp_redirect.bpf.o xdp_sample.bpf.o diff --git a/samples/bpf/xdp_redirect_cpu.bpf.c b/samples/bpf/xdp_redirect_cpu.bpf.c new file mode 100644 index 000000000000..0ceddf6960e8 --- /dev/null +++ b/samples/bpf/xdp_redirect_cpu.bpf.c @@ -0,0 +1,561 @@ +/* XDP redirect to CPUs via cpumap (BPF_MAP_TYPE_CPUMAP) + * + * GPLv2, Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. + */ +#include "vmlinux.h" +#include "xdp_sample.bpf.h" +#include "xdp_sample_shared.h" +#include "hash_func01.h" + +#if defined(__BYTE_ORDER__) && defined(__ORDER_LITTLE_ENDIAN__) && \ + __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ +#define bpf_ntohs(x) __builtin_bswap16(x) +#define bpf_htons(x) __builtin_bswap16(x) +#elif defined(__BYTE_ORDER__) && defined(__ORDER_BIG_ENDIAN__) && \ + __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +#define bpf_ntohs(x) (x) +#define bpf_htons(x) (x) +#else +# error "Endianness detection needs to be set up for your compiler?!" +#endif + +/* Special map type that can XDP_REDIRECT frames to another CPU */ +struct { + __uint(type, BPF_MAP_TYPE_CPUMAP); + __uint(key_size, sizeof(u32)); + __uint(value_size, sizeof(struct bpf_cpumap_val)); + __uint(max_entries, MAX_CPUS); +} cpu_map SEC(".maps"); + +/* Set of maps controlling available CPU, and for iterating through + * selectable redirect CPUs. + */ +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, u32); + __type(value, u32); + __uint(max_entries, MAX_CPUS); +} cpus_available SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, u32); + __type(value, u32); + __uint(max_entries, 1); +} cpus_count SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __type(key, u32); + __type(value, u32); + __uint(max_entries, 1); +} cpus_iterator SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(struct bpf_devmap_val)); + __uint(max_entries, 1); +} tx_port SEC(".maps"); + +char tx_mac_addr[ETH_ALEN]; + +/* Helper parse functions */ + +static __always_inline +bool parse_eth(struct ethhdr *eth, void *data_end, + u16 *eth_proto, u64 *l3_offset) +{ + u16 eth_type; + u64 offset; + + offset = sizeof(*eth); + if ((void *)eth + offset > data_end) + return false; + + eth_type = eth->h_proto; + + /* Skip non 802.3 Ethertypes */ + if (__builtin_expect(bpf_ntohs(eth_type) < ETH_P_802_3_MIN, 0)) + return false; + + /* Handle VLAN tagged packet */ + if (eth_type == bpf_htons(ETH_P_8021Q) || + eth_type == bpf_htons(ETH_P_8021AD)) { + struct vlan_hdr *vlan_hdr; + + vlan_hdr = (void *)eth + offset; + offset += sizeof(*vlan_hdr); + if ((void *)eth + offset > data_end) + return false; + eth_type = vlan_hdr->h_vlan_encapsulated_proto; + } + /* Handle double VLAN tagged packet */ + if (eth_type == bpf_htons(ETH_P_8021Q) || + eth_type == bpf_htons(ETH_P_8021AD)) { + struct vlan_hdr *vlan_hdr; + + vlan_hdr = (void *)eth + offset; + offset += sizeof(*vlan_hdr); + if ((void *)eth + offset > data_end) + return false; + eth_type = vlan_hdr->h_vlan_encapsulated_proto; + } + + *eth_proto = bpf_ntohs(eth_type); + *l3_offset = offset; + return true; +} + +static __always_inline +u16 get_dest_port_ipv4_udp(struct xdp_md *ctx, u64 nh_off) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct iphdr *iph = data + nh_off; + struct udphdr *udph; + u16 dport; + + if (iph + 1 > data_end) + return 0; + if (!(iph->protocol == IPPROTO_UDP)) + return 0; + + udph = (void *)(iph + 1); + if (udph + 1 > data_end) + return 0; + + dport = bpf_ntohs(udph->dest); + return dport; +} + +static __always_inline +int get_proto_ipv4(struct xdp_md *ctx, u64 nh_off) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct iphdr *iph = data + nh_off; + + if (iph + 1 > data_end) + return 0; + return iph->protocol; +} + +static __always_inline +int get_proto_ipv6(struct xdp_md *ctx, u64 nh_off) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct ipv6hdr *ip6h = data + nh_off; + + if (ip6h + 1 > data_end) + return 0; + return ip6h->nexthdr; +} + +SEC("xdp_cpu_map0") +int xdp_prognum0_no_touch(struct xdp_md *ctx) +{ + u32 key = bpf_get_smp_processor_id(); + struct datarec *rec; + u32 *cpu_selected; + u32 cpu_dest = 0; + u32 key0 = 0; + + /* Only use first entry in cpus_available */ + cpu_selected = bpf_map_lookup_elem(&cpus_available, &key0); + if (!cpu_selected) + return XDP_ABORTED; + cpu_dest = *cpu_selected; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +SEC("xdp_cpu_map1_touch_data") +int xdp_prognum1_touch_data(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct ethhdr *eth = data; + struct datarec *rec; + u32 *cpu_selected; + u32 cpu_dest = 0; + u32 key0 = 0; + u16 eth_type; + + /* Only use first entry in cpus_available */ + cpu_selected = bpf_map_lookup_elem(&cpus_available, &key0); + if (!cpu_selected) + return XDP_ABORTED; + cpu_dest = *cpu_selected; + + /* Validate packet length is minimum Eth header size */ + if (eth + 1 > data_end) + return XDP_ABORTED; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + /* Read packet data, and use it (drop non 802.3 Ethertypes) */ + eth_type = eth->h_proto; + if (bpf_ntohs(eth_type) < ETH_P_802_3_MIN) { + ATOMIC_INC_NORMW(rec->dropped); + return XDP_DROP; + } + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +SEC("xdp_cpu_map2_round_robin") +int xdp_prognum2_round_robin(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct datarec *rec; + u32 cpu_dest = 0; + u32 key0 = 0; + + u32 *cpu_selected; + u32 *cpu_iterator; + u32 *cpu_max; + u32 cpu_idx; + + cpu_max = bpf_map_lookup_elem(&cpus_count, &key0); + if (!cpu_max) + return XDP_ABORTED; + + cpu_iterator = bpf_map_lookup_elem(&cpus_iterator, &key0); + if (!cpu_iterator) + return XDP_ABORTED; + cpu_idx = *cpu_iterator; + + *cpu_iterator += 1; + if (*cpu_iterator == *cpu_max) + *cpu_iterator = 0; + + cpu_selected = bpf_map_lookup_elem(&cpus_available, &cpu_idx); + if (!cpu_selected) + return XDP_ABORTED; + cpu_dest = *cpu_selected; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +SEC("xdp_cpu_map3_proto_separate") +int xdp_prognum3_proto_separate(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct ethhdr *eth = data; + u8 ip_proto = IPPROTO_UDP; + struct datarec *rec; + u16 eth_proto = 0; + u64 l3_offset = 0; + u32 cpu_dest = 0; + u32 *cpu_lookup; + u32 cpu_idx = 0; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) + return XDP_PASS; /* Just skip */ + + /* Extract L4 protocol */ + switch (eth_proto) { + case ETH_P_IP: + ip_proto = get_proto_ipv4(ctx, l3_offset); + break; + case ETH_P_IPV6: + ip_proto = get_proto_ipv6(ctx, l3_offset); + break; + case ETH_P_ARP: + cpu_idx = 0; /* ARP packet handled on separate CPU */ + break; + default: + cpu_idx = 0; + } + + /* Choose CPU based on L4 protocol */ + switch (ip_proto) { + case IPPROTO_ICMP: + case IPPROTO_ICMPV6: + cpu_idx = 2; + break; + case IPPROTO_TCP: + cpu_idx = 0; + break; + case IPPROTO_UDP: + cpu_idx = 1; + break; + default: + cpu_idx = 0; + } + + cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); + if (!cpu_lookup) + return XDP_ABORTED; + cpu_dest = *cpu_lookup; + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +SEC("xdp_cpu_map4_ddos_filter_pktgen") +int xdp_prognum4_ddos_filter_pktgen(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct ethhdr *eth = data; + u8 ip_proto = IPPROTO_UDP; + struct datarec *rec; + u16 eth_proto = 0; + u64 l3_offset = 0; + u32 cpu_dest = 0; + u32 *cpu_lookup; + u32 cpu_idx = 0; + u16 dest_port; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) + return XDP_PASS; /* Just skip */ + + /* Extract L4 protocol */ + switch (eth_proto) { + case ETH_P_IP: + ip_proto = get_proto_ipv4(ctx, l3_offset); + break; + case ETH_P_IPV6: + ip_proto = get_proto_ipv6(ctx, l3_offset); + break; + case ETH_P_ARP: + cpu_idx = 0; /* ARP packet handled on separate CPU */ + break; + default: + cpu_idx = 0; + } + + /* Choose CPU based on L4 protocol */ + switch (ip_proto) { + case IPPROTO_ICMP: + case IPPROTO_ICMPV6: + cpu_idx = 2; + break; + case IPPROTO_TCP: + cpu_idx = 0; + break; + case IPPROTO_UDP: + cpu_idx = 1; + /* DDoS filter UDP port 9 (pktgen) */ + dest_port = get_dest_port_ipv4_udp(ctx, l3_offset); + if (dest_port == 9) { + ATOMIC_INC_NORMW(rec->dropped); + return XDP_DROP; + } + break; + default: + cpu_idx = 0; + } + + cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); + if (!cpu_lookup) + return XDP_ABORTED; + cpu_dest = *cpu_lookup; + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +/* Hashing initval */ +#define INITVAL 15485863 + +static __always_inline +u32 get_ipv4_hash_ip_pair(struct xdp_md *ctx, u64 nh_off) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct iphdr *iph = data + nh_off; + u32 cpu_hash; + + if (iph + 1 > data_end) + return 0; + + cpu_hash = iph->saddr + iph->daddr; + cpu_hash = SuperFastHash((char *)&cpu_hash, 4, INITVAL + iph->protocol); + + return cpu_hash; +} + +static __always_inline +u32 get_ipv6_hash_ip_pair(struct xdp_md *ctx, u64 nh_off) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct ipv6hdr *ip6h = data + nh_off; + u32 cpu_hash; + + if (ip6h + 1 > data_end) + return 0; + + cpu_hash = ip6h->saddr.in6_u.u6_addr32[0] + ip6h->daddr.in6_u.u6_addr32[0]; + cpu_hash += ip6h->saddr.in6_u.u6_addr32[1] + ip6h->daddr.in6_u.u6_addr32[1]; + cpu_hash += ip6h->saddr.in6_u.u6_addr32[2] + ip6h->daddr.in6_u.u6_addr32[2]; + cpu_hash += ip6h->saddr.in6_u.u6_addr32[3] + ip6h->daddr.in6_u.u6_addr32[3]; + cpu_hash = SuperFastHash((char *)&cpu_hash, 4, INITVAL + ip6h->nexthdr); + + return cpu_hash; +} + +/* Load-Balance traffic based on hashing IP-addrs + L4-proto. The + * hashing scheme is symmetric, meaning swapping IP src/dest still hit + * same CPU. + */ +SEC("xdp_cpu_map5_lb_hash_ip_pairs") +int xdp_prognum5_lb_hash_ip_pairs(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = bpf_get_smp_processor_id(); + struct ethhdr *eth = data; + struct datarec *rec; + u16 eth_proto = 0; + u64 l3_offset = 0; + u32 cpu_dest = 0; + u32 cpu_idx = 0; + u32 *cpu_lookup; + u32 key0 = 0; + u32 *cpu_max; + u32 cpu_hash; + + if (key < MAX_CPUS) { + /* Count RX packet in map */ + rec = &sample_data.rx_cnt[key]; + ATOMIC_INC_NORMW(rec->processed); + + cpu_max = bpf_map_lookup_elem(&cpus_count, &key0); + if (!cpu_max) + return XDP_ABORTED; + + if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) + return XDP_PASS; /* Just skip */ + + /* Hash for IPv4 and IPv6 */ + switch (eth_proto) { + case ETH_P_IP: + cpu_hash = get_ipv4_hash_ip_pair(ctx, l3_offset); + break; + case ETH_P_IPV6: + cpu_hash = get_ipv6_hash_ip_pair(ctx, l3_offset); + break; + case ETH_P_ARP: /* ARP packet handled on CPU idx 0 */ + default: + cpu_hash = 0; + } + + /* Choose CPU based on hash */ + cpu_idx = cpu_hash % *cpu_max; + + cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); + if (!cpu_lookup) + return XDP_ABORTED; + cpu_dest = *cpu_lookup; + + if (cpu_dest >= MAX_CPUS) { + ATOMIC_INC_NORMW(rec->issue); + return XDP_ABORTED; + } + return bpf_redirect_map(&cpu_map, cpu_dest, 0); + } + + return XDP_PASS; +} + +SEC("xdp_cpumap/devmap") +int xdp_redirect_cpu_devmap(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct ethhdr *eth = data; + u64 nh_off; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + swap_src_dst_mac(data); + return bpf_redirect_map(&tx_port, 0, 0); +} + +SEC("xdp_devmap/egress") +int xdp_redirect_egress_prog(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct ethhdr *eth = data; + u64 nh_off; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + __builtin_memcpy(eth->h_source, (const char *)tx_mac_addr, ETH_ALEN); + + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_redirect_cpu_kern.c b/samples/bpf/xdp_redirect_cpu_kern.c deleted file mode 100644 index 8255025dea97..000000000000 --- a/samples/bpf/xdp_redirect_cpu_kern.c +++ /dev/null @@ -1,730 +0,0 @@ -/* XDP redirect to CPUs via cpumap (BPF_MAP_TYPE_CPUMAP) - * - * GPLv2, Copyright(c) 2017 Jesper Dangaard Brouer, Red Hat, Inc. - */ -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include "hash_func01.h" - -#define MAX_CPUS NR_CPUS - -/* Special map type that can XDP_REDIRECT frames to another CPU */ -struct { - __uint(type, BPF_MAP_TYPE_CPUMAP); - __uint(key_size, sizeof(u32)); - __uint(value_size, sizeof(struct bpf_cpumap_val)); - __uint(max_entries, MAX_CPUS); -} cpu_map SEC(".maps"); - -/* Common stats data record to keep userspace more simple */ -struct datarec { - __u64 processed; - __u64 dropped; - __u64 issue; - __u64 xdp_pass; - __u64 xdp_drop; - __u64 xdp_redirect; -}; - -/* Count RX packets, as XDP bpf_prog doesn't get direct TX-success - * feedback. Redirect TX errors can be caught via a tracepoint. - */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 1); -} rx_cnt SEC(".maps"); - -/* Used by trace point */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 2); - /* TODO: have entries for all possible errno's */ -} redirect_err_cnt SEC(".maps"); - -/* Used by trace point */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, MAX_CPUS); -} cpumap_enqueue_cnt SEC(".maps"); - -/* Used by trace point */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 1); -} cpumap_kthread_cnt SEC(".maps"); - -/* Set of maps controlling available CPU, and for iterating through - * selectable redirect CPUs. - */ -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __type(key, u32); - __type(value, u32); - __uint(max_entries, MAX_CPUS); -} cpus_available SEC(".maps"); -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __type(key, u32); - __type(value, u32); - __uint(max_entries, 1); -} cpus_count SEC(".maps"); -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, u32); - __uint(max_entries, 1); -} cpus_iterator SEC(".maps"); - -/* Used by trace point */ -struct { - __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); - __type(key, u32); - __type(value, struct datarec); - __uint(max_entries, 1); -} exception_cnt SEC(".maps"); - -/* Helper parse functions */ - -/* Parse Ethernet layer 2, extract network layer 3 offset and protocol - * - * Returns false on error and non-supported ether-type - */ -struct vlan_hdr { - __be16 h_vlan_TCI; - __be16 h_vlan_encapsulated_proto; -}; - -static __always_inline -bool parse_eth(struct ethhdr *eth, void *data_end, - u16 *eth_proto, u64 *l3_offset) -{ - u16 eth_type; - u64 offset; - - offset = sizeof(*eth); - if ((void *)eth + offset > data_end) - return false; - - eth_type = eth->h_proto; - - /* Skip non 802.3 Ethertypes */ - if (unlikely(ntohs(eth_type) < ETH_P_802_3_MIN)) - return false; - - /* Handle VLAN tagged packet */ - if (eth_type == htons(ETH_P_8021Q) || eth_type == htons(ETH_P_8021AD)) { - struct vlan_hdr *vlan_hdr; - - vlan_hdr = (void *)eth + offset; - offset += sizeof(*vlan_hdr); - if ((void *)eth + offset > data_end) - return false; - eth_type = vlan_hdr->h_vlan_encapsulated_proto; - } - /* Handle double VLAN tagged packet */ - if (eth_type == htons(ETH_P_8021Q) || eth_type == htons(ETH_P_8021AD)) { - struct vlan_hdr *vlan_hdr; - - vlan_hdr = (void *)eth + offset; - offset += sizeof(*vlan_hdr); - if ((void *)eth + offset > data_end) - return false; - eth_type = vlan_hdr->h_vlan_encapsulated_proto; - } - - *eth_proto = ntohs(eth_type); - *l3_offset = offset; - return true; -} - -static __always_inline -u16 get_dest_port_ipv4_udp(struct xdp_md *ctx, u64 nh_off) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct iphdr *iph = data + nh_off; - struct udphdr *udph; - u16 dport; - - if (iph + 1 > data_end) - return 0; - if (!(iph->protocol == IPPROTO_UDP)) - return 0; - - udph = (void *)(iph + 1); - if (udph + 1 > data_end) - return 0; - - dport = ntohs(udph->dest); - return dport; -} - -static __always_inline -int get_proto_ipv4(struct xdp_md *ctx, u64 nh_off) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct iphdr *iph = data + nh_off; - - if (iph + 1 > data_end) - return 0; - return iph->protocol; -} - -static __always_inline -int get_proto_ipv6(struct xdp_md *ctx, u64 nh_off) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ipv6hdr *ip6h = data + nh_off; - - if (ip6h + 1 > data_end) - return 0; - return ip6h->nexthdr; -} - -SEC("xdp_cpu_map0") -int xdp_prognum0_no_touch(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct datarec *rec; - u32 *cpu_selected; - u32 cpu_dest; - u32 key = 0; - - /* Only use first entry in cpus_available */ - cpu_selected = bpf_map_lookup_elem(&cpus_available, &key); - if (!cpu_selected) - return XDP_ABORTED; - cpu_dest = *cpu_selected; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -SEC("xdp_cpu_map1_touch_data") -int xdp_prognum1_touch_data(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - struct datarec *rec; - u32 *cpu_selected; - u32 cpu_dest; - u16 eth_type; - u32 key = 0; - - /* Only use first entry in cpus_available */ - cpu_selected = bpf_map_lookup_elem(&cpus_available, &key); - if (!cpu_selected) - return XDP_ABORTED; - cpu_dest = *cpu_selected; - - /* Validate packet length is minimum Eth header size */ - if (eth + 1 > data_end) - return XDP_ABORTED; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - /* Read packet data, and use it (drop non 802.3 Ethertypes) */ - eth_type = eth->h_proto; - if (ntohs(eth_type) < ETH_P_802_3_MIN) { - rec->dropped++; - return XDP_DROP; - } - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -SEC("xdp_cpu_map2_round_robin") -int xdp_prognum2_round_robin(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - struct datarec *rec; - u32 cpu_dest; - u32 *cpu_lookup; - u32 key0 = 0; - - u32 *cpu_selected; - u32 *cpu_iterator; - u32 *cpu_max; - u32 cpu_idx; - - cpu_max = bpf_map_lookup_elem(&cpus_count, &key0); - if (!cpu_max) - return XDP_ABORTED; - - cpu_iterator = bpf_map_lookup_elem(&cpus_iterator, &key0); - if (!cpu_iterator) - return XDP_ABORTED; - cpu_idx = *cpu_iterator; - - *cpu_iterator += 1; - if (*cpu_iterator == *cpu_max) - *cpu_iterator = 0; - - cpu_selected = bpf_map_lookup_elem(&cpus_available, &cpu_idx); - if (!cpu_selected) - return XDP_ABORTED; - cpu_dest = *cpu_selected; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key0); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -SEC("xdp_cpu_map3_proto_separate") -int xdp_prognum3_proto_separate(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - u8 ip_proto = IPPROTO_UDP; - struct datarec *rec; - u16 eth_proto = 0; - u64 l3_offset = 0; - u32 cpu_dest = 0; - u32 cpu_idx = 0; - u32 *cpu_lookup; - u32 key = 0; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) - return XDP_PASS; /* Just skip */ - - /* Extract L4 protocol */ - switch (eth_proto) { - case ETH_P_IP: - ip_proto = get_proto_ipv4(ctx, l3_offset); - break; - case ETH_P_IPV6: - ip_proto = get_proto_ipv6(ctx, l3_offset); - break; - case ETH_P_ARP: - cpu_idx = 0; /* ARP packet handled on separate CPU */ - break; - default: - cpu_idx = 0; - } - - /* Choose CPU based on L4 protocol */ - switch (ip_proto) { - case IPPROTO_ICMP: - case IPPROTO_ICMPV6: - cpu_idx = 2; - break; - case IPPROTO_TCP: - cpu_idx = 0; - break; - case IPPROTO_UDP: - cpu_idx = 1; - break; - default: - cpu_idx = 0; - } - - cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); - if (!cpu_lookup) - return XDP_ABORTED; - cpu_dest = *cpu_lookup; - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -SEC("xdp_cpu_map4_ddos_filter_pktgen") -int xdp_prognum4_ddos_filter_pktgen(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - u8 ip_proto = IPPROTO_UDP; - struct datarec *rec; - u16 eth_proto = 0; - u64 l3_offset = 0; - u32 cpu_dest = 0; - u32 cpu_idx = 0; - u16 dest_port; - u32 *cpu_lookup; - u32 key = 0; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) - return XDP_PASS; /* Just skip */ - - /* Extract L4 protocol */ - switch (eth_proto) { - case ETH_P_IP: - ip_proto = get_proto_ipv4(ctx, l3_offset); - break; - case ETH_P_IPV6: - ip_proto = get_proto_ipv6(ctx, l3_offset); - break; - case ETH_P_ARP: - cpu_idx = 0; /* ARP packet handled on separate CPU */ - break; - default: - cpu_idx = 0; - } - - /* Choose CPU based on L4 protocol */ - switch (ip_proto) { - case IPPROTO_ICMP: - case IPPROTO_ICMPV6: - cpu_idx = 2; - break; - case IPPROTO_TCP: - cpu_idx = 0; - break; - case IPPROTO_UDP: - cpu_idx = 1; - /* DDoS filter UDP port 9 (pktgen) */ - dest_port = get_dest_port_ipv4_udp(ctx, l3_offset); - if (dest_port == 9) { - if (rec) - rec->dropped++; - return XDP_DROP; - } - break; - default: - cpu_idx = 0; - } - - cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); - if (!cpu_lookup) - return XDP_ABORTED; - cpu_dest = *cpu_lookup; - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -/* Hashing initval */ -#define INITVAL 15485863 - -static __always_inline -u32 get_ipv4_hash_ip_pair(struct xdp_md *ctx, u64 nh_off) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct iphdr *iph = data + nh_off; - u32 cpu_hash; - - if (iph + 1 > data_end) - return 0; - - cpu_hash = iph->saddr + iph->daddr; - cpu_hash = SuperFastHash((char *)&cpu_hash, 4, INITVAL + iph->protocol); - - return cpu_hash; -} - -static __always_inline -u32 get_ipv6_hash_ip_pair(struct xdp_md *ctx, u64 nh_off) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ipv6hdr *ip6h = data + nh_off; - u32 cpu_hash; - - if (ip6h + 1 > data_end) - return 0; - - cpu_hash = ip6h->saddr.s6_addr32[0] + ip6h->daddr.s6_addr32[0]; - cpu_hash += ip6h->saddr.s6_addr32[1] + ip6h->daddr.s6_addr32[1]; - cpu_hash += ip6h->saddr.s6_addr32[2] + ip6h->daddr.s6_addr32[2]; - cpu_hash += ip6h->saddr.s6_addr32[3] + ip6h->daddr.s6_addr32[3]; - cpu_hash = SuperFastHash((char *)&cpu_hash, 4, INITVAL + ip6h->nexthdr); - - return cpu_hash; -} - -/* Load-Balance traffic based on hashing IP-addrs + L4-proto. The - * hashing scheme is symmetric, meaning swapping IP src/dest still hit - * same CPU. - */ -SEC("xdp_cpu_map5_lb_hash_ip_pairs") -int xdp_prognum5_lb_hash_ip_pairs(struct xdp_md *ctx) -{ - void *data_end = (void *)(long)ctx->data_end; - void *data = (void *)(long)ctx->data; - struct ethhdr *eth = data; - u8 ip_proto = IPPROTO_UDP; - struct datarec *rec; - u16 eth_proto = 0; - u64 l3_offset = 0; - u32 cpu_dest = 0; - u32 cpu_idx = 0; - u32 *cpu_lookup; - u32 *cpu_max; - u32 cpu_hash; - u32 key = 0; - - /* Count RX packet in map */ - rec = bpf_map_lookup_elem(&rx_cnt, &key); - if (!rec) - return XDP_ABORTED; - rec->processed++; - - cpu_max = bpf_map_lookup_elem(&cpus_count, &key); - if (!cpu_max) - return XDP_ABORTED; - - if (!(parse_eth(eth, data_end, ð_proto, &l3_offset))) - return XDP_PASS; /* Just skip */ - - /* Hash for IPv4 and IPv6 */ - switch (eth_proto) { - case ETH_P_IP: - cpu_hash = get_ipv4_hash_ip_pair(ctx, l3_offset); - break; - case ETH_P_IPV6: - cpu_hash = get_ipv6_hash_ip_pair(ctx, l3_offset); - break; - case ETH_P_ARP: /* ARP packet handled on CPU idx 0 */ - default: - cpu_hash = 0; - } - - /* Choose CPU based on hash */ - cpu_idx = cpu_hash % *cpu_max; - - cpu_lookup = bpf_map_lookup_elem(&cpus_available, &cpu_idx); - if (!cpu_lookup) - return XDP_ABORTED; - cpu_dest = *cpu_lookup; - - if (cpu_dest >= MAX_CPUS) { - rec->issue++; - return XDP_ABORTED; - } - - return bpf_redirect_map(&cpu_map, cpu_dest, 0); -} - -char _license[] SEC("license") = "GPL"; - -/*** Trace point code ***/ - -/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_redirect/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct xdp_redirect_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int prog_id; // offset:8; size:4; signed:1; - u32 act; // offset:12 size:4; signed:0; - int ifindex; // offset:16 size:4; signed:1; - int err; // offset:20 size:4; signed:1; - int to_ifindex; // offset:24 size:4; signed:1; - u32 map_id; // offset:28 size:4; signed:0; - int map_index; // offset:32 size:4; signed:1; -}; // offset:36 - -enum { - XDP_REDIRECT_SUCCESS = 0, - XDP_REDIRECT_ERROR = 1 -}; - -static __always_inline -int xdp_redirect_collect_stat(struct xdp_redirect_ctx *ctx) -{ - u32 key = XDP_REDIRECT_ERROR; - struct datarec *rec; - int err = ctx->err; - - if (!err) - key = XDP_REDIRECT_SUCCESS; - - rec = bpf_map_lookup_elem(&redirect_err_cnt, &key); - if (!rec) - return 0; - rec->dropped += 1; - - return 0; /* Indicate event was filtered (no further processing)*/ - /* - * Returning 1 here would allow e.g. a perf-record tracepoint - * to see and record these events, but it doesn't work well - * in-practice as stopping perf-record also unload this - * bpf_prog. Plus, there is additional overhead of doing so. - */ -} - -SEC("tracepoint/xdp/xdp_redirect_err") -int trace_xdp_redirect_err(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - -SEC("tracepoint/xdp/xdp_redirect_map_err") -int trace_xdp_redirect_map_err(struct xdp_redirect_ctx *ctx) -{ - return xdp_redirect_collect_stat(ctx); -} - -/* Tracepoint format: /sys/kernel/debug/tracing/events/xdp/xdp_exception/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct xdp_exception_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int prog_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int ifindex; // offset:16; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_exception") -int trace_xdp_exception(struct xdp_exception_ctx *ctx) -{ - struct datarec *rec; - u32 key = 0; - - rec = bpf_map_lookup_elem(&exception_cnt, &key); - if (!rec) - return 1; - rec->dropped += 1; - - return 0; -} - -/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_enqueue/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct cpumap_enqueue_ctx { - u64 __pad; // First 8 bytes are not accessible by bpf code - int map_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int cpu; // offset:16; size:4; signed:1; - unsigned int drops; // offset:20; size:4; signed:0; - unsigned int processed; // offset:24; size:4; signed:0; - int to_cpu; // offset:28; size:4; signed:1; -}; - -SEC("tracepoint/xdp/xdp_cpumap_enqueue") -int trace_xdp_cpumap_enqueue(struct cpumap_enqueue_ctx *ctx) -{ - u32 to_cpu = ctx->to_cpu; - struct datarec *rec; - - if (to_cpu >= MAX_CPUS) - return 1; - - rec = bpf_map_lookup_elem(&cpumap_enqueue_cnt, &to_cpu); - if (!rec) - return 0; - rec->processed += ctx->processed; - rec->dropped += ctx->drops; - - /* Record bulk events, then userspace can calc average bulk size */ - if (ctx->processed > 0) - rec->issue += 1; - - /* Inception: It's possible to detect overload situations, via - * this tracepoint. This can be used for creating a feedback - * loop to XDP, which can take appropriate actions to mitigate - * this overload situation. - */ - return 0; -} - -/* Tracepoint: /sys/kernel/debug/tracing/events/xdp/xdp_cpumap_kthread/format - * Code in: kernel/include/trace/events/xdp.h - */ -struct cpumap_kthread_ctx { - u64 __pad; // First 8 bytes are not accessible - int map_id; // offset:8; size:4; signed:1; - u32 act; // offset:12; size:4; signed:0; - int cpu; // offset:16; size:4; signed:1; - unsigned int drops; // offset:20; size:4; signed:0; - unsigned int processed; // offset:24; size:4; signed:0; - int sched; // offset:28; size:4; signed:1; - unsigned int xdp_pass; // offset:32; size:4; signed:0; - unsigned int xdp_drop; // offset:36; size:4; signed:0; - unsigned int xdp_redirect; // offset:40; size:4; signed:0; -}; - -SEC("tracepoint/xdp/xdp_cpumap_kthread") -int trace_xdp_cpumap_kthread(struct cpumap_kthread_ctx *ctx) -{ - struct datarec *rec; - u32 key = 0; - - rec = bpf_map_lookup_elem(&cpumap_kthread_cnt, &key); - if (!rec) - return 0; - rec->processed += ctx->processed; - rec->dropped += ctx->drops; - rec->xdp_pass += ctx->xdp_pass; - rec->xdp_drop += ctx->xdp_drop; - rec->xdp_redirect += ctx->xdp_redirect; - - /* Count times kthread yielded CPU via schedule call */ - if (ctx->sched) - rec->issue++; - - return 0; -} diff --git a/samples/bpf/xdp_redirect_cpu_user.c b/samples/bpf/xdp_redirect_cpu_user.c index d3ecdc18b9c1..68eda4fa4d7d 100644 --- a/samples/bpf/xdp_redirect_cpu_user.c +++ b/samples/bpf/xdp_redirect_cpu_user.c @@ -18,110 +18,40 @@ static const char *__doc__ = #include #include #include - #include #include - -/* How many xdp_progs are defined in _kern.c */ -#define MAX_PROG 6 - #include #include - #include "bpf_util.h" +#include "xdp_sample_user.h" +#include "xdp_redirect_cpu.skel.h" -static int ifindex = -1; -static char ifname_buf[IF_NAMESIZE]; -static char *ifname; -static __u32 prog_id; - -static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; -static int n_cpus; - -enum map_type { - CPU_MAP, - RX_CNT, - REDIRECT_ERR_CNT, - CPUMAP_ENQUEUE_CNT, - CPUMAP_KTHREAD_CNT, - CPUS_AVAILABLE, - CPUS_COUNT, - CPUS_ITERATOR, - EXCEPTION_CNT, -}; +static int map_fd; +static int avail_fd; +static int count_fd; -static const char *const map_type_strings[] = { - [CPU_MAP] = "cpu_map", - [RX_CNT] = "rx_cnt", - [REDIRECT_ERR_CNT] = "redirect_err_cnt", - [CPUMAP_ENQUEUE_CNT] = "cpumap_enqueue_cnt", - [CPUMAP_KTHREAD_CNT] = "cpumap_kthread_cnt", - [CPUS_AVAILABLE] = "cpus_available", - [CPUS_COUNT] = "cpus_count", - [CPUS_ITERATOR] = "cpus_iterator", - [EXCEPTION_CNT] = "exception_cnt", -}; +static int mask = SAMPLE_RX_CNT | SAMPLE_REDIRECT_ERR_MAP_CNT | + SAMPLE_CPUMAP_ENQUEUE_CNT | SAMPLE_CPUMAP_KTHREAD_CNT | + SAMPLE_EXCEPTION_CNT; -#define NUM_TP 5 -#define NUM_MAP 9 -struct bpf_link *tp_links[NUM_TP] = {}; -static int map_fds[NUM_MAP]; -static int tp_cnt = 0; - -/* Exit return codes */ -#define EXIT_OK 0 -#define EXIT_FAIL 1 -#define EXIT_FAIL_OPTION 2 -#define EXIT_FAIL_XDP 3 -#define EXIT_FAIL_BPF 4 -#define EXIT_FAIL_MEM 5 +DEFINE_SAMPLE_INIT(xdp_redirect_cpu); static const struct option long_options[] = { - {"help", no_argument, NULL, 'h' }, - {"dev", required_argument, NULL, 'd' }, - {"skb-mode", no_argument, NULL, 'S' }, - {"sec", required_argument, NULL, 's' }, - {"progname", required_argument, NULL, 'p' }, - {"qsize", required_argument, NULL, 'q' }, - {"cpu", required_argument, NULL, 'c' }, - {"stress-mode", no_argument, NULL, 'x' }, - {"no-separators", no_argument, NULL, 'z' }, - {"force", no_argument, NULL, 'F' }, - {"mprog-disable", no_argument, NULL, 'n' }, - {"mprog-name", required_argument, NULL, 'e' }, - {"mprog-filename", required_argument, NULL, 'f' }, - {"redirect-device", required_argument, NULL, 'r' }, - {"redirect-map", required_argument, NULL, 'm' }, - {0, 0, NULL, 0 } + { "help", no_argument, NULL, 'h' }, + { "dev", required_argument, NULL, 'd' }, + { "skb-mode", no_argument, NULL, 'S' }, + { "progname", required_argument, NULL, 'p' }, + { "qsize", required_argument, NULL, 'q' }, + { "cpu", required_argument, NULL, 'c' }, + { "stress-mode", no_argument, NULL, 'x' }, + { "force", no_argument, NULL, 'F' }, + { "redirect-device", required_argument, NULL, 'r' }, + { "interval", required_argument, NULL, 'i' }, + { "verbose", no_argument, NULL, 'v' }, + { "stats", no_argument, NULL, 's' }, + {} }; -static void int_exit(int sig) -{ - __u32 curr_prog_id = 0; - - if (ifindex > -1) { - if (bpf_get_link_xdp_id(ifindex, &curr_prog_id, xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(EXIT_FAIL); - } - if (prog_id == curr_prog_id) { - fprintf(stderr, - "Interrupted: Removing XDP program on ifindex:%d device:%s\n", - ifindex, ifname); - bpf_set_link_xdp_fd(ifindex, -1, xdp_flags); - } else if (!curr_prog_id) { - printf("couldn't find a prog id on a given iface\n"); - } else { - printf("program on interface changed, not removing\n"); - } - } - /* Detach tracepoints */ - while (tp_cnt) - bpf_link__destroy(tp_links[--tp_cnt]); - - exit(EXIT_OK); -} - static void print_avail_progs(struct bpf_object *obj) { struct bpf_program *pos; @@ -136,6 +66,8 @@ static void usage(char *argv[], struct bpf_object *obj) { int i; + sample_print_help(mask); + printf("\nDOCUMENTATION:\n%s\n", __doc__); printf("\n"); printf(" Usage: %s (options-see-below)\n", argv[0]); @@ -150,428 +82,11 @@ static void usage(char *argv[], struct bpf_object *obj) long_options[i].val); printf("\n"); } - printf("\n Programs to be used for --progname:\n"); + printf("\n Programs to be used for -p/--progname:\n"); print_avail_progs(obj); printf("\n"); } -/* gettime returns the current time of day in nanoseconds. - * Cost: clock_gettime (ns) => 26ns (CLOCK_MONOTONIC) - * clock_gettime (ns) => 9ns (CLOCK_MONOTONIC_COARSE) - */ -#define NANOSEC_PER_SEC 1000000000 /* 10^9 */ -static __u64 gettime(void) -{ - struct timespec t; - int res; - - res = clock_gettime(CLOCK_MONOTONIC, &t); - if (res < 0) { - fprintf(stderr, "Error with gettimeofday! (%i)\n", res); - exit(EXIT_FAIL); - } - return (__u64) t.tv_sec * NANOSEC_PER_SEC + t.tv_nsec; -} - -/* Common stats data record shared with _kern.c */ -struct datarec { - __u64 processed; - __u64 dropped; - __u64 issue; - __u64 xdp_pass; - __u64 xdp_drop; - __u64 xdp_redirect; -}; -struct record { - __u64 timestamp; - struct datarec total; - struct datarec *cpu; -}; -struct stats_record { - struct record rx_cnt; - struct record redir_err; - struct record kthread; - struct record exception; - struct record enq[]; -}; - -static bool map_collect_percpu(int fd, __u32 key, struct record *rec) -{ - /* For percpu maps, userspace gets a value per possible CPU */ - unsigned int nr_cpus = bpf_num_possible_cpus(); - struct datarec values[nr_cpus]; - __u64 sum_xdp_redirect = 0; - __u64 sum_xdp_pass = 0; - __u64 sum_xdp_drop = 0; - __u64 sum_processed = 0; - __u64 sum_dropped = 0; - __u64 sum_issue = 0; - int i; - - if ((bpf_map_lookup_elem(fd, &key, values)) != 0) { - fprintf(stderr, - "ERR: bpf_map_lookup_elem failed key:0x%X\n", key); - return false; - } - /* Get time as close as possible to reading map contents */ - rec->timestamp = gettime(); - - /* Record and sum values from each CPU */ - for (i = 0; i < nr_cpus; i++) { - rec->cpu[i].processed = values[i].processed; - sum_processed += values[i].processed; - rec->cpu[i].dropped = values[i].dropped; - sum_dropped += values[i].dropped; - rec->cpu[i].issue = values[i].issue; - sum_issue += values[i].issue; - rec->cpu[i].xdp_pass = values[i].xdp_pass; - sum_xdp_pass += values[i].xdp_pass; - rec->cpu[i].xdp_drop = values[i].xdp_drop; - sum_xdp_drop += values[i].xdp_drop; - rec->cpu[i].xdp_redirect = values[i].xdp_redirect; - sum_xdp_redirect += values[i].xdp_redirect; - } - rec->total.processed = sum_processed; - rec->total.dropped = sum_dropped; - rec->total.issue = sum_issue; - rec->total.xdp_pass = sum_xdp_pass; - rec->total.xdp_drop = sum_xdp_drop; - rec->total.xdp_redirect = sum_xdp_redirect; - return true; -} - -static struct datarec *alloc_record_per_cpu(void) -{ - unsigned int nr_cpus = bpf_num_possible_cpus(); - struct datarec *array; - - array = calloc(nr_cpus, sizeof(struct datarec)); - if (!array) { - fprintf(stderr, "Mem alloc error (nr_cpus:%u)\n", nr_cpus); - exit(EXIT_FAIL_MEM); - } - return array; -} - -static struct stats_record *alloc_stats_record(void) -{ - struct stats_record *rec; - int i, size; - - size = sizeof(*rec) + n_cpus * sizeof(struct record); - rec = malloc(size); - if (!rec) { - fprintf(stderr, "Mem alloc error\n"); - exit(EXIT_FAIL_MEM); - } - memset(rec, 0, size); - rec->rx_cnt.cpu = alloc_record_per_cpu(); - rec->redir_err.cpu = alloc_record_per_cpu(); - rec->kthread.cpu = alloc_record_per_cpu(); - rec->exception.cpu = alloc_record_per_cpu(); - for (i = 0; i < n_cpus; i++) - rec->enq[i].cpu = alloc_record_per_cpu(); - - return rec; -} - -static void free_stats_record(struct stats_record *r) -{ - int i; - - for (i = 0; i < n_cpus; i++) - free(r->enq[i].cpu); - free(r->exception.cpu); - free(r->kthread.cpu); - free(r->redir_err.cpu); - free(r->rx_cnt.cpu); - free(r); -} - -static double calc_period(struct record *r, struct record *p) -{ - double period_ = 0; - __u64 period = 0; - - period = r->timestamp - p->timestamp; - if (period > 0) - period_ = ((double) period / NANOSEC_PER_SEC); - - return period_; -} - -static __u64 calc_pps(struct datarec *r, struct datarec *p, double period_) -{ - __u64 packets = 0; - __u64 pps = 0; - - if (period_ > 0) { - packets = r->processed - p->processed; - pps = packets / period_; - } - return pps; -} - -static __u64 calc_drop_pps(struct datarec *r, struct datarec *p, double period_) -{ - __u64 packets = 0; - __u64 pps = 0; - - if (period_ > 0) { - packets = r->dropped - p->dropped; - pps = packets / period_; - } - return pps; -} - -static __u64 calc_errs_pps(struct datarec *r, - struct datarec *p, double period_) -{ - __u64 packets = 0; - __u64 pps = 0; - - if (period_ > 0) { - packets = r->issue - p->issue; - pps = packets / period_; - } - return pps; -} - -static void calc_xdp_pps(struct datarec *r, struct datarec *p, - double *xdp_pass, double *xdp_drop, - double *xdp_redirect, double period_) -{ - *xdp_pass = 0, *xdp_drop = 0, *xdp_redirect = 0; - if (period_ > 0) { - *xdp_redirect = (r->xdp_redirect - p->xdp_redirect) / period_; - *xdp_pass = (r->xdp_pass - p->xdp_pass) / period_; - *xdp_drop = (r->xdp_drop - p->xdp_drop) / period_; - } -} - -static void stats_print(struct stats_record *stats_rec, - struct stats_record *stats_prev, - char *prog_name, char *mprog_name, int mprog_fd) -{ - unsigned int nr_cpus = bpf_num_possible_cpus(); - double pps = 0, drop = 0, err = 0; - bool mprog_enabled = false; - struct record *rec, *prev; - int to_cpu; - double t; - int i; - - if (mprog_fd > 0) - mprog_enabled = true; - - /* Header */ - printf("Running XDP/eBPF prog_name:%s\n", prog_name); - printf("%-15s %-7s %-14s %-11s %-9s\n", - "XDP-cpumap", "CPU:to", "pps", "drop-pps", "extra-info"); - - /* XDP rx_cnt */ - { - char *fmt_rx = "%-15s %-7d %'-14.0f %'-11.0f %'-10.0f %s\n"; - char *fm2_rx = "%-15s %-7s %'-14.0f %'-11.0f\n"; - char *errstr = ""; - - rec = &stats_rec->rx_cnt; - prev = &stats_prev->rx_cnt; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop_pps(r, p, t); - err = calc_errs_pps(r, p, t); - if (err > 0) - errstr = "cpu-dest/err"; - if (pps > 0) - printf(fmt_rx, "XDP-RX", - i, pps, drop, err, errstr); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop_pps(&rec->total, &prev->total, t); - err = calc_errs_pps(&rec->total, &prev->total, t); - printf(fm2_rx, "XDP-RX", "total", pps, drop); - } - - /* cpumap enqueue stats */ - for (to_cpu = 0; to_cpu < n_cpus; to_cpu++) { - char *fmt = "%-15s %3d:%-3d %'-14.0f %'-11.0f %'-10.2f %s\n"; - char *fm2 = "%-15s %3s:%-3d %'-14.0f %'-11.0f %'-10.2f %s\n"; - char *errstr = ""; - - rec = &stats_rec->enq[to_cpu]; - prev = &stats_prev->enq[to_cpu]; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop_pps(r, p, t); - err = calc_errs_pps(r, p, t); - if (err > 0) { - errstr = "bulk-average"; - err = pps / err; /* calc average bulk size */ - } - if (pps > 0) - printf(fmt, "cpumap-enqueue", - i, to_cpu, pps, drop, err, errstr); - } - pps = calc_pps(&rec->total, &prev->total, t); - if (pps > 0) { - drop = calc_drop_pps(&rec->total, &prev->total, t); - err = calc_errs_pps(&rec->total, &prev->total, t); - if (err > 0) { - errstr = "bulk-average"; - err = pps / err; /* calc average bulk size */ - } - printf(fm2, "cpumap-enqueue", - "sum", to_cpu, pps, drop, err, errstr); - } - } - - /* cpumap kthread stats */ - { - char *fmt_k = "%-15s %-7d %'-14.0f %'-11.0f %'-10.0f %s\n"; - char *fm2_k = "%-15s %-7s %'-14.0f %'-11.0f %'-10.0f %s\n"; - char *e_str = ""; - - rec = &stats_rec->kthread; - prev = &stats_prev->kthread; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop_pps(r, p, t); - err = calc_errs_pps(r, p, t); - if (err > 0) - e_str = "sched"; - if (pps > 0) - printf(fmt_k, "cpumap_kthread", - i, pps, drop, err, e_str); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop_pps(&rec->total, &prev->total, t); - err = calc_errs_pps(&rec->total, &prev->total, t); - if (err > 0) - e_str = "sched-sum"; - printf(fm2_k, "cpumap_kthread", "total", pps, drop, err, e_str); - } - - /* XDP redirect err tracepoints (very unlikely) */ - { - char *fmt_err = "%-15s %-7d %'-14.0f %'-11.0f\n"; - char *fm2_err = "%-15s %-7s %'-14.0f %'-11.0f\n"; - - rec = &stats_rec->redir_err; - prev = &stats_prev->redir_err; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop_pps(r, p, t); - if (pps > 0) - printf(fmt_err, "redirect_err", i, pps, drop); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop_pps(&rec->total, &prev->total, t); - printf(fm2_err, "redirect_err", "total", pps, drop); - } - - /* XDP general exception tracepoints */ - { - char *fmt_err = "%-15s %-7d %'-14.0f %'-11.0f\n"; - char *fm2_err = "%-15s %-7s %'-14.0f %'-11.0f\n"; - - rec = &stats_rec->exception; - prev = &stats_prev->exception; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - pps = calc_pps(r, p, t); - drop = calc_drop_pps(r, p, t); - if (pps > 0) - printf(fmt_err, "xdp_exception", i, pps, drop); - } - pps = calc_pps(&rec->total, &prev->total, t); - drop = calc_drop_pps(&rec->total, &prev->total, t); - printf(fm2_err, "xdp_exception", "total", pps, drop); - } - - /* CPUMAP attached XDP program that runs on remote/destination CPU */ - if (mprog_enabled) { - char *fmt_k = "%-15s %-7d %'-14.0f %'-11.0f %'-10.0f\n"; - char *fm2_k = "%-15s %-7s %'-14.0f %'-11.0f %'-10.0f\n"; - double xdp_pass, xdp_drop, xdp_redirect; - - printf("\n2nd remote XDP/eBPF prog_name: %s\n", mprog_name); - printf("%-15s %-7s %-14s %-11s %-9s\n", - "XDP-cpumap", "CPU:to", "xdp-pass", "xdp-drop", "xdp-redir"); - - rec = &stats_rec->kthread; - prev = &stats_prev->kthread; - t = calc_period(rec, prev); - for (i = 0; i < nr_cpus; i++) { - struct datarec *r = &rec->cpu[i]; - struct datarec *p = &prev->cpu[i]; - - calc_xdp_pps(r, p, &xdp_pass, &xdp_drop, - &xdp_redirect, t); - if (xdp_pass > 0 || xdp_drop > 0 || xdp_redirect > 0) - printf(fmt_k, "xdp-in-kthread", i, xdp_pass, xdp_drop, - xdp_redirect); - } - calc_xdp_pps(&rec->total, &prev->total, &xdp_pass, &xdp_drop, - &xdp_redirect, t); - printf(fm2_k, "xdp-in-kthread", "total", xdp_pass, xdp_drop, xdp_redirect); - } - - printf("\n"); - fflush(stdout); -} - -static void stats_collect(struct stats_record *rec) -{ - int fd, i; - - fd = map_fds[RX_CNT]; - map_collect_percpu(fd, 0, &rec->rx_cnt); - - fd = map_fds[REDIRECT_ERR_CNT]; - map_collect_percpu(fd, 1, &rec->redir_err); - - fd = map_fds[CPUMAP_ENQUEUE_CNT]; - for (i = 0; i < n_cpus; i++) - map_collect_percpu(fd, i, &rec->enq[i]); - - fd = map_fds[CPUMAP_KTHREAD_CNT]; - map_collect_percpu(fd, 0, &rec->kthread); - - fd = map_fds[EXCEPTION_CNT]; - map_collect_percpu(fd, 0, &rec->exception); -} - - -/* Pointer swap trick */ -static inline void swap(struct stats_record **a, struct stats_record **b) -{ - struct stats_record *tmp; - - tmp = *a; - *a = *b; - *b = tmp; -} - static int create_cpu_entry(__u32 cpu, struct bpf_cpumap_val *value, __u32 avail_idx, bool new) { @@ -582,38 +97,40 @@ static int create_cpu_entry(__u32 cpu, struct bpf_cpumap_val *value, /* Add a CPU entry to cpumap, as this allocate a cpu entry in * the kernel for the cpu. */ - ret = bpf_map_update_elem(map_fds[CPU_MAP], &cpu, value, 0); - if (ret) { - fprintf(stderr, "Create CPU entry failed (err:%d)\n", ret); - exit(EXIT_FAIL_BPF); + ret = bpf_map_update_elem(map_fd, &cpu, value, 0); + if (ret < 0) { + fprintf(stderr, "Create CPU entry failed: %s\n", strerror(errno)); + return ret; } /* Inform bpf_prog's that a new CPU is available to select * from via some control maps. */ - ret = bpf_map_update_elem(map_fds[CPUS_AVAILABLE], &avail_idx, &cpu, 0); - if (ret) { - fprintf(stderr, "Add to avail CPUs failed\n"); - exit(EXIT_FAIL_BPF); + ret = bpf_map_update_elem(avail_fd, &avail_idx, &cpu, 0); + if (ret < 0) { + fprintf(stderr, "Add to avail CPUs failed: %s\n", strerror(errno)); + return ret; } /* When not replacing/updating existing entry, bump the count */ - ret = bpf_map_lookup_elem(map_fds[CPUS_COUNT], &key, &curr_cpus_count); - if (ret) { - fprintf(stderr, "Failed reading curr cpus_count\n"); - exit(EXIT_FAIL_BPF); + ret = bpf_map_lookup_elem(count_fd, &key, &curr_cpus_count); + if (ret < 0) { + fprintf(stderr, "Failed reading curr cpus_count: %s\n", + strerror(errno)); + return ret; } if (new) { curr_cpus_count++; - ret = bpf_map_update_elem(map_fds[CPUS_COUNT], &key, + ret = bpf_map_update_elem(count_fd, &key, &curr_cpus_count, 0); - if (ret) { - fprintf(stderr, "Failed write curr cpus_count\n"); - exit(EXIT_FAIL_BPF); + if (ret < 0) { + fprintf(stderr, "Failed write curr cpus_count: %s\n", + strerror(errno)); + return ret; } } - /* map_fd[7] = cpus_iterator */ - printf("%s CPU:%u as idx:%u qsize:%d prog_fd: %d (cpus_count:%u)\n", + + printf("%s CPU:%u as idx:%u qsize:%d cpumap_prog_fd: %d (cpus_count:%u)\n", new ? "Add-new":"Replace", cpu, avail_idx, value->qsize, value->bpf_prog.fd, curr_cpus_count); @@ -623,24 +140,29 @@ static int create_cpu_entry(__u32 cpu, struct bpf_cpumap_val *value, /* CPUs are zero-indexed. Thus, add a special sentinel default value * in map cpus_available to mark CPU index'es not configured */ -static void mark_cpus_unavailable(void) +static int mark_cpus_unavailable(void) { - __u32 invalid_cpu = n_cpus; - int ret, i; + int ret, i, n_cpus = get_nprocs_conf(); + __u32 invalid_cpu; for (i = 0; i < n_cpus; i++) { - ret = bpf_map_update_elem(map_fds[CPUS_AVAILABLE], &i, + ret = bpf_map_update_elem(avail_fd, &i, &invalid_cpu, 0); - if (ret) { - fprintf(stderr, "Failed marking CPU unavailable\n"); - exit(EXIT_FAIL_BPF); + if (ret < 0) { + fprintf(stderr, "Failed marking CPU unavailable: %s\n", + strerror(errno)); + return ret; } } + + return 0; } /* Stress cpumap management code by concurrently changing underlying cpumap */ -static void stress_cpumap(struct bpf_cpumap_val *value) +static void stress_cpumap(void *ctx) { + struct bpf_cpumap_val *value = ctx; + /* Changing qsize will cause kernel to free and alloc a new * bpf_cpu_map_entry, with an associated/complicated tear-down * procedure. @@ -653,142 +175,54 @@ static void stress_cpumap(struct bpf_cpumap_val *value) create_cpu_entry(1, value, 0, false); } -static void stats_poll(int interval, bool use_separators, char *prog_name, - char *mprog_name, struct bpf_cpumap_val *value, - bool stress_mode) -{ - struct stats_record *record, *prev; - int mprog_fd; - - record = alloc_stats_record(); - prev = alloc_stats_record(); - stats_collect(record); - - /* Trick to pretty printf with thousands separators use %' */ - if (use_separators) - setlocale(LC_NUMERIC, "en_US"); - - while (1) { - swap(&prev, &record); - mprog_fd = value->bpf_prog.fd; - stats_collect(record); - stats_print(record, prev, prog_name, mprog_name, mprog_fd); - sleep(interval); - if (stress_mode) - stress_cpumap(value); - } - - free_stats_record(record); - free_stats_record(prev); -} - -static int init_tracepoints(struct bpf_object *obj) +static int set_cpumap_prog(struct xdp_redirect_cpu *skel, char *redir_interface) { - struct bpf_program *prog; - - bpf_object__for_each_program(prog, obj) { - if (bpf_program__is_tracepoint(prog) != true) - continue; - - tp_links[tp_cnt] = bpf_program__attach(prog); - if (libbpf_get_error(tp_links[tp_cnt])) { - tp_links[tp_cnt] = NULL; - return -EINVAL; - } - tp_cnt++; - } - - return 0; -} - -static int init_map_fds(struct bpf_object *obj) -{ - enum map_type type; - - for (type = 0; type < NUM_MAP; type++) { - map_fds[type] = - bpf_object__find_map_fd_by_name(obj, - map_type_strings[type]); - - if (map_fds[type] < 0) - return -ENOENT; - } + struct bpf_devmap_val val = {}; + int ifindex_out, err; + __u32 key = 0; - return 0; -} + if (!redir_interface) + return 0; -static int load_cpumap_prog(char *file_name, char *prog_name, - char *redir_interface, char *redir_map) -{ - struct bpf_prog_load_attr prog_load_attr = { - .prog_type = BPF_PROG_TYPE_XDP, - .expected_attach_type = BPF_XDP_CPUMAP, - .file = file_name, - }; - struct bpf_program *prog; - struct bpf_object *obj; - int fd; - - if (bpf_prog_load_xattr(&prog_load_attr, &obj, &fd)) + ifindex_out = if_nametoindex(redir_interface); + if (!ifindex_out) return -1; - if (fd < 0) { - fprintf(stderr, "ERR: bpf_prog_load_xattr: %s\n", - strerror(errno)); - return fd; - } - - if (redir_interface && redir_map) { - int err, map_fd, ifindex_out, key = 0; - - map_fd = bpf_object__find_map_fd_by_name(obj, redir_map); - if (map_fd < 0) - return map_fd; - - ifindex_out = if_nametoindex(redir_interface); - if (!ifindex_out) - return -1; - - err = bpf_map_update_elem(map_fd, &key, &ifindex_out, 0); - if (err < 0) - return err; + if (get_mac_addr(ifindex_out, skel->bss->tx_mac_addr) < 0) { + printf("get interface %d mac failed\n", ifindex_out); + return -1; } - prog = bpf_object__find_program_by_title(obj, prog_name); - if (!prog) { - fprintf(stderr, "bpf_object__find_program_by_title failed\n"); - return EXIT_FAIL; - } + val.ifindex = ifindex_out; + val.bpf_prog.fd = bpf_program__fd(skel->progs.xdp_redirect_egress_prog); + err = bpf_map_update_elem(bpf_map__fd(skel->maps.tx_port), &key, &val, 0); + if (err < 0) + return err; - return bpf_program__fd(prog); + return bpf_program__fd(skel->progs.xdp_redirect_cpu_devmap); } int main(int argc, char **argv) { - char *prog_name = "xdp_cpu_map5_lb_hash_ip_pairs"; - char *mprog_filename = "xdp_redirect_kern.o"; - char *redir_interface = NULL, *redir_map = NULL; - char *mprog_name = "xdp_redirect_dummy"; - bool mprog_disable = false; - struct bpf_prog_load_attr prog_load_attr = { - .prog_type = BPF_PROG_TYPE_UNSPEC, - }; - struct bpf_prog_info info = {}; - __u32 info_len = sizeof(info); + struct xdp_redirect_cpu *skel; + char *redir_interface = NULL; + char ifname_buf[IF_NAMESIZE]; struct bpf_cpumap_val value; - bool use_separators = true; + int ret = EXIT_FAIL_OPTION; + unsigned long interval = 2; bool stress_mode = false; struct bpf_program *prog; - struct bpf_object *obj; - int err = EXIT_FAIL; - char filename[256]; + const char *prog_name; + bool generic = false; + bool force = false; int added_cpus = 0; int longindex = 0; - int interval = 2; int add_cpu = -1; - int opt, prog_fd; - int *cpu, i; + int ifindex = -1; + int *cpu, i, opt; + char *ifname; __u32 qsize; + int n_cpus; n_cpus = get_nprocs_conf(); @@ -810,85 +244,79 @@ int main(int argc, char **argv) */ qsize = 2048; - snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); - prog_load_attr.file = filename; - - if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) - return err; - - if (prog_fd < 0) { - fprintf(stderr, "ERR: bpf_prog_load_xattr: %s\n", + skel = xdp_redirect_cpu__open_and_load(); + if (!skel) { + fprintf(stderr, "Failed to xdp_redirect_cpu__open_and_load: %s\n", strerror(errno)); - return err; + ret = EXIT_FAIL_BPF; + goto end; } - if (init_tracepoints(obj) < 0) { - fprintf(stderr, "ERR: bpf_program__attach failed\n"); - return err; - } + map_fd = bpf_map__fd(skel->maps.cpu_map); + avail_fd = bpf_map__fd(skel->maps.cpus_available); + count_fd = bpf_map__fd(skel->maps.cpus_count); - if (init_map_fds(obj) < 0) { - fprintf(stderr, "bpf_object__find_map_fd_by_name failed\n"); - return err; + ret = mark_cpus_unavailable(); + if (ret < 0) { + fprintf(stderr, "Unable to mark CPUs as unavailable\n"); + goto end_destroy; } - mark_cpus_unavailable(); - cpu = malloc(n_cpus * sizeof(int)); + cpu = calloc(n_cpus, sizeof(int)); if (!cpu) { - fprintf(stderr, "failed to allocate cpu array\n"); - return err; + fprintf(stderr, "Failed to allocate cpu array\n"); + goto end_destroy; } - memset(cpu, 0, n_cpus * sizeof(int)); - /* Parse commands line args */ - while ((opt = getopt_long(argc, argv, "hSd:s:p:q:c:xzFf:e:r:m:", + prog = skel->progs.xdp_prognum5_lb_hash_ip_pairs; + while ((opt = getopt_long(argc, argv, "d:si:Sxp:r:c:q:Fvh", long_options, &longindex)) != -1) { switch (opt) { case 'd': if (strlen(optarg) >= IF_NAMESIZE) { fprintf(stderr, "ERR: --dev name too long\n"); - goto error; + goto end_cpu; } ifname = (char *)&ifname_buf; - strncpy(ifname, optarg, IF_NAMESIZE); + safe_strncpy(ifname, optarg, sizeof(ifname)); ifindex = if_nametoindex(ifname); if (ifindex == 0) { fprintf(stderr, "ERR: --dev name unknown err(%d):%s\n", errno, strerror(errno)); - goto error; + goto end_cpu; } break; case 's': - interval = atoi(optarg); + mask |= SAMPLE_REDIRECT_MAP_CNT; + break; + case 'i': + interval = strtoul(optarg, NULL, 0); break; case 'S': - xdp_flags |= XDP_FLAGS_SKB_MODE; + generic = true; break; case 'x': stress_mode = true; break; - case 'z': - use_separators = false; - break; case 'p': /* Selecting eBPF prog to load */ prog_name = optarg; - break; - case 'n': - mprog_disable = true; - break; - case 'f': - mprog_filename = optarg; - break; - case 'e': - mprog_name = optarg; + prog = bpf_object__find_program_by_title(skel->obj, + prog_name); + if (!prog) { + fprintf(stderr, + "Failed to find program %s specified by" + " option -p/--progname\n", + prog_name); + usage(argv, skel->obj); + goto end_cpu; + } break; case 'r': redir_interface = optarg; - break; - case 'm': - redir_map = optarg; + mask |= SAMPLE_DEVMAP_XMIT_CNT | + SAMPLE_DEVMAP_XMIT_CNT_MULTI; break; case 'c': /* Add multiple CPUs */ @@ -897,91 +325,75 @@ int main(int argc, char **argv) fprintf(stderr, "--cpu nr too large for cpumap err(%d):%s\n", errno, strerror(errno)); - goto error; + goto end_cpu; } cpu[added_cpus++] = add_cpu; break; case 'q': - qsize = atoi(optarg); + qsize = strtoul(optarg, NULL, 0); break; case 'F': - xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + force = true; + break; + case 'v': + sample_switch_mode(); break; case 'h': - error: default: - free(cpu); - usage(argv, obj); - return EXIT_FAIL_OPTION; + usage(argv, skel->obj); + goto end_cpu; } } - if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) - xdp_flags |= XDP_FLAGS_DRV_MODE; + ret = sample_init(skel, mask); + if (ret < 0) { + fprintf(stderr, "Failed to initialize sample: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_cpu; + } /* Required option */ if (ifindex == -1) { fprintf(stderr, "ERR: required option --dev missing\n"); - usage(argv, obj); - err = EXIT_FAIL_OPTION; - goto out; + usage(argv, skel->obj); + goto end_cpu; } /* Required option */ if (add_cpu == -1) { - fprintf(stderr, "ERR: required option --cpu missing\n"); - fprintf(stderr, " Specify multiple --cpu option to add more\n"); - usage(argv, obj); - err = EXIT_FAIL_OPTION; - goto out; + fprintf(stderr, "ERR: required option --cpu missing\n" + "Specify multiple --cpu option to add more\n"); + usage(argv, skel->obj); + ret = EXIT_FAIL_OPTION; + goto end_cpu; } - value.bpf_prog.fd = 0; - if (!mprog_disable) - value.bpf_prog.fd = load_cpumap_prog(mprog_filename, mprog_name, - redir_interface, redir_map); + value.bpf_prog.fd = set_cpumap_prog(skel, redir_interface); if (value.bpf_prog.fd < 0) { - err = value.bpf_prog.fd; - goto out; + fprintf(stderr, "Failed to set cpumap prog to redirect to interface %s\n", + redir_interface); + ret = EXIT_FAIL_BPF; + goto end_cpu; } value.qsize = qsize; for (i = 0; i < added_cpus; i++) create_cpu_entry(cpu[i], &value, i, true); - /* Remove XDP program when program is interrupted or killed */ - signal(SIGINT, int_exit); - signal(SIGTERM, int_exit); - - prog = bpf_object__find_program_by_title(obj, prog_name); - if (!prog) { - fprintf(stderr, "bpf_object__find_program_by_title failed\n"); - goto out; - } - - prog_fd = bpf_program__fd(prog); - if (prog_fd < 0) { - fprintf(stderr, "bpf_program__fd failed\n"); - goto out; - } - - if (bpf_set_link_xdp_fd(ifindex, prog_fd, xdp_flags) < 0) { - fprintf(stderr, "link set xdp fd failed\n"); - err = EXIT_FAIL_XDP; - goto out; - } + ret = EXIT_FAIL_XDP; + if (sample_install_xdp(prog, ifindex, generic, force) < 0) + goto end_cpu; - err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); - if (err) { - printf("can't get prog info - %s\n", strerror(errno)); - goto out; + ret = sample_run(interval, stress_mode ? stress_cpumap : NULL, &value); + if (ret < 0) { + fprintf(stderr, "Failed during sample run: %s\n", strerror(-ret)); + ret = EXIT_FAIL; + goto end_cpu; } - prog_id = info.id; - - stats_poll(interval, use_separators, prog_name, mprog_name, - &value, stress_mode); - - err = EXIT_OK; -out: + ret = EXIT_OK; +end_cpu: free(cpu); - return err; +end_destroy: + xdp_redirect_cpu__destroy(skel); +end: + sample_exit(ret); }