diff mbox series

[RFC,bpf-next,1/2] bpf: Add generic kfunc bpf_ffs64()

Message ID 20240131155607.51157-2-hffilwlqm@gmail.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series bpf: Add generic kfunc bpf_ffs64() | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 1094 this patch: 1095
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 1066 this patch: 1066
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 1111 this patch: 1112
netdev/checkpatch warning WARNING: Commit log lines starting with '#' are dropped by git as comments
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Leon Hwang Jan. 31, 2024, 3:56 p.m. UTC
On XDP-based virtual network gateway, ffs (aka find first set) algorithm
is used to find the index of the very first 1-value bit in a bitmap,
which is an array of u64, in the gateway's ACL module.

The ACL module was designed from these two papers:

* "eBPF / XDP based firewall and packet filtering"[1]
* "Securing Linux with a Faster and Scalable Iptables"[2]

In the ACL module, the key details are:

1. Match source address to get a bitmap.
2. Match destination address to get a bitmap.
3. Match l4 protocol to get a bitmap.
4. Match source port to get a bitmap.
5. Match destination port to get a bitmap.

Finally, by traversing these 5 bitmaps and doing bitwise-and on 5 u64s
meanwhile, for every bitwise-and result, an u64, if it's not zero, do
ffs to find the index of the very first 1-value bit in the result. When
the index is found, convert it to a rule index of a rule policy bpf map,
whose type is BPF_MAP_TYPE_ARRAY or BPF_MAP_TYPE_PERCPU_ARRAY.

If __ffs64() kernel function can be reused in bpf, it can save some time in
finding the index of the very first 1-value bit in an u64.

Like AVX2, __ffs64() will be compiled to one instruction, "rep bsf", on
x86.

Then, I do compare bpf-implemented __ffs64() with this kfunc bpf_ffs64()
with following bpf code snippet:

#include "vmlinux.h"

#include "bpf/bpf_helpers.h"

unsigned long bpf_ffs64(u64 word) __ksym;

static __noinline __u64
__ffs64(__u64 word)
{
	__u64 shift = 0;
	if ((word & 0xffffffff) == 0) {
		word >>= 32;
		shift += 32;
	}
	if ((word & 0xffff) == 0) {
		word >>= 16;
		shift += 16;
	}
	if ((word & 0xff) == 0) {
		word >>= 8;
		shift += 8;
	}
	if ((word & 0xf) == 0) {
		word >>= 4;
		shift += 4;
	}
	if ((word & 0x3) == 0) {
		word >>= 2;
		shift += 2;
	}
	if ((word & 0x1) == 0) {
		shift += 1;
	}

	return shift;
}

SEC("tc")
int tc_ffs1(struct __sk_buff *skb)
{
	void *data_end = (void *)(long) skb->data_end;
	u64 *data = (u64 *)(long) skb->data;

	if ((void *)(u64) (data + 1) > data_end)
		return 0;

	return __ffs64(*data);
}

SEC("tc")
int tc_ffs2(struct __sk_buff *skb)
{
	void *data_end = (void *)(long) skb->data_end;
	u64 *data = (u64 *)(long) skb->data;

	if ((void *)(u64) (data + 1) > data_end)
		return 0;

	return bpf_ffs64(*data);
}

char _license[] SEC("license") = "GPL";

Then, I run them on a KVM-based VM, which runs on a 48 cores and "Intel(R)
Xeon(R) Silver 4116 CPU @ 2.10GHz" CPU server.

As for the 1-value bit offset is 0, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:

+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
|        1 | 164ns         | 154ns             |
|        2 | 166ns         | 155ns             |
|        3 | 160ns         | 154ns             |
|        4 | 161ns         | 157ns             |
|        5 | 161ns         | 155ns             |
|        6 | 163ns         | 155ns             |
|        7 | 164ns         | 155ns             |
|        8 | 159ns         | 159ns             |
|        9 | 171ns         | 154ns             |
|       10 | 164ns         | 156ns             |
|       11 | 161ns         | 155ns             |
|       12 | 160ns         | 155ns             |
|       13 | 161ns         | 154ns             |
|       14 | 165ns         | 154ns             |
|       15 | 161ns         | 162ns             |
|       16 | 161ns         | 157ns             |
|       17 | 164ns         | 154ns             |
|       18 | 162ns         | 154ns             |
|       19 | 159ns         | 156ns             |
|       20 | 160ns         | 154ns             |
+----------+---------------+-------------------+

As for the 1-value bit offset is 63, and for every time the bpf progs run
for 10000000 times, the average time cost data of bpf progs running is:

+----------+---------------+-------------------+
| Nth time | bpf __ffs64() | kfunc bpf_ffs64() |
+----------+---------------+-------------------+
|        1 | 163ns         | 157ns             |
|        2 | 163ns         | 154ns             |
|        3 | 165ns         | 155ns             |
|        4 | 167ns         | 155ns             |
|        5 | 165ns         | 155ns             |
|        6 | 163ns         | 155ns             |
|        7 | 162ns         | 155ns             |
|        8 | 162ns         | 156ns             |
|        9 | 174ns         | 155ns             |
|       10 | 162ns         | 156ns             |
|       11 | 168ns         | 155ns             |
|       12 | 169ns         | 156ns             |
|       13 | 162ns         | 155ns             |
|       14 | 169ns         | 155ns             |
|       15 | 162ns         | 154ns             |
|       16 | 163ns         | 155ns             |
|       17 | 162ns         | 154ns             |
|       18 | 166ns         | 154ns             |
|       19 | 165ns         | 154ns             |
|       20 | 165ns         | 154ns             |
+----------+---------------+-------------------+

As we can see, for every time, bpf __ffs64() costs around 165ns, and
kfunc bpf_ffs64() costs around 155ns. It seems that kfunc bpf_ffs64()
saves 10ns for every time.

If there is 1m PPS on the gateway, kfunc bpf_ffs64() will save much CPU
resource.

Links:

[1] http://vger.kernel.org/lpc_net2018_talks/ebpf-firewall-paper-LPC.pdf
[2] https://mbertrone.github.io/documents/21-Securing_Linux_with_a_Faster_and_Scalable_Iptables.pdf

Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
---
 kernel/bpf/helpers.c | 7 +++++++
 1 file changed, 7 insertions(+)
diff mbox series

Patch

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index bcb951a2ecf4b..4db48a6a04a90 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -23,6 +23,7 @@ 
 #include <linux/btf_ids.h>
 #include <linux/bpf_mem_alloc.h>
 #include <linux/kasan.h>
+#include <linux/bitops.h>
 
 #include "../../lib/kstrtox.h"
 
@@ -2542,6 +2543,11 @@  __bpf_kfunc void bpf_throw(u64 cookie)
 	WARN(1, "A call to BPF exception callback should never return\n");
 }
 
+__bpf_kfunc unsigned long bpf_ffs64(u64 word)
+{
+	return __ffs64(word);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_SET8_START(generic_btf_ids)
@@ -2573,6 +2579,7 @@  BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL)
 #endif
 BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_throw)
+BTF_ID_FLAGS(func, bpf_ffs64)
 BTF_SET8_END(generic_btf_ids)
 
 static const struct btf_kfunc_id_set generic_kfunc_set = {