From patchwork Sun Sep 1 13:38:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Hwang X-Patchwork-Id: 13786453 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54CFA143880 for ; Sun, 1 Sep 2024 13:41:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198069; cv=none; b=GHXZlT+MLhT8jOic3Sg6cjRA0xmsA30NQ5FKobLdQ5ItLNoIFwjVt52bo2cRJl/kFWoG7T6IyUlegMmaalDDIDP1zPeKuUZD2d3tq3co6AKSOf1TLRP4AhFHYRlw+VcDbtJnWpHkcZOXq3hkCgnGbhd7FXzXsypPdsev32ZuuzI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198069; c=relaxed/simple; bh=ANyPNj9nDVgRBY78OaPZekGNmr78hQPSczgo96lUxqA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jB/eAUyPPH/xLQxaaHTZw8JmXJyl9EH6fD3UHb7/sTRXOe3aZIxgzxckNTROzwhWTnYhjO0NznjBU1qdeMR5UA7s6VIX79pxuJ32D9eHSi0y1btAgALR0XKP5qJFYE0BrSa91VulXe7Hgvl8fMIhKd5IW1mToqWMlGFHoRaYbnQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=bw/TqWKk; arc=none smtp.client-ip=91.218.175.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="bw/TqWKk" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725198062; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8HCqzwOX+eH0kqWdRTP/yCjOIus8cQ72D7HFP04MvtA=; b=bw/TqWKkg2e/XcKO3lIZnsOsIM7buVIfxjiki8t9LQw2qBB0vnvGBgUOfUHSgVZOwXL1Zd kKVs6yOqPI3OeMNemSkYv3jJdUIB0zHOAUuaOh5eehlH5+Usm75nW0/oq9bR4Rzogkylcp uaHnMQgk+vK+bRDGr625PIAM+GaAzEU= From: Leon Hwang To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev, puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com, iii@linux.ibm.com, leon.hwang@linux.dev, kernel-patches-bot@fb.com Subject: [PATCH bpf-next v2 1/4] bpf, x64: Fix tailcall infinite loop caused by freplace Date: Sun, 1 Sep 2024 21:38:53 +0800 Message-ID: <20240901133856.64367-2-leon.hwang@linux.dev> In-Reply-To: <20240901133856.64367-1-leon.hwang@linux.dev> References: <20240901133856.64367-1-leon.hwang@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net This patch fixes a tailcall infinite loop issue caused by freplace. Since commit 1c123c567fb1 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog is allowed to tail call its target prog. Then, when a freplace prog attaches to its target prog's subprog and tail calls its target prog, kernel will panic. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include \#include __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog_tc(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include \#include struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. Kernel will panic: [ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____) (stack is (____ptrval____)..(____ptrval____)) [ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI [ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE 6.10.0-rc6-g026dcdae8d3e-dirty #72 [ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Call Trace: [ 15.310490] <#DF> [ 15.310490] ? die+0x36/0x90 [ 15.310490] ? handle_stack_overflow+0x4d/0x60 [ 15.310490] ? exc_double_fault+0x117/0x1a0 [ 15.310490] ? asm_exc_double_fault+0x23/0x30 [ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] [ 15.310490] [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] ... [ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64 [ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b [ 15.310490] bpf_test_run+0x210/0x370 [ 15.310490] ? bpf_test_run+0x128/0x370 [ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0 [ 15.310490] __sys_bpf+0xdbf/0x2c40 [ 15.310490] ? clockevents_program_event+0x52/0xf0 [ 15.310490] ? lock_release+0xbf/0x290 [ 15.310490] __x64_sys_bpf+0x1e/0x30 [ 15.310490] do_syscall_64+0x68/0x140 [ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 15.310490] RIP: 0033:0x7f133b52725d [ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 [ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX: 00007f133b52725d [ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI: 000000000000000a [ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09: 00007ffddbc102a0 [ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000004 [ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15: 00007f133b6ed000 [ 15.310490] [ 15.310490] Modules linked in: bpf_testmod(OE) [ 15.310490] ---[ end trace 0000000000000000 ]--- [ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53 [ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c [ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202 [ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX: 0000000000008cb5 [ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI: ffff9c98808b7e00 [ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09: 0000000000000000 [ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12: ffffb500c01af000 [ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15: 0000000000000000 [ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000) knlGS:0000000000000000 [ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4: 00000000000006f0 [ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt [ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) This patch fixes the issue by initializing tail_call_cnt at the prologue of entry_tc. Next, when call subprog_tc, the tail_call_cnt_ptr is propagated to subprog_tc by rax. Next, when jump to entry_freplace, the tail_call_cnt_ptr will be reused to count tailcall in freplace prog. Next, when call subprog_tail, the tail_call_cnt_ptr is propagated to subprog_tail by rax. Next, while tail calling to entry_tc, the tail_call_cnt on the stack of entry_tc increments via the tail_call_cnt_ptr. The whole procedure shows as the following JITed prog dumping. bpftool p d j n entry_tc: int entry_tc(struct __sk_buff * skb): bpf_prog_1c515f389a9059b4_entry_tc: ; return subprog_tc(skb); 0: endbr64 4: xorq %rax, %rax ;; rax = 0 (tail_call_cnt) 7: nopl (%rax,%rax) c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: cmpq $33, %rax ;; if rax > 33, rax = tcc_ptr 18: ja 0x20 ;; if rax > 33 goto 0x20 ---+ 1a: pushq %rax ;; [rbp - 8] = rax = 0 | 1b: movq %rsp, %rax ;; rax = rbp - 8 | 1e: jmp 0x21 ;; ---------+ | 20: pushq %rax ;; <--------|---------------+ 21: pushq %rax ;; <--------+ [rbp - 16] = rax 22: movq -16(%rbp), %rax ;; rax = tcc_ptr 29: callq 0x70 ;; call subprog_tc() ; return subprog_tc(skb); 2e: leave 2f: retq int subprog_tc(struct __sk_buff * skb): bpf_prog_1e8f76e2374a0607_subprog_tc: ; return skb->len * 2; 0: endbr64 4: nopl (%rax) ;; do not touch tail_call_cnt 7: jmp 0x108 ;; jump to entry_freplace() c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: pushq %rax 15: pushq %rax 16: movl 112(%rdi), %eax ; return skb->len * 2; 19: shll %eax ; return skb->len * 2; 1b: leave 1c: retq bpftool p d j n entry_freplace: int entry_freplace(struct __sk_buff * skb): bpf_prog_85781a698094722f_entry_freplace: ; int entry_freplace(struct __sk_buff *skb) 0: endbr64 4: nopl (%rax) ;; do not touch tail_call_cnt 7: nopl (%rax,%rax) c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: cmpq $33, %rax ;; if rax > 33, rax = tcc_ptr 18: ja 0x20 ;; if rax > 33 goto 0x20 ---+ 1a: pushq %rax ;; [rbp - 8] = rax = 0 | 1b: movq %rsp, %rax ;; rax = rbp - 8 | 1e: jmp 0x21 ;; ---------+ | 20: pushq %rax ;; <--------|---------------+ 21: pushq %rax ;; <--------+ [rbp - 16] = rax 22: pushq %rbx ;; callee saved 23: pushq %r13 ;; callee saved 25: movq %rdi, %rbx ;; rbx = skb (callee saved) ; count++; 28: movabsq $-123406219759616, %r13 32: movl (%r13), %edi 36: addl $1, %edi 39: movl %edi, (%r13) ; subprog_tail(skb); 3d: movq %rbx, %rdi ;; rdi = skb 40: movq -16(%rbp), %rax ;; rax = tcc_ptr 47: callq 0x80 ;; call subprog_tail() ; subprog_tail(skb); 4c: movq %rbx, %rdi ;; rdi = skb 4f: movq -16(%rbp), %rax ;; rax = tcc_ptr 56: callq 0x80 ;; call subprog_tail() ; return count; 5b: movl (%r13), %eax ; return count; 5f: popq %r13 61: popq %rbx 62: leave 63: retq int subprog_tail(struct __sk_buff * skb): bpf_prog_3a140cef239a4b4f_subprog_tail: ; int subprog_tail(struct __sk_buff *skb) 0: endbr64 4: nopl (%rax) ;; do not touch tail_call_cnt 7: nopl (%rax,%rax) c: pushq %rbp d: movq %rsp, %rbp 10: endbr64 14: pushq %rax ;; [rbp - 8] = rax (tcc_ptr) 15: pushq %rax ;; [rbp - 16] = rax (tcc_ptr) 16: pushq %rbx ;; callee saved 17: pushq %r13 ;; callee saved 19: movq %rdi, %rbx ;; rbx = skb ; asm volatile("r1 = %[ctx]\n\t" 1c: movabsq $-128494642337280, %r13 ;; r13 = jmp_table 26: movq %rbx, %rdi ;; 1st arg, skb 29: movq %r13, %rsi ;; 2nd arg, jmp_table 2c: xorl %edx, %edx ;; 3rd arg, index = 0 2e: movq -16(%rbp), %rax ;; rax = [rbp - 16] (tcc_ptr) 35: cmpq $33, (%rax) 39: jae 0x4e ;; if *tcc_ptr >= 33 goto 0x4e --------+ 3b: nopl (%rax,%rax) ;; jmp bypass, toggled by poking | 40: addq $1, (%rax) ;; (*tcc_ptr)++ | 44: popq %r13 ;; callee saved | 46: popq %rbx ;; callee saved | 47: popq %rax ;; undo rbp-16 push | 48: popq %rax ;; undo rbp-8 push | 49: jmp 0xfffffffffffffe18 ;; tail call target, toggled by poking | ; return 0; ;; | 4e: popq %r13 ;; restore callee saved <--------------+ 50: popq %rbx ;; restore callee saved 51: leave 52: retq As a result, the tail_call_cnt is stored on the stack of entry_tc. And the tail_call_cnt_ptr is propagated between subprog_tc, entry_freplace, subprog_tail and entry_tc. But wait, what if entry_freplace has tailcall and entry_tc has no tailcall? It's to disallow attaching this entry_freplace to this entry_tc in verifier. And, what if entry_freplace has tailcall and entry_tc has tailcall and entry_freplace attaches to entry_tc? In this patch, the tailcall info of entry_freplace inherits from its target. Therefore, it swaps the positions of nop5 and xor/nop3 in order to initialize tail_call_cnt at the prologue of entry_tc and then propagate the tail_call_cnt to entry_freplace. Fixes: 1c123c567fb1 ("bpf: Resolve fext program type when checking map compatibility") Signed-off-by: Leon Hwang --- arch/x86/net/bpf_jit_comp.c | 26 ++++++++++++++++++-------- kernel/bpf/verifier.c | 6 ++++++ 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 074b41fafbe3f..143974b3e953b 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -274,6 +274,8 @@ struct jit_context { #define X86_PATCH_SIZE 5 /* Number of bytes that will be skipped on tailcall */ #define X86_TAIL_CALL_OFFSET (12 + ENDBR_INSN_SIZE) +/* Number of extra bytes that will be skipped on poke */ +#define X86_POKE_EXTRA 3 static void push_r12(u8 **pprog) { @@ -441,17 +443,13 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) */ static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, bool tail_call_reachable, bool is_subprog, - bool is_exception_cb) + bool is_exception_cb, bool is_extension) { u8 *prog = *pprog; emit_cfi(&prog, is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash); - /* BPF trampoline can be made to work without these nops, - * but let's waste 5 bytes for now and optimize later - */ - emit_nops(&prog, X86_PATCH_SIZE); if (!ebpf_from_cbpf) { - if (tail_call_reachable && !is_subprog) + if (tail_call_reachable && !is_extension && !is_subprog) /* When it's the entry of the whole tailcall context, * zeroing rax means initialising tail_call_cnt. */ @@ -460,6 +458,10 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, /* Keep the same instruction layout. */ emit_nops(&prog, 3); /* nop3 */ } + /* BPF trampoline can be made to work without these nops, + * but let's waste 5 bytes for now and optimize later + */ + emit_nops(&prog, X86_PATCH_SIZE); /* Exception callback receives FP as third parameter */ if (is_exception_cb) { EMIT3(0x48, 0x89, 0xF4); /* mov rsp, rsi */ @@ -573,10 +575,13 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, /* * See emit_prologue(), for IBT builds the trampoline hook is preceded - * with an ENDBR instruction. + * with an ENDBR instruction and 3 bytes tail_call_cnt initialization + * instruction. */ if (is_endbr(*(u32 *)ip)) ip += ENDBR_INSN_SIZE; + if (is_bpf_text_address((long)ip)) + ip += X86_POKE_EXTRA; return __bpf_arch_text_poke(ip, t, old_addr, new_addr); } @@ -1366,6 +1371,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image int oldproglen, struct jit_context *ctx, bool jmp_padding) { bool tail_call_reachable = bpf_prog->aux->tail_call_reachable; + bool is_extension = bpf_prog->type == BPF_PROG_TYPE_EXT; struct bpf_insn *insn = bpf_prog->insnsi; bool callee_regs_used[4] = {}; int insn_cnt = bpf_prog->len; @@ -1384,7 +1390,8 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image emit_prologue(&prog, bpf_prog->aux->stack_depth, bpf_prog_was_classic(bpf_prog), tail_call_reachable, - bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb); + bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb, + is_extension); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2923,6 +2930,8 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im */ if (is_endbr(*(u32 *)orig_call)) orig_call += ENDBR_INSN_SIZE; + if (is_bpf_text_address((long)orig_call)) + orig_call += X86_POKE_EXTRA; orig_call += X86_PATCH_SIZE; } @@ -3025,6 +3034,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im /* remember return value in a stack for bpf prog to access */ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8); im->ip_after_call = image + (prog - (u8 *)rw_image); + emit_nops(&prog, X86_POKE_EXTRA); emit_nops(&prog, X86_PATCH_SIZE); } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 217eb0eafa2a6..0a92fd4e0401e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -21898,6 +21898,12 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, bpf_log(log, "Cannot extend fentry/fexit\n"); return -EINVAL; } + if (prog_extension && prog->aux->tail_call_reachable && + !(aux->func ? aux->func[subprog]->aux->tail_call_reachable : + aux->tail_call_reachable)) { + bpf_log(log, "Cannot extend no-tailcall target with tailcall ext prog\n"); + return -EINVAL; + } } else { if (prog_extension) { bpf_log(log, "Cannot replace kernel functions\n"); From patchwork Sun Sep 1 13:38:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Hwang X-Patchwork-Id: 13786454 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CCFA17E01A for ; Sun, 1 Sep 2024 13:41:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198073; cv=none; b=RAHVyJhjOHkGcC6hB/xlreREyZoye/TNTw4RFGWS+nZ0DRIA/TJ9o2WfQXcvRr4KN6T6VEY0oAuOV12EJDsFeA9l5fs+C2NBVm7RdPYD++X2CAP0iIVuP1u/kz0sDfTDFM8IfH6hmB9BU+IMvZtEK7YcnbsmRh3YiK80Y5kXksc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198073; c=relaxed/simple; bh=hl/3D5IoLkxUm/uJLAtbhp84PmcSNO046+P3bpIZVdc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gkFrgSY26WeCooU5eMB/bVYXRsbWY9vDDbN4YSx3XhmQLMDCUcXeSCjbcWW95dGktaUHCgIvNVm9PoZ3GJLOs/yMRoV/8RKhwWUaKzvnxAUaVWnbj7iUAXjEjONXJC3rS9N5tJ3Qclkq5GYZURBPC+OQeCP2a8diDQDXfUdmXE4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=gZzV+B1h; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="gZzV+B1h" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725198068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gVrNiBR2Rj0lYUi1R/vLWhIcSmCKwjLMWxkO6agL3ZI=; b=gZzV+B1hgOOTVH9LWo5ade+/Oe9un5ow3lxuCxsFd3UumiUVKb4YLcVp7G8F523A6NiT3l ZZHaNtPUQqVg7+dwR+P8HtGgGoBkS8UYrz/hktbB3YDi9++r7lpOgAeWssBemgq5e9zJ0r a3gbnTEMsToDQp/2Kd6cSsV/INkuVQg= From: Leon Hwang To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev, puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com, iii@linux.ibm.com, leon.hwang@linux.dev, kernel-patches-bot@fb.com Subject: [PATCH bpf-next v2 2/4] bpf, arm64: Fix tailcall infinite loop caused by freplace Date: Sun, 1 Sep 2024 21:38:54 +0800 Message-ID: <20240901133856.64367-3-leon.hwang@linux.dev> In-Reply-To: <20240901133856.64367-1-leon.hwang@linux.dev> References: <20240901133856.64367-1-leon.hwang@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net Like "bpf, x64: Fix tailcall infinite loop caused by freplace", the same issue happens on arm64, too. For example: tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include \#include __noinline int subprog_tc(struct __sk_buff *skb) { return skb->len * 2; } SEC("tc") int entry_tc(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tailcall_bpf2bpf_hierarchy_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include \#include struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; static __noinline int subprog_tail(struct __sk_buff *skb) { bpf_tail_call_static(skb, &jmp_table, 0); return 0; } SEC("freplace") int entry_freplace(struct __sk_buff *skb) { count++; subprog_tail(skb); subprog_tail(skb); return count; } char __license[] SEC("license") = "GPL"; The attach target of entry_freplace is subprog_tc, and the tail callee in subprog_tail is entry_tc. Then, the infinite loop will be entry_tc -> entry_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count from zero for every time of entry_freplace execution. This patch fixes the issue by avoiding touching tail_call_cnt at prologue when it's subprog or freplace prog. Then, when freplace prog attaches to entry_tc, it has to initialize tail_call_cnt and tail_call_cnt_ptr, because its target is main prog and its target's prologue hasn't initialize them before the attach hook. So, this patch uses x7 register to tell freplace prog that its target prog is main prog or not. Meanwhile, while tail calling to a freplace prog, it is required to reset x7 register to prevent re-initializing tail_call_cnt at freplace prog's prologue. Fixes: 1c123c567fb1 ("bpf: Resolve fext program type when checking map compatibility") Signed-off-by: Leon Hwang --- arch/arm64/net/bpf_jit_comp.c | 44 +++++++++++++++++++++++++++++++---- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 8aa32cb140b9e..cdc12dd83c4be 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -277,6 +277,7 @@ static bool is_lsi_offset(int offset, int scale) /* generated main prog prologue: * bti c // if CONFIG_ARM64_BTI_KERNEL * mov x9, lr + * mov x7, 1 // if not-freplace main prog * nop // POKE_OFFSET * paciasp // if CONFIG_ARM64_PTR_AUTH_KERNEL * stp x29, lr, [sp, #-16]! @@ -288,15 +289,19 @@ static bool is_lsi_offset(int offset, int scale) */ static void prepare_bpf_tail_call_cnt(struct jit_ctx *ctx) { + const bool is_ext = ctx->prog->type == BPF_PROG_TYPE_EXT; const bool is_main_prog = !bpf_is_subprog(ctx->prog); const u8 ptr = bpf2a64[TCCNT_PTR]; - if (is_main_prog) { + if (is_main_prog && !is_ext) { /* Initialize tail_call_cnt. */ emit(A64_PUSH(A64_ZR, ptr, A64_SP), ctx); emit(A64_MOV(1, ptr, A64_SP), ctx); - } else + } else { + /* Keep the same insn layout for freplace main prog. */ emit(A64_PUSH(ptr, ptr, A64_SP), ctx); + emit(A64_NOP, ctx); + } } static void find_used_callee_regs(struct jit_ctx *ctx) @@ -416,16 +421,20 @@ static void pop_callee_regs(struct jit_ctx *ctx) #define PAC_INSNS (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) ? 1 : 0) /* Offset of nop instruction in bpf prog entry to be poked */ -#define POKE_OFFSET (BTI_INSNS + 1) +#define POKE_OFFSET (BTI_INSNS + 2) /* Tail call offset to jump into */ -#define PROLOGUE_OFFSET (BTI_INSNS + 2 + PAC_INSNS + 4) +#define PROLOGUE_OFFSET (BTI_INSNS + 3 + PAC_INSNS + 4) static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) { const struct bpf_prog *prog = ctx->prog; + const bool is_ext = prog->type == BPF_PROG_TYPE_EXT; const bool is_main_prog = !bpf_is_subprog(prog); + const u8 r0 = bpf2a64[BPF_REG_0]; const u8 fp = bpf2a64[BPF_REG_FP]; + const u8 ptr = bpf2a64[TCCNT_PTR]; + const u8 tmp = bpf2a64[TMP_REG_1]; const u8 arena_vm_base = bpf2a64[ARENA_VM_START]; const int idx0 = ctx->idx; int cur_offset; @@ -462,6 +471,10 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) emit_bti(A64_BTI_JC, ctx); emit(A64_MOV(1, A64_R(9), A64_LR), ctx); + if (!is_ext) + emit(A64_MOVZ(1, r0, is_main_prog, 0), ctx); + else + emit(A64_NOP, ctx); emit(A64_NOP, ctx); if (!prog->aux->exception_cb) { @@ -485,6 +498,18 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) /* BTI landing pad for the tail call, done with a BR */ emit_bti(A64_BTI_J, ctx); } + /* If freplace's target prog is main prog, it has to make x26 as + * tail_call_cnt_ptr, and then initialize tail_call_cnt via the + * tail_call_cnt_ptr. + */ + if (is_main_prog && is_ext) { + emit(A64_MOVZ(1, tmp, 1, 0), ctx); + emit(A64_CMP(1, r0, tmp), ctx); + emit(A64_B_(A64_COND_NE, 4), ctx); + emit(A64_MOV(1, ptr, A64_SP), ctx); + emit(A64_MOVZ(1, r0, 0, 0), ctx); + emit(A64_STR64I(r0, ptr, 0), ctx); + } push_callee_regs(ctx); } else { /* @@ -521,6 +546,7 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) static int emit_bpf_tail_call(struct jit_ctx *ctx) { + const u8 r0 = bpf2a64[BPF_REG_0]; /* bpf_tail_call(void *prog_ctx, struct bpf_array *array, u64 index) */ const u8 r2 = bpf2a64[BPF_REG_2]; const u8 r3 = bpf2a64[BPF_REG_3]; @@ -579,6 +605,12 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx) pop_callee_regs(ctx); + /* When freplace prog tail calls freplace prog, setting r0 as 0 is to + * prevent re-initializing tail_call_cnt at the prologue of target + * freplace prog. + */ + emit(A64_MOVZ(1, r0, 0, 0), ctx); + /* goto *(prog->bpf_func + prologue_offset); */ off = offsetof(struct bpf_prog, bpf_func); emit_a64_mov_i64(tmp, off, ctx); @@ -2189,9 +2221,10 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, emit(A64_RET(A64_R(10)), ctx); /* store return value */ emit(A64_STR64I(A64_R(0), A64_SP, retval_off), ctx); - /* reserve a nop for bpf_tramp_image_put */ + /* reserve two nops for bpf_tramp_image_put */ im->ip_after_call = ctx->ro_image + ctx->idx; emit(A64_NOP, ctx); + emit(A64_NOP, ctx); } /* update the branches saved in invoke_bpf_mod_ret with cbnz */ @@ -2474,6 +2507,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type, /* skip to the nop instruction in bpf prog entry: * bti c // if BTI enabled * mov x9, x30 + * mov x7, 1 // if not-freplace main prog * nop */ ip = image + POKE_OFFSET * AARCH64_INSN_SIZE; From patchwork Sun Sep 1 13:38:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Hwang X-Patchwork-Id: 13786455 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-180.mta0.migadu.com (out-180.mta0.migadu.com [91.218.175.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89CD417DFEC for ; Sun, 1 Sep 2024 13:41:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198076; cv=none; b=bK7lMDI2ny6YiDKaxN80h3O9ryrRIWdN6BtHcX0aUpARJ3jOhembIERIbirAzW7ec3c2Bc/KflUYM+vsNfkjniR33oASumCsR5cz9w14Dx6Xcw0lvRalSoizNjInXYQDrzI0V1R11gchxHXy0GagzzV0pTsi9WlXm/9iLiXepUQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198076; c=relaxed/simple; bh=CB9krOCCZUydcFRPKL9eAZfHHhNNTrwagbgeo9IorfE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ePxyt3+cixtzfcCM8JRy9pQY/hLl+VoDGvCveyL/vDN6B1n9BLBN/Fx3AgbGD0s5X+XsTkt106N7t1f1NG2k+1wp0ptVGexL2xLOxEmuvDEzo0Lmgxf9iIRxsY5sBFIRcUBE0kwRwL9DRV0L90l4JM6y1GE8uyTvwoIjtxfRmNI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=QWp8ZvcY; arc=none smtp.client-ip=91.218.175.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="QWp8ZvcY" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725198072; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u/fw2V2lsSEY6Zw7Gjns+M8Uy3pcrWYbLrye98HThtE=; b=QWp8ZvcYQPQiZTfoR56dpQ/pt/83p20IwXghAQHb+RO//X7RXcOA3MHCOjWqt3VTdvTQtM AYGoPinxTGKvni6s6UXu4Z+LHkYsYzPvoysH4IZJC6HV6gWmP9b0iToApA/A/VNM/Q6F07 NemNeBqoH5rNdGEJcFf5OEesTYWjCY4= From: Leon Hwang To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev, puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com, iii@linux.ibm.com, leon.hwang@linux.dev, kernel-patches-bot@fb.com Subject: [PATCH bpf-next v2 3/4] selftests/bpf: Add testcases for another tailcall infinite loop fixing Date: Sun, 1 Sep 2024 21:38:55 +0800 Message-ID: <20240901133856.64367-4-leon.hwang@linux.dev> In-Reply-To: <20240901133856.64367-1-leon.hwang@linux.dev> References: <20240901133856.64367-1-leon.hwang@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net The issue has been fixed on both x64 and arm64. On x64: cd tools/testing/selftests/bpf; ./test_progs -t tailcalls 334/26 tailcalls/tailcall_bpf2bpf_freplace:OK 334/27 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_1:OK 334/28 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_2:OK 334/29 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_3:OK 334/30 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_4:OK 334/31 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_5:OK 334 tailcalls:OK Summary: 1/31 PASSED, 0 SKIPPED, 0 FAILED on arm64: cd tools/testing/selftests/bpf; ./test_progs -t tailcalls 334/26 tailcalls/tailcall_bpf2bpf_freplace:OK 334/27 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_1:OK 334/28 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_2:OK 334/29 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_3:OK 334/30 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_4:OK 334/31 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_5:OK 334 tailcalls:OK Summary: 1/31 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang --- .../selftests/bpf/prog_tests/tailcalls.c | 216 +++++++++++++++++- .../tailcall_bpf2bpf_hierarchy_freplace.c | 30 +++ .../testing/selftests/bpf/progs/tc_bpf2bpf.c | 37 ++- 3 files changed, 279 insertions(+), 4 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c index 21c5a37846ade..0faa0f57f71d4 100644 --- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c +++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c @@ -5,6 +5,7 @@ #include "tailcall_poke.skel.h" #include "tailcall_bpf2bpf_hierarchy2.skel.h" #include "tailcall_bpf2bpf_hierarchy3.skel.h" +#include "tailcall_bpf2bpf_hierarchy_freplace.skel.h" #include "tailcall_freplace.skel.h" #include "tc_bpf2bpf.skel.h" @@ -1525,7 +1526,8 @@ static void test_tailcall_freplace(void) prog_fd = bpf_program__fd(tc_skel->progs.entry_tc); freplace_prog = freplace_skel->progs.entry_freplace; - err = bpf_program__set_attach_target(freplace_prog, prog_fd, "subprog"); + err = bpf_program__set_attach_target(freplace_prog, prog_fd, + "subprog_tailcall_tc"); if (!ASSERT_OK(err, "set_attach_target")) goto out; @@ -1534,7 +1536,7 @@ static void test_tailcall_freplace(void) goto out; freplace_link = bpf_program__attach_freplace(freplace_prog, prog_fd, - "subprog"); + "subprog_tailcall_tc"); if (!ASSERT_OK_PTR(freplace_link, "attach_freplace")) goto out; @@ -1556,6 +1558,204 @@ static void test_tailcall_freplace(void) tailcall_freplace__destroy(freplace_skel); } +/* test_tailcall_bpf2bpf_freplace checks that the count value of the tail + * call limit enforcement matches with expectation for this case: + * + * entry_tc --> subprog_tailcall_tc --jump-> entry_freplace --tailcall-> entry_tc + * + * Meanwhile, check the failure that fails to attach tailcall-reachable freplace + * prog to not-tailcall-reachable subprog. + */ +static void test_tailcall_bpf2bpf_freplace(void) +{ + struct tailcall_freplace *freplace_skel = NULL; + struct bpf_link *freplace_link = NULL; + struct tc_bpf2bpf *tc_skel = NULL; + char buff[128] = {}; + int prog_fd, map_fd; + int err, i; + + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = buff, + .data_size_in = sizeof(buff), + .repeat = 1, + ); + + tc_skel = tc_bpf2bpf__open_and_load(); + if (!ASSERT_OK_PTR(tc_skel, "tc_bpf2bpf__open_and_load")) + goto out; + + prog_fd = bpf_program__fd(tc_skel->progs.entry_tc); + freplace_skel = tailcall_freplace__open(); + if (!ASSERT_OK_PTR(freplace_skel, "tailcall_freplace__open")) + goto out; + + err = bpf_program__set_attach_target(freplace_skel->progs.entry_freplace, + prog_fd, "subprog_tailcall_tc"); + if (!ASSERT_OK(err, "set_attach_target")) + goto out; + + err = tailcall_freplace__load(freplace_skel); + if (!ASSERT_OK(err, "tailcall_freplace__load")) + goto out; + + i = 0; + map_fd = bpf_map__fd(freplace_skel->maps.jmp_table); + err = bpf_map_update_elem(map_fd, &i, &prog_fd, BPF_ANY); + if (!ASSERT_OK(err, "update jmp_table")) + goto out; + + freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace, + prog_fd, "subprog_tc"); + if (!ASSERT_ERR_PTR(freplace_link, "attach_freplace fail")) + goto out; + + freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace, + prog_fd, "subprog_tailcall_tc"); + if (!ASSERT_OK_PTR(freplace_link, "attach_freplace")) + goto out; + + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run_opts"); + ASSERT_EQ(topts.retval, 34, "test_run_opts retval"); + +out: + bpf_link__destroy(freplace_link); + tailcall_freplace__destroy(freplace_skel); + tc_bpf2bpf__destroy(tc_skel); +} + +static void test_tailcall_bpf2bpf_hierarchy_freplace(bool freplace_subprog, + bool tailcall_tc, + bool target_entry2, + int count1, int count2) +{ + struct tailcall_bpf2bpf_hierarchy_freplace *freplace_skel = NULL; + struct bpf_link *freplace_link = NULL; + int freplace_prog_fd, prog_fd, map_fd; + struct tc_bpf2bpf *tc_skel = NULL; + char buff[128] = {}; + int err, i, val; + + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = buff, + .data_size_in = sizeof(buff), + .repeat = 1, + ); + + tc_skel = tc_bpf2bpf__open_and_load(); + if (!ASSERT_OK_PTR(tc_skel, "tc_bpf2bpf__open_and_load")) + goto out; + + prog_fd = bpf_program__fd(target_entry2 ? tc_skel->progs.entry_tc_2 : + tc_skel->progs.entry_tc); + freplace_skel = tailcall_bpf2bpf_hierarchy_freplace__open(); + if (!ASSERT_OK_PTR(freplace_skel, "tailcall_bpf2bpf_hierarchy_freplace__open")) + goto out; + + err = bpf_program__set_attach_target(freplace_skel->progs.entry_freplace, + prog_fd, freplace_subprog ? + "subprog_tailcall_tc" : "entry_tc"); + if (!ASSERT_OK(err, "set_attach_target")) + goto out; + + err = tailcall_bpf2bpf_hierarchy_freplace__load(freplace_skel); + if (!ASSERT_OK(err, "tailcall_bpf2bpf_hierarchy_freplace__load")) + goto out; + + freplace_prog_fd = bpf_program__fd(freplace_skel->progs.entry_freplace); + map_fd = bpf_map__fd(freplace_skel->maps.jmp_table); + val = tailcall_tc ? prog_fd : freplace_prog_fd; + i = 0; + err = bpf_map_update_elem(map_fd, &i, &val, BPF_ANY); + if (!ASSERT_OK(err, "update jmp_table")) + goto out; + + freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace, + prog_fd, freplace_subprog ? + "subprog_tailcall_tc" : "entry_tc"); + if (!ASSERT_OK_PTR(freplace_link, "attach_freplace")) + goto out; + + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run_opts"); + ASSERT_EQ(topts.retval, count1, "test_run_opts retval"); + + i = 0; + err = bpf_map_delete_elem(map_fd, &i); + if (!ASSERT_OK(err, "delete_elem from jmp_table")) + goto out; + + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run_opts again"); + ASSERT_EQ(topts.retval, count2, "test_run_opts retval again"); + +out: + bpf_link__destroy(freplace_link); + tailcall_bpf2bpf_hierarchy_freplace__destroy(freplace_skel); + tc_bpf2bpf__destroy(tc_skel); +} + +/* test_tailcall_bpf2bpf_hierarchy_freplace_1 checks the count value of tail + * call limit enforcement matches with expectation for this case: + * + * subprog_tail --tailcall-> entry_freplace + * entry_tc --jump-> entry_freplace < + * subprog_tail --tailcall-> entry_freplace + */ +static void test_tailcall_bpf2bpf_hierarchy_freplace_1(void) +{ + test_tailcall_bpf2bpf_hierarchy_freplace(false, false, false, 34, 35); +} + +/* test_tailcall_bpf2bpf_hierarchy_freplace_2 checks the count value of tail + * call limit enforcement matches with expectation for this case: + * + * subprog_tail --tailcall-> entry_freplace + * entry_tc --> subprog_tailcall_tc --jump-> entry_freplace < + * subprog_tail --tailcall-> entry_freplace + */ +static void test_tailcall_bpf2bpf_hierarchy_freplace_2(void) +{ + test_tailcall_bpf2bpf_hierarchy_freplace(true, false, false, 34, 35); +} + +/* test_tailcall_bpf2bpf_hierarchy_freplace_3 checks the count value of tail + * call limit enforcement matches with expectation for this case: + * + * subprog_tail --tailcall-> entry_tc + * entry_tc --> subprog_tailcall_tc --jump-> entry_freplace < + * subprog_tail --tailcall-> entry_tc + */ +static void test_tailcall_bpf2bpf_hierarchy_freplace_3(void) +{ + test_tailcall_bpf2bpf_hierarchy_freplace(true, true, false, 34, 35); +} + +/* test_tailcall_bpf2bpf_hierarchy_freplace_4 checks the count value of tail + * call limit enforcement matches with expectation for this case: + * + * subprog_tail --tailcall-> entry_freplace + * entry_tc_2 --> subprog_tailcall_tc (call 10 times) --jump-> entry_freplace < + * subprog_tail --tailcall-> entry_freplace + */ +static void test_tailcall_bpf2bpf_hierarchy_freplace_4(void) +{ + test_tailcall_bpf2bpf_hierarchy_freplace(true, false, true, 43, 53); +} + +/* test_tailcall_bpf2bpf_hierarchy_freplace_5 checks the count value of tail + * call limit enforcement matches with expectation for this case: + * + * subprog_tail --tailcall-> entry_tc_2 + * entry_tc_2 --> subprog_tailcall_tc (call 10 times) --jump-> entry_freplace < + * subprog_tail --tailcall-> entry_tc_2 + */ +static void test_tailcall_bpf2bpf_hierarchy_freplace_5(void) +{ + test_tailcall_bpf2bpf_hierarchy_freplace(true, true, true, 340, 350); +} + void test_tailcalls(void) { if (test__start_subtest("tailcall_1")) @@ -1606,4 +1806,16 @@ void test_tailcalls(void) test_tailcall_bpf2bpf_hierarchy_3(); if (test__start_subtest("tailcall_freplace")) test_tailcall_freplace(); + if (test__start_subtest("tailcall_bpf2bpf_freplace")) + test_tailcall_bpf2bpf_freplace(); + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_1")) + test_tailcall_bpf2bpf_hierarchy_freplace_1(); + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_2")) + test_tailcall_bpf2bpf_hierarchy_freplace_2(); + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_3")) + test_tailcall_bpf2bpf_hierarchy_freplace_3(); + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_4")) + test_tailcall_bpf2bpf_hierarchy_freplace_4(); + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_5")) + test_tailcall_bpf2bpf_hierarchy_freplace_5(); } diff --git a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c new file mode 100644 index 0000000000000..6f7c1fac9ddb7 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 1); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); +} jmp_table SEC(".maps"); + +int count = 0; + +static __noinline +int subprog_tail(struct __sk_buff *skb) +{ + bpf_tail_call_static(skb, &jmp_table, 0); + return 0; +} + +SEC("freplace") +int entry_freplace(struct __sk_buff *skb) +{ + count++; + subprog_tail(skb); + subprog_tail(skb); + return count; +} + +char __license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c index 8a0632c37839a..beacf60a52677 100644 --- a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c +++ b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c @@ -4,11 +4,30 @@ #include #include "bpf_misc.h" +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 1); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); +} jmp_table SEC(".maps"); + +__noinline +int subprog_tailcall_tc(struct __sk_buff *skb) +{ + int ret = 1; + + bpf_tail_call_static(skb, &jmp_table, 0); + __sink(skb); + __sink(ret); + return ret; +} + __noinline -int subprog(struct __sk_buff *skb) +int subprog_tc(struct __sk_buff *skb) { int ret = 1; + __sink(skb); __sink(ret); return ret; } @@ -16,7 +35,21 @@ int subprog(struct __sk_buff *skb) SEC("tc") int entry_tc(struct __sk_buff *skb) { - return subprog(skb); + subprog_tc(skb); + return subprog_tailcall_tc(skb); +} + +SEC("tc") +int entry_tc_2(struct __sk_buff *skb) +{ + int ret, i; + + for (i = 0; i < 10; i++) { + ret = subprog_tailcall_tc(skb); + __sink(ret); + } + + return ret; } char __license[] SEC("license") = "GPL"; From patchwork Sun Sep 1 13:38:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Hwang X-Patchwork-Id: 13786456 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [91.218.175.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 08C6A17DFEC for ; Sun, 1 Sep 2024 13:41:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198081; cv=none; b=uZlRRLhHIEDfYRYb/EnroEmF5ARG6p/nMn2rY5ZJClA+ey8krNERMUvoBiJwQrULbrmDVmvwytnhIC27qvTDJlUbfUQDV8nUE4rxp1Vq6xQpfNw2EF7nF8uGBR+kNk72r1eTg7A3mTIaOupSt1c9gzZEZXVeTmuGiXgcaQlXDY4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725198081; c=relaxed/simple; bh=qPaXYSWgQRz7zQRtOwVm1O9VlecsIIWajI47YPesIEI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FtqJ2Ien3KMX/HDvLaGke5qCtY1MvXGvMzAwXbu+lCu5/966IsyBtUrTVPlWOmywRhrx3strXAhgRw8KMBIt0hAFMYVWPhP/PzCLIvqbPWolxU4XXuYzg/Q4znNMhbp6BJt7M8vVXT4kUqXybbLl0OjnEEc5r/dJfiXG3A3B64M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=SyfnEoA7; arc=none smtp.client-ip=91.218.175.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="SyfnEoA7" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1725198077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=//XW1C5dZBrecgnjydaJAHSbDQqnYGyaX8YPs3KNj+Y=; b=SyfnEoA7j2BwLNmUTAPgpFF92l2p3r8AbQjpl5qgSx3vAfopvVTw7vz7HA46fB732/imiF g78oZPcbp1hdSDmcP4adXQiKg8502nYQT0gjhYIq26Xgbj4UE28AR4utYgSMCvdlL5/8Ir /jsHRBDYzPLu+7sWF6xCQgZYwGynkRQ= From: Leon Hwang To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev, puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com, iii@linux.ibm.com, leon.hwang@linux.dev, kernel-patches-bot@fb.com Subject: [PATCH bpf-next v2 4/4] selftests/bpf: Fix verifier tailcall jit selftest Date: Sun, 1 Sep 2024 21:38:56 +0800 Message-ID: <20240901133856.64367-5-leon.hwang@linux.dev> In-Reply-To: <20240901133856.64367-1-leon.hwang@linux.dev> References: <20240901133856.64367-1-leon.hwang@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net The verifier tailcall jit selftest is broken, because the prologue of x86 bpf prog has been updated to fix the tailcall infinite loop issue caused by freplace. Signed-off-by: Leon Hwang --- tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c b/tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c index 8d60c634a114f..60a6045e2d9ae 100644 --- a/tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c +++ b/tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c @@ -33,8 +33,8 @@ __success __arch_x86_64 /* program entry for main(), regular function prologue */ __jited(" endbr64") -__jited(" nopl (%rax,%rax)") __jited(" xorq %rax, %rax") +__jited(" nopl (%rax,%rax)") __jited(" pushq %rbp") __jited(" movq %rsp, %rbp") /* tail call prologue for program: @@ -63,8 +63,8 @@ __jited(" {{(retq|jmp 0x)}}") /* return or jump to rethunk */ __jited("...") /* subprogram entry for sub(), regular function prologue */ __jited(" endbr64") -__jited(" nopl (%rax,%rax)") __jited(" nopl (%rax)") +__jited(" nopl (%rax,%rax)") __jited(" pushq %rbp") __jited(" movq %rsp, %rbp") /* tail call prologue for subprogram address of tail call counter