diff mbox series

LoongArch: BPF: Fix off by one error in build_prologue()

Message ID 20250315080320.4193821-1-hengqi.chen@gmail.com (mailing list archive)
State New
Delegated to: BPF
Headers show
Series LoongArch: BPF: Fix off by one error in build_prologue() | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / GCC BPF
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for aarch64-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-11 success Logs for aarch64-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / GCC BPF
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-19 success Logs for s390x-gcc / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / GCC BPF / GCC BPF
bpf/vmtest-bpf-next-VM_Test-21 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20 success Logs for s390x-gcc / veristat-meta
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-gcc / veristat-kernel / x86_64-gcc veristat_kernel
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-gcc / veristat-meta / x86_64-gcc veristat_meta
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-17 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / GCC BPF / GCC BPF
bpf/vmtest-bpf-next-VM_Test-44 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-17 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-43 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-51 success Logs for x86_64-llvm-18 / veristat-meta
bpf/vmtest-bpf-next-VM_Test-50 success Logs for x86_64-llvm-18 / veristat-kernel
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / GCC BPF / GCC BPF
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-45 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-47 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-48 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-49 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-46 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18

Commit Message

Hengqi Chen March 15, 2025, 8:03 a.m. UTC
Vincent reported that running BPF progs with tailcalls on LoongArch
causes kernel hard lockup. Debugging the issues shows that the JITed
image missing a jirl instruction at the end of the epilogue.

There are two passes in JIT compile, the first pass set the flags and
the second pass generates JIT code based on those flags. With BPF progs
mixing bpf2bpf and tailcalls, build_prologue() generates N insns in the
first pass and then generates N+1 insns in the second pass. This makes
epilogue_offset off by one and we will jump to some unexpected insn and
cause lockup. Fix this by inserting a nop insn.

Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
Reported-by: Vincent Li <vincent.mc.li@gmail.com>
Closes: https://lore.kernel.org/loongarch/CAK3+h2w6WESdBN3UCr3WKHByD7D6Q_Ve1EDAjotVrnx6Or_c8g@mail.gmail.com/
Closes: https://lore.kernel.org/bpf/CAK3+h2woEjG_N=-XzqEGaAeCmgu2eTCUc7p6bP4u8Q+DFHm-7g@mail.gmail.com/
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
---
 arch/loongarch/include/asm/inst.h | 5 +++++
 arch/loongarch/net/bpf_jit.c      | 2 ++
 2 files changed, 7 insertions(+)

Comments

Vincent Li March 15, 2025, 9:49 p.m. UTC | #1
On Sat, Mar 15, 2025 at 1:03 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> Vincent reported that running BPF progs with tailcalls on LoongArch
> causes kernel hard lockup. Debugging the issues shows that the JITed
> image missing a jirl instruction at the end of the epilogue.
>
> There are two passes in JIT compile, the first pass set the flags and
> the second pass generates JIT code based on those flags. With BPF progs
> mixing bpf2bpf and tailcalls, build_prologue() generates N insns in the
> first pass and then generates N+1 insns in the second pass. This makes
> epilogue_offset off by one and we will jump to some unexpected insn and
> cause lockup. Fix this by inserting a nop insn.
>
> Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
> Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
> Reported-by: Vincent Li <vincent.mc.li@gmail.com>
> Closes: https://lore.kernel.org/loongarch/CAK3+h2w6WESdBN3UCr3WKHByD7D6Q_Ve1EDAjotVrnx6Or_c8g@mail.gmail.com/
> Closes: https://lore.kernel.org/bpf/CAK3+h2woEjG_N=-XzqEGaAeCmgu2eTCUc7p6bP4u8Q+DFHm-7g@mail.gmail.com/
> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> ---
>  arch/loongarch/include/asm/inst.h | 5 +++++
>  arch/loongarch/net/bpf_jit.c      | 2 ++
>  2 files changed, 7 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 3089785ca97e..d61b356170fe 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -695,6 +695,11 @@ static inline void emit_jirl(union loongarch_instruction *insn,
>         insn->reg2i16_format.rj = rj;
>  }
>
> +static inline void emit_nop(union loongarch_instruction *insn)
> +{
> +       insn->word = INSN_NOP;
> +}
> +
>  #define DEF_EMIT_REG2BSTRD_FORMAT(NAME, OP)                            \
>  static inline void emit_##NAME(union loongarch_instruction *insn,      \
>                                enum loongarch_gpr rd,                   \
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index ea357a3edc09..2346c0b55043 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -142,6 +142,8 @@ static void build_prologue(struct jit_ctx *ctx)
>          */
>         if (seen_tail_call(ctx) && seen_call(ctx))
>                 move_reg(ctx, TCC_SAVED, REG_TCC);
> +       else
> +               emit_insn(ctx, nop);
>
>         ctx->stack_size = stack_adjust;
>  }
> --
> 2.43.5
>

Thanks Hengqi for the patch, I tested the patch on the LoongArch
platform with the loxilb tc BPF program, it does not lockup up the
kernel anymore.

attach loxilb tc BPF program to loongfire firewall green0 interface

[root@loongfire ~]# loxilb --whitelist="green0"
loxilb start
...
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook1
prog tc_packet_func
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook2
prog tc_packet_func_slow
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook3
prog tc_packet_func_fw
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook4
prog tc_csum_func1
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook5
prog tc_csum_func2
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook6
prog tc_slow_unp_func
14:43:13 DEBUG common_libbpf.c:161: tc: autoload sec tc_packet_hook7
prog tc_packet_func_masq
libbpf: Error in bpf_create_map_xattr(nh_map):Invalid argument(-22).
Retrying without BTF.
libbpf: Error in bpf_create_map_xattr(xctk):Operation not
supported(-95). Retrying without BTF.
2025-03-15 14:43:14 Get xsync()
14:43:15 INFO  common_libbpf.c:204: tc: bpf attach OK for green0
<===============
14:43:15 DEBUG loxilb_libdp.c:3311: llb_link_prop_add: IF-green0 added
idx 1 type 2
2025-03-15 14:43:15 ebpf intfmap added - 5 vlan 0 -> 1
2025-03-15 14:43:15 ebpf txintfmap added - 1 -> 5

Above log shows the program loaded and attached to the green0
interface, no lockup.

One strange thing is that bpftool net does not show the tc BPF program
attached to green0 interface, attached to virtual interface llb0 ok, I
guess this should be irrelevant to kernel lockup issue.

[root@loongfire ~]# bpftool net
xdp:
llb0(7) generic id 55

tc:
llb0(7) clsact/ingress tc_packet_func_:[59] id 59

flow_dissector:

netfilter:
diff mbox series

Patch

diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 3089785ca97e..d61b356170fe 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -695,6 +695,11 @@  static inline void emit_jirl(union loongarch_instruction *insn,
 	insn->reg2i16_format.rj = rj;
 }
 
+static inline void emit_nop(union loongarch_instruction *insn)
+{
+	insn->word = INSN_NOP;
+}
+
 #define DEF_EMIT_REG2BSTRD_FORMAT(NAME, OP)				\
 static inline void emit_##NAME(union loongarch_instruction *insn,	\
 			       enum loongarch_gpr rd,			\
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index ea357a3edc09..2346c0b55043 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -142,6 +142,8 @@  static void build_prologue(struct jit_ctx *ctx)
 	 */
 	if (seen_tail_call(ctx) && seen_call(ctx))
 		move_reg(ctx, TCC_SAVED, REG_TCC);
+	else
+		emit_insn(ctx, nop);
 
 	ctx->stack_size = stack_adjust;
 }