Message ID | 20210429212834.82621-1-jolsa@kernel.org (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [RFC] bpf: Fix trampoline for functions with variable arguments | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Thu, Apr 29, 2021 at 11:28:34PM +0200, Jiri Olsa wrote: > For functions with variable arguments like: > > void set_worker_desc(const char *fmt, ...) > > the BTF data contains void argument at the end: > > [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > 'fmt' type_id=3 > '(anon)' type_id=0 > > When attaching function with this void argument the btf_distill_func_proto > will set last btf_func_model's argument with size 0 and that > will cause extra loop in save_regs/restore_regs functions and > generate trampoline code like: > > 55 push %rbp > 48 89 e5 mov %rsp,%rbp > 48 83 ec 10 sub $0x10,%rsp > 53 push %rbx > 48 89 7d f0 mov %rdi,-0x10(%rbp) > 75 f8 jne 0xffffffffa00cf007 > ^^^ extra jump > > It's causing soft lockups/crashes probably depends on what context > is the attached function called, like for set_worker_desc: > > watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] > CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 > Workqueue: writeback wb_workfn > RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 > Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. > RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 > Call Trace: > set_worker_desc+0x5/0xb0 > wb_workfn+0x48/0x4d0 > ? psi_group_change+0x41/0x210 > ? __bpf_prog_exit+0x15/0x20 > ? bpf_trampoline_6442458903_0+0x3b/0x1000 > ? update_pasid+0x5/0x90 > ? __switch_to+0x187/0x450 > process_one_work+0x1e7/0x380 > worker_thread+0x50/0x3b0 > ? rescuer_thread+0x380/0x380 > kthread+0x11b/0x140 > ? __kthread_bind_mask+0x60/0x60 > ret_from_fork+0x22/0x30 > > This patch is removing the void argument from struct btf_func_model > in btf_distill_func_proto, but perhaps we should also check for this > in JIT's save_regs/restore_regs functions. actualy looks like we need to disable functions with variable arguments completely, because we don't know how many arguments to save I tried to disable them in pahole and it's easy fix, will post new fix jirka
On Sun, May 2, 2021 at 2:17 PM Jiri Olsa <jolsa@redhat.com> wrote: > > On Thu, Apr 29, 2021 at 11:28:34PM +0200, Jiri Olsa wrote: > > For functions with variable arguments like: > > > > void set_worker_desc(const char *fmt, ...) > > > > the BTF data contains void argument at the end: > > > > [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > 'fmt' type_id=3 > > '(anon)' type_id=0 > > > > When attaching function with this void argument the btf_distill_func_proto > > will set last btf_func_model's argument with size 0 and that > > will cause extra loop in save_regs/restore_regs functions and > > generate trampoline code like: > > > > 55 push %rbp > > 48 89 e5 mov %rsp,%rbp > > 48 83 ec 10 sub $0x10,%rsp > > 53 push %rbx > > 48 89 7d f0 mov %rdi,-0x10(%rbp) > > 75 f8 jne 0xffffffffa00cf007 > > ^^^ extra jump > > > > It's causing soft lockups/crashes probably depends on what context > > is the attached function called, like for set_worker_desc: > > > > watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] > > CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 > > Workqueue: writeback wb_workfn > > RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 > > Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. > > RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 > > Call Trace: > > set_worker_desc+0x5/0xb0 > > wb_workfn+0x48/0x4d0 > > ? psi_group_change+0x41/0x210 > > ? __bpf_prog_exit+0x15/0x20 > > ? bpf_trampoline_6442458903_0+0x3b/0x1000 > > ? update_pasid+0x5/0x90 > > ? __switch_to+0x187/0x450 > > process_one_work+0x1e7/0x380 > > worker_thread+0x50/0x3b0 > > ? rescuer_thread+0x380/0x380 > > kthread+0x11b/0x140 > > ? __kthread_bind_mask+0x60/0x60 > > ret_from_fork+0x22/0x30 > > > > This patch is removing the void argument from struct btf_func_model > > in btf_distill_func_proto, but perhaps we should also check for this > > in JIT's save_regs/restore_regs functions. > > actualy looks like we need to disable functions with variable arguments > completely, because we don't know how many arguments to save > > I tried to disable them in pahole and it's easy fix, will post new fix Can we still allow access to fixed arguments for such functions and just disallow the vararg ones? > > jirka >
On Mon, May 03, 2021 at 03:32:34PM -0700, Andrii Nakryiko wrote: > On Sun, May 2, 2021 at 2:17 PM Jiri Olsa <jolsa@redhat.com> wrote: > > > > On Thu, Apr 29, 2021 at 11:28:34PM +0200, Jiri Olsa wrote: > > > For functions with variable arguments like: > > > > > > void set_worker_desc(const char *fmt, ...) > > > > > > the BTF data contains void argument at the end: > > > > > > [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > > 'fmt' type_id=3 > > > '(anon)' type_id=0 > > > > > > When attaching function with this void argument the btf_distill_func_proto > > > will set last btf_func_model's argument with size 0 and that > > > will cause extra loop in save_regs/restore_regs functions and > > > generate trampoline code like: > > > > > > 55 push %rbp > > > 48 89 e5 mov %rsp,%rbp > > > 48 83 ec 10 sub $0x10,%rsp > > > 53 push %rbx > > > 48 89 7d f0 mov %rdi,-0x10(%rbp) > > > 75 f8 jne 0xffffffffa00cf007 > > > ^^^ extra jump > > > > > > It's causing soft lockups/crashes probably depends on what context > > > is the attached function called, like for set_worker_desc: > > > > > > watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] > > > CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 > > > Workqueue: writeback wb_workfn > > > RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 > > > Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. > > > RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 > > > Call Trace: > > > set_worker_desc+0x5/0xb0 > > > wb_workfn+0x48/0x4d0 > > > ? psi_group_change+0x41/0x210 > > > ? __bpf_prog_exit+0x15/0x20 > > > ? bpf_trampoline_6442458903_0+0x3b/0x1000 > > > ? update_pasid+0x5/0x90 > > > ? __switch_to+0x187/0x450 > > > process_one_work+0x1e7/0x380 > > > worker_thread+0x50/0x3b0 > > > ? rescuer_thread+0x380/0x380 > > > kthread+0x11b/0x140 > > > ? __kthread_bind_mask+0x60/0x60 > > > ret_from_fork+0x22/0x30 > > > > > > This patch is removing the void argument from struct btf_func_model > > > in btf_distill_func_proto, but perhaps we should also check for this > > > in JIT's save_regs/restore_regs functions. > > > > actualy looks like we need to disable functions with variable arguments > > completely, because we don't know how many arguments to save > > > > I tried to disable them in pahole and it's easy fix, will post new fix > > Can we still allow access to fixed arguments for such functions and > just disallow the vararg ones? the problem is that we should save all the registers for arguments, which is probably doable.. but if caller uses more than 6 arguments, we need stack data, which will be wrong because of the extra stack frame we do in bpf trampoline.. so we could crash the patch below prevents to attach these functions directly in kernel, so we could keep these functions in BTF jirka --- diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 0600ed325fa0..f9709dc08c44 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5213,6 +5213,13 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); return -EINVAL; } + if (ret == 0) { + bpf_log(log, + "The function %s has variable args, it's unsupported.\n", + tname); + return -EINVAL; + + } m->arg_size[i] = ret; } m->nr_args = nargs;
On Tue, May 4, 2021 at 6:27 AM Jiri Olsa <jolsa@redhat.com> wrote: > > On Mon, May 03, 2021 at 03:32:34PM -0700, Andrii Nakryiko wrote: > > On Sun, May 2, 2021 at 2:17 PM Jiri Olsa <jolsa@redhat.com> wrote: > > > > > > On Thu, Apr 29, 2021 at 11:28:34PM +0200, Jiri Olsa wrote: > > > > For functions with variable arguments like: > > > > > > > > void set_worker_desc(const char *fmt, ...) > > > > > > > > the BTF data contains void argument at the end: > > > > > > > > [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > > > 'fmt' type_id=3 > > > > '(anon)' type_id=0 > > > > > > > > When attaching function with this void argument the btf_distill_func_proto > > > > will set last btf_func_model's argument with size 0 and that > > > > will cause extra loop in save_regs/restore_regs functions and > > > > generate trampoline code like: > > > > > > > > 55 push %rbp > > > > 48 89 e5 mov %rsp,%rbp > > > > 48 83 ec 10 sub $0x10,%rsp > > > > 53 push %rbx > > > > 48 89 7d f0 mov %rdi,-0x10(%rbp) > > > > 75 f8 jne 0xffffffffa00cf007 > > > > ^^^ extra jump > > > > > > > > It's causing soft lockups/crashes probably depends on what context > > > > is the attached function called, like for set_worker_desc: > > > > > > > > watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] > > > > CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 > > > > Workqueue: writeback wb_workfn > > > > RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 > > > > Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. > > > > RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 > > > > Call Trace: > > > > set_worker_desc+0x5/0xb0 > > > > wb_workfn+0x48/0x4d0 > > > > ? psi_group_change+0x41/0x210 > > > > ? __bpf_prog_exit+0x15/0x20 > > > > ? bpf_trampoline_6442458903_0+0x3b/0x1000 > > > > ? update_pasid+0x5/0x90 > > > > ? __switch_to+0x187/0x450 > > > > process_one_work+0x1e7/0x380 > > > > worker_thread+0x50/0x3b0 > > > > ? rescuer_thread+0x380/0x380 > > > > kthread+0x11b/0x140 > > > > ? __kthread_bind_mask+0x60/0x60 > > > > ret_from_fork+0x22/0x30 > > > > > > > > This patch is removing the void argument from struct btf_func_model > > > > in btf_distill_func_proto, but perhaps we should also check for this > > > > in JIT's save_regs/restore_regs functions. > > > > > > actualy looks like we need to disable functions with variable arguments > > > completely, because we don't know how many arguments to save > > > > > > I tried to disable them in pahole and it's easy fix, will post new fix > > > > Can we still allow access to fixed arguments for such functions and > > just disallow the vararg ones? > > the problem is that we should save all the registers for arguments, > which is probably doable.. but if caller uses more than 6 arguments, > we need stack data, which will be wrong because of the extra stack > frame we do in bpf trampoline.. so we could crash > > the patch below prevents to attach these functions directly in kernel, > so we could keep these functions in BTF > > jirka > > > --- > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > index 0600ed325fa0..f9709dc08c44 100644 > --- a/kernel/bpf/btf.c > +++ b/kernel/bpf/btf.c > @@ -5213,6 +5213,13 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, > tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); > return -EINVAL; > } > + if (ret == 0) { > + bpf_log(log, > + "The function %s has variable args, it's unsupported.\n", > + tname); > + return -EINVAL; > + > + } this will work, but the explicit check for vararg should be `i == nargs - 1 && args[i].type == 0`. Everything else (if it happens) is probably a bad BTF data. > m->arg_size[i] = ret; > } > m->nr_args = nargs; >
On Tue, May 4, 2021 at 3:37 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, May 4, 2021 at 6:27 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > On Mon, May 03, 2021 at 03:32:34PM -0700, Andrii Nakryiko wrote: > > > On Sun, May 2, 2021 at 2:17 PM Jiri Olsa <jolsa@redhat.com> wrote: > > > > > > > > On Thu, Apr 29, 2021 at 11:28:34PM +0200, Jiri Olsa wrote: > > > > > For functions with variable arguments like: > > > > > > > > > > void set_worker_desc(const char *fmt, ...) > > > > > > > > > > the BTF data contains void argument at the end: > > > > > > > > > > [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 > > > > > 'fmt' type_id=3 > > > > > '(anon)' type_id=0 > > > > > > > > > > When attaching function with this void argument the btf_distill_func_proto > > > > > will set last btf_func_model's argument with size 0 and that > > > > > will cause extra loop in save_regs/restore_regs functions and > > > > > generate trampoline code like: > > > > > > > > > > 55 push %rbp > > > > > 48 89 e5 mov %rsp,%rbp > > > > > 48 83 ec 10 sub $0x10,%rsp > > > > > 53 push %rbx > > > > > 48 89 7d f0 mov %rdi,-0x10(%rbp) > > > > > 75 f8 jne 0xffffffffa00cf007 > > > > > ^^^ extra jump > > > > > > > > > > It's causing soft lockups/crashes probably depends on what context > > > > > is the attached function called, like for set_worker_desc: > > > > > > > > > > watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] > > > > > CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 > > > > > Workqueue: writeback wb_workfn > > > > > RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 > > > > > Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. > > > > > RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 > > > > > Call Trace: > > > > > set_worker_desc+0x5/0xb0 > > > > > wb_workfn+0x48/0x4d0 > > > > > ? psi_group_change+0x41/0x210 > > > > > ? __bpf_prog_exit+0x15/0x20 > > > > > ? bpf_trampoline_6442458903_0+0x3b/0x1000 > > > > > ? update_pasid+0x5/0x90 > > > > > ? __switch_to+0x187/0x450 > > > > > process_one_work+0x1e7/0x380 > > > > > worker_thread+0x50/0x3b0 > > > > > ? rescuer_thread+0x380/0x380 > > > > > kthread+0x11b/0x140 > > > > > ? __kthread_bind_mask+0x60/0x60 > > > > > ret_from_fork+0x22/0x30 > > > > > > > > > > This patch is removing the void argument from struct btf_func_model > > > > > in btf_distill_func_proto, but perhaps we should also check for this > > > > > in JIT's save_regs/restore_regs functions. > > > > > > > > actualy looks like we need to disable functions with variable arguments > > > > completely, because we don't know how many arguments to save > > > > > > > > I tried to disable them in pahole and it's easy fix, will post new fix > > > > > > Can we still allow access to fixed arguments for such functions and > > > just disallow the vararg ones? > > > > the problem is that we should save all the registers for arguments, > > which is probably doable.. but if caller uses more than 6 arguments, > > we need stack data, which will be wrong because of the extra stack > > frame we do in bpf trampoline.. so we could crash > > > > the patch below prevents to attach these functions directly in kernel, > > so we could keep these functions in BTF > > > > jirka > > > > > > --- > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > index 0600ed325fa0..f9709dc08c44 100644 > > --- a/kernel/bpf/btf.c > > +++ b/kernel/bpf/btf.c > > @@ -5213,6 +5213,13 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, > > tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); > > return -EINVAL; > > } > > + if (ret == 0) { > > + bpf_log(log, > > + "The function %s has variable args, it's unsupported.\n", > > + tname); > > + return -EINVAL; > > + > > + } > > this will work, but the explicit check for vararg should be `i == > nargs - 1 && args[i].type == 0`. Everything else (if it happens) is > probably a bad BTF data. Jiri, could you please resubmit with the check like Andrii suggested? Thanks!
On Tue, May 04, 2021 at 09:11:26PM -0700, Alexei Starovoitov wrote: SNIP > > > > > > > > > > actualy looks like we need to disable functions with variable arguments > > > > > completely, because we don't know how many arguments to save > > > > > > > > > > I tried to disable them in pahole and it's easy fix, will post new fix > > > > > > > > Can we still allow access to fixed arguments for such functions and > > > > just disallow the vararg ones? > > > > > > the problem is that we should save all the registers for arguments, > > > which is probably doable.. but if caller uses more than 6 arguments, > > > we need stack data, which will be wrong because of the extra stack > > > frame we do in bpf trampoline.. so we could crash > > > > > > the patch below prevents to attach these functions directly in kernel, > > > so we could keep these functions in BTF > > > > > > jirka > > > > > > > > > --- > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > > index 0600ed325fa0..f9709dc08c44 100644 > > > --- a/kernel/bpf/btf.c > > > +++ b/kernel/bpf/btf.c > > > @@ -5213,6 +5213,13 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, > > > tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); > > > return -EINVAL; > > > } > > > + if (ret == 0) { > > > + bpf_log(log, > > > + "The function %s has variable args, it's unsupported.\n", > > > + tname); > > > + return -EINVAL; > > > + > > > + } > > > > this will work, but the explicit check for vararg should be `i == > > nargs - 1 && args[i].type == 0`. Everything else (if it happens) is > > probably a bad BTF data. > > Jiri, > could you please resubmit with the check like Andrii suggested? > Thanks! > yes, will send it later today jirka
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index b1a76fe046cb..017a80324139 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5133,6 +5133,11 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]); return -EINVAL; } + /* void at the end of args means '...' argument, skip it */ + if (!ret && (i + 1 == nargs)) { + nargs--; + break; + } m->arg_size[i] = ret; } m->nr_args = nargs;
For functions with variable arguments like: void set_worker_desc(const char *fmt, ...) the BTF data contains void argument at the end: [4061] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2 'fmt' type_id=3 '(anon)' type_id=0 When attaching function with this void argument the btf_distill_func_proto will set last btf_func_model's argument with size 0 and that will cause extra loop in save_regs/restore_regs functions and generate trampoline code like: 55 push %rbp 48 89 e5 mov %rsp,%rbp 48 83 ec 10 sub $0x10,%rsp 53 push %rbx 48 89 7d f0 mov %rdi,-0x10(%rbp) 75 f8 jne 0xffffffffa00cf007 ^^^ extra jump It's causing soft lockups/crashes probably depends on what context is the attached function called, like for set_worker_desc: watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [kworker/u40:4:239] CPU: 16 PID: 239 Comm: kworker/u40:4 Not tainted 5.12.0-rc4qemu+ #178 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014 Workqueue: writeback wb_workfn RIP: 0010:bpf_trampoline_6442464853_0+0xa/0x1000 Code: Unable to access opcode bytes at RIP 0xffffffffa3597fe0. RSP: 0018:ffffc90000687da8 EFLAGS: 00000217 Call Trace: set_worker_desc+0x5/0xb0 wb_workfn+0x48/0x4d0 ? psi_group_change+0x41/0x210 ? __bpf_prog_exit+0x15/0x20 ? bpf_trampoline_6442458903_0+0x3b/0x1000 ? update_pasid+0x5/0x90 ? __switch_to+0x187/0x450 process_one_work+0x1e7/0x380 worker_thread+0x50/0x3b0 ? rescuer_thread+0x380/0x380 kthread+0x11b/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 This patch is removing the void argument from struct btf_func_model in btf_distill_func_proto, but perhaps we should also check for this in JIT's save_regs/restore_regs functions. Signed-off-by: Jiri Olsa <jolsa@kernel.org> --- kernel/bpf/btf.c | 5 +++++ 1 file changed, 5 insertions(+)