diff mbox series

[bpf-next,v1,07/10] bpf: Ensure IP is within prog->jited_length for bpf_throw calls

Message ID 20230713023232.1411523-8-memxor@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Exceptions - 1/2 | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs_no_alu32 on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1451 this patch: 1450
netdev/cc_maintainers warning 16 maintainers not CCed: tglx@linutronix.de hpa@zytor.com dsahern@kernel.org mingo@redhat.com kpsingh@kernel.org x86@kernel.org john.fastabend@gmail.com sdf@google.com yhs@fb.com netdev@vger.kernel.org song@kernel.org dave.hansen@linux.intel.com davem@davemloft.net jolsa@kernel.org haoluo@google.com bp@alien8.de
netdev/build_clang fail Errors and warnings before: 180 this patch: 180
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1446 this patch: 1445
netdev/checkpatch warning CHECK: extern prototypes should be avoided in .h files
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain_full }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-8 success Logs for veristat

Commit Message

Kumar Kartikeya Dwivedi July 13, 2023, 2:32 a.m. UTC
Now that we allow exception throwing using bpf_throw kfunc, it can
appear as the final instruction in a prog. When this happens, and we
begin to unwind the stack using arch_bpf_stack_walk, the instruction
pointer (IP) may appear to lie outside the JITed instructions. This
happens because the return address is the instruction following the
call, but the bpf_throw never returns to the program, so the JIT
considers instruction ending at the bpf_throw call as the final JITed
instruction and end of the jited_length for the program.

This becomes a problem when we search the IP using is_bpf_text_address
and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
and it rightfully considers addr == ksym.end to be outside the program's
boundaries.

Insert a dummy 'int3' instruction which will never be hit to bump the
jited_length and allow us to handle programs with their final
isntruction being a call to bpf_throw.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
 include/linux/bpf.h         |  2 ++
 2 files changed, 13 insertions(+)

Comments

Alexei Starovoitov July 14, 2023, 10:39 p.m. UTC | #1
On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> Now that we allow exception throwing using bpf_throw kfunc, it can
> appear as the final instruction in a prog. When this happens, and we
> begin to unwind the stack using arch_bpf_stack_walk, the instruction
> pointer (IP) may appear to lie outside the JITed instructions. This
> happens because the return address is the instruction following the
> call, but the bpf_throw never returns to the program, so the JIT
> considers instruction ending at the bpf_throw call as the final JITed
> instruction and end of the jited_length for the program.
> 
> This becomes a problem when we search the IP using is_bpf_text_address
> and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
> and it rightfully considers addr == ksym.end to be outside the program's
> boundaries.
> 
> Insert a dummy 'int3' instruction which will never be hit to bump the
> jited_length and allow us to handle programs with their final
> isntruction being a call to bpf_throw.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
>  include/linux/bpf.h         |  2 ++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 8d97c6a60f9a..052230cc7f50 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1579,6 +1579,17 @@ st:			if (is_imm8(insn->off))
>  			}
>  			if (emit_call(&prog, func, image + addrs[i - 1] + offs))
>  				return -EINVAL;
> +			/* Similar to BPF_EXIT_INSN, call for bpf_throw may be
> +			 * the final instruction in the program. Insert an int3
> +			 * following the call instruction so that we can still
> +			 * detect pc to be part of the bpf_prog in
> +			 * bpf_ksym_find, otherwise when this is the last
> +			 * instruction (as allowed by verifier, similar to exit
> +			 * and jump instructions), pc will be == ksym.end,
> +			 * leading to bpf_throw failing to unwind the stack.
> +			 */
> +			if (func == (u8 *)&bpf_throw)
> +				EMIT1(0xCC); /* int3 */

Probably worth explaining that this happens because bpf_throw is marked
__attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn.
Meaing the program might not have BPF_EXIT at all.

I wonder though whether this self-inflicted pain is worth it.
May be it shouldn't be marked as noreturn.
What do we gain by marking?
Kumar Kartikeya Dwivedi July 17, 2023, 4:36 p.m. UTC | #2
On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> > Now that we allow exception throwing using bpf_throw kfunc, it can
> > appear as the final instruction in a prog. When this happens, and we
> > begin to unwind the stack using arch_bpf_stack_walk, the instruction
> > pointer (IP) may appear to lie outside the JITed instructions. This
> > happens because the return address is the instruction following the
> > call, but the bpf_throw never returns to the program, so the JIT
> > considers instruction ending at the bpf_throw call as the final JITed
> > instruction and end of the jited_length for the program.
> >
> > This becomes a problem when we search the IP using is_bpf_text_address
> > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
> > and it rightfully considers addr == ksym.end to be outside the program's
> > boundaries.
> >
> > Insert a dummy 'int3' instruction which will never be hit to bump the
> > jited_length and allow us to handle programs with their final
> > isntruction being a call to bpf_throw.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
> >  include/linux/bpf.h         |  2 ++
> >  2 files changed, 13 insertions(+)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 8d97c6a60f9a..052230cc7f50 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1579,6 +1579,17 @@ st:                    if (is_imm8(insn->off))
> >                       }
> >                       if (emit_call(&prog, func, image + addrs[i - 1] + offs))
> >                               return -EINVAL;
> > +                     /* Similar to BPF_EXIT_INSN, call for bpf_throw may be
> > +                      * the final instruction in the program. Insert an int3
> > +                      * following the call instruction so that we can still
> > +                      * detect pc to be part of the bpf_prog in
> > +                      * bpf_ksym_find, otherwise when this is the last
> > +                      * instruction (as allowed by verifier, similar to exit
> > +                      * and jump instructions), pc will be == ksym.end,
> > +                      * leading to bpf_throw failing to unwind the stack.
> > +                      */
> > +                     if (func == (u8 *)&bpf_throw)
> > +                             EMIT1(0xCC); /* int3 */
>
> Probably worth explaining that this happens because bpf_throw is marked
> __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn.
> Meaing the program might not have BPF_EXIT at all.

Yes, sorry about omitting that. I will add it to the commit message in v2.

>
> I wonder though whether this self-inflicted pain is worth it.
> May be it shouldn't be marked as noreturn.
> What do we gain by marking?

It felt like the obvious thing to do to me. The cost on the kernel
side is negligible (atleast in my opinion), we just have to allow it
as final instruction in the program. If it's heavily used it allows
the compiler to better optimize the code (marking anything after it
unreachable, no need to save registers etc., although this may not be
a persuasive point for you).

Regardless of this noreturn attribute, I was thinking whether we
should always emit an extra instruction so that any IP (say one past
last instruction) we get for a BPF prog can always be seen as
belonging to it. It probably is only a problem surfaced by this
bpf_throw call at the end, but I was wondering whether doing it
unconditionally makes sense.
Alexei Starovoitov July 17, 2023, 5:45 p.m. UTC | #3
On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > Now that we allow exception throwing using bpf_throw kfunc, it can
> > > appear as the final instruction in a prog. When this happens, and we
> > > begin to unwind the stack using arch_bpf_stack_walk, the instruction
> > > pointer (IP) may appear to lie outside the JITed instructions. This
> > > happens because the return address is the instruction following the
> > > call, but the bpf_throw never returns to the program, so the JIT
> > > considers instruction ending at the bpf_throw call as the final JITed
> > > instruction and end of the jited_length for the program.
> > >
> > > This becomes a problem when we search the IP using is_bpf_text_address
> > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
> > > and it rightfully considers addr == ksym.end to be outside the program's
> > > boundaries.
> > >
> > > Insert a dummy 'int3' instruction which will never be hit to bump the
> > > jited_length and allow us to handle programs with their final
> > > isntruction being a call to bpf_throw.
> > >
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
> > >  include/linux/bpf.h         |  2 ++
> > >  2 files changed, 13 insertions(+)
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index 8d97c6a60f9a..052230cc7f50 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -1579,6 +1579,17 @@ st:                    if (is_imm8(insn->off))
> > >                       }
> > >                       if (emit_call(&prog, func, image + addrs[i - 1] + offs))
> > >                               return -EINVAL;
> > > +                     /* Similar to BPF_EXIT_INSN, call for bpf_throw may be
> > > +                      * the final instruction in the program. Insert an int3
> > > +                      * following the call instruction so that we can still
> > > +                      * detect pc to be part of the bpf_prog in
> > > +                      * bpf_ksym_find, otherwise when this is the last
> > > +                      * instruction (as allowed by verifier, similar to exit
> > > +                      * and jump instructions), pc will be == ksym.end,
> > > +                      * leading to bpf_throw failing to unwind the stack.
> > > +                      */
> > > +                     if (func == (u8 *)&bpf_throw)
> > > +                             EMIT1(0xCC); /* int3 */
> >
> > Probably worth explaining that this happens because bpf_throw is marked
> > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn.
> > Meaing the program might not have BPF_EXIT at all.
>
> Yes, sorry about omitting that. I will add it to the commit message in v2.
>
> >
> > I wonder though whether this self-inflicted pain is worth it.
> > May be it shouldn't be marked as noreturn.
> > What do we gain by marking?
>
> It felt like the obvious thing to do to me. The cost on the kernel
> side is negligible (atleast in my opinion), we just have to allow it
> as final instruction in the program. If it's heavily used it allows
> the compiler to better optimize the code (marking anything after it
> unreachable, no need to save registers etc., although this may not be
> a persuasive point for you).

"no need to save registers"... "optimize"... that's the thing that worries me.
I think it's better to drop noreturn attribute.
bpf has implicit prolog/epilogue that only apply to bpf_exit insn.
bpf_call insn that doesn't return is exploiting undefined logic
in the compiler, since we never fully clarified our hidden prologue/epilogue
rules. Saying it differently, bpf_tail_call is also noreturn,
but if we mark it as such all kinds of things will break.
We still need to add alloca(). It doesn't play well with the current BPF ISA.
I think it's better to treat 'noreturn' as broken in the compiler,
since its behavior may change.

> Regardless of this noreturn attribute, I was thinking whether we
> should always emit an extra instruction so that any IP (say one past
> last instruction) we get for a BPF prog can always be seen as
> belonging to it. It probably is only a problem surfaced by this
> bpf_throw call at the end, but I was wondering whether doing it
> unconditionally makes sense.

I think it's a corner case of this 'noreturn' from bpf_call logic.
The bpf prog is padded with 0xcc before and after already.
What you're suggesting is to add one of 0xcc to the body of the prog.
That doesn't sound right.
Kumar Kartikeya Dwivedi July 17, 2023, 7:09 p.m. UTC | #4
On Mon, 17 Jul 2023 at 23:15, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > Now that we allow exception throwing using bpf_throw kfunc, it can
> > > > appear as the final instruction in a prog. When this happens, and we
> > > > begin to unwind the stack using arch_bpf_stack_walk, the instruction
> > > > pointer (IP) may appear to lie outside the JITed instructions. This
> > > > happens because the return address is the instruction following the
> > > > call, but the bpf_throw never returns to the program, so the JIT
> > > > considers instruction ending at the bpf_throw call as the final JITed
> > > > instruction and end of the jited_length for the program.
> > > >
> > > > This becomes a problem when we search the IP using is_bpf_text_address
> > > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood,
> > > > and it rightfully considers addr == ksym.end to be outside the program's
> > > > boundaries.
> > > >
> > > > Insert a dummy 'int3' instruction which will never be hit to bump the
> > > > jited_length and allow us to handle programs with their final
> > > > isntruction being a call to bpf_throw.
> > > >
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > ---
> > > >  arch/x86/net/bpf_jit_comp.c | 11 +++++++++++
> > > >  include/linux/bpf.h         |  2 ++
> > > >  2 files changed, 13 insertions(+)
> > > >
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > > index 8d97c6a60f9a..052230cc7f50 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -1579,6 +1579,17 @@ st:                    if (is_imm8(insn->off))
> > > >                       }
> > > >                       if (emit_call(&prog, func, image + addrs[i - 1] + offs))
> > > >                               return -EINVAL;
> > > > +                     /* Similar to BPF_EXIT_INSN, call for bpf_throw may be
> > > > +                      * the final instruction in the program. Insert an int3
> > > > +                      * following the call instruction so that we can still
> > > > +                      * detect pc to be part of the bpf_prog in
> > > > +                      * bpf_ksym_find, otherwise when this is the last
> > > > +                      * instruction (as allowed by verifier, similar to exit
> > > > +                      * and jump instructions), pc will be == ksym.end,
> > > > +                      * leading to bpf_throw failing to unwind the stack.
> > > > +                      */
> > > > +                     if (func == (u8 *)&bpf_throw)
> > > > +                             EMIT1(0xCC); /* int3 */
> > >
> > > Probably worth explaining that this happens because bpf_throw is marked
> > > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn.
> > > Meaing the program might not have BPF_EXIT at all.
> >
> > Yes, sorry about omitting that. I will add it to the commit message in v2.
> >
> > >
> > > I wonder though whether this self-inflicted pain is worth it.
> > > May be it shouldn't be marked as noreturn.
> > > What do we gain by marking?
> >
> > It felt like the obvious thing to do to me. The cost on the kernel
> > side is negligible (atleast in my opinion), we just have to allow it
> > as final instruction in the program. If it's heavily used it allows
> > the compiler to better optimize the code (marking anything after it
> > unreachable, no need to save registers etc., although this may not be
> > a persuasive point for you).
>
> "no need to save registers"... "optimize"... that's the thing that worries me.
> I think it's better to drop noreturn attribute.
> bpf has implicit prolog/epilogue that only apply to bpf_exit insn.
> bpf_call insn that doesn't return is exploiting undefined logic
> in the compiler, since we never fully clarified our hidden prologue/epilogue
> rules. Saying it differently, bpf_tail_call is also noreturn,
> but if we mark it as such all kinds of things will break.
> We still need to add alloca(). It doesn't play well with the current BPF ISA.
> I think it's better to treat 'noreturn' as broken in the compiler,
> since its behavior may change.
>

Ok, I think then let's drop this patch and the noreturn attribute on bpf_throw.

> > Regardless of this noreturn attribute, I was thinking whether we
> > should always emit an extra instruction so that any IP (say one past
> > last instruction) we get for a BPF prog can always be seen as
> > belonging to it. It probably is only a problem surfaced by this
> > bpf_throw call at the end, but I was wondering whether doing it
> > unconditionally makes sense.
>
> I think it's a corner case of this 'noreturn' from bpf_call logic.
> The bpf prog is padded with 0xcc before and after already.
> What you're suggesting is to add one of 0xcc to the body of the prog.
> That doesn't sound right.

Actually this patch was added because I caught the case where the
reported ip during unwinding was lying outside jited_length (==
ksym.end).
So including an extra instruction would prevent that. But it's true
it's a side effect of the noreturn attribute, it wouldn't occur
otherwise.

Let's drop this patch then.
diff mbox series

Patch

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8d97c6a60f9a..052230cc7f50 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1579,6 +1579,17 @@  st:			if (is_imm8(insn->off))
 			}
 			if (emit_call(&prog, func, image + addrs[i - 1] + offs))
 				return -EINVAL;
+			/* Similar to BPF_EXIT_INSN, call for bpf_throw may be
+			 * the final instruction in the program. Insert an int3
+			 * following the call instruction so that we can still
+			 * detect pc to be part of the bpf_prog in
+			 * bpf_ksym_find, otherwise when this is the last
+			 * instruction (as allowed by verifier, similar to exit
+			 * and jump instructions), pc will be == ksym.end,
+			 * leading to bpf_throw failing to unwind the stack.
+			 */
+			if (func == (u8 *)&bpf_throw)
+				EMIT1(0xCC); /* int3 */
 			break;
 		}
 
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 61cdb291311f..1652d184ee7f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3111,4 +3111,6 @@  static inline gfp_t bpf_memcg_flags(gfp_t flags)
 	return flags;
 }
 
+extern void bpf_throw(u64);
+
 #endif /* _LINUX_BPF_H */