mbox series

[bpf-next,0/7] Fix MAX_TAIL_CALL_CNT handling in eBPF JITs

Message ID 20210809093437.876558-1-johan.almbladh@anyfinetworks.com (mailing list archive)
Headers show
Series Fix MAX_TAIL_CALL_CNT handling in eBPF JITs | expand

Message

Johan Almbladh Aug. 9, 2021, 9:34 a.m. UTC
A new test of tail call count limiting revealed that the interpreter
did in fact allow up to MAX_TAIL_CALL_CNT + 1 tail calls, whereas the
x86 JITs stopped at the intended MAX_TAIL_CALL_CNT. The interpreter was
fixed in commit b61a28cf11d61f512172e673b8f8c4a6c789b425 ("bpf: Fix
off-by-one in tail call count limiting"). This patch set fixes all
arch-specific JITs except for RISC-V.

For each of the affected JITs, the incorrect behaviour was verified
by running the test_bpf test suite in QEMU. After the fixes, the JITs
pass the tail call count limiting test.

I have not been able to test the RISC-V JITs due to the lack of a
working toolchain and QEMU setup. It is likely that the RISC-V JITs
have the off-by-one behaviour too. I have not verfied any of the NIC JITs.

Link: https://lore.kernel.org/bpf/20210728164741.350370-1-johan.almbladh@anyfinetworks.com/

Johan Almbladh (7):
  arm: bpf: Fix off-by-one in tail call count limiting
  arm64: bpf: Fix off-by-one in tail call count limiting
  powerpc: bpf: Fix off-by-one in tail call count limiting
  s390: bpf: Fix off-by-one in tail call count limiting
  sparc: bpf: Fix off-by-one in tail call count limiting
  mips: bpf: Fix off-by-one in tail call count limiting
  x86: bpf: Fix comments on tail call count limiting

 arch/arm/net/bpf_jit_32.c         | 6 +++---
 arch/arm64/net/bpf_jit_comp.c     | 4 ++--
 arch/mips/net/ebpf_jit.c          | 4 ++--
 arch/powerpc/net/bpf_jit_comp32.c | 4 ++--
 arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
 arch/s390/net/bpf_jit_comp.c      | 6 +++---
 arch/sparc/net/bpf_jit_comp_64.c  | 2 +-
 arch/x86/net/bpf_jit_comp32.c     | 6 +++---
 8 files changed, 18 insertions(+), 18 deletions(-)

Comments

Paul Chaignon Aug. 12, 2021, 4:36 p.m. UTC | #1
On Mon, Aug 09, 2021 at 11:34:30AM +0200, Johan Almbladh wrote:
> A new test of tail call count limiting revealed that the interpreter
> did in fact allow up to MAX_TAIL_CALL_CNT + 1 tail calls, whereas the
> x86 JITs stopped at the intended MAX_TAIL_CALL_CNT. The interpreter was
> fixed in commit b61a28cf11d61f512172e673b8f8c4a6c789b425 ("bpf: Fix
> off-by-one in tail call count limiting"). This patch set fixes all
> arch-specific JITs except for RISC-V.

I'm a bit surprised by this because I had previously tested the tail
call limit of several JIT compilers and found it to be 33 (i.e.,
allowing chains of up to 34 programs). I've just extended a test program
I had to validate this again on the x86-64 JIT and found a limit of 33
tail calls again [1].

Also note we had previously changed the RISC-V and MIPS JITs to allow up
to 33 tail calls [2, 3], for consistency with other JITs and with the
interpreter. We had decided to increase these two to 33 rather than
decrease the other JITs to 32 for backward compatibility, though that
probably doesn't matter much as I'd expect few people to actually use 33
tail calls :-)

1 - https://github.com/pchaigno/tail-call-bench/commit/ae7887482985b4b1745c9b2ef7ff9ae506c82886
2 - 96bc4432 ("bpf, riscv: Limit to 33 tail calls")
3 - e49e6f6d ("bpf, mips: Limit to 33 tail calls")

>
> For each of the affected JITs, the incorrect behaviour was verified
> by running the test_bpf test suite in QEMU. After the fixes, the JITs
> pass the tail call count limiting test.

If you are referring to test_tailcall_3 and its associated BPF program
tailcall3, then as far as I can tell, it checks that 33 tail calls are
allowed. The counter is incremented before each tail call except the
first one. The last tail call is rejected because we reach the limit, so
a counter value of 33 (as checked in the test code) means we've
successfully executed 33 tail calls.

--
Paul

>
> I have not been able to test the RISC-V JITs due to the lack of a
> working toolchain and QEMU setup. It is likely that the RISC-V JITs
> have the off-by-one behaviour too. I have not verfied any of the NIC JITs.
>
> Link: https://lore.kernel.org/bpf/20210728164741.350370-1-johan.almbladh@anyfinetworks.com/
>
> Johan Almbladh (7):
>   arm: bpf: Fix off-by-one in tail call count limiting
>   arm64: bpf: Fix off-by-one in tail call count limiting
>   powerpc: bpf: Fix off-by-one in tail call count limiting
>   s390: bpf: Fix off-by-one in tail call count limiting
>   sparc: bpf: Fix off-by-one in tail call count limiting
>   mips: bpf: Fix off-by-one in tail call count limiting
>   x86: bpf: Fix comments on tail call count limiting
>
>  arch/arm/net/bpf_jit_32.c         | 6 +++---
>  arch/arm64/net/bpf_jit_comp.c     | 4 ++--
>  arch/mips/net/ebpf_jit.c          | 4 ++--
>  arch/powerpc/net/bpf_jit_comp32.c | 4 ++--
>  arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
>  arch/s390/net/bpf_jit_comp.c      | 6 +++---
>  arch/sparc/net/bpf_jit_comp_64.c  | 2 +-
>  arch/x86/net/bpf_jit_comp32.c     | 6 +++---
>  8 files changed, 18 insertions(+), 18 deletions(-)
>
> --
> 2.25.1
>
Johan Almbladh Aug. 16, 2021, 7:17 a.m. UTC | #2
On Thu, Aug 12, 2021 at 6:37 PM Paul Chaignon <paul.chaignon@gmail.com> wrote:
> On Mon, Aug 09, 2021 at 11:34:30AM +0200, Johan Almbladh wrote:
> > A new test of tail call count limiting revealed that the interpreter
> > did in fact allow up to MAX_TAIL_CALL_CNT + 1 tail calls, whereas the
> > x86 JITs stopped at the intended MAX_TAIL_CALL_CNT. The interpreter was
> > fixed in commit b61a28cf11d61f512172e673b8f8c4a6c789b425 ("bpf: Fix
> > off-by-one in tail call count limiting"). This patch set fixes all
> > arch-specific JITs except for RISC-V.
>
> I'm a bit surprised by this because I had previously tested the tail
> call limit of several JIT compilers and found it to be 33 (i.e.,
> allowing chains of up to 34 programs). I've just extended a test program
> I had to validate this again on the x86-64 JIT and found a limit of 33
> tail calls again [1].

Hmm, that was surprising. I have been working on a MIPS32 JIT, and as
a part of that I have been extending the in-kernel test suite in
lib/test_bpf.c. The additional tests include a suite for testing tail
calls and associated error paths. The tests were merged to bpf-next
[1].

The tail call limit test is a very simple BPF program that increments
R1, sets R0 to R1, and then calls itself again with a tail call. Since
the program is called with R1=0, the return value R0 will then be 1 +
number of tail calls executed. When I ran this on x86 I got the
following result.

Interpreter: 34
x86_64 JIT: 33
i386 JIT: 33

So, the interpreter and the x86 JITs had different behaviours. It was
then decided to change the interpreter to allow 32 tail calls to match
the behaviour of the x86 JITs [2]. As a follow up on that, I tested
the other JITs except RISC-V in the same way, and found that they too
allowed one more tail call than the now-updated [3] interpreter. This
patch set updates the behaviour of those JITs as well.

[1] https://lore.kernel.org/bpf/20210809091829.810076-1-johan.almbladh@anyfinetworks.com/
[2] https://lore.kernel.org/bpf/5afe26c6-7ab1-88ab-a3e0-eb007256a856@iogearbox.net/
[3] b61a28cf1 ("bpf: Fix off-by-one in tail call count limiting")

> Also note we had previously changed the RISC-V and MIPS JITs to allow up
> to 33 tail calls [2, 3], for consistency with other JITs and with the
> interpreter. We had decided to increase these two to 33 rather than
> decrease the other JITs to 32 for backward compatibility, though that
> probably doesn't matter much as I'd expect few people to actually use 33
> tail calls :-)

Right, the backwards compatibility aspect is a valid point. I don't
think anyone would be near that limit though, :-) but still.

Whether the limit is 32 or 33 really doesn't matter. My only concern
here is that the limit should be the same across all JIT
implementations and the interpreter. We could instead change the x86
JITs and revert the interpreter change to let the limit be 33, if that
would be a better solution.

> 1 - https://github.com/pchaigno/tail-call-bench/commit/ae7887482985b4b1745c9b2ef7ff9ae506c82886
> 2 - 96bc4432 ("bpf, riscv: Limit to 33 tail calls")
> 3 - e49e6f6d ("bpf, mips: Limit to 33 tail calls")
>
> >
> > For each of the affected JITs, the incorrect behaviour was verified
> > by running the test_bpf test suite in QEMU. After the fixes, the JITs
> > pass the tail call count limiting test.
>
> If you are referring to test_tailcall_3 and its associated BPF program
> tailcall3, then as far as I can tell, it checks that 33 tail calls are
> allowed. The counter is incremented before each tail call except the
> first one. The last tail call is rejected because we reach the limit, so
> a counter value of 33 (as checked in the test code) means we've
> successfully executed 33 tail calls.

My test setup can build for all architectures included in this patch
set and some more, and then boot the kernel in QEMU with a
statically-linked busybox as userspace. I can easily run the kernel's
BPF test suite on all those architectures, but since I don't have a
full-fledged userspace I have not been able to run the selftests in
the same way.

We need to be able to determine what the tail call limit actually is
for the different implementations. I don't understand why you get
different results when testing from userspace compared to testing the
JIT itself. Either one of the tests is faulty, or there is some other
mechanism at play here.

Johan

> >
> > I have not been able to test the RISC-V JITs due to the lack of a
> > working toolchain and QEMU setup. It is likely that the RISC-V JITs
> > have the off-by-one behaviour too. I have not verfied any of the NIC JITs.
> >
> > Link: https://lore.kernel.org/bpf/20210728164741.350370-1-johan.almbladh@anyfinetworks.com/
> >
> > Johan Almbladh (7):
> >   arm: bpf: Fix off-by-one in tail call count limiting
> >   arm64: bpf: Fix off-by-one in tail call count limiting
> >   powerpc: bpf: Fix off-by-one in tail call count limiting
> >   s390: bpf: Fix off-by-one in tail call count limiting
> >   sparc: bpf: Fix off-by-one in tail call count limiting
> >   mips: bpf: Fix off-by-one in tail call count limiting
> >   x86: bpf: Fix comments on tail call count limiting
> >
> >  arch/arm/net/bpf_jit_32.c         | 6 +++---
> >  arch/arm64/net/bpf_jit_comp.c     | 4 ++--
> >  arch/mips/net/ebpf_jit.c          | 4 ++--
> >  arch/powerpc/net/bpf_jit_comp32.c | 4 ++--
> >  arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
> >  arch/s390/net/bpf_jit_comp.c      | 6 +++---
> >  arch/sparc/net/bpf_jit_comp_64.c  | 2 +-
> >  arch/x86/net/bpf_jit_comp32.c     | 6 +++---
> >  8 files changed, 18 insertions(+), 18 deletions(-)
> >
> > --
> > 2.25.1
> >