mbox series

[bpf-next,v4,0/4] bpf: Fix tailcall infinite loop caused by freplace

Message ID 20240929132757.79826-1-leon.hwang@linux.dev (mailing list archive)
Headers show
Series bpf: Fix tailcall infinite loop caused by freplace | expand

Message

Leon Hwang Sept. 29, 2024, 1:27 p.m. UTC
Previously, I fixed a tailcall infinite loop issue caused by trampoline[0].

At this time, I fix a tailcall infinite loop issue caused by tailcall and
freplace combination by preventing updating extended prog to prog_array map
and preventing extending tail callee prog with freplace:

1. If a prog or its subprog has been extended by freplace prog, the prog
   can not be updated to prog_array map.
2. If a prog has been updated to prog_array map, it or its subprog can not
   be extended by freplace prog.

v3 -> v4:
  * Address comments from Eduard:
    * Rename 'tail_callee_cnt' to 'prog_array_member_cnt'.
    * Add comment to 'prog_array_member_cnt'.
    * Use a mutex to protect 'is_extended' and 'prog_array_member_cnt'.

v2 -> v3:
  * Address comments from Alexei:
    * Stop hacking JIT.
    * Prevent the specific use case at attach/update time.

v1 -> v2:
  * Address comment from Eduard:
    * Explain why nop5 and xor/nop3 are swapped at prologue.
  * Address comment from Alexei:
    * Disallow attaching tail_call_reachable freplace prog to
      not-tail_call_reachable target in verifier.
  * Update "bpf, arm64: Fix tailcall infinite loop caused by freplace" with
    latest arm64 JIT code.

Links:
[0] https://lore.kernel.org/bpf/20230912150442.2009-1-hffilwlqm@gmail.com/

Leon Hwang (4):
  bpf: Prevent updating extended prog to prog_array map
  bpf: Prevent extending tail callee prog with freplace prog
  selftests/bpf: Add a test case to confirm a tailcall infinite loop
    issue has been prevented
  selftests/bpf: Add cases to test tailcall in freplace

 include/linux/bpf.h                           |   3 +
 kernel/bpf/arraymap.c                         |  30 ++-
 kernel/bpf/core.c                             |   1 +
 kernel/bpf/syscall.c                          |  53 ++++-
 .../selftests/bpf/prog_tests/tailcalls.c      | 196 +++++++++++++++++-
 .../tailcall_bpf2bpf_hierarchy_freplace.c     |  30 +++
 .../testing/selftests/bpf/progs/tc_bpf2bpf.c  |  37 +++-
 7 files changed, 335 insertions(+), 15 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c

Comments

Eduard Zingerman Oct. 1, 2024, 11:14 a.m. UTC | #1
On Sun, 2024-09-29 at 21:27 +0800, Leon Hwang wrote:
> Previously, I fixed a tailcall infinite loop issue caused by trampoline[0].
> 
> At this time, I fix a tailcall infinite loop issue caused by tailcall and
> freplace combination by preventing updating extended prog to prog_array map
> and preventing extending tail callee prog with freplace:
> 
> 1. If a prog or its subprog has been extended by freplace prog, the prog
>    can not be updated to prog_array map.
> 2. If a prog has been updated to prog_array map, it or its subprog can not
>    be extended by freplace prog.

So, once this series is applied we essentially have:
- three variables:
  - tgt_prog->aux->is_extended
  - tgt_prog->aux->prog_array_member_cnt
  - trampoline->extension_prog
- four operations:
  - link/attach extension program 'prog' using trampoline 'tr'
  - unlink/detach extension program 'prog' using trampoline 'tr'
  - put program 'tgt_prog' into prog array
  - remove program 'tgt_prog' from prog array

And above four operations have the following pseudo-code with regards
to update of the variables:

- link/attach extension program 'prog' using trampoline 'tr':

    with lock(tgt_prog->ext_mutex):
      if tgt_prog->aux->prog_array_member_cnt:
         return error
      if tr->extension_prog:
         return error
      tr->extension_prog = prog
      tgt_prog->is_extended = true

- unlink/detach extension program 'prog' using trampoline 'tr':

    with lock(tgt_prog->ext_mutex):
      tr->extension_prog = NULL
      tgt_prog->is_extended = false

- put program 'tgt_prog' into prog array:

    with lock(tgt_prog->ext_mutex):
      if tgt_prog->aux->is_extended:
         return error
      tgt_prog->aux->prog_array_member_cnt++

- remove program 'tgt_prog' from prog array:

    with lock(tgt_prog->ext_mutex):
      tgt_prog->aux->prog_array_member_cnt--

I think this is correct, would be great if someone with more
concurrency related experience would take a look.

[...]