diff mbox series

[v3,1/2] bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD

Message ID 20230106154400.74211-1-paul@paul-moore.com (mailing list archive)
State Accepted
Commit ef01f4e25c1760920e2c94f1c232350277ace69b
Delegated to: BPF
Headers show
Series [v3,1/2] bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers fail 4 blamed authors not CCed: toke@redhat.com andrii@kernel.org martin.lau@linux.dev daniel@iogearbox.net; 10 maintainers not CCed: kpsingh@kernel.org haoluo@google.com song@kernel.org yhs@fb.com daniel@iogearbox.net toke@redhat.com andrii@kernel.org martin.lau@linux.dev john.fastabend@gmail.com jolsa@kernel.org
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 33 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc

Commit Message

Paul Moore Jan. 6, 2023, 3:43 p.m. UTC
When changing the ebpf program put() routines to support being called
from within IRQ context the program ID was reset to zero prior to
calling the perf event and audit UNLOAD record generators, which
resulted in problems as the ebpf program ID was bogus (always zero).
This patch addresses this problem by removing an unnecessary call to
bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
__bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
have finished their bpf program unload tasks in
bpf_prog_put_deferred().  For the record, no one can determine, or
remember, why it was necessary to free the program ID, and remove it
from the IDR, prior to executing bpf_prog_put_deferred();
regardless, both Stanislav and Alexei agree that the approach in this
patch should be safe.

It is worth noting that when moving the bpf_prog_free_id() call, the
do_idr_lock parameter was forced to true as the ebpf devs determined
this was the correct as the do_idr_lock should always be true.  The
do_idr_lock parameter will be removed in a follow-up patch, but it
was kept here to keep the patch small in an effort to ease any stable
backports.

I also modified the bpf_audit_prog() logic used to associate the
AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
Instead of keying off the operation, it now keys off the execution
context, e.g. '!in_irg && !irqs_disabled()', which is much more
appropriate and should help better connect the UNLOAD operations with
the associated audit state (other audit records).

Cc: stable@vger.kernel.org
Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")
Reported-by: Burn Alting <burn.alting@iinet.net.au>
Reported-by: Jiri Olsa <olsajiri@gmail.com>
Suggested-by: Stanislav Fomichev <sdf@google.com>
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>

---
* v3
- abandon most of the changes in v2
- move bpf_prog_free_id() after the audit/perf unload hooks
- remove bpf_prog_free_id() from __bpf_prog_offload_destroy()
- added stable tag
* v2
- change subj
- add mention of the perf regression
- drop the dedicated program audit ID
- add the bpf_prog::valid_id flag, bpf_prog_get_id() getter
- convert prog ID users to new ID getter
* v1
- subj was: "bpf: restore the ebpf audit UNLOAD id field"
- initial draft
---
 kernel/bpf/offload.c | 3 ---
 kernel/bpf/syscall.c | 6 ++----
 2 files changed, 2 insertions(+), 7 deletions(-)

Comments

Stanislav Fomichev Jan. 6, 2023, 7:45 p.m. UTC | #1
On Fri, Jan 6, 2023 at 7:44 AM Paul Moore <paul@paul-moore.com> wrote:
>
> When changing the ebpf program put() routines to support being called
> from within IRQ context the program ID was reset to zero prior to
> calling the perf event and audit UNLOAD record generators, which
> resulted in problems as the ebpf program ID was bogus (always zero).
> This patch addresses this problem by removing an unnecessary call to
> bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> have finished their bpf program unload tasks in
> bpf_prog_put_deferred().  For the record, no one can determine, or
> remember, why it was necessary to free the program ID, and remove it
> from the IDR, prior to executing bpf_prog_put_deferred();
> regardless, both Stanislav and Alexei agree that the approach in this
> patch should be safe.
>
> It is worth noting that when moving the bpf_prog_free_id() call, the
> do_idr_lock parameter was forced to true as the ebpf devs determined
> this was the correct as the do_idr_lock should always be true.  The
> do_idr_lock parameter will be removed in a follow-up patch, but it
> was kept here to keep the patch small in an effort to ease any stable
> backports.
>
> I also modified the bpf_audit_prog() logic used to associate the
> AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> Instead of keying off the operation, it now keys off the execution
> context, e.g. '!in_irg && !irqs_disabled()', which is much more
> appropriate and should help better connect the UNLOAD operations with
> the associated audit state (other audit records).
>
> Cc: stable@vger.kernel.org
> Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")
> Reported-by: Burn Alting <burn.alting@iinet.net.au>
> Reported-by: Jiri Olsa <olsajiri@gmail.com>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Paul Moore <paul@paul-moore.com>

Acked-by: Stanislav Fomichev <sdf@google.com>

Thank you! There might be a chance it breaks test_offload.py (I don't
remember whether it checks this prog-is-removed-from-id part or not),
but I don't think it's fair to ask to address it :-)
Since it doesn't trigger in CI, I'll take another look next week when
doing a respin of my 'xdp-hints' series.


> ---
> * v3
> - abandon most of the changes in v2
> - move bpf_prog_free_id() after the audit/perf unload hooks
> - remove bpf_prog_free_id() from __bpf_prog_offload_destroy()
> - added stable tag
> * v2
> - change subj
> - add mention of the perf regression
> - drop the dedicated program audit ID
> - add the bpf_prog::valid_id flag, bpf_prog_get_id() getter
> - convert prog ID users to new ID getter
> * v1
> - subj was: "bpf: restore the ebpf audit UNLOAD id field"
> - initial draft
> ---
>  kernel/bpf/offload.c | 3 ---
>  kernel/bpf/syscall.c | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
> index 13e4efc971e6..190d9f9dc987 100644
> --- a/kernel/bpf/offload.c
> +++ b/kernel/bpf/offload.c
> @@ -216,9 +216,6 @@ static void __bpf_prog_offload_destroy(struct bpf_prog *prog)
>         if (offload->dev_state)
>                 offload->offdev->ops->destroy(prog);
>
> -       /* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */
> -       bpf_prog_free_id(prog, true);
> -
>         list_del_init(&offload->offloads);
>         kfree(offload);
>         prog->aux->offload = NULL;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 64131f88c553..61bb19e81b9c 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1972,7 +1972,7 @@ static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op)
>                 return;
>         if (audit_enabled == AUDIT_OFF)
>                 return;
> -       if (op == BPF_AUDIT_LOAD)
> +       if (!in_irq() && !irqs_disabled())
>                 ctx = audit_context();
>         ab = audit_log_start(ctx, GFP_ATOMIC, AUDIT_BPF);
>         if (unlikely(!ab))
> @@ -2067,6 +2067,7 @@ static void bpf_prog_put_deferred(struct work_struct *work)
>         prog = aux->prog;
>         perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
>         bpf_audit_prog(prog, BPF_AUDIT_UNLOAD);
> +       bpf_prog_free_id(prog, true);
>         __bpf_prog_put_noref(prog, true);
>  }
>
> @@ -2075,9 +2076,6 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
>         struct bpf_prog_aux *aux = prog->aux;
>
>         if (atomic64_dec_and_test(&aux->refcnt)) {
> -               /* bpf_prog_free_id() must be called first */
> -               bpf_prog_free_id(prog, do_idr_lock);
> -
>                 if (in_irq() || irqs_disabled()) {
>                         INIT_WORK(&aux->work, bpf_prog_put_deferred);
>                         schedule_work(&aux->work);
> --
> 2.39.0
>
Paul Moore Jan. 9, 2023, 4:54 p.m. UTC | #2
On Fri, Jan 6, 2023 at 2:45 PM Stanislav Fomichev <sdf@google.com> wrote:
> On Fri, Jan 6, 2023 at 7:44 AM Paul Moore <paul@paul-moore.com> wrote:
> >
> > When changing the ebpf program put() routines to support being called
> > from within IRQ context the program ID was reset to zero prior to
> > calling the perf event and audit UNLOAD record generators, which
> > resulted in problems as the ebpf program ID was bogus (always zero).
> > This patch addresses this problem by removing an unnecessary call to
> > bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> > __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> > have finished their bpf program unload tasks in
> > bpf_prog_put_deferred().  For the record, no one can determine, or
> > remember, why it was necessary to free the program ID, and remove it
> > from the IDR, prior to executing bpf_prog_put_deferred();
> > regardless, both Stanislav and Alexei agree that the approach in this
> > patch should be safe.
> >
> > It is worth noting that when moving the bpf_prog_free_id() call, the
> > do_idr_lock parameter was forced to true as the ebpf devs determined
> > this was the correct as the do_idr_lock should always be true.  The
> > do_idr_lock parameter will be removed in a follow-up patch, but it
> > was kept here to keep the patch small in an effort to ease any stable
> > backports.
> >
> > I also modified the bpf_audit_prog() logic used to associate the
> > AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> > Instead of keying off the operation, it now keys off the execution
> > context, e.g. '!in_irg && !irqs_disabled()', which is much more
> > appropriate and should help better connect the UNLOAD operations with
> > the associated audit state (other audit records).
> >
> > Cc: stable@vger.kernel.org
> > Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")
> > Reported-by: Burn Alting <burn.alting@iinet.net.au>
> > Reported-by: Jiri Olsa <olsajiri@gmail.com>
> > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Signed-off-by: Paul Moore <paul@paul-moore.com>
>
> Acked-by: Stanislav Fomichev <sdf@google.com>
>
> Thank you! There might be a chance it breaks test_offload.py (I don't
> remember whether it checks this prog-is-removed-from-id part or not),
> but I don't think it's fair to ask to address it :-)
> Since it doesn't trigger in CI, I'll take another look next week when
> doing a respin of my 'xdp-hints' series.

No problem, I'm glad we found a solution that works for everyone; and
thank you for chasing down any test changes that may be necessary.

I'd like to get this patch into Linus' tree sooner rather than later
as it fixes a kinda ugly problem, would you be okay if this went in
via the bpf tree?  With the appropriate ACKs I could send it to Linus
via the audit tree, but I think it would be much better to send it via
the bpf/netdev tree.
Stanislav Fomichev Jan. 9, 2023, 6:04 p.m. UTC | #3
On 01/09, Paul Moore wrote:
> On Fri, Jan 6, 2023 at 2:45 PM Stanislav Fomichev <sdf@google.com> wrote:
> > On Fri, Jan 6, 2023 at 7:44 AM Paul Moore <paul@paul-moore.com> wrote:
> > >
> > > When changing the ebpf program put() routines to support being called
> > > from within IRQ context the program ID was reset to zero prior to
> > > calling the perf event and audit UNLOAD record generators, which
> > > resulted in problems as the ebpf program ID was bogus (always zero).
> > > This patch addresses this problem by removing an unnecessary call to
> > > bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> > > __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> > > have finished their bpf program unload tasks in
> > > bpf_prog_put_deferred().  For the record, no one can determine, or
> > > remember, why it was necessary to free the program ID, and remove it
> > > from the IDR, prior to executing bpf_prog_put_deferred();
> > > regardless, both Stanislav and Alexei agree that the approach in this
> > > patch should be safe.
> > >
> > > It is worth noting that when moving the bpf_prog_free_id() call, the
> > > do_idr_lock parameter was forced to true as the ebpf devs determined
> > > this was the correct as the do_idr_lock should always be true.  The
> > > do_idr_lock parameter will be removed in a follow-up patch, but it
> > > was kept here to keep the patch small in an effort to ease any stable
> > > backports.
> > >
> > > I also modified the bpf_audit_prog() logic used to associate the
> > > AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> > > Instead of keying off the operation, it now keys off the execution
> > > context, e.g. '!in_irg && !irqs_disabled()', which is much more
> > > appropriate and should help better connect the UNLOAD operations with
> > > the associated audit state (other audit records).
> > >
> > > Cc: stable@vger.kernel.org
> > > Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from  
> irq context.")
> > > Reported-by: Burn Alting <burn.alting@iinet.net.au>
> > > Reported-by: Jiri Olsa <olsajiri@gmail.com>
> > > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > > Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > > Signed-off-by: Paul Moore <paul@paul-moore.com>
> >
> > Acked-by: Stanislav Fomichev <sdf@google.com>
> >
> > Thank you! There might be a chance it breaks test_offload.py (I don't
> > remember whether it checks this prog-is-removed-from-id part or not),
> > but I don't think it's fair to ask to address it :-)
> > Since it doesn't trigger in CI, I'll take another look next week when
> > doing a respin of my 'xdp-hints' series.

> No problem, I'm glad we found a solution that works for everyone; and
> thank you for chasing down any test changes that may be necessary.

> I'd like to get this patch into Linus' tree sooner rather than later
> as it fixes a kinda ugly problem, would you be okay if this went in
> via the bpf tree?  With the appropriate ACKs I could send it to Linus
> via the audit tree, but I think it would be much better to send it via
> the bpf/netdev tree.

Don't see any reason that this should go via bpf-next, so assuming
going via bpf three should be fine.


> --
> paul-moore.com
patchwork-bot+netdevbpf@kernel.org Jan. 10, 2023, 4 a.m. UTC | #4
Hello:

This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri,  6 Jan 2023 10:43:59 -0500 you wrote:
> When changing the ebpf program put() routines to support being called
> from within IRQ context the program ID was reset to zero prior to
> calling the perf event and audit UNLOAD record generators, which
> resulted in problems as the ebpf program ID was bogus (always zero).
> This patch addresses this problem by removing an unnecessary call to
> bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> have finished their bpf program unload tasks in
> bpf_prog_put_deferred().  For the record, no one can determine, or
> remember, why it was necessary to free the program ID, and remove it
> from the IDR, prior to executing bpf_prog_put_deferred();
> regardless, both Stanislav and Alexei agree that the approach in this
> patch should be safe.
> 
> [...]

Here is the summary with links:
  - [v3,1/2] bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD
    https://git.kernel.org/bpf/bpf/c/ef01f4e25c17
  - [v3,2/2] bpf: remove the do_idr_lock parameter from bpf_prog_free_id()
    https://git.kernel.org/bpf/bpf/c/e7895f017b79

You are awesome, thank you!
Jiri Olsa Jan. 10, 2023, 9:10 a.m. UTC | #5
On Fri, Jan 06, 2023 at 10:43:59AM -0500, Paul Moore wrote:
> When changing the ebpf program put() routines to support being called
> from within IRQ context the program ID was reset to zero prior to
> calling the perf event and audit UNLOAD record generators, which
> resulted in problems as the ebpf program ID was bogus (always zero).
> This patch addresses this problem by removing an unnecessary call to
> bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> have finished their bpf program unload tasks in
> bpf_prog_put_deferred().  For the record, no one can determine, or
> remember, why it was necessary to free the program ID, and remove it
> from the IDR, prior to executing bpf_prog_put_deferred();
> regardless, both Stanislav and Alexei agree that the approach in this
> patch should be safe.
> 
> It is worth noting that when moving the bpf_prog_free_id() call, the
> do_idr_lock parameter was forced to true as the ebpf devs determined
> this was the correct as the do_idr_lock should always be true.  The
> do_idr_lock parameter will be removed in a follow-up patch, but it
> was kept here to keep the patch small in an effort to ease any stable
> backports.
> 
> I also modified the bpf_audit_prog() logic used to associate the
> AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> Instead of keying off the operation, it now keys off the execution
> context, e.g. '!in_irg && !irqs_disabled()', which is much more
> appropriate and should help better connect the UNLOAD operations with
> the associated audit state (other audit records).
> 
> Cc: stable@vger.kernel.org
> Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")
> Reported-by: Burn Alting <burn.alting@iinet.net.au>
> Reported-by: Jiri Olsa <olsajiri@gmail.com>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Paul Moore <paul@paul-moore.com>
> 
> ---
> * v3
> - abandon most of the changes in v2
> - move bpf_prog_free_id() after the audit/perf unload hooks
> - remove bpf_prog_free_id() from __bpf_prog_offload_destroy()
> - added stable tag

fwiw I checked and the perf UNLOAD events have proper id now
thanks for fixing this

jirka


> * v2
> - change subj
> - add mention of the perf regression
> - drop the dedicated program audit ID
> - add the bpf_prog::valid_id flag, bpf_prog_get_id() getter
> - convert prog ID users to new ID getter
> * v1
> - subj was: "bpf: restore the ebpf audit UNLOAD id field"
> - initial draft
> ---
>  kernel/bpf/offload.c | 3 ---
>  kernel/bpf/syscall.c | 6 ++----
>  2 files changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
> index 13e4efc971e6..190d9f9dc987 100644
> --- a/kernel/bpf/offload.c
> +++ b/kernel/bpf/offload.c
> @@ -216,9 +216,6 @@ static void __bpf_prog_offload_destroy(struct bpf_prog *prog)
>  	if (offload->dev_state)
>  		offload->offdev->ops->destroy(prog);
>  
> -	/* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */
> -	bpf_prog_free_id(prog, true);
> -
>  	list_del_init(&offload->offloads);
>  	kfree(offload);
>  	prog->aux->offload = NULL;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 64131f88c553..61bb19e81b9c 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1972,7 +1972,7 @@ static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op)
>  		return;
>  	if (audit_enabled == AUDIT_OFF)
>  		return;
> -	if (op == BPF_AUDIT_LOAD)
> +	if (!in_irq() && !irqs_disabled())
>  		ctx = audit_context();
>  	ab = audit_log_start(ctx, GFP_ATOMIC, AUDIT_BPF);
>  	if (unlikely(!ab))
> @@ -2067,6 +2067,7 @@ static void bpf_prog_put_deferred(struct work_struct *work)
>  	prog = aux->prog;
>  	perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
>  	bpf_audit_prog(prog, BPF_AUDIT_UNLOAD);
> +	bpf_prog_free_id(prog, true);
>  	__bpf_prog_put_noref(prog, true);
>  }
>  
> @@ -2075,9 +2076,6 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
>  	struct bpf_prog_aux *aux = prog->aux;
>  
>  	if (atomic64_dec_and_test(&aux->refcnt)) {
> -		/* bpf_prog_free_id() must be called first */
> -		bpf_prog_free_id(prog, do_idr_lock);
> -
>  		if (in_irq() || irqs_disabled()) {
>  			INIT_WORK(&aux->work, bpf_prog_put_deferred);
>  			schedule_work(&aux->work);
> -- 
> 2.39.0
>
Paul Moore Jan. 10, 2023, 4:55 p.m. UTC | #6
On Tue, Jan 10, 2023 at 4:10 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> On Fri, Jan 06, 2023 at 10:43:59AM -0500, Paul Moore wrote:
> > When changing the ebpf program put() routines to support being called
> > from within IRQ context the program ID was reset to zero prior to
> > calling the perf event and audit UNLOAD record generators, which
> > resulted in problems as the ebpf program ID was bogus (always zero).
> > This patch addresses this problem by removing an unnecessary call to
> > bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
> > __bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
> > have finished their bpf program unload tasks in
> > bpf_prog_put_deferred().  For the record, no one can determine, or
> > remember, why it was necessary to free the program ID, and remove it
> > from the IDR, prior to executing bpf_prog_put_deferred();
> > regardless, both Stanislav and Alexei agree that the approach in this
> > patch should be safe.
> >
> > It is worth noting that when moving the bpf_prog_free_id() call, the
> > do_idr_lock parameter was forced to true as the ebpf devs determined
> > this was the correct as the do_idr_lock should always be true.  The
> > do_idr_lock parameter will be removed in a follow-up patch, but it
> > was kept here to keep the patch small in an effort to ease any stable
> > backports.
> >
> > I also modified the bpf_audit_prog() logic used to associate the
> > AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
> > Instead of keying off the operation, it now keys off the execution
> > context, e.g. '!in_irg && !irqs_disabled()', which is much more
> > appropriate and should help better connect the UNLOAD operations with
> > the associated audit state (other audit records).
> >
> > Cc: stable@vger.kernel.org
> > Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")
> > Reported-by: Burn Alting <burn.alting@iinet.net.au>
> > Reported-by: Jiri Olsa <olsajiri@gmail.com>
> > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Signed-off-by: Paul Moore <paul@paul-moore.com>
> >
> > ---
> > * v3
> > - abandon most of the changes in v2
> > - move bpf_prog_free_id() after the audit/perf unload hooks
> > - remove bpf_prog_free_id() from __bpf_prog_offload_destroy()
> > - added stable tag
>
> fwiw I checked and the perf UNLOAD events have proper id now
> thanks for fixing this

No problem, thanks for verifying that this solves the perf problem too.
diff mbox series

Patch

diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
index 13e4efc971e6..190d9f9dc987 100644
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -216,9 +216,6 @@  static void __bpf_prog_offload_destroy(struct bpf_prog *prog)
 	if (offload->dev_state)
 		offload->offdev->ops->destroy(prog);
 
-	/* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */
-	bpf_prog_free_id(prog, true);
-
 	list_del_init(&offload->offloads);
 	kfree(offload);
 	prog->aux->offload = NULL;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 64131f88c553..61bb19e81b9c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1972,7 +1972,7 @@  static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op)
 		return;
 	if (audit_enabled == AUDIT_OFF)
 		return;
-	if (op == BPF_AUDIT_LOAD)
+	if (!in_irq() && !irqs_disabled())
 		ctx = audit_context();
 	ab = audit_log_start(ctx, GFP_ATOMIC, AUDIT_BPF);
 	if (unlikely(!ab))
@@ -2067,6 +2067,7 @@  static void bpf_prog_put_deferred(struct work_struct *work)
 	prog = aux->prog;
 	perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
 	bpf_audit_prog(prog, BPF_AUDIT_UNLOAD);
+	bpf_prog_free_id(prog, true);
 	__bpf_prog_put_noref(prog, true);
 }
 
@@ -2075,9 +2076,6 @@  static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
 	struct bpf_prog_aux *aux = prog->aux;
 
 	if (atomic64_dec_and_test(&aux->refcnt)) {
-		/* bpf_prog_free_id() must be called first */
-		bpf_prog_free_id(prog, do_idr_lock);
-
 		if (in_irq() || irqs_disabled()) {
 			INIT_WORK(&aux->work, bpf_prog_put_deferred);
 			schedule_work(&aux->work);