Message ID | 20221115095043.1249776-1-jolsa@kernel.org (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [RFC,bpf-next] bpf: Fix perf bpf event and audit prog id logging | expand |
On 11/15/22 10:50 AM, Jiri Olsa wrote: > hi, > perf_event_bpf_event and bpf_audit_prog calls currently report zero > program id for unload path. > > It's because of the [1] change moved those audit calls into work queue > and they are executed after the id is zeroed in bpf_prog_free_id. > > I originally made a change that added 'id_audit' field to struct > bpf_prog, which would be initialized as id, untouched and used > in audit callbacks. > > Then I realized we might actually not need to zero prog->aux->id > in bpf_prog_free_id. It seems to be called just once on release > paths. Tests seems ok with that. > > thoughts? > > thanks, > jirka > > > [1] d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.") > --- > kernel/bpf/syscall.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index fdbae52f463f..426529355c29 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -1991,7 +1991,6 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) > __acquire(&prog_idr_lock); > > idr_remove(&prog_idr, prog->aux->id); > - prog->aux->id = 0; This would trigger a race when offloaded progs are used, see also ad8ad79f4f60 ("bpf: offload: free program id when device disappears"). __bpf_prog_offload_destroy() calls it, and my read is that later bpf_prog_free_id() then hits the early !prog->aux->id return path. Is there a reason for irq context to not defer the bpf_prog_free_id()? > if (do_idr_lock) > spin_unlock_irqrestore(&prog_idr_lock, flags); >
On Tue, Nov 15, 2022 at 01:49:54PM +0100, Daniel Borkmann wrote: > On 11/15/22 10:50 AM, Jiri Olsa wrote: > > hi, > > perf_event_bpf_event and bpf_audit_prog calls currently report zero > > program id for unload path. > > > > It's because of the [1] change moved those audit calls into work queue > > and they are executed after the id is zeroed in bpf_prog_free_id. > > > > I originally made a change that added 'id_audit' field to struct > > bpf_prog, which would be initialized as id, untouched and used > > in audit callbacks. > > > > Then I realized we might actually not need to zero prog->aux->id > > in bpf_prog_free_id. It seems to be called just once on release > > paths. Tests seems ok with that. > > > > thoughts? > > > > thanks, > > jirka > > > > > > [1] d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.") > > --- > > kernel/bpf/syscall.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > index fdbae52f463f..426529355c29 100644 > > --- a/kernel/bpf/syscall.c > > +++ b/kernel/bpf/syscall.c > > @@ -1991,7 +1991,6 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) > > __acquire(&prog_idr_lock); > > idr_remove(&prog_idr, prog->aux->id); > > - prog->aux->id = 0; > > This would trigger a race when offloaded progs are used, see also ad8ad79f4f60 ("bpf: > offload: free program id when device disappears"). __bpf_prog_offload_destroy() calls > it, and my read is that later bpf_prog_free_id() then hits the early !prog->aux->id > return path. Is there a reason for irq context to not defer the bpf_prog_free_id()? there's comment saying: /* bpf_prog_free_id() must be called first */ it was added together with the BPF_(PROG|MAP)_GET_NEXT_ID api: 34ad5580f8f9 bpf: Add BPF_(PROG|MAP)_GET_NEXT_ID command Martin, any idea? while looking on this I noticed we can remove the do_idr_lock argument (patch below) because it's always true and I think we need to upgrade all the prog_idr_lock locks to spin_lock_irqsave because bpf_prog_put could be called from irq context because of d809e134be7a thanks, jirka --- diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 49f9d2bec401..50c71ce698d0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1761,7 +1761,7 @@ void bpf_prog_inc(struct bpf_prog *prog); struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog); void bpf_prog_put(struct bpf_prog *prog); -void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock); +void bpf_prog_free_id(struct bpf_prog *prog); void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock); struct btf_field *btf_record_find(const struct btf_record *rec, diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 13e4efc971e6..327dab644200 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -217,7 +217,7 @@ static void __bpf_prog_offload_destroy(struct bpf_prog *prog) offload->offdev->ops->destroy(prog); /* Make sure BPF_PROG_GET_NEXT_ID can't find this dead program */ - bpf_prog_free_id(prog, true); + bpf_prog_free_id(prog); list_del_init(&offload->offloads); kfree(offload); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index fdbae52f463f..9b929d8ab06d 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1973,7 +1973,7 @@ static int bpf_prog_alloc_id(struct bpf_prog *prog) return id > 0 ? 0 : id; } -void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) +void bpf_prog_free_id(struct bpf_prog *prog) { unsigned long flags; @@ -1985,18 +1985,12 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) if (!prog->aux->id) return; - if (do_idr_lock) - spin_lock_irqsave(&prog_idr_lock, flags); - else - __acquire(&prog_idr_lock); + spin_lock_irqsave(&prog_idr_lock, flags); idr_remove(&prog_idr, prog->aux->id); prog->aux->id = 0; - if (do_idr_lock) - spin_unlock_irqrestore(&prog_idr_lock, flags); - else - __release(&prog_idr_lock); + spin_unlock_irqrestore(&prog_idr_lock, flags); } static void __bpf_prog_put_rcu(struct rcu_head *rcu) @@ -2048,7 +2042,7 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) if (atomic64_dec_and_test(&aux->refcnt)) { /* bpf_prog_free_id() must be called first */ - bpf_prog_free_id(prog, do_idr_lock); + bpf_prog_free_id(prog); if (in_irq() || irqs_disabled()) { INIT_WORK(&aux->work, bpf_prog_put_deferred);
On Tue, Nov 15, 2022 at 3:26 PM Jiri Olsa <olsajiri@gmail.com> wrote: > > On Tue, Nov 15, 2022 at 01:49:54PM +0100, Daniel Borkmann wrote: > > On 11/15/22 10:50 AM, Jiri Olsa wrote: > > > hi, > > > perf_event_bpf_event and bpf_audit_prog calls currently report zero > > > program id for unload path. > > > > > > It's because of the [1] change moved those audit calls into work queue > > > and they are executed after the id is zeroed in bpf_prog_free_id. > > > > > > I originally made a change that added 'id_audit' field to struct > > > bpf_prog, which would be initialized as id, untouched and used > > > in audit callbacks. > > > > > > Then I realized we might actually not need to zero prog->aux->id > > > in bpf_prog_free_id. It seems to be called just once on release > > > paths. Tests seems ok with that. > > > > > > thoughts? > > > > > > thanks, > > > jirka > > > > > > > > > [1] d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.") > > > --- > > > kernel/bpf/syscall.c | 1 - > > > 1 file changed, 1 deletion(-) > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > > index fdbae52f463f..426529355c29 100644 > > > --- a/kernel/bpf/syscall.c > > > +++ b/kernel/bpf/syscall.c > > > @@ -1991,7 +1991,6 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) > > > __acquire(&prog_idr_lock); > > > idr_remove(&prog_idr, prog->aux->id); > > > - prog->aux->id = 0; > > > > This would trigger a race when offloaded progs are used, see also ad8ad79f4f60 ("bpf: > > offload: free program id when device disappears"). __bpf_prog_offload_destroy() calls > > it, and my read is that later bpf_prog_free_id() then hits the early !prog->aux->id > > return path. Is there a reason for irq context to not defer the bpf_prog_free_id()? > > there's comment saying: > /* bpf_prog_free_id() must be called first */ > > it was added together with the BPF_(PROG|MAP)_GET_NEXT_ID api: > 34ad5580f8f9 bpf: Add BPF_(PROG|MAP)_GET_NEXT_ID command > > Martin, any idea? > > while looking on this I noticed we can remove the do_idr_lock argument > (patch below) because it's always true and I think we need to upgrade > all the prog_idr_lock locks to spin_lock_irqsave because bpf_prog_put > could be called from irq context because of d809e134be7a before we dive into rabbit hole let's understand the severity of "perf_event_bpf_event and bpf_audit_prog calls currently report zero program id for unload path" and why we cannot leave it as-is.
On Tue, Nov 15, 2022 at 03:34:55PM -0800, Alexei Starovoitov wrote: > On Tue, Nov 15, 2022 at 3:26 PM Jiri Olsa <olsajiri@gmail.com> wrote: > > > > On Tue, Nov 15, 2022 at 01:49:54PM +0100, Daniel Borkmann wrote: > > > On 11/15/22 10:50 AM, Jiri Olsa wrote: > > > > hi, > > > > perf_event_bpf_event and bpf_audit_prog calls currently report zero > > > > program id for unload path. > > > > > > > > It's because of the [1] change moved those audit calls into work queue > > > > and they are executed after the id is zeroed in bpf_prog_free_id. > > > > > > > > I originally made a change that added 'id_audit' field to struct > > > > bpf_prog, which would be initialized as id, untouched and used > > > > in audit callbacks. > > > > > > > > Then I realized we might actually not need to zero prog->aux->id > > > > in bpf_prog_free_id. It seems to be called just once on release > > > > paths. Tests seems ok with that. > > > > > > > > thoughts? > > > > > > > > thanks, > > > > jirka > > > > > > > > > > > > [1] d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.") > > > > --- > > > > kernel/bpf/syscall.c | 1 - > > > > 1 file changed, 1 deletion(-) > > > > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > > > index fdbae52f463f..426529355c29 100644 > > > > --- a/kernel/bpf/syscall.c > > > > +++ b/kernel/bpf/syscall.c > > > > @@ -1991,7 +1991,6 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) > > > > __acquire(&prog_idr_lock); > > > > idr_remove(&prog_idr, prog->aux->id); > > > > - prog->aux->id = 0; > > > > > > This would trigger a race when offloaded progs are used, see also ad8ad79f4f60 ("bpf: > > > offload: free program id when device disappears"). __bpf_prog_offload_destroy() calls > > > it, and my read is that later bpf_prog_free_id() then hits the early !prog->aux->id > > > return path. Is there a reason for irq context to not defer the bpf_prog_free_id()? > > > > there's comment saying: > > /* bpf_prog_free_id() must be called first */ > > > > it was added together with the BPF_(PROG|MAP)_GET_NEXT_ID api: > > 34ad5580f8f9 bpf: Add BPF_(PROG|MAP)_GET_NEXT_ID command > > > > Martin, any idea? > > > > while looking on this I noticed we can remove the do_idr_lock argument > > (patch below) because it's always true and I think we need to upgrade > > all the prog_idr_lock locks to spin_lock_irqsave because bpf_prog_put > > could be called from irq context because of d809e134be7a > > before we dive into rabbit hole let's understand > the severity of > "perf_event_bpf_event and bpf_audit_prog calls currently report zero > program id for unload path" > > and why we cannot leave it as-is. I found this because I wanted use perf bpf-event to monitor bpf programs loads/unloads.. and I need 'id' on the unload event perf_event_bpf_event is irq safe so quick fix for me would be to move it back: if (atomic64_dec_and_test(&aux->refcnt)) { + perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0); /* bpf_prog_free_id() must be called first */ bpf_prog_free_id(prog, do_idr_lock); FWIW I also used fix below for a while for testing jirka --- diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d88e0741b381..6b752d4815e6 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1163,6 +1163,7 @@ struct bpf_prog_aux { u32 max_tp_access; u32 stack_depth; u32 id; + u32 id_audit; u32 func_cnt; /* used by non-func prog as the number of func progs */ u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */ u32 attach_btf_id; /* in-kernel BTF type id to attach to */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index d6081e8336c6..1f4fdf0aaac6 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1930,7 +1930,7 @@ static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op) if (unlikely(!ab)) return; audit_log_format(ab, "prog-id=%u op=%s", - prog->aux->id, bpf_audit_str[op]); + prog->aux->id_audit, bpf_audit_str[op]); audit_log_end(ab); } @@ -1942,7 +1942,7 @@ static int bpf_prog_alloc_id(struct bpf_prog *prog) spin_lock_bh(&prog_idr_lock); id = idr_alloc_cyclic(&prog_idr, prog, 1, INT_MAX, GFP_ATOMIC); if (id > 0) - prog->aux->id = id; + prog->aux->id = prog->aux->id_audit = id; spin_unlock_bh(&prog_idr_lock); idr_preload_end(); diff --git a/kernel/events/core.c b/kernel/events/core.c index 0f0bfcf5adce..32edb3a4360e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9065,7 +9065,7 @@ void perf_event_bpf_event(struct bpf_prog *prog, }, .type = type, .flags = flags, - .id = prog->aux->id, + .id = prog->aux->id_audit, }, };
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index fdbae52f463f..426529355c29 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1991,7 +1991,6 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) __acquire(&prog_idr_lock); idr_remove(&prog_idr, prog->aux->id); - prog->aux->id = 0; if (do_idr_lock) spin_unlock_irqrestore(&prog_idr_lock, flags);