Message ID | 20221206024554.3826186-1-sdf@google.com (mailing list archive) |
---|---|
Headers | show |
Series | xdp: hints via kfuncs | expand |
Stanislav Fomichev <sdf@google.com> writes: > Please see the first patch in the series for the overall > design and use-cases. > > Changes since v3: > > - Rework prog->bound_netdev refcounting (Jakub/Marin) > > Now it's based on the offload.c framework. It mostly fits, except > I had to automatically insert a HT entry for the netdev. In the > offloaded case, the netdev is added via a call to > bpf_offload_dev_netdev_register from the driver init path; with > a dev-bound programs, we have to manually add (and remove) the entry. > > As suggested by Toke, I'm also prohibiting putting dev-bound programs > into prog-array map; essentially prohibiting tail calling into it. > I'm also disabling freplace of the dev-bound programs. Both of those > restrictions can be loosened up eventually. I thought it would be a shame that we don't support at least freplace programs from the get-go (as that would exclude libxdp from taking advantage of this). So see below for a patch implementing this :) -Toke commit 3abb333e5fd2e8a0920b77013499bdae0ee3db43 Author: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu Dec 8 23:10:54 2022 +0100 bpf: Support consuming XDP HW metadata from fext programs Instead of rejecting the attaching of PROG_TYPE_EXT programs to XDP programs that consume HW metadata, implement support for propagating the offload information. The extension program doesn't need to set a flag or ifindex, it these will just be propagated from the target by the verifier. We need to create a separate offload object for the extension program, though, since it can be reattached to a different program later (which means we can't just inhering the offload information from the target). An additional check is added on attach that the new target is compatible with the offload information in the extension prog. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> diff --git a/include/linux/bpf.h b/include/linux/bpf.h index b46b60f4eae1..cfa5c847cf2c 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2482,6 +2482,7 @@ void *bpf_offload_resolve_kfunc(struct bpf_prog *prog, u32 func_id); void unpriv_ebpf_notify(int new_state); #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) +int __bpf_prog_offload_init(struct bpf_prog *prog, struct net_device *netdev); int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr); void bpf_offload_bound_netdev_unregister(struct net_device *dev); diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index bad8bab916eb..b059a7b53457 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -83,36 +83,25 @@ bpf_offload_find_netdev(struct net_device *netdev) return rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params); } -int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) +int __bpf_prog_offload_init(struct bpf_prog *prog, struct net_device *netdev) { struct bpf_offload_netdev *ondev; struct bpf_prog_offload *offload; int err; - if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && - attr->prog_type != BPF_PROG_TYPE_XDP) + if (!netdev) return -EINVAL; - if (attr->prog_flags & ~BPF_F_XDP_HAS_METADATA) - return -EINVAL; + err = __bpf_offload_init(); + if (err) + return err; offload = kzalloc(sizeof(*offload), GFP_USER); if (!offload) return -ENOMEM; - err = __bpf_offload_init(); - if (err) - return err; - offload->prog = prog; - - offload->netdev = dev_get_by_index(current->nsproxy->net_ns, - attr->prog_ifindex); - err = bpf_dev_offload_check(offload->netdev); - if (err) - goto err_maybe_put; - - prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_HAS_METADATA); + offload->netdev = netdev; down_write(&bpf_devs_lock); ondev = bpf_offload_find_netdev(offload->netdev); @@ -135,19 +124,46 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) offload->offdev = ondev->offdev; prog->aux->offload = offload; list_add_tail(&offload->offloads, &ondev->progs); - dev_put(offload->netdev); up_write(&bpf_devs_lock); return 0; err_unlock: up_write(&bpf_devs_lock); -err_maybe_put: - if (offload->netdev) - dev_put(offload->netdev); kfree(offload); return err; } +int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) +{ + struct net_device *netdev; + int err; + + if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && + attr->prog_type != BPF_PROG_TYPE_XDP) + return -EINVAL; + + if (attr->prog_flags & ~BPF_F_XDP_HAS_METADATA) + return -EINVAL; + + netdev = dev_get_by_index(current->nsproxy->net_ns, attr->prog_ifindex); + if (!netdev) + return -EINVAL; + + err = bpf_dev_offload_check(netdev); + if (err) + goto out; + + prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_HAS_METADATA); + + err = __bpf_prog_offload_init(prog, netdev); + if (err) + goto out; + +out: + dev_put(netdev); + return err; +} + int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) { struct bpf_prog_offload *offload; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index b345a273f7d0..606e6de5f716 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3021,6 +3021,14 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, goto out_put_prog; } + if (bpf_prog_is_dev_bound(tgt_prog->aux) && + (bpf_prog_is_offloaded(tgt_prog->aux) || + !bpf_prog_is_dev_bound(prog->aux) || + !bpf_offload_dev_match(prog, tgt_prog->aux->offload->netdev))) { + err = -EINVAL; + goto out_put_prog; + } + key = bpf_trampoline_compute_key(tgt_prog, NULL, btf_id); } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index bc8d9b8d4f47..d92e28dd220e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16379,11 +16379,6 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, if (tgt_prog) { struct bpf_prog_aux *aux = tgt_prog->aux; - if (bpf_prog_is_dev_bound(tgt_prog->aux)) { - bpf_log(log, "Replacing device-bound programs not supported\n"); - return -EINVAL; - } - for (i = 0; i < aux->func_info_cnt; i++) if (aux->func_info[i].type_id == btf_id) { subprog = i; @@ -16644,10 +16639,22 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) if (tgt_prog && prog->type == BPF_PROG_TYPE_EXT) { /* to make freplace equivalent to their targets, they need to * inherit env->ops and expected_attach_type for the rest of the - * verification + * verification; we also need to propagate the prog offload data + * for resolving kfuncs. */ env->ops = bpf_verifier_ops[tgt_prog->type]; prog->expected_attach_type = tgt_prog->expected_attach_type; + + if (bpf_prog_is_dev_bound(tgt_prog->aux)) { + if (bpf_prog_is_offloaded(tgt_prog->aux)) + return -EINVAL; + + prog->aux->dev_bound = true; + ret = __bpf_prog_offload_init(prog, + tgt_prog->aux->offload->netdev); + if (ret) + return ret; + } } /* store info about the attachment target that will be used later */
On Thu, Dec 8, 2022 at 2:29 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Stanislav Fomichev <sdf@google.com> writes: > > > Please see the first patch in the series for the overall > > design and use-cases. > > > > Changes since v3: > > > > - Rework prog->bound_netdev refcounting (Jakub/Marin) > > > > Now it's based on the offload.c framework. It mostly fits, except > > I had to automatically insert a HT entry for the netdev. In the > > offloaded case, the netdev is added via a call to > > bpf_offload_dev_netdev_register from the driver init path; with > > a dev-bound programs, we have to manually add (and remove) the entry. > > > > As suggested by Toke, I'm also prohibiting putting dev-bound programs > > into prog-array map; essentially prohibiting tail calling into it. > > I'm also disabling freplace of the dev-bound programs. Both of those > > restrictions can be loosened up eventually. > > I thought it would be a shame that we don't support at least freplace > programs from the get-go (as that would exclude libxdp from taking > advantage of this). So see below for a patch implementing this :) > > -Toke Damn, now I need to write a selftest :-) But seriously, thank you for taking care of this, will try to include preserving SoB! > commit 3abb333e5fd2e8a0920b77013499bdae0ee3db43 > Author: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu Dec 8 23:10:54 2022 +0100 > > bpf: Support consuming XDP HW metadata from fext programs > > Instead of rejecting the attaching of PROG_TYPE_EXT programs to XDP > programs that consume HW metadata, implement support for propagating the > offload information. The extension program doesn't need to set a flag or > ifindex, it these will just be propagated from the target by the verifier. > We need to create a separate offload object for the extension program, > though, since it can be reattached to a different program later (which > means we can't just inhering the offload information from the target). > > An additional check is added on attach that the new target is compatible > with the offload information in the extension prog. > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index b46b60f4eae1..cfa5c847cf2c 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -2482,6 +2482,7 @@ void *bpf_offload_resolve_kfunc(struct bpf_prog *prog, u32 func_id); > void unpriv_ebpf_notify(int new_state); > > #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) > +int __bpf_prog_offload_init(struct bpf_prog *prog, struct net_device *netdev); > int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr); > void bpf_offload_bound_netdev_unregister(struct net_device *dev); > > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c > index bad8bab916eb..b059a7b53457 100644 > --- a/kernel/bpf/offload.c > +++ b/kernel/bpf/offload.c > @@ -83,36 +83,25 @@ bpf_offload_find_netdev(struct net_device *netdev) > return rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params); > } > > -int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) > +int __bpf_prog_offload_init(struct bpf_prog *prog, struct net_device *netdev) > { > struct bpf_offload_netdev *ondev; > struct bpf_prog_offload *offload; > int err; > > - if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && > - attr->prog_type != BPF_PROG_TYPE_XDP) > + if (!netdev) > return -EINVAL; > > - if (attr->prog_flags & ~BPF_F_XDP_HAS_METADATA) > - return -EINVAL; > + err = __bpf_offload_init(); > + if (err) > + return err; > > offload = kzalloc(sizeof(*offload), GFP_USER); > if (!offload) > return -ENOMEM; > > - err = __bpf_offload_init(); > - if (err) > - return err; > - > offload->prog = prog; > - > - offload->netdev = dev_get_by_index(current->nsproxy->net_ns, > - attr->prog_ifindex); > - err = bpf_dev_offload_check(offload->netdev); > - if (err) > - goto err_maybe_put; > - > - prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_HAS_METADATA); > + offload->netdev = netdev; > > down_write(&bpf_devs_lock); > ondev = bpf_offload_find_netdev(offload->netdev); > @@ -135,19 +124,46 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) > offload->offdev = ondev->offdev; > prog->aux->offload = offload; > list_add_tail(&offload->offloads, &ondev->progs); > - dev_put(offload->netdev); > up_write(&bpf_devs_lock); > > return 0; > err_unlock: > up_write(&bpf_devs_lock); > -err_maybe_put: > - if (offload->netdev) > - dev_put(offload->netdev); > kfree(offload); > return err; > } > > +int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) > +{ > + struct net_device *netdev; > + int err; > + > + if (attr->prog_type != BPF_PROG_TYPE_SCHED_CLS && > + attr->prog_type != BPF_PROG_TYPE_XDP) > + return -EINVAL; > + > + if (attr->prog_flags & ~BPF_F_XDP_HAS_METADATA) > + return -EINVAL; > + > + netdev = dev_get_by_index(current->nsproxy->net_ns, attr->prog_ifindex); > + if (!netdev) > + return -EINVAL; > + > + err = bpf_dev_offload_check(netdev); > + if (err) > + goto out; > + > + prog->aux->offload_requested = !(attr->prog_flags & BPF_F_XDP_HAS_METADATA); > + > + err = __bpf_prog_offload_init(prog, netdev); > + if (err) > + goto out; > + > +out: > + dev_put(netdev); > + return err; > +} > + > int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) > { > struct bpf_prog_offload *offload; > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index b345a273f7d0..606e6de5f716 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -3021,6 +3021,14 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, > goto out_put_prog; > } > > + if (bpf_prog_is_dev_bound(tgt_prog->aux) && > + (bpf_prog_is_offloaded(tgt_prog->aux) || > + !bpf_prog_is_dev_bound(prog->aux) || > + !bpf_offload_dev_match(prog, tgt_prog->aux->offload->netdev))) { > + err = -EINVAL; > + goto out_put_prog; > + } > + > key = bpf_trampoline_compute_key(tgt_prog, NULL, btf_id); > } > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index bc8d9b8d4f47..d92e28dd220e 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -16379,11 +16379,6 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, > if (tgt_prog) { > struct bpf_prog_aux *aux = tgt_prog->aux; > > - if (bpf_prog_is_dev_bound(tgt_prog->aux)) { > - bpf_log(log, "Replacing device-bound programs not supported\n"); > - return -EINVAL; > - } > - > for (i = 0; i < aux->func_info_cnt; i++) > if (aux->func_info[i].type_id == btf_id) { > subprog = i; > @@ -16644,10 +16639,22 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) > if (tgt_prog && prog->type == BPF_PROG_TYPE_EXT) { > /* to make freplace equivalent to their targets, they need to > * inherit env->ops and expected_attach_type for the rest of the > - * verification > + * verification; we also need to propagate the prog offload data > + * for resolving kfuncs. > */ > env->ops = bpf_verifier_ops[tgt_prog->type]; > prog->expected_attach_type = tgt_prog->expected_attach_type; > + > + if (bpf_prog_is_dev_bound(tgt_prog->aux)) { > + if (bpf_prog_is_offloaded(tgt_prog->aux)) > + return -EINVAL; > + > + prog->aux->dev_bound = true; > + ret = __bpf_prog_offload_init(prog, > + tgt_prog->aux->offload->netdev); > + if (ret) > + return ret; > + } > } > > /* store info about the attachment target that will be used later */ >
Stanislav Fomichev <sdf@google.com> writes: > On Thu, Dec 8, 2022 at 2:29 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Stanislav Fomichev <sdf@google.com> writes: >> >> > Please see the first patch in the series for the overall >> > design and use-cases. >> > >> > Changes since v3: >> > >> > - Rework prog->bound_netdev refcounting (Jakub/Marin) >> > >> > Now it's based on the offload.c framework. It mostly fits, except >> > I had to automatically insert a HT entry for the netdev. In the >> > offloaded case, the netdev is added via a call to >> > bpf_offload_dev_netdev_register from the driver init path; with >> > a dev-bound programs, we have to manually add (and remove) the entry. >> > >> > As suggested by Toke, I'm also prohibiting putting dev-bound programs >> > into prog-array map; essentially prohibiting tail calling into it. >> > I'm also disabling freplace of the dev-bound programs. Both of those >> > restrictions can be loosened up eventually. >> >> I thought it would be a shame that we don't support at least freplace >> programs from the get-go (as that would exclude libxdp from taking >> advantage of this). So see below for a patch implementing this :) >> >> -Toke > > Damn, now I need to write a selftest :-) > But seriously, thank you for taking care of this, will try to include > preserving SoB! Cool, thanks! I just realised I made on mistake in the attach check, though: [...] >> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >> index b345a273f7d0..606e6de5f716 100644 >> --- a/kernel/bpf/syscall.c >> +++ b/kernel/bpf/syscall.c >> @@ -3021,6 +3021,14 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, >> goto out_put_prog; >> } >> >> + if (bpf_prog_is_dev_bound(tgt_prog->aux) && >> + (bpf_prog_is_offloaded(tgt_prog->aux) || >> + !bpf_prog_is_dev_bound(prog->aux) || >> + !bpf_offload_dev_match(prog, tgt_prog->aux->offload->netdev))) { This should switch the order of the is_dev_bound() checks, like: + if (bpf_prog_is_dev_bound(prog->aux) && + (bpf_prog_is_offloaded(tgt_prog->aux) || + !bpf_prog_is_dev_bound(tgt_prog->aux) || + !bpf_offload_dev_match(prog, tgt_prog->aux->offload->netdev))) { I.e., first check bpf_prog_is_dev_bound(prog->aux) (the program being attached), and only perform the other checks if we're attaching something that has been verified as being dev-bound. It should be fine to attach a non-devbound function to a devbound parent program (since that non-devbound function can't call any of the kfuncs). -Toke