Message ID | 20250220134503.835224-3-maciej.fijalkowski@intel.com (mailing list archive) |
---|---|
State | New |
Delegated to: | BPF |
Headers | show |
Series | bpf: introduce skb refcount kfuncs | expand |
On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > These have been mostly taken from Amery Hung's work related to bpf qdisc > implementation. bpf_skb_{acquire,release}() are for increment/decrement > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > have not been released and map is being wiped out from system. > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 62 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index 2ec162dd83c4..9bd2701be088 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > __bpf_kfunc_end_defs(); > > +__diag_push(); > +__diag_ignore_all("-Wmissing-prototypes", > + "Global functions as their definitions will be in vmlinux BTF"); > + > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > + * kfunc which is not stored in a map as a kptr, must be released by calling > + * bpf_skb_release(). > + * @skb: The skb on which a reference is being acquired. > + */ > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > +{ > + if (refcount_inc_not_zero(&skb->users)) > + return skb; > + return NULL; > +} > + > +/* bpf_skb_release - Release the reference acquired on an skb. > + * @skb: The skb on which a reference is being released. > + */ > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > +{ > + skb_unref(skb); > +} > + > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > + * an allocated object or a map. > + * @skb: The skb on which a reference is being released. > + */ > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > +{ > + (void)skb_unref(skb); > + consume_skb(skb); > +} > + > +__diag_pop(); > + > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > + > +static const struct btf_kfunc_id_set skb_kfunc_set = { > + .owner = THIS_MODULE, > + .set = &skb_kfunc_btf_ids, > +}; > + > +BTF_ID_LIST(skb_kfunc_dtor_ids) > +BTF_ID(struct, sk_buff) > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > + > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > struct bpf_dynptr *ptr__uninit) > { > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > static int __init bpf_kfunc_init(void) > { > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > + { > + .btf_id = skb_kfunc_dtor_ids[0], > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > + }, > + }; > + > int ret; > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > &bpf_kfunc_set_sock_addr); > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > + > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > + ARRAY_SIZE(skb_kfunc_dtors), > + THIS_MODULE); I think we will need to deal with two versions of skb dtors here. Both qdisc and cls will register dtor associated for skb. The qdisc one just call kfree_skb(). While only one can exist for a specific btf id in the kernel if I understand correctly. Is it possible to have one that work for both use cases? > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > } > late_initcall(bpf_kfunc_init);
On Thu, Feb 20, 2025 at 03:25:03PM -0800, Amery Hung wrote: > > > On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > > These have been mostly taken from Amery Hung's work related to bpf qdisc > > implementation. bpf_skb_{acquire,release}() are for increment/decrement > > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > > have not been released and map is being wiped out from system. > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > --- > > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 62 insertions(+) > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 2ec162dd83c4..9bd2701be088 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > __bpf_kfunc_end_defs(); > > +__diag_push(); > > +__diag_ignore_all("-Wmissing-prototypes", > > + "Global functions as their definitions will be in vmlinux BTF"); > > + > > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > > + * kfunc which is not stored in a map as a kptr, must be released by calling > > + * bpf_skb_release(). > > + * @skb: The skb on which a reference is being acquired. > > + */ > > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > > +{ > > + if (refcount_inc_not_zero(&skb->users)) > > + return skb; > > + return NULL; > > +} > > + > > +/* bpf_skb_release - Release the reference acquired on an skb. > > + * @skb: The skb on which a reference is being released. > > + */ > > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > > +{ > > + skb_unref(skb); > > +} > > + > > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > > + * an allocated object or a map. > > + * @skb: The skb on which a reference is being released. > > + */ > > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > > +{ > > + (void)skb_unref(skb); > > + consume_skb(skb); > > +} > > + > > +__diag_pop(); > > + > > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > > + > > +static const struct btf_kfunc_id_set skb_kfunc_set = { > > + .owner = THIS_MODULE, > > + .set = &skb_kfunc_btf_ids, > > +}; > > + > > +BTF_ID_LIST(skb_kfunc_dtor_ids) > > +BTF_ID(struct, sk_buff) > > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > > + > > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > > struct bpf_dynptr *ptr__uninit) > > { > > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > static int __init bpf_kfunc_init(void) > > { > > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > > + { > > + .btf_id = skb_kfunc_dtor_ids[0], > > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > > + }, > > + }; > > + > > int ret; > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > > &bpf_kfunc_set_sock_addr); > > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > > + > > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > > + ARRAY_SIZE(skb_kfunc_dtors), > > + THIS_MODULE); > > I think we will need to deal with two versions of skb dtors here. Both qdisc > and cls will register dtor associated for skb. The qdisc one just call > kfree_skb(). While only one can exist for a specific btf id in the kernel if > I understand correctly. Is it possible to have one that work > for both use cases? Looking at the current code it seems bpf_find_btf_id() (which btf_parse_kptr() calls) will go through modules and return the first match against sk_buff btf but that's currently a wild guess from my side. So your concern stands as we have no mechanism that would distinguish the dtors for same btf id. I would have to take a deeper look at btf_parse_kptr() and find some way to associate dtor with its module during registering and then use it within btf_find_dtor_kfunc(). Would this be sufficient? > > > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > > } > > late_initcall(bpf_kfunc_init); >
On Fri, Feb 21, 2025 at 6:57 AM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Thu, Feb 20, 2025 at 03:25:03PM -0800, Amery Hung wrote: > > > > > > On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > > > These have been mostly taken from Amery Hung's work related to bpf qdisc > > > implementation. bpf_skb_{acquire,release}() are for increment/decrement > > > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > > > have not been released and map is being wiped out from system. > > > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > --- > > > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 62 insertions(+) > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > index 2ec162dd83c4..9bd2701be088 100644 > > > --- a/net/core/filter.c > > > +++ b/net/core/filter.c > > > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > > __bpf_kfunc_end_defs(); > > > +__diag_push(); > > > +__diag_ignore_all("-Wmissing-prototypes", > > > + "Global functions as their definitions will be in vmlinux BTF"); > > > + > > > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > > > + * kfunc which is not stored in a map as a kptr, must be released by calling > > > + * bpf_skb_release(). > > > + * @skb: The skb on which a reference is being acquired. > > > + */ > > > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > > > +{ > > > + if (refcount_inc_not_zero(&skb->users)) > > > + return skb; > > > + return NULL; > > > +} > > > + > > > +/* bpf_skb_release - Release the reference acquired on an skb. > > > + * @skb: The skb on which a reference is being released. > > > + */ > > > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > > > +{ > > > + skb_unref(skb); > > > +} > > > + > > > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > > > + * an allocated object or a map. > > > + * @skb: The skb on which a reference is being released. > > > + */ > > > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > > > +{ > > > + (void)skb_unref(skb); > > > + consume_skb(skb); > > > +} > > > + > > > +__diag_pop(); > > > + > > > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > > > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > > > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > > > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > > > + > > > +static const struct btf_kfunc_id_set skb_kfunc_set = { > > > + .owner = THIS_MODULE, > > > + .set = &skb_kfunc_btf_ids, > > > +}; > > > + > > > +BTF_ID_LIST(skb_kfunc_dtor_ids) > > > +BTF_ID(struct, sk_buff) > > > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > > > + > > > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > > > struct bpf_dynptr *ptr__uninit) > > > { > > > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > > static int __init bpf_kfunc_init(void) > > > { > > > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > > > + { > > > + .btf_id = skb_kfunc_dtor_ids[0], > > > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > > > + }, > > > + }; > > > + > > > int ret; > > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > > > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > > > &bpf_kfunc_set_sock_addr); > > > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > > > + > > > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > > > + ARRAY_SIZE(skb_kfunc_dtors), > > > + THIS_MODULE); > > > > I think we will need to deal with two versions of skb dtors here. Both qdisc > > and cls will register dtor associated for skb. The qdisc one just call > > kfree_skb(). While only one can exist for a specific btf id in the kernel if > > I understand correctly. Is it possible to have one that work > > for both use cases? > > Looking at the current code it seems bpf_find_btf_id() (which > btf_parse_kptr() calls) will go through modules and return the first match > against sk_buff btf but that's currently a wild guess from my side. So > your concern stands as we have no mechanism that would distinguish the > dtors for same btf id. > > I would have to take a deeper look at btf_parse_kptr() and find some way > to associate dtor with its module during registering and then use it > within btf_find_dtor_kfunc(). Would this be sufficient? > That might not be enough. Ultimately, if the user configures both modules to be built-in, then there is no way to tell where a trusted skb kptr in a bpf program is from. Two possible ways to solve this: 1. Make the dtor be able to tell whether the skb is from qdisc or cls. Since we are both in the TC layer, maybe we can use skb->cb for this? 2. Associate KF_ACQUIRE kfuncs with the corresponding KF_RELEASE kfuncs somehow. Carry this additional information as the kptr propagates in the bpf world so that we know which dtor to call. This seems to be overly complicated. > > > > > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > > > } > > > late_initcall(bpf_kfunc_init); > >
On Fri, Feb 21, 2025 at 11:11:27AM -0800, Amery Hung wrote: > On Fri, Feb 21, 2025 at 6:57 AM Maciej Fijalkowski > <maciej.fijalkowski@intel.com> wrote: > > > > On Thu, Feb 20, 2025 at 03:25:03PM -0800, Amery Hung wrote: > > > > > > > > > On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > > > > These have been mostly taken from Amery Hung's work related to bpf qdisc > > > > implementation. bpf_skb_{acquire,release}() are for increment/decrement > > > > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > > > > have not been released and map is being wiped out from system. > > > > > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > > --- > > > > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 62 insertions(+) > > > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > index 2ec162dd83c4..9bd2701be088 100644 > > > > --- a/net/core/filter.c > > > > +++ b/net/core/filter.c > > > > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > > > __bpf_kfunc_end_defs(); > > > > +__diag_push(); > > > > +__diag_ignore_all("-Wmissing-prototypes", > > > > + "Global functions as their definitions will be in vmlinux BTF"); > > > > + > > > > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > > > > + * kfunc which is not stored in a map as a kptr, must be released by calling > > > > + * bpf_skb_release(). > > > > + * @skb: The skb on which a reference is being acquired. > > > > + */ > > > > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > > > > +{ > > > > + if (refcount_inc_not_zero(&skb->users)) > > > > + return skb; > > > > + return NULL; > > > > +} > > > > + > > > > +/* bpf_skb_release - Release the reference acquired on an skb. > > > > + * @skb: The skb on which a reference is being released. > > > > + */ > > > > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > > > > +{ > > > > + skb_unref(skb); > > > > +} > > > > + > > > > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > > > > + * an allocated object or a map. > > > > + * @skb: The skb on which a reference is being released. > > > > + */ > > > > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > > > > +{ > > > > + (void)skb_unref(skb); > > > > + consume_skb(skb); > > > > +} > > > > + > > > > +__diag_pop(); > > > > + > > > > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > > > > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > > > > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > > > > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > > > > + > > > > +static const struct btf_kfunc_id_set skb_kfunc_set = { > > > > + .owner = THIS_MODULE, > > > > + .set = &skb_kfunc_btf_ids, > > > > +}; > > > > + > > > > +BTF_ID_LIST(skb_kfunc_dtor_ids) > > > > +BTF_ID(struct, sk_buff) > > > > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > > > > + > > > > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > > > > struct bpf_dynptr *ptr__uninit) > > > > { > > > > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > > > static int __init bpf_kfunc_init(void) > > > > { > > > > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > > > > + { > > > > + .btf_id = skb_kfunc_dtor_ids[0], > > > > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > > > > + }, > > > > + }; > > > > + > > > > int ret; > > > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > > > > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > > > > &bpf_kfunc_set_sock_addr); > > > > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > > > > + > > > > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > > > > + ARRAY_SIZE(skb_kfunc_dtors), > > > > + THIS_MODULE); > > > > > > I think we will need to deal with two versions of skb dtors here. Both qdisc > > > and cls will register dtor associated for skb. The qdisc one just call > > > kfree_skb(). While only one can exist for a specific btf id in the kernel if > > > I understand correctly. Is it possible to have one that work > > > for both use cases? > > > > Looking at the current code it seems bpf_find_btf_id() (which > > btf_parse_kptr() calls) will go through modules and return the first match > > against sk_buff btf but that's currently a wild guess from my side. So > > your concern stands as we have no mechanism that would distinguish the > > dtors for same btf id. > > > > I would have to take a deeper look at btf_parse_kptr() and find some way > > to associate dtor with its module during registering and then use it > > within btf_find_dtor_kfunc(). Would this be sufficient? > > > > That might not be enough. Ultimately, if the user configures both > modules to be built-in, then there is no way to tell where a trusted > skb kptr in a bpf program is from. Am I missing something or how are you going to use the kfuncs that are required for loading skb onto map as kptr without loaded module? Dtors are owned by same module as corresponding acquire/release kfuncs. > > Two possible ways to solve this: > > 1. Make the dtor be able to tell whether the skb is from qdisc or cls. > Since we are both in the TC layer, maybe we can use skb->cb for this? > > 2. Associate KF_ACQUIRE kfuncs with the corresponding KF_RELEASE > kfuncs somehow. Carry this additional information as the kptr > propagates in the bpf world so that we know which dtor to call. This > seems to be overly complicated. > > > > > > > > > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > > > > } > > > > late_initcall(bpf_kfunc_init); > > >
On Fri, Feb 21, 2025 at 12:17 PM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Fri, Feb 21, 2025 at 11:11:27AM -0800, Amery Hung wrote: > > On Fri, Feb 21, 2025 at 6:57 AM Maciej Fijalkowski > > <maciej.fijalkowski@intel.com> wrote: > > > > > > On Thu, Feb 20, 2025 at 03:25:03PM -0800, Amery Hung wrote: > > > > > > > > > > > > On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > > > > > These have been mostly taken from Amery Hung's work related to bpf qdisc > > > > > implementation. bpf_skb_{acquire,release}() are for increment/decrement > > > > > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > > > > > have not been released and map is being wiped out from system. > > > > > > > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > > > --- > > > > > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 62 insertions(+) > > > > > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > > index 2ec162dd83c4..9bd2701be088 100644 > > > > > --- a/net/core/filter.c > > > > > +++ b/net/core/filter.c > > > > > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > > > > __bpf_kfunc_end_defs(); > > > > > +__diag_push(); > > > > > +__diag_ignore_all("-Wmissing-prototypes", > > > > > + "Global functions as their definitions will be in vmlinux BTF"); > > > > > + > > > > > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > > > > > + * kfunc which is not stored in a map as a kptr, must be released by calling > > > > > + * bpf_skb_release(). > > > > > + * @skb: The skb on which a reference is being acquired. > > > > > + */ > > > > > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > > > > > +{ > > > > > + if (refcount_inc_not_zero(&skb->users)) > > > > > + return skb; > > > > > + return NULL; > > > > > +} > > > > > + > > > > > +/* bpf_skb_release - Release the reference acquired on an skb. > > > > > + * @skb: The skb on which a reference is being released. > > > > > + */ > > > > > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > > > > > +{ > > > > > + skb_unref(skb); > > > > > +} > > > > > + > > > > > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > > > > > + * an allocated object or a map. > > > > > + * @skb: The skb on which a reference is being released. > > > > > + */ > > > > > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > > > > > +{ > > > > > + (void)skb_unref(skb); > > > > > + consume_skb(skb); > > > > > +} > > > > > + > > > > > +__diag_pop(); > > > > > + > > > > > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > > > > > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > > > > > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > > > > > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > > > > > + > > > > > +static const struct btf_kfunc_id_set skb_kfunc_set = { > > > > > + .owner = THIS_MODULE, > > > > > + .set = &skb_kfunc_btf_ids, > > > > > +}; > > > > > + > > > > > +BTF_ID_LIST(skb_kfunc_dtor_ids) > > > > > +BTF_ID(struct, sk_buff) > > > > > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > > > > > + > > > > > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > > > > > struct bpf_dynptr *ptr__uninit) > > > > > { > > > > > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > > > > static int __init bpf_kfunc_init(void) > > > > > { > > > > > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > > > > > + { > > > > > + .btf_id = skb_kfunc_dtor_ids[0], > > > > > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > > > > > + }, > > > > > + }; > > > > > + > > > > > int ret; > > > > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > > > > > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > > > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > > > > > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > > > > > &bpf_kfunc_set_sock_addr); > > > > > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > > > > > + > > > > > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > > > > > + ARRAY_SIZE(skb_kfunc_dtors), > > > > > + THIS_MODULE); > > > > > > > > I think we will need to deal with two versions of skb dtors here. Both qdisc > > > > and cls will register dtor associated for skb. The qdisc one just call > > > > kfree_skb(). While only one can exist for a specific btf id in the kernel if > > > > I understand correctly. Is it possible to have one that work > > > > for both use cases? > > > > > > Looking at the current code it seems bpf_find_btf_id() (which > > > btf_parse_kptr() calls) will go through modules and return the first match > > > against sk_buff btf but that's currently a wild guess from my side. So > > > your concern stands as we have no mechanism that would distinguish the > > > dtors for same btf id. > > > > > > I would have to take a deeper look at btf_parse_kptr() and find some way > > > to associate dtor with its module during registering and then use it > > > within btf_find_dtor_kfunc(). Would this be sufficient? > > > > > > > That might not be enough. Ultimately, if the user configures both > > modules to be built-in, then there is no way to tell where a trusted > > skb kptr in a bpf program is from. > > Am I missing something or how are you going to use the kfuncs that are > required for loading skb onto map as kptr without loaded module? Dtors are > owned by same module as corresponding acquire/release kfuncs. > bpf qdisc will be either built-in (CONFIG_NET_SCH_BPF=y) or not enabled (=n). classifier can be either y or m. Dtors are associated with (btf, btf_id). So in case both are built in, only one of them should be registered to (btf_vmlinux, struct skb). The current implementation does not check if a dtor of a btf id is already registered in register_btf_id_dtor_kfunc, but I don't think we should do it in the first place. } > > > > Two possible ways to solve this: > > > > 1. Make the dtor be able to tell whether the skb is from qdisc or cls. > > Since we are both in the TC layer, maybe we can use skb->cb for this? > > > > 2. Associate KF_ACQUIRE kfuncs with the corresponding KF_RELEASE > > kfuncs somehow. Carry this additional information as the kptr > > propagates in the bpf world so that we know which dtor to call. This > > seems to be overly complicated. > > > > > > > > > > > > > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > > > > > } > > > > > late_initcall(bpf_kfunc_init); > > > >
On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote: > These have been mostly taken from Amery Hung's work related to bpf qdisc > implementation. bpf_skb_{acquire,release}() are for increment/decrement > sk_buff::users whereas bpf_skb_destroy() is called for map entries that > have not been released and map is being wiped out from system. > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > --- > net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 62 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index 2ec162dd83c4..9bd2701be088 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, > > __bpf_kfunc_end_defs(); > > +__diag_push(); > +__diag_ignore_all("-Wmissing-prototypes", > + "Global functions as their definitions will be in vmlinux BTF"); > + > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this > + * kfunc which is not stored in a map as a kptr, must be released by calling > + * bpf_skb_release(). > + * @skb: The skb on which a reference is being acquired. > + */ > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) > +{ > + if (refcount_inc_not_zero(&skb->users)) Any reason to use refcount_inc_not_zero instead of refcount_inc here? > + return skb; > + return NULL; > +} > + > +/* bpf_skb_release - Release the reference acquired on an skb. > + * @skb: The skb on which a reference is being released. > + */ > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) > +{ > + skb_unref(skb); > +} > + > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into > + * an allocated object or a map. > + * @skb: The skb on which a reference is being released. > + */ > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) > +{ > + (void)skb_unref(skb); Actually, there might be a dtor that work for both cls and qdisc. This skb_unref() seems redundant, consume_skb() already unref once. > + consume_skb(skb); consume_skb() indicates that the skb is consumed, but if the skb here is being dropped by dtor. kfree_skb() or maybe kfree_skb_reason() with a proper reason should be better. Here is the comments for consume_skb() in skbuff.c * Functions identically to kfree_skb, but kfree_skb assumes that the frame * is being dropped after a failure and notes that So the dtor would be basically the same as the qdisc one. > +} > + > +__diag_pop(); > + > +BTF_KFUNCS_START(skb_kfunc_btf_ids) > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) > +BTF_KFUNCS_END(skb_kfunc_btf_ids) > + > +static const struct btf_kfunc_id_set skb_kfunc_set = { > + .owner = THIS_MODULE, > + .set = &skb_kfunc_btf_ids, > +}; > + > +BTF_ID_LIST(skb_kfunc_dtor_ids) > +BTF_ID(struct, sk_buff) > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) > + > int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, > struct bpf_dynptr *ptr__uninit) > { > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { > > static int __init bpf_kfunc_init(void) > { > + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { > + { > + .btf_id = skb_kfunc_dtor_ids[0], > + .kfunc_btf_id = skb_kfunc_dtor_ids[1] > + }, > + }; > + > int ret; > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); > ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, > &bpf_kfunc_set_sock_addr); > + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); > + > + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, > + ARRAY_SIZE(skb_kfunc_dtors), > + THIS_MODULE); > return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); > } > late_initcall(bpf_kfunc_init);
diff --git a/net/core/filter.c b/net/core/filter.c index 2ec162dd83c4..9bd2701be088 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, __bpf_kfunc_end_defs(); +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this + * kfunc which is not stored in a map as a kptr, must be released by calling + * bpf_skb_release(). + * @skb: The skb on which a reference is being acquired. + */ +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb) +{ + if (refcount_inc_not_zero(&skb->users)) + return skb; + return NULL; +} + +/* bpf_skb_release - Release the reference acquired on an skb. + * @skb: The skb on which a reference is being released. + */ +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb) +{ + skb_unref(skb); +} + +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into + * an allocated object or a map. + * @skb: The skb on which a reference is being released. + */ +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb) +{ + (void)skb_unref(skb); + consume_skb(skb); +} + +__diag_pop(); + +BTF_KFUNCS_START(skb_kfunc_btf_ids) +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE) +BTF_KFUNCS_END(skb_kfunc_btf_ids) + +static const struct btf_kfunc_id_set skb_kfunc_set = { + .owner = THIS_MODULE, + .set = &skb_kfunc_btf_ids, +}; + +BTF_ID_LIST(skb_kfunc_dtor_ids) +BTF_ID(struct, sk_buff) +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE) + int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, struct bpf_dynptr *ptr__uninit) { @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = { static int __init bpf_kfunc_init(void) { + const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = { + { + .btf_id = skb_kfunc_dtor_ids[0], + .kfunc_btf_id = skb_kfunc_dtor_ids[1] + }, + }; + int ret; ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb); @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void) ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, &bpf_kfunc_set_sock_addr); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set); + + ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors, + ARRAY_SIZE(skb_kfunc_dtors), + THIS_MODULE); return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); } late_initcall(bpf_kfunc_init);
These have been mostly taken from Amery Hung's work related to bpf qdisc implementation. bpf_skb_{acquire,release}() are for increment/decrement sk_buff::users whereas bpf_skb_destroy() is called for map entries that have not been released and map is being wiped out from system. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+)