Message ID | d77b08bf757a8ea8dab3a495885c7de6ff6678da.1639102791.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [BPF,for-next] cgroup/bpf: fast path for not loaded skb BPF filtering | expand |
Context | Check | Description |
---|---|---|
bpf/vmtest-bpf-next-PR | success | PR summary |
bpf/vmtest-bpf-next | success | VM_Test |
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Fri, Dec 10, 2021 at 02:23:34AM +0000, Pavel Begunkov wrote: > cgroup_bpf_enabled_key static key guards from overhead in cases where > no cgroup bpf program of a specific type is loaded in any cgroup. Turn > out that's not always good enough, e.g. when there are many cgroups but > ones that we're interesting in are without bpf. It's seen in server > environments, but the problem seems to be even wider as apparently > systemd loads some BPF affecting my laptop. > > Profiles for small packet or zerocopy transmissions over fast network > show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from > migrate_disable/enable(), and similarly on the receiving side. Also > got +4-5% of t-put for local testing. > > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- > kernel/bpf/cgroup.c | 23 +++++++---------------- > 2 files changed, 28 insertions(+), 19 deletions(-) > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h > index 11820a430d6c..99b01201d7db 100644 > --- a/include/linux/bpf-cgroup.h > +++ b/include/linux/bpf-cgroup.h > @@ -141,6 +141,9 @@ struct cgroup_bpf { > struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; > u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; > > + /* for each type tracks whether effective prog array is not empty */ > + unsigned long enabled_mask; > + > /* list of cgroup shared storages */ > struct list_head storages; > > @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); > int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, > void *value, u64 flags); > > +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf *cgrp_bpf, > + enum cgroup_bpf_attach_type atype) > +{ > + return test_bit(atype, &cgrp_bpf->enabled_mask); > +} > + > +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ > +({ \ > + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ > + \ > + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ > +}) I think it should directly test if the array is empty or not instead of adding another bit. Can the existing __cgroup_bpf_prog_array_is_empty(cgrp, ...) test be used instead?
On 12/11/21 00:38, Martin KaFai Lau wrote: > On Fri, Dec 10, 2021 at 02:23:34AM +0000, Pavel Begunkov wrote: >> cgroup_bpf_enabled_key static key guards from overhead in cases where >> no cgroup bpf program of a specific type is loaded in any cgroup. Turn >> out that's not always good enough, e.g. when there are many cgroups but >> ones that we're interesting in are without bpf. It's seen in server >> environments, but the problem seems to be even wider as apparently >> systemd loads some BPF affecting my laptop. >> >> Profiles for small packet or zerocopy transmissions over fast network >> show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from >> migrate_disable/enable(), and similarly on the receiving side. Also >> got +4-5% of t-put for local testing. >> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- >> kernel/bpf/cgroup.c | 23 +++++++---------------- >> 2 files changed, 28 insertions(+), 19 deletions(-) >> >> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h >> index 11820a430d6c..99b01201d7db 100644 >> --- a/include/linux/bpf-cgroup.h >> +++ b/include/linux/bpf-cgroup.h >> @@ -141,6 +141,9 @@ struct cgroup_bpf { >> struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; >> u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; >> >> + /* for each type tracks whether effective prog array is not empty */ >> + unsigned long enabled_mask; >> + >> /* list of cgroup shared storages */ >> struct list_head storages; >> >> @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); >> int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, >> void *value, u64 flags); >> >> +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf *cgrp_bpf, >> + enum cgroup_bpf_attach_type atype) >> +{ >> + return test_bit(atype, &cgrp_bpf->enabled_mask); >> +} >> + >> +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ >> +({ \ >> + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ >> + \ >> + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ >> +}) > I think it should directly test if the array is empty or not instead of > adding another bit. > > Can the existing __cgroup_bpf_prog_array_is_empty(cgrp, ...) test be used instead? That was the first idea, but it's still heavier than I'd wish. 0.3%-0.7% in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair is cheap but anyway adds 2 barrier()s, and with bitmasks we can inline the check.
On Sat, Dec 11, 2021 at 01:15:05AM +0000, Pavel Begunkov wrote: > On 12/11/21 00:38, Martin KaFai Lau wrote: > > On Fri, Dec 10, 2021 at 02:23:34AM +0000, Pavel Begunkov wrote: > > > cgroup_bpf_enabled_key static key guards from overhead in cases where > > > no cgroup bpf program of a specific type is loaded in any cgroup. Turn > > > out that's not always good enough, e.g. when there are many cgroups but > > > ones that we're interesting in are without bpf. It's seen in server > > > environments, but the problem seems to be even wider as apparently > > > systemd loads some BPF affecting my laptop. > > > > > > Profiles for small packet or zerocopy transmissions over fast network > > > show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from > > > migrate_disable/enable(), and similarly on the receiving side. Also > > > got +4-5% of t-put for local testing. > > > > > > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > > > --- > > > include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- > > > kernel/bpf/cgroup.c | 23 +++++++---------------- > > > 2 files changed, 28 insertions(+), 19 deletions(-) > > > > > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h > > > index 11820a430d6c..99b01201d7db 100644 > > > --- a/include/linux/bpf-cgroup.h > > > +++ b/include/linux/bpf-cgroup.h > > > @@ -141,6 +141,9 @@ struct cgroup_bpf { > > > struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; > > > u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; > > > + /* for each type tracks whether effective prog array is not empty */ > > > + unsigned long enabled_mask; > > > + > > > /* list of cgroup shared storages */ > > > struct list_head storages; > > > @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); > > > int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, > > > void *value, u64 flags); > > > +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf *cgrp_bpf, > > > + enum cgroup_bpf_attach_type atype) > > > +{ > > > + return test_bit(atype, &cgrp_bpf->enabled_mask); > > > +} > > > + > > > +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ > > > +({ \ > > > + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ > > > + \ > > > + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ > > > +}) > > I think it should directly test if the array is empty or not instead of > > adding another bit. > > > > Can the existing __cgroup_bpf_prog_array_is_empty(cgrp, ...) test be used instead? > > That was the first idea, but it's still heavier than I'd wish. 0.3%-0.7% > in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair is > cheap but anyway adds 2 barrier()s, and with bitmasks we can inline > the check. It sounds like there is opportunity to optimize __cgroup_bpf_prog_array_is_empty(). How about using rcu_access_pointer(), testing with &empty_prog_array.hdr, and then inline it? The cgroup prog array cannot be all dummy_bpf_prog.prog. If that could be the case, it should be replaced with &empty_prog_array.hdr earlier, so please check.
On 12/11/21 01:56, Martin KaFai Lau wrote: > On Sat, Dec 11, 2021 at 01:15:05AM +0000, Pavel Begunkov wrote: >> On 12/11/21 00:38, Martin KaFai Lau wrote: >>> On Fri, Dec 10, 2021 at 02:23:34AM +0000, Pavel Begunkov wrote: >>>> cgroup_bpf_enabled_key static key guards from overhead in cases where >>>> no cgroup bpf program of a specific type is loaded in any cgroup. Turn >>>> out that's not always good enough, e.g. when there are many cgroups but >>>> ones that we're interesting in are without bpf. It's seen in server >>>> environments, but the problem seems to be even wider as apparently >>>> systemd loads some BPF affecting my laptop. >>>> >>>> Profiles for small packet or zerocopy transmissions over fast network >>>> show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from >>>> migrate_disable/enable(), and similarly on the receiving side. Also >>>> got +4-5% of t-put for local testing. >>>> >>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >>>> --- >>>> include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- >>>> kernel/bpf/cgroup.c | 23 +++++++---------------- >>>> 2 files changed, 28 insertions(+), 19 deletions(-) >>>> >>>> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h >>>> index 11820a430d6c..99b01201d7db 100644 >>>> --- a/include/linux/bpf-cgroup.h >>>> +++ b/include/linux/bpf-cgroup.h >>>> @@ -141,6 +141,9 @@ struct cgroup_bpf { >>>> struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; >>>> u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; >>>> + /* for each type tracks whether effective prog array is not empty */ >>>> + unsigned long enabled_mask; >>>> + >>>> /* list of cgroup shared storages */ >>>> struct list_head storages; >>>> @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); >>>> int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, >>>> void *value, u64 flags); >>>> +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf *cgrp_bpf, >>>> + enum cgroup_bpf_attach_type atype) >>>> +{ >>>> + return test_bit(atype, &cgrp_bpf->enabled_mask); >>>> +} >>>> + >>>> +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ >>>> +({ \ >>>> + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ >>>> + \ >>>> + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ >>>> +}) >>> I think it should directly test if the array is empty or not instead of >>> adding another bit. >>> >>> Can the existing __cgroup_bpf_prog_array_is_empty(cgrp, ...) test be used instead? >> >> That was the first idea, but it's still heavier than I'd wish. 0.3%-0.7% >> in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair is >> cheap but anyway adds 2 barrier()s, and with bitmasks we can inline >> the check. > It sounds like there is opportunity to optimize > __cgroup_bpf_prog_array_is_empty(). > > How about using rcu_access_pointer(), testing with &empty_prog_array.hdr, > and then inline it? The cgroup prog array cannot be all > dummy_bpf_prog.prog. If that could be the case, it should be replaced > with &empty_prog_array.hdr earlier, so please check. I'd need to expose and export empty_prog_array, but that should do. Will try it out, thanks
On 12/11, Pavel Begunkov wrote: > On 12/11/21 01:56, Martin KaFai Lau wrote: > > On Sat, Dec 11, 2021 at 01:15:05AM +0000, Pavel Begunkov wrote: > > > On 12/11/21 00:38, Martin KaFai Lau wrote: > > > > On Fri, Dec 10, 2021 at 02:23:34AM +0000, Pavel Begunkov wrote: > > > > > cgroup_bpf_enabled_key static key guards from overhead in cases > where > > > > > no cgroup bpf program of a specific type is loaded in any cgroup. > Turn > > > > > out that's not always good enough, e.g. when there are many > cgroups but > > > > > ones that we're interesting in are without bpf. It's seen in > server > > > > > environments, but the problem seems to be even wider as apparently > > > > > systemd loads some BPF affecting my laptop. > > > > > > > > > > Profiles for small packet or zerocopy transmissions over fast > network > > > > > show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is > from > > > > > migrate_disable/enable(), and similarly on the receiving side. > Also > > > > > got +4-5% of t-put for local testing. > > > > > > > > > > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > > > > > --- > > > > > include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- > > > > > kernel/bpf/cgroup.c | 23 +++++++---------------- > > > > > 2 files changed, 28 insertions(+), 19 deletions(-) > > > > > > > > > > diff --git a/include/linux/bpf-cgroup.h > b/include/linux/bpf-cgroup.h > > > > > index 11820a430d6c..99b01201d7db 100644 > > > > > --- a/include/linux/bpf-cgroup.h > > > > > +++ b/include/linux/bpf-cgroup.h > > > > > @@ -141,6 +141,9 @@ struct cgroup_bpf { > > > > > struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; > > > > > u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; > > > > > + /* for each type tracks whether effective prog array is not > empty */ > > > > > + unsigned long enabled_mask; > > > > > + > > > > > /* list of cgroup shared storages */ > > > > > struct list_head storages; > > > > > @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct > bpf_map *map, void *key, void *value); > > > > > int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void > *key, > > > > > void *value, u64 flags); > > > > > +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf > *cgrp_bpf, > > > > > + enum cgroup_bpf_attach_type atype) > > > > > +{ > > > > > + return test_bit(atype, &cgrp_bpf->enabled_mask); > > > > > +} > > > > > + > > > > > +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ > > > > > +({ \ > > > > > + struct cgroup *__cgrp = > sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ > > > > > + \ > > > > > + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ > > > > > +}) > > > > I think it should directly test if the array is empty or not > instead of > > > > adding another bit. > > > > > > > > Can the existing __cgroup_bpf_prog_array_is_empty(cgrp, ...) test > be used instead? > > > > > > That was the first idea, but it's still heavier than I'd wish. > 0.3%-0.7% > > > in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair > is > > > cheap but anyway adds 2 barrier()s, and with bitmasks we can inline > > > the check. > > It sounds like there is opportunity to optimize > > __cgroup_bpf_prog_array_is_empty(). > > > > How about using rcu_access_pointer(), testing with > &empty_prog_array.hdr, > > and then inline it? The cgroup prog array cannot be all > > dummy_bpf_prog.prog. If that could be the case, it should be replaced > > with &empty_prog_array.hdr earlier, so please check. > I'd need to expose and export empty_prog_array, but that should do. > Will try it out, thanks Note that we already use __cgroup_bpf_prog_array_is_empty in __cgroup_bpf_run_filter_setsockopt/__cgroup_bpf_run_filter_getsockopt for exactly the same purpose. If you happen to optimize it, pls update these places as well.
On 12/14/21 17:54, sdf@google.com wrote: > On 12/11, Pavel Begunkov wrote: >> On 12/11/21 01:56, Martin KaFai Lau wrote: >> > On Sat, Dec 11, 2021 at 01:15:05AM +0000, Pavel Begunkov wrote: >> > > That was the first idea, but it's still heavier than I'd wish. 0.3%-0.7% >> > > in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair is >> > > cheap but anyway adds 2 barrier()s, and with bitmasks we can inline >> > > the check. >> > It sounds like there is opportunity to optimize >> > __cgroup_bpf_prog_array_is_empty(). >> > >> > How about using rcu_access_pointer(), testing with &empty_prog_array.hdr, >> > and then inline it? The cgroup prog array cannot be all >> > dummy_bpf_prog.prog. If that could be the case, it should be replaced >> > with &empty_prog_array.hdr earlier, so please check. > >> I'd need to expose and export empty_prog_array, but that should do. >> Will try it out, thanks > > Note that we already use __cgroup_bpf_prog_array_is_empty in > __cgroup_bpf_run_filter_setsockopt/__cgroup_bpf_run_filter_getsockopt > for exactly the same purpose. If you happen to optimize it, pls > update these places as well. Just like it's already done in the patch? Or maybe you mean something else?
On Tue, Dec 14, 2021 at 10:00 AM Pavel Begunkov <asml.silence@gmail.com> wrote: > > On 12/14/21 17:54, sdf@google.com wrote: > > On 12/11, Pavel Begunkov wrote: > >> On 12/11/21 01:56, Martin KaFai Lau wrote: > >> > On Sat, Dec 11, 2021 at 01:15:05AM +0000, Pavel Begunkov wrote: > >> > > That was the first idea, but it's still heavier than I'd wish. 0.3%-0.7% > >> > > in profiles, something similar in reqs/s. rcu_read_lock/unlock() pair is > >> > > cheap but anyway adds 2 barrier()s, and with bitmasks we can inline > >> > > the check. > >> > It sounds like there is opportunity to optimize > >> > __cgroup_bpf_prog_array_is_empty(). > >> > > >> > How about using rcu_access_pointer(), testing with &empty_prog_array.hdr, > >> > and then inline it? The cgroup prog array cannot be all > >> > dummy_bpf_prog.prog. If that could be the case, it should be replaced > >> > with &empty_prog_array.hdr earlier, so please check. > > > >> I'd need to expose and export empty_prog_array, but that should do. > >> Will try it out, thanks > > > > Note that we already use __cgroup_bpf_prog_array_is_empty in > > __cgroup_bpf_run_filter_setsockopt/__cgroup_bpf_run_filter_getsockopt > > for exactly the same purpose. If you happen to optimize it, pls > > update these places as well. > > Just like it's already done in the patch? Or maybe you mean something else? Ah, you already did it, looks good! I didn't scroll all the way to the bottom and got distracted by Martin's comment about __cgroup_bpf_prog_array_is_empty :-[
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 11820a430d6c..99b01201d7db 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -141,6 +141,9 @@ struct cgroup_bpf { struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; + /* for each type tracks whether effective prog array is not empty */ + unsigned long enabled_mask; + /* list of cgroup shared storages */ struct list_head storages; @@ -219,11 +222,25 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value); int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, void *value, u64 flags); +static inline bool __cgroup_bpf_type_enabled(struct cgroup_bpf *cgrp_bpf, + enum cgroup_bpf_attach_type atype) +{ + return test_bit(atype, &cgrp_bpf->enabled_mask); +} + +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \ +({ \ + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \ + \ + __cgroup_bpf_type_enabled(&__cgrp->bpf, (atype)); \ +}) + /* Wrappers for __cgroup_bpf_run_filter_skb() guarded by cgroup_bpf_enabled. */ #define BPF_CGROUP_RUN_PROG_INET_INGRESS(sk, skb) \ ({ \ int __ret = 0; \ - if (cgroup_bpf_enabled(CGROUP_INET_INGRESS)) \ + if (cgroup_bpf_enabled(CGROUP_INET_INGRESS) && sk && \ + CGROUP_BPF_TYPE_ENABLED((sk), CGROUP_INET_INGRESS)) \ __ret = __cgroup_bpf_run_filter_skb(sk, skb, \ CGROUP_INET_INGRESS); \ \ @@ -235,9 +252,10 @@ int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key, int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_INET_EGRESS) && sk && sk == skb->sk) { \ typeof(sk) __sk = sk_to_full_sk(sk); \ - if (sk_fullsock(__sk)) \ + if (sk_fullsock(__sk) && \ + CGROUP_BPF_TYPE_ENABLED(__sk, CGROUP_INET_EGRESS)) \ __ret = __cgroup_bpf_run_filter_skb(__sk, skb, \ - CGROUP_INET_EGRESS); \ + CGROUP_INET_EGRESS); \ } \ __ret; \ }) diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 2ca643af9a54..28c8d0d6ea45 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -272,6 +272,11 @@ static void activate_effective_progs(struct cgroup *cgrp, enum cgroup_bpf_attach_type atype, struct bpf_prog_array *old_array) { + if (!bpf_prog_array_is_empty(old_array)) + set_bit(atype, &cgrp->bpf.enabled_mask); + else + clear_bit(atype, &cgrp->bpf.enabled_mask); + old_array = rcu_replace_pointer(cgrp->bpf.effective[atype], old_array, lockdep_is_held(&cgroup_mutex)); /* free prog array after grace period, since __cgroup_bpf_run_*() @@ -1354,20 +1359,6 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head, } #ifdef CONFIG_NET -static bool __cgroup_bpf_prog_array_is_empty(struct cgroup *cgrp, - enum cgroup_bpf_attach_type attach_type) -{ - struct bpf_prog_array *prog_array; - bool empty; - - rcu_read_lock(); - prog_array = rcu_dereference(cgrp->bpf.effective[attach_type]); - empty = bpf_prog_array_is_empty(prog_array); - rcu_read_unlock(); - - return empty; -} - static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen, struct bpf_sockopt_buf *buf) { @@ -1430,7 +1421,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level, * attached to the hook so we don't waste time allocating * memory and locking the socket. */ - if (__cgroup_bpf_prog_array_is_empty(cgrp, CGROUP_SETSOCKOPT)) + if (!__cgroup_bpf_type_enabled(&cgrp->bpf, CGROUP_SETSOCKOPT)) return 0; /* Allocate a bit more than the initial user buffer for @@ -1526,7 +1517,7 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level, * attached to the hook so we don't waste time allocating * memory and locking the socket. */ - if (__cgroup_bpf_prog_array_is_empty(cgrp, CGROUP_GETSOCKOPT)) + if (!__cgroup_bpf_type_enabled(&cgrp->bpf, CGROUP_GETSOCKOPT)) return retval; ctx.optlen = max_optlen;
cgroup_bpf_enabled_key static key guards from overhead in cases where no cgroup bpf program of a specific type is loaded in any cgroup. Turn out that's not always good enough, e.g. when there are many cgroups but ones that we're interesting in are without bpf. It's seen in server environments, but the problem seems to be even wider as apparently systemd loads some BPF affecting my laptop. Profiles for small packet or zerocopy transmissions over fast network show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from migrate_disable/enable(), and similarly on the receiving side. Also got +4-5% of t-put for local testing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++--- kernel/bpf/cgroup.c | 23 +++++++---------------- 2 files changed, 28 insertions(+), 19 deletions(-)