Message ID | 20230827072057.1591929-2-zhouchuyi@bytedance.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | Add Open-coded process and css iters | expand |
On Sun, Aug 27, 2023 at 12:21 AM Chuyi Zhou <zhouchuyi@bytedance.com> wrote: > > This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow > creation and manipulation of struct bpf_iter_css_task in open-coded > iterator style. These kfuncs actually wrapps > css_task_iter_{start,next,end}. BPF programs can use these kfuncs through > bpf_for_each macro for iteration of all tasks under a css. > > Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> > --- > include/uapi/linux/bpf.h | 4 ++++ > kernel/bpf/helpers.c | 3 +++ > kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++ > tools/include/uapi/linux/bpf.h | 4 ++++ > tools/lib/bpf/bpf_helpers.h | 7 ++++++ > 5 files changed, 57 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 60a9d59beeab..2a6e9b99564b 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -7195,4 +7195,8 @@ struct bpf_iter_num { > __u64 __opaque[1]; > } __attribute__((aligned(8))); > > +struct bpf_iter_css_task { > + __u64 __opaque[1]; > +} __attribute__((aligned(8))); > + > #endif /* _UAPI__LINUX_BPF_H__ */ > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 9e80efa59a5d..cf113ad24837 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL) > BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW) > BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL) > BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY) > +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW) > +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) > +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) > BTF_ID_FLAGS(func, bpf_dynptr_adjust) > BTF_ID_FLAGS(func, bpf_dynptr_is_null) > BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index c4ab9d6cdbe9..b1bdba40b684 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = { > .arg5_type = ARG_ANYTHING, > }; > > +struct bpf_iter_css_task_kern { > + struct css_task_iter *css_it; > +} __attribute__((aligned(8))); > + > +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it, > + struct cgroup_subsys_state *css, unsigned int flags) > +{ > + struct bpf_iter_css_task_kern *kit = (void *)it; > + > + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task)); > + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) != > + __alignof__(struct bpf_iter_css_task)); > + > + kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL); > + if (!kit->css_it) > + return -ENOMEM; > + css_task_iter_start(css, flags, kit->css_it); Some of the flags are internal. Like CSS_TASK_ITER_SKIPPED. The kfunc should probably only allow CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED, and not CSS_TASK_ITER_THREADED alone. Since they're #define-s it's not easy for bpf prog to use them. I think would be good to have a pre-patch that converts them to enum, so that bpf prog can take them from vmlinux.h. But the main issue of the patch that it adds this iter to common kfuncs. That's not safe, since css_task_iter_*() does spin_unlock_irq() which might screw up irq flags depending on the context where bpf prog is running. Can css_task_iter internals switch to irqsave/irqrestore? css_set_lock is also global, so the bpf side has to be careful in where it allows to use this iter. bpf_lsm hooks are safe, most of bpf iter-s are safe too. Future bpf-oom hooks are probably safe as well. We probably need an allowlist here.
Hello, 在 2023/9/6 03:02, Alexei Starovoitov 写道: > On Sun, Aug 27, 2023 at 12:21 AM Chuyi Zhou <zhouchuyi@bytedance.com> wrote: >> >> This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow >> creation and manipulation of struct bpf_iter_css_task in open-coded >> iterator style. These kfuncs actually wrapps >> css_task_iter_{start,next,end}. BPF programs can use these kfuncs through >> bpf_for_each macro for iteration of all tasks under a css. >> >> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> >> --- >> include/uapi/linux/bpf.h | 4 ++++ >> kernel/bpf/helpers.c | 3 +++ >> kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++ >> tools/include/uapi/linux/bpf.h | 4 ++++ >> tools/lib/bpf/bpf_helpers.h | 7 ++++++ >> 5 files changed, 57 insertions(+) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index 60a9d59beeab..2a6e9b99564b 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -7195,4 +7195,8 @@ struct bpf_iter_num { >> __u64 __opaque[1]; >> } __attribute__((aligned(8))); >> >> +struct bpf_iter_css_task { >> + __u64 __opaque[1]; >> +} __attribute__((aligned(8))); >> + >> #endif /* _UAPI__LINUX_BPF_H__ */ >> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c >> index 9e80efa59a5d..cf113ad24837 100644 >> --- a/kernel/bpf/helpers.c >> +++ b/kernel/bpf/helpers.c >> @@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL) >> BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW) >> BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL) >> BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY) >> +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW) >> +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) >> +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) >> BTF_ID_FLAGS(func, bpf_dynptr_adjust) >> BTF_ID_FLAGS(func, bpf_dynptr_is_null) >> BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) >> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c >> index c4ab9d6cdbe9..b1bdba40b684 100644 >> --- a/kernel/bpf/task_iter.c >> +++ b/kernel/bpf/task_iter.c >> @@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = { >> .arg5_type = ARG_ANYTHING, >> }; >> >> +struct bpf_iter_css_task_kern { >> + struct css_task_iter *css_it; >> +} __attribute__((aligned(8))); >> + >> +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it, >> + struct cgroup_subsys_state *css, unsigned int flags) >> +{ >> + struct bpf_iter_css_task_kern *kit = (void *)it; >> + >> + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task)); >> + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) != >> + __alignof__(struct bpf_iter_css_task)); >> + >> + kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL); >> + if (!kit->css_it) >> + return -ENOMEM; >> + css_task_iter_start(css, flags, kit->css_it); > > Some of the flags are internal. Like CSS_TASK_ITER_SKIPPED. > The kfunc should probably only allow CSS_TASK_ITER_PROCS | > CSS_TASK_ITER_THREADED, > and not CSS_TASK_ITER_THREADED alone. > > Since they're #define-s it's not easy for bpf prog to use them. > I think would be good to have a pre-patch that converts them to enum, > so that bpf prog can take them from vmlinux.h. > > > But the main issue of the patch that it adds this iter to common kfuncs. > That's not safe, since css_task_iter_*() does spin_unlock_irq() which > might screw up irq flags depending on the context where bpf prog is running. > Can css_task_iter internals switch to irqsave/irqrestore? Yes, I think so. Switching to irqsave/irqrestore is no harm. > css_set_lock is also global, so the bpf side has to be careful in > where it allows to use this iter. > bpf_lsm hooks are safe, most of bpf iter-s are safe too. > Future bpf-oom hooks are probably safe as well. > We probably need an allowlist here. What should we do if we want to make a allowlist? Do you mean we need to check prog_type or attach_type when we call these kfuncs in BPF verifier? If so, we should add a new attach_type or prog_type for bpf-oom in the feature so we can know the current BPF program is hooking for OOM Policy.
On Wed, Sep 6, 2023 at 5:37 AM Chuyi Zhou <zhouchuyi@bytedance.com> wrote: > > Hello, > > 在 2023/9/6 03:02, Alexei Starovoitov 写道: > > On Sun, Aug 27, 2023 at 12:21 AM Chuyi Zhou <zhouchuyi@bytedance.com> wrote: > >> > >> This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow > >> creation and manipulation of struct bpf_iter_css_task in open-coded > >> iterator style. These kfuncs actually wrapps > >> css_task_iter_{start,next,end}. BPF programs can use these kfuncs through > >> bpf_for_each macro for iteration of all tasks under a css. > >> > >> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> > >> --- > >> include/uapi/linux/bpf.h | 4 ++++ > >> kernel/bpf/helpers.c | 3 +++ > >> kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++ > >> tools/include/uapi/linux/bpf.h | 4 ++++ > >> tools/lib/bpf/bpf_helpers.h | 7 ++++++ > >> 5 files changed, 57 insertions(+) > >> > >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > >> index 60a9d59beeab..2a6e9b99564b 100644 > >> --- a/include/uapi/linux/bpf.h > >> +++ b/include/uapi/linux/bpf.h > >> @@ -7195,4 +7195,8 @@ struct bpf_iter_num { > >> __u64 __opaque[1]; > >> } __attribute__((aligned(8))); > >> > >> +struct bpf_iter_css_task { > >> + __u64 __opaque[1]; > >> +} __attribute__((aligned(8))); > >> + > >> #endif /* _UAPI__LINUX_BPF_H__ */ > >> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > >> index 9e80efa59a5d..cf113ad24837 100644 > >> --- a/kernel/bpf/helpers.c > >> +++ b/kernel/bpf/helpers.c > >> @@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL) > >> BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW) > >> BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL) > >> BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY) > >> +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW) > >> +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) > >> +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) > >> BTF_ID_FLAGS(func, bpf_dynptr_adjust) > >> BTF_ID_FLAGS(func, bpf_dynptr_is_null) > >> BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) > >> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > >> index c4ab9d6cdbe9..b1bdba40b684 100644 > >> --- a/kernel/bpf/task_iter.c > >> +++ b/kernel/bpf/task_iter.c > >> @@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = { > >> .arg5_type = ARG_ANYTHING, > >> }; > >> > >> +struct bpf_iter_css_task_kern { > >> + struct css_task_iter *css_it; > >> +} __attribute__((aligned(8))); > >> + > >> +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it, > >> + struct cgroup_subsys_state *css, unsigned int flags) > >> +{ > >> + struct bpf_iter_css_task_kern *kit = (void *)it; > >> + > >> + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task)); > >> + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) != > >> + __alignof__(struct bpf_iter_css_task)); > >> + > >> + kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL); > >> + if (!kit->css_it) > >> + return -ENOMEM; > >> + css_task_iter_start(css, flags, kit->css_it); > > > > Some of the flags are internal. Like CSS_TASK_ITER_SKIPPED. > > The kfunc should probably only allow CSS_TASK_ITER_PROCS | > > CSS_TASK_ITER_THREADED, > > and not CSS_TASK_ITER_THREADED alone. > > > > Since they're #define-s it's not easy for bpf prog to use them. > > I think would be good to have a pre-patch that converts them to enum, > > so that bpf prog can take them from vmlinux.h. > > > > > > But the main issue of the patch that it adds this iter to common kfuncs. > > That's not safe, since css_task_iter_*() does spin_unlock_irq() which > > might screw up irq flags depending on the context where bpf prog is running. > > Can css_task_iter internals switch to irqsave/irqrestore? > > Yes, I think so. Switching to irqsave/irqrestore is no harm. > > > css_set_lock is also global, so the bpf side has to be careful in > > where it allows to use this iter. > > bpf_lsm hooks are safe, most of bpf iter-s are safe too. > > Future bpf-oom hooks are probably safe as well. > > We probably need an allowlist here. > > What should we do if we want to make a allowlist? > Do you mean we need to check prog_type or attach_type when we call these > kfuncs in BPF verifier? If so, we should add a new attach_type or > prog_type for bpf-oom in the feature so we can know the current BPF > program is hooking for OOM Policy. bpf-oom type can be added later. Let's make this one work for bpf-lsm and sleepable iter-s first. See SEC("iter.s
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 60a9d59beeab..2a6e9b99564b 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -7195,4 +7195,8 @@ struct bpf_iter_num { __u64 __opaque[1]; } __attribute__((aligned(8))); +struct bpf_iter_css_task { + __u64 __opaque[1]; +} __attribute__((aligned(8))); + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 9e80efa59a5d..cf113ad24837 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW) BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY) +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW) +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_dynptr_adjust) BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index c4ab9d6cdbe9..b1bdba40b684 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = { .arg5_type = ARG_ANYTHING, }; +struct bpf_iter_css_task_kern { + struct css_task_iter *css_it; +} __attribute__((aligned(8))); + +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it, + struct cgroup_subsys_state *css, unsigned int flags) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task)); + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) != + __alignof__(struct bpf_iter_css_task)); + + kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL); + if (!kit->css_it) + return -ENOMEM; + css_task_iter_start(css, flags, kit->css_it); + return 0; +} + +__bpf_kfunc struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + if (!kit->css_it) + return NULL; + return css_task_iter_next(kit->css_it); +} + +__bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + if (!kit->css_it) + return; + css_task_iter_end(kit->css_it); + kfree(kit->css_it); +} + DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); static void do_mmap_read_unlock(struct irq_work *entry) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 60a9d59beeab..2a6e9b99564b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -7195,4 +7195,8 @@ struct bpf_iter_num { __u64 __opaque[1]; } __attribute__((aligned(8))); +struct bpf_iter_css_task { + __u64 __opaque[1]; +} __attribute__((aligned(8))); + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index bbab9ad9dc5a..f4d74b2aaddd 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -302,6 +302,13 @@ extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym; extern void bpf_iter_num_destroy(struct bpf_iter_num *it) __weak __ksym; +struct bpf_iter_css_task; +struct cgroup_subsys_state; +extern int bpf_iter_css_task_new(struct bpf_iter_css_task *it, + struct cgroup_subsys_state *css, unsigned int flags) __weak __ksym; +extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) __weak __ksym; +extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __weak __ksym; + #ifndef bpf_for_each /* bpf_for_each(iter_type, cur_elem, args...) provides generic construct for * using BPF open-coded iterators without having to write mundane explicit
This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow creation and manipulation of struct bpf_iter_css_task in open-coded iterator style. These kfuncs actually wrapps css_task_iter_{start,next,end}. BPF programs can use these kfuncs through bpf_for_each macro for iteration of all tasks under a css. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> --- include/uapi/linux/bpf.h | 4 ++++ kernel/bpf/helpers.c | 3 +++ kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 4 ++++ tools/lib/bpf/bpf_helpers.h | 7 ++++++ 5 files changed, 57 insertions(+)