Message ID | 20240319050302.1085006-1-josef@netflix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [V3,bpf-next,1/2] bpf: add bpf_task_get_cgroup kfunc | expand |
On Mon, Mar 18, 2024 at 10:04 PM Jose Fernandez <josef@netflix.com> wrote: > > This patch enhances the BPF helpers by adding a kfunc to retrieve the > cgroup v2 of a task, addressing a previous limitation where only > bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is > particularly useful for scenarios where obtaining the cgroup ID of a > task other than the "current" one is necessary, which the existing > bpf_get_current_cgroup_id helper cannot accommodate. A specific use > case at Netflix involved the sched_switch tracepoint, where we had to > get the cgroup IDs of both the prev and next tasks. > > The bpf_task_get_cgroup kfunc acquires and returns a reference to a > task's default cgroup, ensuring thread-safe access by correctly > implementing RCU read locking and unlocking. It leverages the existing > cgroup.h helper, and cgroup_tryget to safely acquire a reference to it. > > Signed-off-by: Jose Fernandez <josef@netflix.com> > Reviewed-by: Tycho Andersen <tycho@tycho.pizza> > Acked-by: Yonghong Song <yonghong.song@linux.dev> > Acked-by: Stanislav Fomichev <sdf@google.com> > --- > V2 -> V3: No changes > V1 -> V2: Return a pointer to the cgroup instead of the cgroup ID > > kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++ > 1 file changed, 26 insertions(+) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index a89587859571..bbd19d5eedb6 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2266,6 +2266,31 @@ bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) > return NULL; > return cgrp; > } > + > +/** > + * bpf_task_get_cgroup - Acquire a reference to the default cgroup of a task. > + * @task: The target task > + * > + * This function returns the task's default cgroup, primarily > + * designed for use with cgroup v2. In cgroup v1, the concept of default > + * cgroup varies by subsystem, and while this function will work with > + * cgroup v1, it's recommended to use bpf_task_get_cgroup1 instead. > + * A cgroup returned by this kfunc which is not subsequently stored in a > + * map, must be released by calling bpf_cgroup_release(). > + * > + * Return: On success, the cgroup is returned. On failure, NULL is returned. > + */ > +__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task) > +{ > + struct cgroup *cgrp; > + > + rcu_read_lock(); > + cgrp = task_dfl_cgroup(task); > + if (!cgroup_tryget(cgrp)) > + cgrp = NULL; > + rcu_read_unlock(); > + return cgrp; > +} Tejun, Does this patch look good to you?
On Mon, Mar 18, 2024 at 11:03:01PM -0600, Jose Fernandez wrote: > This patch enhances the BPF helpers by adding a kfunc to retrieve the > cgroup v2 of a task, addressing a previous limitation where only > bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is > particularly useful for scenarios where obtaining the cgroup ID of a > task other than the "current" one is necessary, which the existing > bpf_get_current_cgroup_id helper cannot accommodate. A specific use > case at Netflix involved the sched_switch tracepoint, where we had to > get the cgroup IDs of both the prev and next tasks. > > The bpf_task_get_cgroup kfunc acquires and returns a reference to a > task's default cgroup, ensuring thread-safe access by correctly > implementing RCU read locking and unlocking. It leverages the existing > cgroup.h helper, and cgroup_tryget to safely acquire a reference to it. > > Signed-off-by: Jose Fernandez <josef@netflix.com> > Reviewed-by: Tycho Andersen <tycho@tycho.pizza> > Acked-by: Yonghong Song <yonghong.song@linux.dev> > Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Tejun Heo <tj@kernel.org> but some questions below > +__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task) > +{ > + struct cgroup *cgrp; > + > + rcu_read_lock(); > + cgrp = task_dfl_cgroup(task); > + if (!cgroup_tryget(cgrp)) > + cgrp = NULL; > + rcu_read_unlock(); > + return cgrp; > +} So, as this is a lot easier in cgroup2, the above can probably be written directly in BPF (untested and not sure the necessary annotations are in place, so please take it with a big grain of salt): bpf_rcu_read_lock(); cgrp = task->cgroups->dfl_cgrp; cgrp = bpf_cgroup_from_id(cgrp->kn.id); bpf_rcu_read_unlock(); If all you need is ID, it's even simpler: bpf_rcu_read_lock(); cgrp_id = task->cgroups->dfl_cgrp->kn.id; bpf_rcu_read_unlock(); In the first example, it's not great that we go from task pointer to cgroup pointer to ID and then back to acquired cgroup pointer. I wonder whether what we really want is to support something like the following: bpf_rcu_read_lock(); cgrp = bpf_cgroup_tryget(task->cgroups->dfl_cgrp); bpf_rcu_read_unlock(); What do you think? Thanks.
On Wed, Mar 20, 2024 at 7:41 AM Tejun Heo <tj@kernel.org> wrote: > > On Mon, Mar 18, 2024 at 11:03:01PM -0600, Jose Fernandez wrote: > > This patch enhances the BPF helpers by adding a kfunc to retrieve the > > cgroup v2 of a task, addressing a previous limitation where only > > bpf_task_get_cgroup1 was available for cgroup v1. The new kfunc is > > particularly useful for scenarios where obtaining the cgroup ID of a > > task other than the "current" one is necessary, which the existing > > bpf_get_current_cgroup_id helper cannot accommodate. A specific use > > case at Netflix involved the sched_switch tracepoint, where we had to > > get the cgroup IDs of both the prev and next tasks. > > > > The bpf_task_get_cgroup kfunc acquires and returns a reference to a > > task's default cgroup, ensuring thread-safe access by correctly > > implementing RCU read locking and unlocking. It leverages the existing > > cgroup.h helper, and cgroup_tryget to safely acquire a reference to it. > > > > Signed-off-by: Jose Fernandez <josef@netflix.com> > > Reviewed-by: Tycho Andersen <tycho@tycho.pizza> > > Acked-by: Yonghong Song <yonghong.song@linux.dev> > > Acked-by: Stanislav Fomichev <sdf@google.com> > > Acked-by: Tejun Heo <tj@kernel.org> > > but some questions below > > > +__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task) > > +{ > > + struct cgroup *cgrp; > > + > > + rcu_read_lock(); > > + cgrp = task_dfl_cgroup(task); > > + if (!cgroup_tryget(cgrp)) > > + cgrp = NULL; > > + rcu_read_unlock(); > > + return cgrp; > > +} > > So, as this is a lot easier in cgroup2, the above can probably be written > directly in BPF (untested and not sure the necessary annotations are in > place, so please take it with a big grain of salt): > > bpf_rcu_read_lock(); > cgrp = task->cgroups->dfl_cgrp; > cgrp = bpf_cgroup_from_id(cgrp->kn.id); > bpf_rcu_read_unlock(); > > If all you need is ID, it's even simpler: > > bpf_rcu_read_lock(); > cgrp_id = task->cgroups->dfl_cgrp->kn.id; > bpf_rcu_read_unlock(); Good point. Looks like this kfunc isn't really needed. We even have the following in one of the selftests: /* simulate bpf_get_current_cgroup_id() helper */ bpf_rcu_read_lock(); cgroups = task->cgroups; if (!cgroups) goto unlock; cgroup_id = cgroups->dfl_cgrp->kn->id; unlock: bpf_rcu_read_unlock(); > In the first example, it's not great that we go from task pointer to cgroup > pointer to ID and then back to acquired cgroup pointer. I wonder whether > what we really want is to support something like the following: > > bpf_rcu_read_lock(); > cgrp = bpf_cgroup_tryget(task->cgroups->dfl_cgrp); > bpf_rcu_read_unlock(); this one should work already. that's existing bpf_cgroup_acquire(). pw-bot: cr
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index a89587859571..bbd19d5eedb6 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2266,6 +2266,31 @@ bpf_task_get_cgroup1(struct task_struct *task, int hierarchy_id) return NULL; return cgrp; } + +/** + * bpf_task_get_cgroup - Acquire a reference to the default cgroup of a task. + * @task: The target task + * + * This function returns the task's default cgroup, primarily + * designed for use with cgroup v2. In cgroup v1, the concept of default + * cgroup varies by subsystem, and while this function will work with + * cgroup v1, it's recommended to use bpf_task_get_cgroup1 instead. + * A cgroup returned by this kfunc which is not subsequently stored in a + * map, must be released by calling bpf_cgroup_release(). + * + * Return: On success, the cgroup is returned. On failure, NULL is returned. + */ +__bpf_kfunc struct cgroup *bpf_task_get_cgroup(struct task_struct *task) +{ + struct cgroup *cgrp; + + rcu_read_lock(); + cgrp = task_dfl_cgroup(task); + if (!cgroup_tryget(cgrp)) + cgrp = NULL; + rcu_read_unlock(); + return cgrp; +} #endif /* CONFIG_CGROUPS */ /** @@ -2573,6 +2598,7 @@ BTF_ID_FLAGS(func, bpf_cgroup_ancestor, KF_ACQUIRE | KF_RCU | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_cgroup_from_id, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_task_under_cgroup, KF_RCU) BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_task_get_cgroup, KF_ACQUIRE | KF_RCU | KF_RET_NULL) #endif BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_throw)