From patchwork Wed Oct 18 06:03:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuyi Zhou X-Patchwork-Id: 13426390 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F8528F59 for ; Wed, 18 Oct 2023 06:03:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="bAuwMM04" Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4382F9 for ; Tue, 17 Oct 2023 23:03:34 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6934202b8bdso5612152b3a.1 for ; Tue, 17 Oct 2023 23:03:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1697609014; x=1698213814; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FNW2oYGtelVvgczyk2A13Ij/ocoUNYFTCyV6b+P6gh8=; b=bAuwMM04S+xXGF2mph5y8RlqZdBfuL+ajSBfC8QXGjoeoJLWhuDh9nA+BdUrXXdLWW rAcCpcs/Es1uO8VE3XLTgYaTkbQjVIEV9jUb7FbiWZw7IQFuWX4KPZshS/VpkfuxGY95 gG+7EPKiLPEgzcEEtBy0H+VoRbAHW+VPthYHyGF2s1f0FyHjGshSkGdg98/x2qOIMNX3 KzKzI0rP+BpU16oX2RhLPKA7HZ8oP3OY22w/9SvTGbbI+UBlJ4soHBifGHe1CQu6RrRg R3vJ9t8sdS0Zt4Wy4xat/oxRzcwuGr+5ANeSo+Tbq7RprU/9rGDHYL09ypIrKBbbatUI jiog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697609014; x=1698213814; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FNW2oYGtelVvgczyk2A13Ij/ocoUNYFTCyV6b+P6gh8=; b=wzpwo5KaXbhzPdpbHOxvH5ZAqbpAFSoJ5W0W/izoD5z3C3AMoEX0awxBeW5gnl4hDD 5UPyZZk0gxmeFzUuqQVdBGWl+ZDvbL4d/RgHlWyAxeIwwI2qcEiBQ6Mrlh3Q2Vu0hz1T o5EJ2rLH6YKtM47+DuWzfngBD+kFcqakFUUNxbd2OFkYc+wUfGyGHiGownEKte8huzVF 0vkyrgdm8Hwuyo6UPWiYc/B2qNAnahP+UfAFUEp3phWzCF+yfxgynM0eFUFeGrAhWuc1 YqUGYkge94KpBHNnubOYUawFZQpp6ZtVYTc89hzroxCtByvBbuI7cV072ShuQ+jjnMKt CMkQ== X-Gm-Message-State: AOJu0Yxn0oaVygptJKaxq2IarbHr7hmR4ANYMQNycLdgGiRBb8HdH3Ub BvTGscAnM32253A02Emm261jO1fAXQfkqMyRCD0= X-Google-Smtp-Source: AGHT+IHqopjsr9BP1ss3CSoo05FNLUC7HlYkBIJAwdcXGipKNcB3IBn5N91CU3D2e0oxtKbS3nDKUA== X-Received: by 2002:a05:6a20:72ab:b0:15e:bb88:b76e with SMTP id o43-20020a056a2072ab00b0015ebb88b76emr4544752pzk.14.1697609014178; Tue, 17 Oct 2023 23:03:34 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id b14-20020a17090acc0e00b00276cb03a0e9sm505782pju.46.2023.10.17.23.03.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 23:03:33 -0700 (PDT) From: Chuyi Zhou To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org, Chuyi Zhou Subject: [PATCH bpf-next v6 1/8] cgroup: Prepare for using css_task_iter_*() in BPF Date: Wed, 18 Oct 2023 14:03:11 +0800 Message-Id: <20231018060318.105524-2-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231018060318.105524-1-zhouchuyi@bytedance.com> References: <20231018060318.105524-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This patch makes some preparations for using css_task_iter_*() in BPF Program. 1. Flags CSS_TASK_ITER_* are #define-s and it's not easy for bpf prog to use them. Convert them to enum so bpf prog can take them from vmlinux.h. 2. In the next patch we will add css_task_iter_*() in common kfuncs which is not safe. Since css_task_iter_*() does spin_unlock_irq() which might screw up irq flags depending on the context where bpf prog is running. So we should use irqsave/irqrestore here and the switching is harmless. Suggested-by: Alexei Starovoitov Signed-off-by: Chuyi Zhou Acked-by: Tejun Heo --- include/linux/cgroup.h | 12 +++++------- kernel/cgroup/cgroup.c | 18 ++++++++++++------ 2 files changed, 17 insertions(+), 13 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index b307013b9c6c..0ef0af66080e 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -40,13 +40,11 @@ struct kernel_clone_args; #define CGROUP_WEIGHT_DFL 100 #define CGROUP_WEIGHT_MAX 10000 -/* walk only threadgroup leaders */ -#define CSS_TASK_ITER_PROCS (1U << 0) -/* walk all threaded css_sets in the domain */ -#define CSS_TASK_ITER_THREADED (1U << 1) - -/* internal flags */ -#define CSS_TASK_ITER_SKIPPED (1U << 16) +enum { + CSS_TASK_ITER_PROCS = (1U << 0), /* walk only threadgroup leaders */ + CSS_TASK_ITER_THREADED = (1U << 1), /* walk all threaded css_sets in the domain */ + CSS_TASK_ITER_SKIPPED = (1U << 16), /* internal flags */ +}; /* a css_task_iter should be treated as an opaque object */ struct css_task_iter { diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 1fb7f562289d..b6d64f3b8888 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4917,9 +4917,11 @@ static void css_task_iter_advance(struct css_task_iter *it) void css_task_iter_start(struct cgroup_subsys_state *css, unsigned int flags, struct css_task_iter *it) { + unsigned long irqflags; + memset(it, 0, sizeof(*it)); - spin_lock_irq(&css_set_lock); + spin_lock_irqsave(&css_set_lock, irqflags); it->ss = css->ss; it->flags = flags; @@ -4933,7 +4935,7 @@ void css_task_iter_start(struct cgroup_subsys_state *css, unsigned int flags, css_task_iter_advance(it); - spin_unlock_irq(&css_set_lock); + spin_unlock_irqrestore(&css_set_lock, irqflags); } /** @@ -4946,12 +4948,14 @@ void css_task_iter_start(struct cgroup_subsys_state *css, unsigned int flags, */ struct task_struct *css_task_iter_next(struct css_task_iter *it) { + unsigned long irqflags; + if (it->cur_task) { put_task_struct(it->cur_task); it->cur_task = NULL; } - spin_lock_irq(&css_set_lock); + spin_lock_irqsave(&css_set_lock, irqflags); /* @it may be half-advanced by skips, finish advancing */ if (it->flags & CSS_TASK_ITER_SKIPPED) @@ -4964,7 +4968,7 @@ struct task_struct *css_task_iter_next(struct css_task_iter *it) css_task_iter_advance(it); } - spin_unlock_irq(&css_set_lock); + spin_unlock_irqrestore(&css_set_lock, irqflags); return it->cur_task; } @@ -4977,11 +4981,13 @@ struct task_struct *css_task_iter_next(struct css_task_iter *it) */ void css_task_iter_end(struct css_task_iter *it) { + unsigned long irqflags; + if (it->cur_cset) { - spin_lock_irq(&css_set_lock); + spin_lock_irqsave(&css_set_lock, irqflags); list_del(&it->iters_node); put_css_set_locked(it->cur_cset); - spin_unlock_irq(&css_set_lock); + spin_unlock_irqrestore(&css_set_lock, irqflags); } if (it->cur_dcset) From patchwork Wed Oct 18 06:03:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuyi Zhou X-Patchwork-Id: 13426391 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94A1C8F45 for ; Wed, 18 Oct 2023 06:03:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lNlW4MHS" Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12435FA for ; Tue, 17 Oct 2023 23:03:38 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-5a9d8f4388bso2907063a12.3 for ; Tue, 17 Oct 2023 23:03:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1697609017; x=1698213817; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BbIp3xjsMf+vYU1+9/Ac/+6x9YCK8BhSfh7PegmcPCk=; b=lNlW4MHStfBXSQ1ZYeV+XIS3QOZLzJkZOHDX+SGQImWJR7wjGSf+LEEKSXWalAzDgq pTspO64J2sy8MGGCQfunb4OU9Th+nqO2tKCiVZepLEuinRBm1hRJaRk+3jwVLGo5GCLQ o7ZhwvrYdBBzmIknzQxpxRAjd/EyvkzZwcohPREpsN73nDrgmzTkbAlJm2r0ENKyjmrR 0sBrxTdId0UKqNabUoYw4oOqYO5MXGAw9XPv15KLdJUjnWwX4NVJh93nb3Fq9ZTdxnpY P+YQHXbNtVJwCu367oMJCHfzwXVr8i21f1tKDYt5KrcOPMeSbSwAh/Kj68yg2oEfNZwh r5CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697609017; x=1698213817; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BbIp3xjsMf+vYU1+9/Ac/+6x9YCK8BhSfh7PegmcPCk=; b=k/EdrKBmAVe6okooryS+CdXr2ej8jpBTO05/uTOBQNnbGhO9J8TRR5U0jTZgCxqXm0 NgAY4P4iePHVS0vKrnE6GW4HSaT5Eq+2Q4RNvlzkI1BBbQWtX1R6AQfqqXcVQ4NHo61e QoHBtPYks0X8wgeg0oJ7kWfN3QbNUZq/bYPdo0puC5EKP4zqfFfCJc4CXGIpoQQLyQpM w7isuNDbEvVNQMRJkX4hCp+0t8EJyhJ1wOwBy4yckNLXz8u+gHi6nMokbNMPydBjWpcu HDYlenIVY0wWL9sBHfSeDUgYv7KAStarBnxUxkiiH6p4YzMKhpOAEARuSC6wgKPS+3lD eH+Q== X-Gm-Message-State: AOJu0YzB5KbGJzKtmK/oK3KFZEyVngIjBC1K0o1aegevFea3SfycJLix ZpRtVtmFeSttSrLcb1kOpA2FLG38eJn8zGOwpu0= X-Google-Smtp-Source: AGHT+IFMdAqLCRZf5Wjki/UePWaDQXS6ZA3IGknhq+zJ6zAC4wWQO38tcJ+B2LN1oIJ3Nc21rJln/Q== X-Received: by 2002:a05:6a21:999c:b0:17a:f2ed:e921 with SMTP id ve28-20020a056a21999c00b0017af2ede921mr4145757pzb.55.1697609017344; Tue, 17 Oct 2023 23:03:37 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id b14-20020a17090acc0e00b00276cb03a0e9sm505782pju.46.2023.10.17.23.03.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 23:03:37 -0700 (PDT) From: Chuyi Zhou To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org, Chuyi Zhou Subject: [PATCH bpf-next v6 2/8] bpf: Introduce css_task open-coded iterator kfuncs Date: Wed, 18 Oct 2023 14:03:12 +0800 Message-Id: <20231018060318.105524-3-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231018060318.105524-1-zhouchuyi@bytedance.com> References: <20231018060318.105524-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow creation and manipulation of struct bpf_iter_css_task in open-coded iterator style. These kfuncs actually wrapps css_task_iter_{start,next, end}. BPF programs can use these kfuncs through bpf_for_each macro for iteration of all tasks under a css. css_task_iter_*() would try to get the global spin-lock *css_set_lock*, so the bpf side has to be careful in where it allows to use this iter. Currently we only allow it in bpf_lsm and bpf iter-s. Signed-off-by: Chuyi Zhou Acked-by: Tejun Heo --- kernel/bpf/helpers.c | 3 + kernel/bpf/task_iter.c | 58 +++++++++++++++++++ kernel/bpf/verifier.c | 23 ++++++++ .../testing/selftests/bpf/bpf_experimental.h | 8 +++ 4 files changed, 92 insertions(+) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 61f51dee8448..c01441db9fd5 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2560,6 +2560,9 @@ BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_iter_task_vma_new, KF_ITER_NEW | KF_RCU) BTF_ID_FLAGS(func, bpf_iter_task_vma_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_task_vma_destroy, KF_ITER_DESTROY) +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_dynptr_adjust) BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index fef17628341f..e4126698cecf 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -894,6 +894,64 @@ __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it) __diag_pop(); +struct bpf_iter_css_task { + __u64 __opaque[1]; +} __attribute__((aligned(8))); + +struct bpf_iter_css_task_kern { + struct css_task_iter *css_it; +} __attribute__((aligned(8))); + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it, + struct cgroup_subsys_state *css, unsigned int flags) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task)); + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) != + __alignof__(struct bpf_iter_css_task)); + kit->css_it = NULL; + switch (flags) { + case CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED: + case CSS_TASK_ITER_PROCS: + case 0: + break; + default: + return -EINVAL; + } + + kit->css_it = bpf_mem_alloc(&bpf_global_ma, sizeof(struct css_task_iter)); + if (!kit->css_it) + return -ENOMEM; + css_task_iter_start(css, flags, kit->css_it); + return 0; +} + +__bpf_kfunc struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + if (!kit->css_it) + return NULL; + return css_task_iter_next(kit->css_it); +} + +__bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) +{ + struct bpf_iter_css_task_kern *kit = (void *)it; + + if (!kit->css_it) + return; + css_task_iter_end(kit->css_it); + bpf_mem_free(&bpf_global_ma, kit->css_it); +} + +__diag_pop(); + DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); static void do_mmap_read_unlock(struct irq_work *entry) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index bb58987e4844..974713185269 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10472,6 +10472,7 @@ enum special_kfunc_type { KF_bpf_percpu_obj_new_impl, KF_bpf_percpu_obj_drop_impl, KF_bpf_throw, + KF_bpf_iter_css_task_new, }; BTF_SET_START(special_kfunc_set) @@ -10495,6 +10496,7 @@ BTF_ID(func, bpf_dynptr_clone) BTF_ID(func, bpf_percpu_obj_new_impl) BTF_ID(func, bpf_percpu_obj_drop_impl) BTF_ID(func, bpf_throw) +BTF_ID(func, bpf_iter_css_task_new) BTF_SET_END(special_kfunc_set) BTF_ID_LIST(special_kfunc_list) @@ -10520,6 +10522,7 @@ BTF_ID(func, bpf_dynptr_clone) BTF_ID(func, bpf_percpu_obj_new_impl) BTF_ID(func, bpf_percpu_obj_drop_impl) BTF_ID(func, bpf_throw) +BTF_ID(func, bpf_iter_css_task_new) static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) { @@ -11050,6 +11053,20 @@ static int process_kf_arg_ptr_to_rbtree_node(struct bpf_verifier_env *env, &meta->arg_rbtree_root.field); } +static bool check_css_task_iter_allowlist(struct bpf_verifier_env *env) +{ + enum bpf_prog_type prog_type = resolve_prog_type(env->prog); + + switch (prog_type) { + case BPF_PROG_TYPE_LSM: + return true; + case BPF_TRACE_ITER: + return env->prog->aux->sleepable; + default: + return false; + } +} + static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta, int insn_idx) { @@ -11300,6 +11317,12 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ break; } case KF_ARG_PTR_TO_ITER: + if (meta->func_id == special_kfunc_list[KF_bpf_iter_css_task_new]) { + if (!check_css_task_iter_allowlist(env)) { + verbose(env, "css_task_iter is only allowed in bpf_lsm and bpf iter-s\n"); + return -EINVAL; + } + } ret = process_iter_arg(env, regno, insn_idx, meta); if (ret < 0) return ret; diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 2c8cb3f61529..6792ed2b45d7 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -458,4 +458,12 @@ extern void bpf_throw(u64 cookie) __ksym; __bpf_assert_op(LHS, <=, END, value, false); \ }) +struct bpf_iter_css_task; +struct cgroup_subsys_state; +extern int bpf_iter_css_task_new(struct bpf_iter_css_task *it, + struct cgroup_subsys_state *css, unsigned int flags) __weak __ksym; +extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) __weak __ksym; +extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __weak __ksym; + + #endif From patchwork Wed Oct 18 06:03:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuyi Zhou X-Patchwork-Id: 13426392 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1335F8F45 for ; Wed, 18 Oct 2023 06:03:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="YUZoUS8m" Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC6DFEA for ; Tue, 17 Oct 2023 23:03:41 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-5aaebfac4b0so3066529a12.2 for ; Tue, 17 Oct 2023 23:03:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1697609021; x=1698213821; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6pA+uWgAid8oSLs2nLYNpXTwEjADAAqZra1pWMMUUeE=; b=YUZoUS8mKtF9eJp0E5jNZsZoUzKff6RJGp2trsjygUpRzQoecB84eC6OpgwkKq159u EPg0Q4cKZiZn5vtq0RZYe422+jQYoGmRayS506CQnsfqbaIb+oVehf0+9PRproDfY0SW CoHMQMUF8OvYEdHfyRxM+BAKY9QVc3aIOgzBXqrr6Z+ZLSYNEr9Dnpr6Ca2lgjG2vG1C +IlJILhZIjqr7dlgTi0MHDu+6AoW4+iQ98Fnou/GRa7cvmxFRnzY5m3yvJxGh8QrEcri CjVV2vVvCX2TbaJ3ilq4MCd766F58DFFISb2mE+UG2LEHzj+lgft6JraT1jIxldcLgGy MitA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697609021; x=1698213821; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6pA+uWgAid8oSLs2nLYNpXTwEjADAAqZra1pWMMUUeE=; b=MyiOU6gcjZOFrU1ukOzl/MFdLU+k0IrvFg3sybbsyDvkN503GtlLbhymvFCSfSfPAZ /yVrC78295CKRm5K2OBNanQVEx1BVJhCZIJXzgBCEMiaYDm/bgdzdKboXa9e55TMqPtw h9dgsNSNfPFwzCnAnSw9Z7kEn1jTafH9eETksLo102q7RgzCBh6B+6Xw2ilJ+THCJlXJ gS4ECZTQQkjpeHQn+FVF/CBB/PidxX46mPAMA3vgzOxfDcUapb25X1WqKIHXgaaevmsW Jl881Exu/AMxw5ejNoN20hyEH8a27Q+DtETscAXncXpuo4dx5Bu9VkfNGTp/GVCMLKYH FERw== X-Gm-Message-State: AOJu0YxBg03ny3vNIQtt9dTSxMDuDllm04PVT7HH9s4+zKMctH1zWlvQ iGjWxhCe8Z9vuCacqrZAqqsoi400wjDKraFUdwU= X-Google-Smtp-Source: AGHT+IEO+03WhZt4F6AHpfpw7A9KXjwU+OZEB7duCyd8ayOSyzXuyioFP+BNVpHWrGFf/YIaxP8YAw== X-Received: by 2002:a17:90a:1141:b0:27d:61ff:3d3b with SMTP id d1-20020a17090a114100b0027d61ff3d3bmr4503004pje.38.1697609020936; Tue, 17 Oct 2023 23:03:40 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id b14-20020a17090acc0e00b00276cb03a0e9sm505782pju.46.2023.10.17.23.03.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 23:03:40 -0700 (PDT) From: Chuyi Zhou To: bpf@vger.kernel.org Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org, Chuyi Zhou Subject: [PATCH bpf-next v6 3/8] bpf: Introduce task open coded iterator kfuncs Date: Wed, 18 Oct 2023 14:03:13 +0800 Message-Id: <20231018060318.105524-4-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231018060318.105524-1-zhouchuyi@bytedance.com> References: <20231018060318.105524-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow creation and manipulation of struct bpf_iter_task in open-coded iterator style. BPF programs can use these kfuncs or through bpf_for_each macro to iterate all processes in the system. The API design keep consistent with SEC("iter/task"). bpf_iter_task_new() accepts a specific task and iterating type which allows: 1. iterating all process in the system (BPF_TASK_ITER_ALL_PROCS) 2. iterating all threads in the system (BPF_TASK_ITER_ALL_THREADS) 3. iterating all threads of a specific task (BPF_TASK_ITER_PROC_THREADS) Signed-off-by: Chuyi Zhou --- kernel/bpf/helpers.c | 3 + kernel/bpf/task_iter.c | 90 +++++++++++++++++++ .../testing/selftests/bpf/bpf_experimental.h | 5 ++ 3 files changed, 98 insertions(+) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index c01441db9fd5..c25941531265 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2563,6 +2563,9 @@ BTF_ID_FLAGS(func, bpf_iter_task_vma_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY) +BTF_ID_FLAGS(func, bpf_iter_task_new, KF_ITER_NEW | KF_TRUSTED_ARGS) +BTF_ID_FLAGS(func, bpf_iter_task_next, KF_ITER_NEXT | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_iter_task_destroy, KF_ITER_DESTROY) BTF_ID_FLAGS(func, bpf_dynptr_adjust) BTF_ID_FLAGS(func, bpf_dynptr_is_null) BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index e4126698cecf..faa1712c1df5 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -952,6 +952,96 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __diag_pop(); +struct bpf_iter_task { + __u64 __opaque[3]; +} __attribute__((aligned(8))); + +struct bpf_iter_task_kern { + struct task_struct *task; + struct task_struct *pos; + unsigned int flags; +} __attribute__((aligned(8))); + +enum { + /* all process in the system */ + BPF_TASK_ITER_ALL_PROCS, + /* all threads in the system */ + BPF_TASK_ITER_ALL_THREADS, + /* all threads of a specific process */ + BPF_TASK_ITER_PROC_THREADS +}; + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, + struct task_struct *task, unsigned int flags) +{ + struct bpf_iter_task_kern *kit = (void *)it; + + BUILD_BUG_ON(sizeof(struct bpf_iter_task_kern) > sizeof(struct bpf_iter_task)); + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) != + __alignof__(struct bpf_iter_task)); + + kit->task = kit->pos = NULL; + switch (flags) { + case BPF_TASK_ITER_ALL_THREADS: + case BPF_TASK_ITER_ALL_PROCS: + case BPF_TASK_ITER_PROC_THREADS: + break; + default: + return -EINVAL; + } + + if (flags == BPF_TASK_ITER_PROC_THREADS) + kit->task = task; + else + kit->task = &init_task; + kit->pos = kit->task; + kit->flags = flags; + return 0; +} + +__bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it) +{ + struct bpf_iter_task_kern *kit = (void *)it; + struct task_struct *pos; + unsigned int flags; + + flags = kit->flags; + pos = kit->pos; + + if (!pos) + return pos; + + if (flags == BPF_TASK_ITER_ALL_PROCS) + goto get_next_task; + + kit->pos = next_thread(kit->pos); + if (kit->pos == kit->task) { + if (flags == BPF_TASK_ITER_PROC_THREADS) { + kit->pos = NULL; + return pos; + } + } else + return pos; + +get_next_task: + kit->pos = next_task(kit->pos); + kit->task = kit->pos; + if (kit->pos == &init_task) + kit->pos = NULL; + + return pos; +} + +__bpf_kfunc void bpf_iter_task_destroy(struct bpf_iter_task *it) +{ +} + +__diag_pop(); + DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); static void do_mmap_read_unlock(struct irq_work *entry) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 6792ed2b45d7..2f6c747aa874 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -465,5 +465,10 @@ extern int bpf_iter_css_task_new(struct bpf_iter_css_task *it, extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) __weak __ksym; extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __weak __ksym; +struct bpf_iter_task; +extern int bpf_iter_task_new(struct bpf_iter_task *it, + struct task_struct *task, unsigned int flags) __weak __ksym; +extern struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it) __weak __ksym; +extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; #endif