From patchwork Thu Oct 10 23:25:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13831301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16FC3D2445D for ; Thu, 10 Oct 2024 23:25:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B3186B0088; Thu, 10 Oct 2024 19:25:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 865BB6B008A; Thu, 10 Oct 2024 19:25:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E6FA6B0088; Thu, 10 Oct 2024 19:25:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5116D6B0088 for ; Thu, 10 Oct 2024 19:25:11 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3AA4540CA8 for ; Thu, 10 Oct 2024 23:25:08 +0000 (UTC) X-FDA: 82659275580.06.643567A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id 65DAB14001A for ; Thu, 10 Oct 2024 23:25:07 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=diCbxJ1V; spf=pass (imf09.hostedemail.com: domain of namhyung@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728602571; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KqXB5q/NyOlpNhsJmf6mwELvm2zk8XMc6NrpFn4Ohwk=; b=jn3NV/tfqzVQrW7CtmUISDF8RQKUMN5IS16QUW4DULxdrGHg9a1AfsvGhttzfkmlqRGFxH RA7L0LXPXM5OOPHmrTY9Xw8xNTVpEQYqh8E1pJQpY3wwcbybes9wBeqa1bZ+FalalQlDoX UQe9OM7P8IuxQdCJO2OOme0s9kjh0JU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728602571; a=rsa-sha256; cv=none; b=kXINK2wz7DvXY5iKQN3P4BjcmIVv10Qf0GoUcjRJ1kJqbJOtYcSBzOxZYJ55b6ZslZWkTw dCIw2H7g8ZHbnabh9EYEdMmCJ9iNpqK8AaP+UwqiZlMwsMCU8G+Wu7miAzaTK+1XmtjFR9 ieVzfgtfITcOr5jTrBApUOuSoVECYHY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=diCbxJ1V; spf=pass (imf09.hostedemail.com: domain of namhyung@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D97355C6009; Thu, 10 Oct 2024 23:25:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2A5BDC4CED1; Thu, 10 Oct 2024 23:25:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728602708; bh=qwr7dD1YMBw5Ph7vaygJ/ovjODa5iDddCD/l90SK30g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=diCbxJ1VXjr7niALk+JJ9UkjhhtqbN2rskMmlYxFM/dazCKZtrvQLMHnp2MtoCxWM xfPLdR8d3QYjj2YVpEf1v3jrOY9sv6YV/Bf1LAf58Qap8JNuQEokJX0BuJMf1D0H5H xptQ7M2FOVg/6mPUkWJzrg32HcdwAdSlFYq62rIPL2621ftJ/0/EgDYgiRq6sNTXZ5 HLQrj654wA68XLE3Lv4Z8zXeOFrF22J63Wn//nIaI6pFnXQB7KpA/O8VKaNWWWQfHF bE/hkHGZf+waYurr4ytGDdcGzZZ2uNzaPCWhyyCN9IdcRfTTH3i3x1X3Fw96akpxmF yVkwUmSAotEiQ== From: Namhyung Kim To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko Cc: Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , LKML , bpf@vger.kernel.org, Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, Arnaldo Carvalho de Melo , Kees Cook , "Paul E. McKenney" Subject: [PATCH v5 bpf-next 1/3] bpf: Add kmem_cache iterator Date: Thu, 10 Oct 2024 16:25:03 -0700 Message-ID: <20241010232505.1339892-2-namhyung@kernel.org> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog In-Reply-To: <20241010232505.1339892-1-namhyung@kernel.org> References: <20241010232505.1339892-1-namhyung@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 65DAB14001A X-Stat-Signature: pw664pf4sd14eboi139fwojjh9ccck6m X-HE-Tag: 1728602707-797196 X-HE-Meta: U2FsdGVkX19NcGGa+JhKuti/kaQj454zMqgvXerov+4vZLJzDqplXDecG772cUt3MUlDo5db1HEe12DDUS4IrukALAC8J0BDu8D2SmGbbsKr8XVSeVMVf/Nh6uUNj3LhOZZlZ5hfn5WYhZwi1p/+gZ5QgcCKTUpDAAEeqX+i78aPKXbG5OO2F5y44vXeKGadZJUTWiUe+gdeK0O1CmiL6utjArt7vGnDVcwjESh+XsY+Evx3e/yEcKd/+IE2jZO2MHtZXJOo7mbsX855/aMnxwLf25jNOdWXUBUdAqC2uXo2QwW/XeYpe5E4VhhjTFnlUQX+7k58M3vGgmDCMYUbqKZmAt7D6qKa7ED1ZA8seFHK5XoFk7rQlmv0cPalUahGNiuKwRI1BqHFDTocLaKXMjjmILSg7Ed3qQTRsKwV7GpE3xnwzUfqZJ1S8doipKar5AaFzPcrmU1CqeqSEZC3IsJHrEhvfq3sA+bdYuu9mW+oB2h9z3jgri8REc1/t82yEW6dVLX2uw3iH7VptCVlIQ8KxrwQUcFf9ZSI12xQebhuRhzkp7096IqlHDEcp3h+s8UTuXIY867PyJHHFJqhofudTwryV9xAGbgHDG8dhpie4Cj3D3cg4ozI1QEWQAJm0s5qrfMNvIKZcYZr5yOG3QA1Htpygl2HhstJnyH9ddPY6bJg3mvEmpY7mJOr7O6KovvZdTI34NHg54UElZGuO417mUO9zMrTJVjir9RVGhxfGExiCLIgMub+ozuIOZrvjJRJx+hUpsbhfNS3exVMA/g8md1oZErHnyKGdpNaeNUmEOqMp5a92nsVZN9mrWKIJJskmjSKdJiLDEHjfSorPG0xLm5g++gsfSvwS/R78IQHcMHoH0v2inpq+MIBwwZh0LB5OjjXicNQ1sSvUGbdlDc/0HE+mXc1BvF1fs6cxRNAza6nHkaB2ljX+w6AbKa5sRwWhpHhVQJ5uaaFckE eCYFW4t6 6wB+B3QNrmnKojv02cGcAF0J3aquOoizUEnC+8Rwmcw2gJM8R+EcOrI2D7TtWVasykW/QbWCV+9u980/aqL1L4p8LQIfnlnB/ZxnTOUO/PmPure7t4tbcwS81xrcI4E9X/CaGWyorDacwESVrKkFLXSV0c+oBIhasKLdtdFVNLE/2HpfrTsjkELGDHoTzQSq/qWtp3IJJoavFx6+Ngnt1tJXNyZjheM3w8BRO1mL9YvPYVh2x5ygacslhLi7oy6eshdSLqCICNQ5GrEQq0PKOB8TwLA4/GXhdJ6Sz0SbGGbb4EoMYGuLY5pay3FTjWa3qd9efnlWOLqK0kRmh145E0pBCK66ZL+SjdgRAL7wuRJhj1bU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The new "kmem_cache" iterator will traverse the list of slab caches and call attached BPF programs for each entry. It should check the argument (ctx.s) if it's NULL before using it. Now the iteration grabs the slab_mutex only if it traverse the list and releases the mutex when it runs the BPF program. The kmem_cache entry is protected by a refcount during the execution. It includes the internal "mm/slab.h" header to access kmem_cache, slab_caches and slab_mutex. Hope it's ok to mm folks. Signed-off-by: Namhyung Kim Acked-by: Vlastimil Babka #slab --- include/linux/btf_ids.h | 1 + kernel/bpf/Makefile | 1 + kernel/bpf/kmem_cache_iter.c | 175 +++++++++++++++++++++++++++++++++++ 3 files changed, 177 insertions(+) create mode 100644 kernel/bpf/kmem_cache_iter.c diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h index c0e3e1426a82f5c4..139bdececdcfaefb 100644 --- a/include/linux/btf_ids.h +++ b/include/linux/btf_ids.h @@ -283,5 +283,6 @@ extern u32 btf_tracing_ids[]; extern u32 bpf_cgroup_btf_id[]; extern u32 bpf_local_storage_map_btf_id[]; extern u32 btf_bpf_map_id[]; +extern u32 bpf_kmem_cache_btf_id[]; #endif diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 9b9c151b5c826b31..105328f0b9c04e37 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -52,3 +52,4 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o +obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o diff --git a/kernel/bpf/kmem_cache_iter.c b/kernel/bpf/kmem_cache_iter.c new file mode 100644 index 0000000000000000..2de0682c6d4c773f --- /dev/null +++ b/kernel/bpf/kmem_cache_iter.c @@ -0,0 +1,175 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2024 Google */ +#include +#include +#include +#include +#include + +#include "../../mm/slab.h" /* kmem_cache, slab_caches and slab_mutex */ + +struct bpf_iter__kmem_cache { + __bpf_md_ptr(struct bpf_iter_meta *, meta); + __bpf_md_ptr(struct kmem_cache *, s); +}; + +static void *kmem_cache_iter_seq_start(struct seq_file *seq, loff_t *pos) +{ + loff_t cnt = 0; + bool found = false; + struct kmem_cache *s; + + mutex_lock(&slab_mutex); + + /* Find an entry at the given position in the slab_caches list instead + * of keeping a reference (of the last visited entry, if any) out of + * slab_mutex. It might miss something if one is deleted in the middle + * while it releases the lock. But it should be rare and there's not + * much we can do about it. + */ + list_for_each_entry(s, &slab_caches, list) { + if (cnt == *pos) { + /* Make sure this entry remains in the list by getting + * a new reference count. Note that boot_cache entries + * have a negative refcount, so don't touch them. + */ + if (s->refcount > 0) + s->refcount++; + found = true; + break; + } + cnt++; + } + mutex_unlock(&slab_mutex); + + if (!found) + return NULL; + + return s; +} + +static void kmem_cache_iter_seq_stop(struct seq_file *seq, void *v) +{ + struct bpf_iter_meta meta; + struct bpf_iter__kmem_cache ctx = { + .meta = &meta, + .s = v, + }; + struct bpf_prog *prog; + bool destroy = false; + + meta.seq = seq; + prog = bpf_iter_get_info(&meta, true); + if (prog && !ctx.s) + bpf_iter_run_prog(prog, &ctx); + + if (ctx.s == NULL) + return; + + mutex_lock(&slab_mutex); + + /* Skip kmem_cache_destroy() for active entries */ + if (ctx.s->refcount > 1) + ctx.s->refcount--; + else if (ctx.s->refcount == 1) + destroy = true; + + mutex_unlock(&slab_mutex); + + if (destroy) + kmem_cache_destroy(ctx.s); +} + +static void *kmem_cache_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct kmem_cache *s = v; + struct kmem_cache *next = NULL; + bool destroy = false; + + ++*pos; + + mutex_lock(&slab_mutex); + + if (list_last_entry(&slab_caches, struct kmem_cache, list) != s) { + next = list_next_entry(s, list); + + WARN_ON_ONCE(next->refcount == 0); + + /* boot_caches have negative refcount, don't touch them */ + if (next->refcount > 0) + next->refcount++; + } + + /* Skip kmem_cache_destroy() for active entries */ + if (s->refcount > 1) + s->refcount--; + else if (s->refcount == 1) + destroy = true; + + mutex_unlock(&slab_mutex); + + if (destroy) + kmem_cache_destroy(s); + + return next; +} + +static int kmem_cache_iter_seq_show(struct seq_file *seq, void *v) +{ + struct bpf_iter_meta meta; + struct bpf_iter__kmem_cache ctx = { + .meta = &meta, + .s = v, + }; + struct bpf_prog *prog; + int ret = 0; + + meta.seq = seq; + prog = bpf_iter_get_info(&meta, false); + if (prog) + ret = bpf_iter_run_prog(prog, &ctx); + + return ret; +} + +static const struct seq_operations kmem_cache_iter_seq_ops = { + .start = kmem_cache_iter_seq_start, + .next = kmem_cache_iter_seq_next, + .stop = kmem_cache_iter_seq_stop, + .show = kmem_cache_iter_seq_show, +}; + +BTF_ID_LIST_GLOBAL_SINGLE(bpf_kmem_cache_btf_id, struct, kmem_cache) + +static const struct bpf_iter_seq_info kmem_cache_iter_seq_info = { + .seq_ops = &kmem_cache_iter_seq_ops, +}; + +static void bpf_iter_kmem_cache_show_fdinfo(const struct bpf_iter_aux_info *aux, + struct seq_file *seq) +{ + seq_puts(seq, "kmem_cache iter\n"); +} + +DEFINE_BPF_ITER_FUNC(kmem_cache, struct bpf_iter_meta *meta, + struct kmem_cache *s) + +static struct bpf_iter_reg bpf_kmem_cache_reg_info = { + .target = "kmem_cache", + .feature = BPF_ITER_RESCHED, + .show_fdinfo = bpf_iter_kmem_cache_show_fdinfo, + .ctx_arg_info_size = 1, + .ctx_arg_info = { + { offsetof(struct bpf_iter__kmem_cache, s), + PTR_TO_BTF_ID_OR_NULL | PTR_UNTRUSTED }, + }, + .seq_info = &kmem_cache_iter_seq_info, +}; + +static int __init bpf_kmem_cache_iter_init(void) +{ + bpf_kmem_cache_reg_info.ctx_arg_info[0].btf_id = bpf_kmem_cache_btf_id[0]; + return bpf_iter_reg_target(&bpf_kmem_cache_reg_info); +} + +late_initcall(bpf_kmem_cache_iter_init);