From patchwork Wed Oct 2 18:09:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 13820195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67A5FCEB2E2 for ; Wed, 2 Oct 2024 18:10:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D8D1D6B02A6; Wed, 2 Oct 2024 14:10:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D43EA6B02AF; Wed, 2 Oct 2024 14:10:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3DB46B02A6; Wed, 2 Oct 2024 14:10:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8510C6B011E for ; Wed, 2 Oct 2024 14:10:02 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 26F8A140E01 for ; Wed, 2 Oct 2024 18:10:02 +0000 (UTC) X-FDA: 82629451044.22.8CBA6BE Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 6CD6616001C for ; Wed, 2 Oct 2024 18:10:00 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cFiiyiX1; spf=pass (imf08.hostedemail.com: domain of namhyung@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727892560; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+HNPaEwOEv30YqIsbGZrnqdb+NT3GStBU1g8u3WoX4I=; b=aWTpLTJPGmq42xBM4zRHMX1ruX+RyTeyYYn+4Yd/oQxJK0e9mpvQftlqsZURyXaYGUOGMl 5C4x/tvG9K5k30xwXwE+gEerkp6bLhassNbqqoR2iP6NICcNgY0YVb9vtGL3VduH5A2R+r uII20SX4lCD+fOk73F2eSsHTrp7GcN0= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cFiiyiX1; spf=pass (imf08.hostedemail.com: domain of namhyung@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=namhyung@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727892560; a=rsa-sha256; cv=none; b=ONoWesnqeU6nqKjJ7Ui2KKSqBOvX3JzOC2lYQQPY5T3rUYhUAPshLTaMbE1MXXGznGNmfK scOwvb60ajxGR7i3f6WM7aNi1fJYVnJU06yHxgiqeoy0YyIEonmrK9VoePwL7v+C+U12oB W402Ekm+5Ev5lVGhNPpKuxVDp7bSDAg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id BF4125C365B; Wed, 2 Oct 2024 18:09:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3F97C4CED3; Wed, 2 Oct 2024 18:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727892598; bh=OspFckh9hcfNpQfmco31YwmJgcBtSOdPnDKdNDPKD04=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cFiiyiX1K3U6QE/qWPMNufn3ufF9eAQ7QU8VWg84F7VQkXI9sYF+Dqgx0QhzilOkB yKEGYNBagzW52NB8dTcBOGGkaTx1puduS09f3KE4sAdGUdNtSXCIUWmib1raJAZyjs yxSYo9a3zhUKsvt7TSFzKe5/+1EbgsEy1cO36RLt1LGeZRwgtstndGxlAya9ZGoi4a xz62UNP+TEuOIQvLMVI3y7sYrIScLSIdirJIX2XXnwew/9qXIJrQ8VPDzH9x7b+nJh YI5pWUHU5FY8r5qJIdNL7XImWGm8MKS+5tNxxwiemvHN9G4khA8pQPFUMqZBQP3PnK 10Hbz3ixnpkXg== From: Namhyung Kim To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko Cc: Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , LKML , bpf@vger.kernel.org, Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, Arnaldo Carvalho de Melo , Kees Cook Subject: [PATCH v4 bpf-next 1/3] bpf: Add kmem_cache iterator Date: Wed, 2 Oct 2024 11:09:54 -0700 Message-ID: <20241002180956.1781008-2-namhyung@kernel.org> X-Mailer: git-send-email 2.46.1.824.gd892dcdcdd-goog In-Reply-To: <20241002180956.1781008-1-namhyung@kernel.org> References: <20241002180956.1781008-1-namhyung@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: o3xz186k1w7k6ud4uqk5s1tgepiu8jyq X-Rspamd-Queue-Id: 6CD6616001C X-Rspamd-Server: rspam11 X-HE-Tag: 1727892600-633463 X-HE-Meta: U2FsdGVkX1+J5pXJkyuceH83zzstXk63Rxjv6a4P+tTfWJsvYG+b7MuewlIY0DucxAQ8kwdDmp7ZlE6A2Y3AH6ohA70fhFannlzcpEtYmm5CtslZ+zt7vlGTw/2pWbMr6HAIcF4bPGqA4fMmI8uVoV3gmpJ8t2KgxNiFw4Lz2L2LteEeWpqgb72ytAw2akxTRzPx2xKhchD8WDur9vMC+zidDK3OCxcnS5gJspdHUh+23V8aC+lI101hDmj857KgSjfeMvoVqzfIkpMAsmhkHyU7X3DHw18hvjkon0LOIv5n60snZcsFz4/zGakZ9oB63wq1CiWRRmS758h7muQYrtT1mUIVkHiygy07Ikp5h1RVgl4JoOTdkXVNHnJykrqvJtALw95QuBEs8O7g2x/eIjBiKS+3Bg8F7nf4pn08cf7y7xSLwA8vznu8jC9Pc9VATLRbNiuJ/f3/jvJXYID7xcANr7dl7DjcUZWmJ5n2yOvHVtT3x7VUxZkgbWVQL2CvPRW7ckjkqVSqNXKUiGVS4lD/tCJjCfEUbkoybWtTnWbjHXB3J8QK4PQQaOA+Hwy7WI2VDeI/j/i4t+SyAZrZU94DOVJhCqhHoDpGNoezYTCCqualtvUBI8l1dF1kcqa0H4uIDNzP1chiFoPI2ddOoNJPSIvkyyZiTN7WnwHqAk9WgRHTupAtRdLQh3DYoWHuw6GpyBxSvhFmqwoozh6QkoomTXd3xT29ht65X8/0+RwZY4HEXv8NFDJsaGm4A4aS0b/6UUs8jUNrzV06sshRRKJ+EdztRbkznmqcGt1Vi3DmWWZiOi5+1RtncFCEYZafTE8fW7jeUqJakbdJ/3Ws1N+C29wPJJUgMmGaj00v4vJm24Zg9dzcA3a1nP7SJuJ2GAQRc2jYraRPj75wYllgjC5zd31BFbk0Q3z+1M2h3utADOvId+GY/Agh4TwR1zpb2tgrzZHWiOY8O1ZTxkt BXEqGTBH tbYmFqTiKBlQMCm/QoEjJ3N3o9TToFAezZ8RyUG88Woztzglicl85Jk4GjqgHGZBy+3ISmNp2dIFw52UPC/ShstZ3iD/KZ7AsusCSyChSzcqfr4BhRJ+5+m7e6IIwO46ghRiYlnd7FF65Wd4bGVt99atTJBhAV9gtT6vA2/J4al2Y7NRAWH/DMCIN7bpRT3Cteypj6BnOs89YFxsiGgtPId47L2JeWfpKSDchtUi5nOHEHxcFOJobuT0sKm4wT34prBXSod9G/tm3IHAxJEMlpILEEzRA+Cl4Z+uaN0+Sdn32IR7GPqhX06OS9YQFPr2TjJUAX/9KElsdvFVAS06+2ONzrgZ2GHCwjW1T X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The new "kmem_cache" iterator will traverse the list of slab caches and call attached BPF programs for each entry. It should check the argument (ctx.s) if it's NULL before using it. Now the iteration grabs the slab_mutex only if it traverse the list and releases the mutex when it runs the BPF program. The kmem_cache entry is protected by a refcount during the execution. It includes the internal "mm/slab.h" header to access kmem_cache, slab_caches and slab_mutex. Hope it's ok to mm folks. Signed-off-by: Namhyung Kim Acked-by: Vlastimil Babka #mm/slab --- I've removed the Acked-by's from Roman and Vlastimil since it's changed not to hold the slab_mutex and to manage the refcount. Please review this change again! include/linux/btf_ids.h | 1 + kernel/bpf/Makefile | 1 + kernel/bpf/kmem_cache_iter.c | 174 +++++++++++++++++++++++++++++++++++ 3 files changed, 176 insertions(+) create mode 100644 kernel/bpf/kmem_cache_iter.c diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h index c0e3e1426a82f5c4..139bdececdcfaefb 100644 --- a/include/linux/btf_ids.h +++ b/include/linux/btf_ids.h @@ -283,5 +283,6 @@ extern u32 btf_tracing_ids[]; extern u32 bpf_cgroup_btf_id[]; extern u32 bpf_local_storage_map_btf_id[]; extern u32 btf_bpf_map_id[]; +extern u32 bpf_kmem_cache_btf_id[]; #endif diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 9b9c151b5c826b31..105328f0b9c04e37 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -52,3 +52,4 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o +obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o diff --git a/kernel/bpf/kmem_cache_iter.c b/kernel/bpf/kmem_cache_iter.c new file mode 100644 index 0000000000000000..e103d25175126ab0 --- /dev/null +++ b/kernel/bpf/kmem_cache_iter.c @@ -0,0 +1,174 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2024 Google */ +#include +#include +#include +#include +#include + +#include "../../mm/slab.h" /* kmem_cache, slab_caches and slab_mutex */ + +struct bpf_iter__kmem_cache { + __bpf_md_ptr(struct bpf_iter_meta *, meta); + __bpf_md_ptr(struct kmem_cache *, s); +}; + +static void *kmem_cache_iter_seq_start(struct seq_file *seq, loff_t *pos) +{ + loff_t cnt = 0; + bool found = false; + struct kmem_cache *s; + + mutex_lock(&slab_mutex); + + /* + * Find an entry at the given position in the slab_caches list instead + * of keeping a reference (of the last visited entry, if any) out of + * slab_mutex. It might miss something if one is deleted in the middle + * while it releases the lock. But it should be rare and there's not + * much we can do about it. + */ + list_for_each_entry(s, &slab_caches, list) { + if (cnt == *pos) { + /* + * Make sure this entry remains in the list by getting + * a new reference count. Note that boot_cache entries + * have a negative refcount, so don't touch them. + */ + if (s->refcount > 0) + s->refcount++; + found = true; + break; + } + cnt++; + } + mutex_unlock(&slab_mutex); + + if (!found) + return NULL; + + ++*pos; + return s; +} + +static void kmem_cache_iter_seq_stop(struct seq_file *seq, void *v) +{ + struct bpf_iter_meta meta; + struct bpf_iter__kmem_cache ctx = { + .meta = &meta, + .s = v, + }; + struct bpf_prog *prog; + bool destroy = false; + + meta.seq = seq; + prog = bpf_iter_get_info(&meta, true); + if (prog) + bpf_iter_run_prog(prog, &ctx); + + if (ctx.s == NULL) + return; + + mutex_lock(&slab_mutex); + + /* Skip kmem_cache_destroy() for active entries */ + if (ctx.s->refcount > 1) + ctx.s->refcount--; + else if (ctx.s->refcount == 1) + destroy = true; + + mutex_unlock(&slab_mutex); + + if (destroy) + kmem_cache_destroy(ctx.s); +} + +static void *kmem_cache_iter_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct kmem_cache *s = v; + struct kmem_cache *next = NULL; + bool destroy = false; + + ++*pos; + + mutex_lock(&slab_mutex); + + if (list_last_entry(&slab_caches, struct kmem_cache, list) != s) { + next = list_next_entry(s, list); + if (next->refcount > 0) + next->refcount++; + } + + /* Skip kmem_cache_destroy() for active entries */ + if (s->refcount > 1) + s->refcount--; + else if (s->refcount == 1) + destroy = true; + + mutex_unlock(&slab_mutex); + + if (destroy) + kmem_cache_destroy(s); + + return next; +} + +static int kmem_cache_iter_seq_show(struct seq_file *seq, void *v) +{ + struct bpf_iter_meta meta; + struct bpf_iter__kmem_cache ctx = { + .meta = &meta, + .s = v, + }; + struct bpf_prog *prog; + int ret = 0; + + meta.seq = seq; + prog = bpf_iter_get_info(&meta, false); + if (prog) + ret = bpf_iter_run_prog(prog, &ctx); + + return ret; +} + +static const struct seq_operations kmem_cache_iter_seq_ops = { + .start = kmem_cache_iter_seq_start, + .next = kmem_cache_iter_seq_next, + .stop = kmem_cache_iter_seq_stop, + .show = kmem_cache_iter_seq_show, +}; + +BTF_ID_LIST_GLOBAL_SINGLE(bpf_kmem_cache_btf_id, struct, kmem_cache) + +static const struct bpf_iter_seq_info kmem_cache_iter_seq_info = { + .seq_ops = &kmem_cache_iter_seq_ops, +}; + +static void bpf_iter_kmem_cache_show_fdinfo(const struct bpf_iter_aux_info *aux, + struct seq_file *seq) +{ + seq_puts(seq, "kmem_cache iter\n"); +} + +DEFINE_BPF_ITER_FUNC(kmem_cache, struct bpf_iter_meta *meta, + struct kmem_cache *s) + +static struct bpf_iter_reg bpf_kmem_cache_reg_info = { + .target = "kmem_cache", + .feature = BPF_ITER_RESCHED, + .show_fdinfo = bpf_iter_kmem_cache_show_fdinfo, + .ctx_arg_info_size = 1, + .ctx_arg_info = { + { offsetof(struct bpf_iter__kmem_cache, s), + PTR_TO_BTF_ID_OR_NULL | PTR_TRUSTED }, + }, + .seq_info = &kmem_cache_iter_seq_info, +}; + +static int __init bpf_kmem_cache_iter_init(void) +{ + bpf_kmem_cache_reg_info.ctx_arg_info[0].btf_id = bpf_kmem_cache_btf_id[0]; + return bpf_iter_reg_target(&bpf_kmem_cache_reg_info); +} + +late_initcall(bpf_kmem_cache_iter_init);