From patchwork Fri Aug  9 07:33:06 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kees Cook <kees@kernel.org>
X-Patchwork-Id: 13758477
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E5DFDC52D71
	for <linux-mm@archiver.kernel.org>; Fri,  9 Aug 2024 07:33:25 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 859BC6B00A2; Fri,  9 Aug 2024 03:33:15 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 793846B00A3; Fri,  9 Aug 2024 03:33:15 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 570CC6B00A4; Fri,  9 Aug 2024 03:33:15 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 323296B00A2
	for <linux-mm@kvack.org>; Fri,  9 Aug 2024 03:33:15 -0400 (EDT)
Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id DF3B8160A4F
	for <linux-mm@kvack.org>; Fri,  9 Aug 2024 07:33:14 +0000 (UTC)
X-FDA: 82431891108.25.FD837E6
Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217])
	by imf13.hostedemail.com (Postfix) with ESMTP id 2A64A2001F
	for <linux-mm@kvack.org>; Fri,  9 Aug 2024 07:33:13 +0000 (UTC)
Authentication-Results: imf13.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b=KAgXWs1X;
	dmarc=pass (policy=none) header.from=kernel.org;
	spf=pass (imf13.hostedemail.com: domain of kees@kernel.org designates
 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723188783; a=rsa-sha256;
	cv=none;
	b=PkTG1gSqtkXO5/rZr44q6KQJCeMBlv6MQHzkAn/UcNrf2fdcGopFuVlpjKwa+Q0/gB7b2n
	EGtvjJRCpkPhw1dH8c7ZE0t9nYdOZii6cYggeQ8eMftMgfLvxVfXkt8ZRAoy/OcwANOyhH
	dDz5CmDL9xCtpoSeA5iL5HUqct5fULg=
ARC-Authentication-Results: i=1;
	imf13.hostedemail.com;
	dkim=pass header.d=kernel.org header.s=k20201202 header.b=KAgXWs1X;
	dmarc=pass (policy=none) header.from=kernel.org;
	spf=pass (imf13.hostedemail.com: domain of kees@kernel.org designates
 139.178.84.217 as permitted sender) smtp.mailfrom=kees@kernel.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1723188783;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=1LHppLDOPKVGGYTOKfAnP782hJ0aLQibHbvq68MVdt8=;
	b=vNXAoCVq9XAu+OzOEn48jJRmjcSfUTvK01NTDBeDp8jFSPcz9G6VHobXDJXqhyxg9otOki
	bp671LEk6dx22z2kPJAo958B4Ktcxgce8kcsrIxgtSop3kgNgjCU8YODyj474pxlvT/uIo
	Keo2RL/j0awxGz5XwN1uvSulXXxvAfs=
Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58])
	by dfw.source.kernel.org (Postfix) with ESMTP id 4841861668;
	Fri,  9 Aug 2024 07:33:10 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16C99C4AF11;
	Fri,  9 Aug 2024 07:33:10 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
	s=k20201202; t=1723188790;
	bh=WwVEHZxXP/denNiV+dxfYy7hsfC8akgc4PDoJIzTgCM=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
	b=KAgXWs1XKo6iAYzTDWoioXqzTgVrNa4MoYZ4pl3diCJ3d8IMaeMQTl3LUdU4MmSNJ
	 moeSkqQ3YonWd/mpWyVpQNXdVhvpFglG6LQ3r1ZlZxG4tJ1UPkeCEKI6TxVTxfjDTU
	 lGRHfrPSDVCBwY5kVX34XejsM3+3wl0+XynFbgCjBw6fkywEUA4Q8Jz9oJcy6oIGbz
	 oEVwUOwqBv22WLCGoFb22ZbuO2H75EAQP60xliAgxoPNmxUrZFdaPLm1a1owS/jL7b
	 XcuYfJ0FqyUH38NLCXBsgiQ6pyQV7eCEWMSnDxxnOD2WX/Nb2OCPf1KWNK+sdyB09Y
	 6HKHxYMV8Vq9w==
From: Kees Cook <kees@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Kees Cook <kees@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	linux-mm@kvack.org,
	"GONG, Ruiqi" <gongruiqi@huaweicloud.com>,
	Jann Horn <jannh@google.com>,
	Matteo Rizzo <matteorizzo@google.com>,
	jvoisin <julien.voisin@dustri.org>,
	Xiu Jianfeng <xiujianfeng@huawei.com>,
	linux-kernel@vger.kernel.org,
	linux-hardening@vger.kernel.org
Subject: [PATCH 5/5] slab: Allocate and use per-call-site caches
Date: Fri,  9 Aug 2024 00:33:06 -0700
Message-Id: <20240809073309.2134488-5-kees@kernel.org>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20240809072532.work.266-kees@kernel.org>
References: <20240809072532.work.266-kees@kernel.org>
MIME-Version: 1.0
X-Developer-Signature: v=1; a=openpgp-sha256; l=10722; i=kees@kernel.org;
 h=from:subject; bh=WwVEHZxXP/denNiV+dxfYy7hsfC8akgc4PDoJIzTgCM=;
 b=owGbwMvMwCVmps19z/KJym7G02pJDGlbjxn1Tdz6Xbdh/s6WL17vDgv/KesuvZIWvl70Z+FGn
 /x65TbrjlIWBjEuBlkxRZYgO/c4F4+37eHucxVh5rAygQxh4OIUgInc/8HIcLbu+bae0p1OfcrS
 3ofrO24W5Xt+yPlS1MpqFhTQf9RCiuF/gKqyfHxc5OOXDx6wX+2LmJTo7tm64m3mIyZr87OiVWY
 8AA==
X-Developer-Key: i=kees@kernel.org; a=openpgp;
 fpr=A5C3F68F229DD60F723E6E138972F4DFDC6DC026
X-Rspam-User: 
X-Rspamd-Queue-Id: 2A64A2001F
X-Rspamd-Server: rspam01
X-Stat-Signature: 3kb94zr7kusrarcjn78pnheumjf9pmhr
X-HE-Tag: 1723188793-370049
X-HE-Meta: 
 U2FsdGVkX1+wuNeesk3wpP2vfZ1Lu0h01EbfBgIUtu7H0pfq3U4DKdV1Zi2DFafchq2/J8qujzQu9Qo6/IDuI6z6ssgCBNF3Qe+v9LWDvb+Vy541PZ+oPB5DRejQDWFWEcXw+jP/QOmpksIdX9qJ4GsLPsDSOA2+2xq//BKQSfFC70+bSe9hdvCoS5Yc4Xzd/4suTuQGWcuiBUdyQFZT85u8vOkMz2yDMNjUJ17bdZbuteCGyR5Ni1MU94s3sFcntyw0mCFQLkS90Ye+GRk9d9YqIBxHt5s8kCzoxFhELQcThzzMU/Cbmkv710iWfsBTsh40ykrGsVsN2EWpnyWEWYwrFOm2XssYgq0drJvG+KWw1zS1kjadO+6mO4PKIVcZ2R5rU6wR67a+SHMuL7SBJIybuf3hJ/x2DYjnITsvoPKDF162uwFQhNA8FyZY0Tq5k2vPiN5eQLqL9xg2Z7K09egdogyHefjIa4PTFF+LgGUNtIIsSxNaMkI+KgOfYzIHyvzvZo7wTcqqWUEdZQsTSrkln7uHXp2gDjBJg0b2MM8m3TrxgQNBvegWON6VAVQRFtmnkPt8IHfpC9KBzb/km6Ja5uLyAzU1VfXpOpDahLc0x4VcSWeqEYUVKzAJa96IIkBdO5hPVdfSIxhNopk71/nzIcXTTj5Mh1Se9vkuQ4oOY9ERS44rq08Xu02YJWJlFg5hEsVl4eO3DML9PNqFVuFMkynmkMJInwRU0bjzrf8xUrt+Qx+FKZj9RcMvQrv3wAvXtiaYU3bmgyaolOaygwbN3XQ3mCKCoYHqyYMgy+BK+XWq8qStxQTeSNM0B0OEr7tdfShRVaPIFn/J/I50jfIjjh3br4x57lyAyacfRoRizxnNITfdPGIW9/mpkfMHjFC2h3dWnGXww3XfzyA5X2Wq2TNxTtq0rKq9nMhvLGVa3SIBtNEzMJUqOQxmS1cwJXqsznVkicgTFldXdIw
 59NcRlwW
 lPBhToCbr4xeATeRKYyhDb6T0/lQOtJAMTY5IzYzFbIoXNvP5McQ0qAeWDNii0iQNnhE634BJlPKmor6anOT4cBRJ5kD37EgEe44EyaEn21vqcKbvtrOu7OKA02tkA0vyPF81kI8z4QP0dG1QY59OnaCampY3TuYxJziXctytNusjcAN02NcYI+gOcI6S4AWOcZ/xtp5k9YRZHHUZXr/QfvnJ7W8zWNVazXp5lAkd2/Ww5js1+SOPI086gh0g2StoGH/6nRzrv4gRjwovIaVha/T3HmjiJExN/hjrAr+PFK4ZR74XtUpZS2/AQhOhGHRnQ6q3eqbeyjaTgKtogALrNRc64K5gePIiPxt24qAlHwxgZ/wpp/ZEv7fKe2xC11KhbXeosxTN/I74MOLD2aK4pV2AODQocvZ00+MNTpqgaq3Y7hEtdcLcdfaTzuRdc1izwclcAO1ddJgRO6Sy1opRUUq6B1p7cqVhf1VkYot6SCVYnpfJDkE44EM2h/z0TTKdSp1dpk0p5PC82AlAkNLy1wg6Gg4ONrhKQJB6OgmwFLdbZd41dGrZgIKBSIVyu4ikfmRO8lJpXytFKwyAFoYX4aC2sQ==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Use separate per-call-site kmem_cache or kmem_buckets. These are
allocated on demand to avoid wasting memory for unused caches.

A few caches need to be allocated very early to support allocating the
caches themselves: kstrdup(), kvasprintf(), and pcpu_mem_zalloc(). Any
GFP_ATOMIC allocations are currently left to be allocated from
KMALLOC_NORMAL.

With a distro config, /proc/slabinfo grows from ~400 entries to ~2200.

Since this feature (CONFIG_SLAB_PER_SITE) is redundant to
CONFIG_RANDOM_KMALLOC_CACHES, mark it a incompatible. Add Kconfig help
text that compares the features.

Improvements needed:
- Retain call site gfp flags in alloc_tag meta field to:
  - pre-allocate all GFP_ATOMIC caches (since their caches cannot
    be allocated on demand unless we want them to be GFP_ATOMIC
    themselves...)
  - Separate MEMCG allocations as well
- Allocate individual caches within kmem_buckets on demand to
  further reduce memory usage overhead.

Signed-off-by: Kees Cook <kees@kernel.org>
---
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org
---
 include/linux/alloc_tag.h |   8 +++
 lib/alloc_tag.c           | 121 +++++++++++++++++++++++++++++++++++---
 mm/Kconfig                |  19 +++++-
 mm/slab_common.c          |   1 +
 mm/slub.c                 |  31 +++++++++-
 5 files changed, 170 insertions(+), 10 deletions(-)

diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
index f5d8c5849b82..c95628f9b049 100644
--- a/include/linux/alloc_tag.h
+++ b/include/linux/alloc_tag.h
@@ -24,6 +24,7 @@ struct alloc_tag_counters {
 struct alloc_meta {
 	/* 0 means non-slab, SIZE_MAX means dynamic, and everything else is fixed-size. */
 	size_t sized;
+	void *cache;
 };
 #define ALLOC_META_INIT(_size)	{		\
 		.sized = (__builtin_constant_p(_size) ? (_size) : SIZE_MAX), \
@@ -216,6 +217,13 @@ static inline void alloc_tag_sub(union codetag_ref *ref, size_t bytes) {}
 
 #endif /* CONFIG_MEM_ALLOC_PROFILING */
 
+#ifdef CONFIG_SLAB_PER_SITE
+void alloc_tag_early_walk(void);
+void alloc_tag_site_init(struct codetag *ct, bool ondemand);
+#else
+static inline void alloc_tag_early_walk(void) {}
+#endif
+
 #define alloc_hooks_tag(_tag, _do_alloc)				\
 ({									\
 	struct alloc_tag * __maybe_unused _old = alloc_tag_save(_tag);	\
diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 6d2cb72bf269..e8a66a7c4a6b 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -157,6 +157,89 @@ static void __init procfs_init(void)
 	proc_create_seq("allocinfo", 0400, NULL, &allocinfo_seq_op);
 }
 
+#ifdef CONFIG_SLAB_PER_SITE
+static bool ondemand_ready;
+
+void alloc_tag_site_init(struct codetag *ct, bool ondemand)
+{
+	struct alloc_tag *tag = ct_to_alloc_tag(ct);
+	char *name;
+	void *p, *old;
+
+	/* Only handle kmalloc allocations. */
+	if (!tag->meta.sized)
+		return;
+
+	/* Must be ready for on-demand allocations. */
+	if (ondemand && !ondemand_ready)
+		return;
+
+	old = READ_ONCE(tag->meta.cache);
+	/* Already allocated? */
+	if (old)
+		return;
+
+	if (tag->meta.sized < SIZE_MAX) {
+		/* Fixed-size allocations. */
+		name = kasprintf(GFP_KERNEL, "f:%zu:%s:%d", tag->meta.sized, ct->function, ct->lineno);
+		if (WARN_ON_ONCE(!name))
+			return;
+		/*
+		 * As with KMALLOC_NORMAL, the entire allocation needs to be
+		 * open to usercopy access. :(
+		 */
+		p = kmem_cache_create_usercopy(name, tag->meta.sized, 0,
+					       SLAB_NO_MERGE, 0, tag->meta.sized,
+					       NULL);
+	} else {
+		/* Dynamically-size allocations. */
+		name = kasprintf(GFP_KERNEL, "d:%s:%d", ct->function, ct->lineno);
+		if (WARN_ON_ONCE(!name))
+			return;
+		p = kmem_buckets_create(name, SLAB_NO_MERGE, 0, UINT_MAX, NULL);
+	}
+	if (p) {
+		if (unlikely(!try_cmpxchg(&tag->meta.cache, &old, p))) {
+			/* We lost the allocation race; clean up. */
+			if (tag->meta.sized < SIZE_MAX)
+				kmem_cache_destroy(p);
+			else
+				kmem_buckets_destroy(p);
+		}
+	}
+	kfree(name);
+}
+
+static void alloc_tag_site_init_early(struct codetag *ct)
+{
+	/* Explicitly initialize the caches needed to initialize caches. */
+	if (strcmp(ct->function, "kstrdup") == 0 ||
+	    strcmp(ct->function, "kvasprintf") == 0 ||
+	    strcmp(ct->function, "pcpu_mem_zalloc") == 0)
+		alloc_tag_site_init(ct, false);
+
+	/* TODO: pre-allocate GFP_ATOMIC caches here. */
+}
+#endif
+
+static void alloc_tag_module_load(struct codetag_type *cttype,
+				  struct codetag_module *cmod)
+{
+#ifdef CONFIG_SLAB_PER_SITE
+	struct codetag_iterator iter;
+	struct codetag *ct;
+
+	iter = codetag_get_ct_iter(cttype);
+	for (ct = codetag_next_ct(&iter); ct; ct = codetag_next_ct(&iter)) {
+		if (iter.cmod != cmod)
+			continue;
+
+		/* TODO: pre-allocate GFP_ATOMIC caches here. */
+		//alloc_tag_site_init(ct, false);
+	}
+#endif
+}
+
 static bool alloc_tag_module_unload(struct codetag_type *cttype,
 				    struct codetag_module *cmod)
 {
@@ -175,8 +258,21 @@ static bool alloc_tag_module_unload(struct codetag_type *cttype,
 
 		if (WARN(counter.bytes,
 			 "%s:%u module %s func:%s has %llu allocated at module unload",
-			 ct->filename, ct->lineno, ct->modname, ct->function, counter.bytes))
+			 ct->filename, ct->lineno, ct->modname, ct->function, counter.bytes)) {
 			module_unused = false;
+		}
+#ifdef CONFIG_SLAB_PER_SITE
+		else if (tag->meta.sized) {
+			/* Remove the allocated caches, if possible. */
+			void *p = READ_ONCE(tag->meta.cache);
+
+			WRITE_ONCE(tag->meta.cache, NULL);
+			if (tag->meta.sized < SIZE_MAX)
+				kmem_cache_destroy(p);
+			else
+				kmem_buckets_destroy(p);
+		}
+#endif
 	}
 
 	return module_unused;
@@ -260,15 +356,16 @@ static void __init sysctl_init(void)
 static inline void sysctl_init(void) {}
 #endif /* CONFIG_SYSCTL */
 
+static const struct codetag_type_desc alloc_tag_desc = {
+	.section	= "alloc_tags",
+	.tag_size	= sizeof(struct alloc_tag),
+	.module_load	= alloc_tag_module_load,
+	.module_unload	= alloc_tag_module_unload,
+};
+
 static int __init alloc_tag_init(void)
 {
-	const struct codetag_type_desc desc = {
-		.section	= "alloc_tags",
-		.tag_size	= sizeof(struct alloc_tag),
-		.module_unload	= alloc_tag_module_unload,
-	};
-
-	alloc_tag_cttype = codetag_register_type(&desc);
+	alloc_tag_cttype = codetag_register_type(&alloc_tag_desc);
 	if (IS_ERR(alloc_tag_cttype))
 		return PTR_ERR(alloc_tag_cttype);
 
@@ -278,3 +375,11 @@ static int __init alloc_tag_init(void)
 	return 0;
 }
 module_init(alloc_tag_init);
+
+#ifdef CONFIG_SLAB_PER_SITE
+void alloc_tag_early_walk(void)
+{
+	codetag_early_walk(&alloc_tag_desc, alloc_tag_site_init_early);
+	ondemand_ready = true;
+}
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 855c63c3270d..4f01cb6dd32e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -302,7 +302,20 @@ config SLAB_PER_SITE
 	default SLAB_FREELIST_HARDENED
 	select SLAB_BUCKETS
 	help
-	  Track sizes of kmalloc() call sites.
+	  As a defense against shared-cache "type confusion" use-after-free
+	  attacks, every kmalloc()-family call allocates from a separate
+	  kmem_cache (or when dynamically sized, kmem_buckets). Attackers
+	  will no longer be able to groom malicious objects via similarly
+	  sized allocations that share the same cache as the target object.
+
+	  This increases the "at rest" kmalloc slab memory usage by
+	  roughly 5x (around 7MiB), and adds the potential for greater
+	  long-term memory fragmentation. However, some workloads
+	  actually see performance improvements when single allocation
+	  sites are hot.
+
+	  For a similar defense, see CONFIG_RANDOM_KMALLOC_CACHES, which
+	  has less memory usage overhead, but is probabilistic.
 
 config SLUB_STATS
 	default n
@@ -331,6 +344,7 @@ config SLUB_CPU_PARTIAL
 config RANDOM_KMALLOC_CACHES
 	default n
 	depends on !SLUB_TINY
+	depends on !SLAB_PER_SITE
 	bool "Randomize slab caches for normal kmalloc"
 	help
 	  A hardening feature that creates multiple copies of slab caches for
@@ -345,6 +359,9 @@ config RANDOM_KMALLOC_CACHES
 	  limited degree of memory and CPU overhead that relates to hardware and
 	  system workload.
 
+	  For a similar defense, see CONFIG_SLAB_PER_SITE, which is
+	  deterministic, but has greater memory usage overhead.
+
 endmenu # Slab allocator options
 
 config SHUFFLE_PAGE_ALLOCATOR
diff --git a/mm/slab_common.c b/mm/slab_common.c
index fc698cba0ebe..09506bfa972c 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1040,6 +1040,7 @@ void __init create_kmalloc_caches(void)
 		kmem_buckets_cache = kmem_cache_create("kmalloc_buckets",
 						       sizeof(kmem_buckets),
 						       0, SLAB_NO_MERGE, NULL);
+	alloc_tag_early_walk();
 }
 
 /**
diff --git a/mm/slub.c b/mm/slub.c
index 3520acaf9afa..d14102c4b4d7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4135,6 +4135,35 @@ void *__kmalloc_large_node_noprof(size_t size, gfp_t flags, int node)
 }
 EXPORT_SYMBOL(__kmalloc_large_node_noprof);
 
+static __always_inline
+struct kmem_cache *choose_slab(size_t size, kmem_buckets *b, gfp_t flags,
+			       unsigned long caller)
+{
+#ifdef CONFIG_SLAB_PER_SITE
+	struct alloc_tag *tag = current->alloc_tag;
+
+	if (!b && tag && tag->meta.sized &&
+	    kmalloc_type(flags, caller) == KMALLOC_NORMAL &&
+	    (flags & GFP_ATOMIC) != GFP_ATOMIC) {
+		void *p = READ_ONCE(tag->meta.cache);
+
+		if (!p && slab_state >= UP) {
+			alloc_tag_site_init(&tag->ct, true);
+			p = READ_ONCE(tag->meta.cache);
+		}
+
+		if (tag->meta.sized < SIZE_MAX) {
+			if (p)
+				return p;
+			/* Otherwise continue with default buckets. */
+		} else {
+			b = p;
+		}
+	}
+#endif
+	return kmalloc_slab(size, b, flags, caller);
+}
+
 static __always_inline
 void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
 			unsigned long caller)
@@ -4152,7 +4181,7 @@ void *__do_kmalloc_node(size_t size, kmem_buckets *b, gfp_t flags, int node,
 	if (unlikely(!size))
 		return ZERO_SIZE_PTR;
 
-	s = kmalloc_slab(size, b, flags, caller);
+	s = choose_slab(size, b, flags, caller);
 
 	ret = slab_alloc_node(s, NULL, flags, node, caller, size);
 	ret = kasan_kmalloc(s, ret, size, flags);