From patchwork Mon Feb 3 10:28:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Brodsky X-Patchwork-Id: 13957242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40086C02193 for ; Mon, 3 Feb 2025 10:29:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CB290280016; Mon, 3 Feb 2025 05:29:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C620128000E; Mon, 3 Feb 2025 05:29:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B510E280016; Mon, 3 Feb 2025 05:29:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 97E6B28000E for ; Mon, 3 Feb 2025 05:29:42 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E8F0EA2577 for ; Mon, 3 Feb 2025 10:28:41 +0000 (UTC) X-FDA: 83078259642.13.B011B6B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 4C3AE4000C for ; Mon, 3 Feb 2025 10:28:40 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of kevin.brodsky@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=kevin.brodsky@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738578520; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wydQ6t7sf5CwaeBJ0WjER4INB2IsNUN8+Dj6vCyx8B4=; b=2jw78TuIm6AoNWM/zWtxAYRD0TB2IUkAYmlNmiO+3e4V/2seQ1RV6dgJAW1w8Yb4aXQMyF UtXnmdI/HlMiof3UT75ERvOdWibU5G4h1+rohPKpnzyNfQn64SaElW5+V0RTdPAVB5TLQo oLvpvzmymkJIRJRJetpwJHt0hSpBg+s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738578520; a=rsa-sha256; cv=none; b=CfZ0PvJX86upvKdB1e92VgzWcLUdIH0aPvLbFvfhfJKYcvwiOiv2o1DRJNxfdguahO9oMp gzUqQQqvU3IedeaObetweDUuT+Oq9wQv+tGCuFXnSXT1pKcPVxW5eBzvEEJqRotteBcR6E ibcFp+Q8NJkD6lKKZwm17HXdlOD4zNY= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of kevin.brodsky@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=kevin.brodsky@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0159F1A32; Mon, 3 Feb 2025 02:29:04 -0800 (PST) Received: from e123572-lin.arm.com (e123572-lin.cambridge.arm.com [10.1.194.54]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9F1B53F63F; Mon, 3 Feb 2025 02:28:35 -0800 (PST) From: Kevin Brodsky To: linux-hardening@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Kevin Brodsky , Andrew Morton , Mark Brown , Catalin Marinas , Dave Hansen , David Howells , "Eric W. Biederman" , Jann Horn , Jeff Xu , Joey Gouly , Kees Cook , Linus Walleij , Andy Lutomirski , Marc Zyngier , Peter Zijlstra , Pierre Langlois , Quentin Perret , "Mike Rapoport (IBM)" , Ryan Roberts , Thomas Gleixner , Will Deacon , Matthew Wilcox , Qi Zheng , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, x86@kernel.org Subject: [RFC PATCH 3/8] slab: Introduce SLAB_SET_PKEY Date: Mon, 3 Feb 2025 10:28:04 +0000 Message-ID: <20250203102809.1223255-4-kevin.brodsky@arm.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250203102809.1223255-1-kevin.brodsky@arm.com> References: <20250203102809.1223255-1-kevin.brodsky@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4C3AE4000C X-Stat-Signature: geiurjdd7n773p145qzjc8hk7i7qjk14 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738578520-903913 X-HE-Meta: U2FsdGVkX1/pMdv9kXXYD36zl3k7IN1Di9tXULNX0zbdINPMuT1dXQiPk84tJX5sxC5UkPXYpst98zWU1F6saE3Z07hahR0IzxTeuHpZSEz3parWh7MznS19Q5OIvlJfiNBcQTLNLM/4swwtC7r5bRoJBbNO2lz3mFC5ywsbo5l5loENN3M1Ao3e2JrjjfBgwltxfrS4qR4Xv0UiTlWLUozl4iRslTZa+Mkwivn10dTrFwwIzng0DoB7pwbA7DoRhunej4lqfdFaCyKyRfWIjGjQKd1jlnJGFBumrmhnw87vbNJJ1X0ryxieazluHvsUfItX9Qf2Q4MdR7WNnAxEwdE4DfFxCs0i0IpZutPedYpdWe0MvZRTR3Pq/+1L1GNDkXeGLahU/T7uf5CQxKrtY1v8eylau0S6rKUvuJAFl/jLejYKrBuPcJqXiQdu/rS2aPO21Dp+3A6pzM27AfvwzwqGyyE1uetTcrYuLtigpvmpE/ORu7LznsNeHcbYawkTQfjv41i8TAmqVn99jvcNDlkWBL0vbkFtPVdvrNaBFyhFX5C9wXMZn43VDP1g6m/q3ve+PqYXeWS0JNJf9oKvSE4bPBpZzoQFVmjeOseyF1JZg82okaJ6Pk9icB3n/Xvc/08I/mzJflu/s7ukEXOQgdRfeSsxXj+eoDA/5vQCIbccPQsHWoQUs5k8ZIVT5l+7ScTM6RUe2fU30DlYexzlYLzc+euZrrm5+ya2ejSOUb4P+aj8Af5pufFLhDDHLwxTphV2PkAKB8UPI9cWFZPHzFJM2vidtC2HgSGSc+JzbhD7QkzVhR7l+JUWv6SCrgQeKD3sknc21kABx/oTDdp/ZIbi1k0/Yn1R/CAgxPPcO1xZaigTWTtmTslMKipqS5D+spHXYDBFI4rd4i09G1mDwrB18EIzrFoMN8wxHby6RLoWEhfGPGONR37grOCdouIDdWFLuzH7VEgiWaKphoS EXVTK1Dy PPQXUCIn7OG3R1zyjWW1I2FztYSXvJ1YpVXOfT4AsRQeU2URe3SEwqpm2FU5Rh9Z5kWmGZL1grvNVtsUZVI02yG3GXOTHnL1DntIMI7MHt4Ns5TyVgR0r0AsbMzoWHnNTiy4CiAXIl6tJumEQ/y/BR0qjohvs9D9t5jCZ2g0om7d1EIj63zrWKRxgG9pzA9dAQ2axxGOWRumIYdg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce the SLAB_SET_PKEY flag to request a kmem_cache whose slabs are mapped with a non-default pkey, if kernel pkeys (kpkeys) are supported. The pkey to be used is specified via a new pkey field in struct kmem_cache_args. The setting/resetting of the pkey is done directly at the slab level (allocate_slab/__free_slab) to avoid having to propagate the pkey value down to the page level. Memory mapped with a non-default pkey cannot be written to at the default kpkeys level. This is handled by switching to the unrestricted kpkeys level (granting write access to all pkeys) when writing to a slab with SLAB_SET_PKEY. The merging of slabs with SLAB_SET_PKEY is conservatively prevented, though it should be possible to merge slabs with the same configured pkey. Signed-off-by: Kevin Brodsky --- include/linux/slab.h | 21 ++++++++++++++++ mm/slab.h | 7 +++++- mm/slab_common.c | 2 +- mm/slub.c | 58 +++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 85 insertions(+), 3 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 09eedaecf120..cc2e757b16ec 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -58,6 +58,9 @@ enum _slab_flag_bits { _SLAB_CMPXCHG_DOUBLE, #ifdef CONFIG_SLAB_OBJ_EXT _SLAB_NO_OBJ_EXT, +#endif +#ifdef CONFIG_ARCH_HAS_KPKEYS + _SLAB_SET_PKEY, #endif _SLAB_FLAGS_LAST_BIT }; @@ -234,6 +237,12 @@ enum _slab_flag_bits { #define SLAB_NO_OBJ_EXT __SLAB_FLAG_UNUSED #endif +#ifdef CONFIG_ARCH_HAS_KPKEYS +#define SLAB_SET_PKEY __SLAB_FLAG_BIT(_SLAB_SET_PKEY) +#else +#define SLAB_SET_PKEY __SLAB_FLAG_UNUSED +#endif + /* * freeptr_t represents a SLUB freelist pointer, which might be encoded * and not dereferenceable if CONFIG_SLAB_FREELIST_HARDENED is enabled. @@ -331,6 +340,18 @@ struct kmem_cache_args { * %NULL means no constructor. */ void (*ctor)(void *); + /** + * @pkey: The pkey to map the allocated pages with. + * + * If the SLAB flags include SLAB_SET_PKEY, and if kernel pkeys are + * supported, objects are allocated in pages mapped with the protection + * key specified by @pkey. Otherwise, this field is ignored. + * + * Note that if @pkey is a non-default pkey, some overhead is incurred + * when internal slab functions switch the pkey register to write to the + * slab (e.g. setting a free pointer). + */ + int pkey; }; struct kmem_cache *__kmem_cache_create_args(const char *name, diff --git a/mm/slab.h b/mm/slab.h index 1a081f50f947..d5cf5927634a 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -311,6 +311,10 @@ struct kmem_cache { unsigned int usersize; /* Usercopy region size */ #endif +#ifdef CONFIG_ARCH_HAS_KPKEYS + int pkey; +#endif + struct kmem_cache_node *node[MAX_NUMNODES]; }; @@ -462,7 +466,8 @@ static inline bool is_kmalloc_normal(struct kmem_cache *s) SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS | \ SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \ SLAB_TEMPORARY | SLAB_ACCOUNT | \ - SLAB_NO_USER_FLAGS | SLAB_KMALLOC | SLAB_NO_MERGE) + SLAB_NO_USER_FLAGS | SLAB_KMALLOC | SLAB_NO_MERGE | \ + SLAB_SET_PKEY) #define SLAB_DEBUG_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \ SLAB_TRACE | SLAB_CONSISTENCY_CHECKS) diff --git a/mm/slab_common.c b/mm/slab_common.c index 69f9afd85f9f..21323d2a108e 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -47,7 +47,7 @@ struct kmem_cache *kmem_cache; */ #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \ SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \ - SLAB_FAILSLAB | SLAB_NO_MERGE) + SLAB_FAILSLAB | SLAB_NO_MERGE | SLAB_SET_PKEY) #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \ SLAB_CACHE_DMA32 | SLAB_ACCOUNT) diff --git a/mm/slub.c b/mm/slub.c index 1f50129dcfb3..75b543e255d9 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -459,6 +460,15 @@ static nodemask_t slab_nodes; static struct workqueue_struct *flushwq; #endif +#ifdef CONFIG_ARCH_HAS_KPKEYS +KPKEYS_GUARD_COND(kpkeys_slab_write, + KPKEYS_LVL_UNRESTRICTED, + unlikely(s->flags & SLAB_SET_PKEY), + struct kmem_cache *s) +#else +KPKEYS_GUARD_NOOP(kpkeys_slab_write, struct kmem_cache *s) +#endif + /******************************************************************** * Core slab cache functions *******************************************************************/ @@ -545,6 +555,8 @@ static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) BUG_ON(object == fp); /* naive detection of double free or corruption */ #endif + guard(kpkeys_slab_write)(s); + freeptr_addr = (unsigned long)kasan_reset_tag((void *)freeptr_addr); *(freeptr_t *)freeptr_addr = freelist_ptr_encode(s, fp, freeptr_addr); } @@ -765,6 +777,8 @@ static inline void set_orig_size(struct kmem_cache *s, p += get_info_end(s); p += sizeof(struct track) * 2; + guard(kpkeys_slab_write)(s); + *(unsigned int *)p = orig_size; } @@ -949,6 +963,8 @@ static void set_track_update(struct kmem_cache *s, void *object, { struct track *p = get_track(s, object, alloc); + guard(kpkeys_slab_write)(s); + #ifdef CONFIG_STACKDEPOT p->handle = handle; #endif @@ -973,6 +989,8 @@ static void init_tracking(struct kmem_cache *s, void *object) if (!(s->flags & SLAB_STORE_USER)) return; + guard(kpkeys_slab_write)(s); + p = get_track(s, object, TRACK_ALLOC); memset(p, 0, 2*sizeof(struct track)); } @@ -1137,6 +1155,8 @@ static void init_object(struct kmem_cache *s, void *object, u8 val) u8 *p = kasan_reset_tag(object); unsigned int poison_size = s->object_size; + guard(kpkeys_slab_write)(s); + if (s->flags & SLAB_RED_ZONE) { /* * Here and below, avoid overwriting the KMSAN shadow. Keeping @@ -2335,6 +2355,8 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init, int rsize; unsigned int inuse, orig_size; + guard(kpkeys_slab_write)(s); + inuse = get_info_end(s); orig_size = get_orig_size(s, x); if (!kasan_has_integrated_init()) @@ -2563,6 +2585,8 @@ static __always_inline void unaccount_slab(struct slab *slab, int order, -(PAGE_SIZE << order)); } +static void __free_slab(struct kmem_cache *s, struct slab *slab); + static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct slab *slab; @@ -2612,6 +2636,18 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) setup_slab_debug(s, slab, start); +#ifdef CONFIG_ARCH_HAS_KPKEYS + if (unlikely(s->flags & SLAB_SET_PKEY)) { + int ret = set_memory_pkey((unsigned long)start, + 1 << oo_order(oo), s->pkey); + + if (WARN_ON(ret)) { + __free_slab(s, slab); + return NULL; + } + } +#endif + shuffle = shuffle_freelist(s, slab); if (!shuffle) { @@ -2652,6 +2688,11 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab) __folio_clear_slab(folio); mm_account_reclaimed_pages(pages); unaccount_slab(slab, order, s); +#ifdef CONFIG_ARCH_HAS_KPKEYS + if (unlikely(s->flags & SLAB_SET_PKEY)) + WARN_ON(set_memory_pkey((unsigned long)folio_address(folio), + pages, 0)); +#endif free_frozen_pages(&folio->page, order); } @@ -4053,9 +4094,11 @@ static __always_inline void maybe_wipe_obj_freeptr(struct kmem_cache *s, void *obj) { if (unlikely(slab_want_init_on_free(s)) && obj && - !freeptr_outside_object(s)) + !freeptr_outside_object(s)) { + guard(kpkeys_slab_write)(s); memset((void *)((char *)kasan_reset_tag(obj) + s->offset), 0, sizeof(void *)); + } } static __fastpath_inline @@ -4798,6 +4841,7 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags) /* Zero out spare memory. */ if (want_init_on_alloc(flags)) { kasan_disable_current(); + guard(kpkeys_slab_write)(s); if (orig_size && orig_size < new_size) memset(kasan_reset_tag(p) + orig_size, 0, new_size - orig_size); else @@ -4807,6 +4851,7 @@ __do_krealloc(const void *p, size_t new_size, gfp_t flags) /* Setup kmalloc redzone when needed */ if (s && slub_debug_orig_size(s)) { + guard(kpkeys_slab_write)(s); set_orig_size(s, (void *)p, new_size); if (s->flags & SLAB_RED_ZONE && new_size < ks) memset_no_sanitize_memory(kasan_reset_tag(p) + new_size, @@ -6162,6 +6207,17 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name, s->useroffset = args->useroffset; s->usersize = args->usersize; #endif +#ifdef CONFIG_ARCH_HAS_KPKEYS + s->pkey = args->pkey; + + if (s->flags & SLAB_SET_PKEY) { + if (s->pkey >= arch_max_pkey()) + goto out; + + if (!arch_kpkeys_enabled() || s->pkey == KPKEYS_PKEY_DEFAULT) + s->flags &= ~SLAB_SET_PKEY; + } +#endif if (!calculate_sizes(args, s)) goto out;