From patchwork Tue Aug 23 17:03:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12952321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53AE0C32772 for ; Tue, 23 Aug 2022 17:04:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A08B940008; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 29218940009; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D3F8940008; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F02CC940007 for ; Tue, 23 Aug 2022 13:04:09 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C6D371A16A7 for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) X-FDA: 79831480218.29.2F18A02 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf15.hostedemail.com (Postfix) with ESMTP id 3A7F7A0009 for ; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id BBB62336D9; Tue, 23 Aug 2022 17:04:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661274247; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EjSrX8BFIPDQWfuPxbIRBWZHHhdF0EzP5MLHo6wzngI=; b=sIhIFzi6nBz1AAcJDYwcd+bKBtZGjPBv2tMDQVtzPc129WU/+y4k0bksLyrxKfv91cj4UY rbmHo2CZjtziacByakzN+fttflnuwBNOSr+AFkgUSU81hCcM/1qWBPcsHY5AVW9rFLWk0d sZVKCWh3V2c1i1hGG2+/u7vRZ6Bv6Bc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661274247; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EjSrX8BFIPDQWfuPxbIRBWZHHhdF0EzP5MLHo6wzngI=; b=Kk2X4Try1D/4m5adEAI7wLbCspg82h37oG/9F75hBhBnKulq68k1q1MLTj327/TRQy5Nn6 qGPUSxspzIPLGHBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9181E13AE6; Tue, 23 Aug 2022 17:04:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id iHfEIocIBWPTQgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 17:04:07 +0000 From: Vlastimil Babka To: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Sebastian Andrzej Siewior , Thomas Gleixner , Mike Galbraith , Vlastimil Babka Subject: [PATCH v2 1/5] mm/slub: move free_debug_processing() further Date: Tue, 23 Aug 2022 19:03:56 +0200 Message-Id: <20220823170400.26546-2-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661274249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EjSrX8BFIPDQWfuPxbIRBWZHHhdF0EzP5MLHo6wzngI=; b=kTxe3xXPjja8O47b40zX4/TQ20PKQUKdiynjCYci+Ii+RAAEHqELAsEPYhDCIRr9zBsVa4 PAzPKlmMpX1Oc+YkOPNOxGTa7MzkD2J48I0SElxLOmflY+N7OILczEzINwT33WK4RYywWx fcSoNMMr56b5XOBJpDTxJXm7uAlwamw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=sIhIFzi6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Kk2X4Try; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661274249; a=rsa-sha256; cv=none; b=ITAzP21HQ8DHEr6m66MWbWp18kmrw0l3/h/uwEBK4S87jjbE+R5Zvs6p9PZfNNwDoqc15M XYBEvbkY180nIyUtQ9He/YBCiIj7yTad9uJmxRaClYKTL62UqkJZnz0jULqj3w22GnkYEQ ETOGyqN/AObulk2Xm5ohOrdD+liZ8Mg= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3A7F7A0009 X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=sIhIFzi6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Kk2X4Try; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Stat-Signature: ggx3ga94maf8p3y71wqhtdkyukhxeaf8 X-HE-Tag: 1661274248-313368 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the following patch, the function free_debug_processing() will be calling add_partial(), remove_partial() and discard_slab(), se move it below their definitions to avoid forward declarations. To make review easier, separate the move from functional changes. Signed-off-by: Vlastimil Babka Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: David Rientjes --- mm/slub.c | 114 +++++++++++++++++++++++++++--------------------------- 1 file changed, 57 insertions(+), 57 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 862dbd9af4f5..87e794ab101a 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1385,63 +1385,6 @@ static inline int free_consistency_checks(struct kmem_cache *s, return 1; } -/* Supports checking bulk free of a constructed freelist */ -static noinline int free_debug_processing( - struct kmem_cache *s, struct slab *slab, - void *head, void *tail, int bulk_cnt, - unsigned long addr) -{ - struct kmem_cache_node *n = get_node(s, slab_nid(slab)); - void *object = head; - int cnt = 0; - unsigned long flags, flags2; - int ret = 0; - depot_stack_handle_t handle = 0; - - if (s->flags & SLAB_STORE_USER) - handle = set_track_prepare(); - - spin_lock_irqsave(&n->list_lock, flags); - slab_lock(slab, &flags2); - - if (s->flags & SLAB_CONSISTENCY_CHECKS) { - if (!check_slab(s, slab)) - goto out; - } - -next_object: - cnt++; - - if (s->flags & SLAB_CONSISTENCY_CHECKS) { - if (!free_consistency_checks(s, slab, object, addr)) - goto out; - } - - if (s->flags & SLAB_STORE_USER) - set_track_update(s, object, TRACK_FREE, addr, handle); - trace(s, slab, object, 0); - /* Freepointer not overwritten by init_object(), SLAB_POISON moved it */ - init_object(s, object, SLUB_RED_INACTIVE); - - /* Reached end of constructed freelist yet? */ - if (object != tail) { - object = get_freepointer(s, object); - goto next_object; - } - ret = 1; - -out: - if (cnt != bulk_cnt) - slab_err(s, slab, "Bulk freelist count(%d) invalid(%d)\n", - bulk_cnt, cnt); - - slab_unlock(slab, &flags2); - spin_unlock_irqrestore(&n->list_lock, flags); - if (!ret) - slab_fix(s, "Object at 0x%p not freed", object); - return ret; -} - /* * Parse a block of slub_debug options. Blocks are delimited by ';' * @@ -2788,6 +2731,63 @@ static inline unsigned long node_nr_objs(struct kmem_cache_node *n) { return atomic_long_read(&n->total_objects); } + +/* Supports checking bulk free of a constructed freelist */ +static noinline int free_debug_processing( + struct kmem_cache *s, struct slab *slab, + void *head, void *tail, int bulk_cnt, + unsigned long addr) +{ + struct kmem_cache_node *n = get_node(s, slab_nid(slab)); + void *object = head; + int cnt = 0; + unsigned long flags, flags2; + int ret = 0; + depot_stack_handle_t handle = 0; + + if (s->flags & SLAB_STORE_USER) + handle = set_track_prepare(); + + spin_lock_irqsave(&n->list_lock, flags); + slab_lock(slab, &flags2); + + if (s->flags & SLAB_CONSISTENCY_CHECKS) { + if (!check_slab(s, slab)) + goto out; + } + +next_object: + cnt++; + + if (s->flags & SLAB_CONSISTENCY_CHECKS) { + if (!free_consistency_checks(s, slab, object, addr)) + goto out; + } + + if (s->flags & SLAB_STORE_USER) + set_track_update(s, object, TRACK_FREE, addr, handle); + trace(s, slab, object, 0); + /* Freepointer not overwritten by init_object(), SLAB_POISON moved it */ + init_object(s, object, SLUB_RED_INACTIVE); + + /* Reached end of constructed freelist yet? */ + if (object != tail) { + object = get_freepointer(s, object); + goto next_object; + } + ret = 1; + +out: + if (cnt != bulk_cnt) + slab_err(s, slab, "Bulk freelist count(%d) invalid(%d)\n", + bulk_cnt, cnt); + + slab_unlock(slab, &flags2); + spin_unlock_irqrestore(&n->list_lock, flags); + if (!ret) + slab_fix(s, "Object at 0x%p not freed", object); + return ret; +} #endif /* CONFIG_SLUB_DEBUG */ #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) From patchwork Tue Aug 23 17:03:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12952322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D472BC32789 for ; Tue, 23 Aug 2022 17:04:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B62D94000B; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C629940009; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47E7794000C; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1E2EB940007 for ; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id EB7EFC1669 for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) X-FDA: 79831480218.03.90C4C3B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf20.hostedemail.com (Postfix) with ESMTP id 588B61C000C for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 0702A1F988; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+hUsU3xe6H95wbSvHqonahhxepFQepUTgv8+1J2GFF0=; b=i0fx0iZGKVqHGt8WisLoPA3cKz9cH/dZ+s6uu9WRBuXrtd31OJdQDtSicILrOEFkHmLU0G Oppfo3XmCxvnX9evHgKSOoBW1JBzP7yCYHWsZY19epTw1hZ3NkMOXr6F7TUVjWa8TfDTEq Afe4kHUwXpBi0BFnXoK/yBtKXWWx9Ik= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+hUsU3xe6H95wbSvHqonahhxepFQepUTgv8+1J2GFF0=; b=d5JvSP88tZYOTUracYUF06BA8s5m42tJubj1qxPKkdNhUBFGdTakssSCGfA7d4OxyMxRa4 IQTxGEqSpX2beBAg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BCE6013AB7; Tue, 23 Aug 2022 17:04:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id WOmaLYcIBWPTQgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 17:04:07 +0000 From: Vlastimil Babka To: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Sebastian Andrzej Siewior , Thomas Gleixner , Mike Galbraith , Vlastimil Babka Subject: [PATCH v2 2/5] mm/slub: restrict sysfs validation to debug caches and make it safe Date: Tue, 23 Aug 2022 19:03:57 +0200 Message-Id: <20220823170400.26546-3-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661274249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+hUsU3xe6H95wbSvHqonahhxepFQepUTgv8+1J2GFF0=; b=AZ2fBSNmB2AgEiyoIX/Rnbuv52uadIt1PoL59gr+fUewTHJ4nyEljdCwg3ThvIXCr0z3nS g9pfP5pmc57V4Av0COwyhbQSiwqiFPZQlWi5DHcEx5lv8newKGCax1JmvP8M3egcbwp3zS +S1kGoW1WJGiGcRm3sxaLqkpV3GFidM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=i0fx0iZG; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=d5JvSP88; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661274249; a=rsa-sha256; cv=none; b=wKfoR3ka4wQDEu56ZPbVAZG8tg7hsLlFK6D0gH44YUK8RgOTy6eich52/a8FLCChgxqc8S TwoqlPu7QRwI27EF95wJSxGPGssFrX/JaoiZXXFU912NYE3PytIVeaiZeEFleXlK2Wr+6k +vLpNRsd+FxM5pBVxVMylKuoOEK3Pio= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 588B61C000C X-Rspam-User: Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=i0fx0iZG; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=d5JvSP88; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Stat-Signature: 88sr9et5jpq3t7g1pr8176hdybympc76 X-HE-Tag: 1661274249-656918 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rongwei Wang reports [1] that cache validation triggered by writing to /sys/kernel/slab//validate is racy against normal cache operations (e.g. freeing) in a way that can cause false positive inconsistency reports for caches with debugging enabled. The problem is that debugging actions that mark object free or active and actual freelist operations are not atomic, and the validation can see an inconsistent state. For caches that do or don't have debugging enabled, additional races involving n->nr_slabs are possible that result in false reports of wrong slab counts. This patch attempts to solve these issues while not adding overhead to normal (especially fastpath) operations for caches that do not have debugging enabled. Such overhead would not be justified to make possible userspace-triggered validation safe. Instead, disable the validation for caches that don't have debugging enabled and make their sysfs validate handler return -EINVAL. For caches that do have debugging enabled, we can instead extend the existing approach of not using percpu freelists to force all alloc/free operations to the slow paths where debugging flags is checked and acted upon. There can adjust the debug-specific paths to increase n->list_lock coverage against concurrent validation as necessary. The processing on free in free_debug_processing() already happens under n->list_lock so we can extend it to actually do the freeing as well and thus make it atomic against concurrent validation. As observed by Hyeonggon Yoo, we do not really need to take slab_lock() anymore here because all paths we could race with are protected by n->list_lock under the new scheme, so drop its usage here. The processing on alloc in alloc_debug_processing() currently doesn't take any locks, but we have to first allocate the object from a slab on the partial list (as debugging caches have no percpu slabs) and thus take the n->list_lock anyway. Add a function alloc_single_from_partial() that grabs just the allocated object instead of the whole freelist, and does the debug processing. The n->list_lock coverage again makes it atomic against validation and it is also ultimately more efficient than the current grabbing of freelist immediately followed by slab deactivation. To prevent races on n->nr_slabs updates, make sure that for caches with debugging enabled, inc_slabs_node() or dec_slabs_node() is called under n->list_lock. When allocating a new slab for a debug cache, handle the allocation by a new function alloc_single_from_new_slab() instead of the current forced deactivation path. Neither of these changes affect the fast paths at all. The changes in slow paths are negligible for non-debug caches. [1] https://lore.kernel.org/all/20220529081535.69275-1-rongwei.wang@linux.alibaba.com/ Reported-by: Rongwei Wang Signed-off-by: Vlastimil Babka Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 231 ++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 179 insertions(+), 52 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 87e794ab101a..a5a913879871 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1324,17 +1324,14 @@ static inline int alloc_consistency_checks(struct kmem_cache *s, } static noinline int alloc_debug_processing(struct kmem_cache *s, - struct slab *slab, - void *object, unsigned long addr) + struct slab *slab, void *object) { if (s->flags & SLAB_CONSISTENCY_CHECKS) { if (!alloc_consistency_checks(s, slab, object)) goto bad; } - /* Success perform special debug activities for allocs */ - if (s->flags & SLAB_STORE_USER) - set_track(s, object, TRACK_ALLOC, addr); + /* Success. Perform special debug activities for allocs */ trace(s, slab, object, 1); init_object(s, object, SLUB_RED_ACTIVE); return 1; @@ -1604,16 +1601,18 @@ static inline void setup_slab_debug(struct kmem_cache *s, struct slab *slab, void *addr) {} static inline int alloc_debug_processing(struct kmem_cache *s, - struct slab *slab, void *object, unsigned long addr) { return 0; } + struct slab *slab, void *object) { return 0; } -static inline int free_debug_processing( +static inline void free_debug_processing( struct kmem_cache *s, struct slab *slab, void *head, void *tail, int bulk_cnt, - unsigned long addr) { return 0; } + unsigned long addr) {} static inline void slab_pad_check(struct kmem_cache *s, struct slab *slab) {} static inline int check_object(struct kmem_cache *s, struct slab *slab, void *object, u8 val) { return 1; } +static inline void set_track(struct kmem_cache *s, void *object, + enum track_item alloc, unsigned long addr) {} static inline void add_full(struct kmem_cache *s, struct kmem_cache_node *n, struct slab *slab) {} static inline void remove_full(struct kmem_cache *s, struct kmem_cache_node *n, @@ -1919,11 +1918,13 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) */ slab = alloc_slab_page(alloc_gfp, node, oo); if (unlikely(!slab)) - goto out; + return NULL; stat(s, ORDER_FALLBACK); } slab->objects = oo_objects(oo); + slab->inuse = 0; + slab->frozen = 0; account_slab(slab, oo_order(oo), s, flags); @@ -1950,15 +1951,6 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) set_freepointer(s, p, NULL); } - slab->inuse = slab->objects; - slab->frozen = 1; - -out: - if (!slab) - return NULL; - - inc_slabs_node(s, slab_nid(slab), slab->objects); - return slab; } @@ -2045,6 +2037,75 @@ static inline void remove_partial(struct kmem_cache_node *n, n->nr_partial--; } +/* + * Called only for kmem_cache_debug() caches instead of acquire_slab(), with a + * slab from the n->partial list. Remove only a single object from the slab, do + * the alloc_debug_processing() checks and leave the slab on the list, or move + * it to full list if it was the last free object. + */ +static void *alloc_single_from_partial(struct kmem_cache *s, + struct kmem_cache_node *n, struct slab *slab) +{ + void *object; + + lockdep_assert_held(&n->list_lock); + + object = slab->freelist; + slab->freelist = get_freepointer(s, object); + slab->inuse++; + + if (!alloc_debug_processing(s, slab, object)) { + remove_partial(n, slab); + return NULL; + } + + if (slab->inuse == slab->objects) { + remove_partial(n, slab); + add_full(s, n, slab); + } + + return object; +} + +/* + * Called only for kmem_cache_debug() caches to allocate from a freshly + * allocated slab. Allocate a single object instead of whole freelist + * and put the slab to the partial (or full) list. + */ +static void *alloc_single_from_new_slab(struct kmem_cache *s, + struct slab *slab) +{ + int nid = slab_nid(slab); + struct kmem_cache_node *n = get_node(s, nid); + unsigned long flags; + void *object; + + + object = slab->freelist; + slab->freelist = get_freepointer(s, object); + slab->inuse = 1; + + if (!alloc_debug_processing(s, slab, object)) + /* + * It's not really expected that this would fail on a + * freshly allocated slab, but a concurrent memory + * corruption in theory could cause that. + */ + return NULL; + + spin_lock_irqsave(&n->list_lock, flags); + + if (slab->inuse == slab->objects) + add_full(s, n, slab); + else + add_partial(n, slab, DEACTIVATE_TO_HEAD); + + inc_slabs_node(s, nid, slab->objects); + spin_unlock_irqrestore(&n->list_lock, flags); + + return object; +} + /* * Remove slab from the partial list, freeze it and * return the pointer to the freelist. @@ -2125,6 +2186,13 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, if (!pfmemalloc_match(slab, gfpflags)) continue; + if (kmem_cache_debug(s)) { + object = alloc_single_from_partial(s, n, slab); + if (object) + break; + continue; + } + t = acquire_slab(s, n, slab, object == NULL); if (!t) break; @@ -2733,31 +2801,39 @@ static inline unsigned long node_nr_objs(struct kmem_cache_node *n) } /* Supports checking bulk free of a constructed freelist */ -static noinline int free_debug_processing( +static noinline void free_debug_processing( struct kmem_cache *s, struct slab *slab, void *head, void *tail, int bulk_cnt, unsigned long addr) { struct kmem_cache_node *n = get_node(s, slab_nid(slab)); + struct slab *slab_free = NULL; void *object = head; int cnt = 0; - unsigned long flags, flags2; - int ret = 0; + unsigned long flags; + bool checks_ok = false; depot_stack_handle_t handle = 0; if (s->flags & SLAB_STORE_USER) handle = set_track_prepare(); spin_lock_irqsave(&n->list_lock, flags); - slab_lock(slab, &flags2); if (s->flags & SLAB_CONSISTENCY_CHECKS) { if (!check_slab(s, slab)) goto out; } + if (slab->inuse < bulk_cnt) { + slab_err(s, slab, "Slab has %d allocated objects but %d are to be freed\n", + slab->inuse, bulk_cnt); + goto out; + } + next_object: - cnt++; + + if (++cnt > bulk_cnt) + goto out_cnt; if (s->flags & SLAB_CONSISTENCY_CHECKS) { if (!free_consistency_checks(s, slab, object, addr)) @@ -2775,18 +2851,56 @@ static noinline int free_debug_processing( object = get_freepointer(s, object); goto next_object; } - ret = 1; + checks_ok = true; -out: +out_cnt: if (cnt != bulk_cnt) - slab_err(s, slab, "Bulk freelist count(%d) invalid(%d)\n", + slab_err(s, slab, "Bulk free expected %d objects but found %d\n", bulk_cnt, cnt); - slab_unlock(slab, &flags2); +out: + if (checks_ok) { + void *prior = slab->freelist; + + /* Perform the actual freeing while we still hold the locks */ + slab->inuse -= cnt; + set_freepointer(s, tail, prior); + slab->freelist = head; + + /* Do we need to remove the slab from full or partial list? */ + if (!prior) { + remove_full(s, n, slab); + } else if (slab->inuse == 0) { + remove_partial(n, slab); + stat(s, FREE_REMOVE_PARTIAL); + } + + /* Do we need to discard the slab or add to partial list? */ + if (slab->inuse == 0) { + slab_free = slab; + } else if (!prior) { + add_partial(n, slab, DEACTIVATE_TO_TAIL); + stat(s, FREE_ADD_PARTIAL); + } + } + + if (slab_free) { + /* + * Update the counters while still holding n->list_lock to + * prevent spurious validation warnings + */ + dec_slabs_node(s, slab_nid(slab_free), slab_free->objects); + } + spin_unlock_irqrestore(&n->list_lock, flags); - if (!ret) + + if (!checks_ok) slab_fix(s, "Object at 0x%p not freed", object); - return ret; + + if (slab_free) { + stat(s, FREE_SLAB); + free_slab(s, slab_free); + } } #endif /* CONFIG_SLUB_DEBUG */ @@ -3036,36 +3150,52 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, return NULL; } + stat(s, ALLOC_SLAB); + + if (kmem_cache_debug(s)) { + freelist = alloc_single_from_new_slab(s, slab); + + if (unlikely(!freelist)) + goto new_objects; + + if (s->flags & SLAB_STORE_USER) + set_track(s, freelist, TRACK_ALLOC, addr); + + return freelist; + } + /* * No other reference to the slab yet so we can * muck around with it freely without cmpxchg */ freelist = slab->freelist; slab->freelist = NULL; + slab->inuse = slab->objects; + slab->frozen = 1; - stat(s, ALLOC_SLAB); + inc_slabs_node(s, slab_nid(slab), slab->objects); check_new_slab: if (kmem_cache_debug(s)) { - if (!alloc_debug_processing(s, slab, freelist, addr)) { - /* Slab failed checks. Next slab needed */ - goto new_slab; - } else { - /* - * For debug case, we don't load freelist so that all - * allocations go through alloc_debug_processing() - */ - goto return_single; - } + /* + * For debug caches here we had to go through + * alloc_single_from_partial() so just store the tracking info + * and return the object + */ + if (s->flags & SLAB_STORE_USER) + set_track(s, freelist, TRACK_ALLOC, addr); + return freelist; } - if (unlikely(!pfmemalloc_match(slab, gfpflags))) + if (unlikely(!pfmemalloc_match(slab, gfpflags))) { /* * For !pfmemalloc_match() case we don't load freelist so that * we don't make further mismatched allocations easier. */ - goto return_single; + deactivate_slab(s, slab, get_freepointer(s, freelist)); + return freelist; + } retry_load_slab: @@ -3089,11 +3219,6 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, c->slab = slab; goto load_freelist; - -return_single: - - deactivate_slab(s, slab, get_freepointer(s, freelist)); - return freelist; } /* @@ -3341,9 +3466,10 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, if (kfence_free(head)) return; - if (kmem_cache_debug(s) && - !free_debug_processing(s, slab, head, tail, cnt, addr)) + if (kmem_cache_debug(s)) { + free_debug_processing(s, slab, head, tail, cnt, addr); return; + } do { if (unlikely(n)) { @@ -3936,6 +4062,7 @@ static void early_kmem_cache_node_alloc(int node) slab = new_slab(kmem_cache_node, GFP_NOWAIT, node); BUG_ON(!slab); + inc_slabs_node(kmem_cache_node, slab_nid(slab), slab->objects); if (slab_nid(slab) != node) { pr_err("SLUB: Unable to allocate memory from node %d\n", node); pr_err("SLUB: Allocating a useless per node structure in order to be able to continue\n"); @@ -3950,7 +4077,6 @@ static void early_kmem_cache_node_alloc(int node) n = kasan_slab_alloc(kmem_cache_node, n, GFP_KERNEL, false); slab->freelist = get_freepointer(kmem_cache_node, n); slab->inuse = 1; - slab->frozen = 0; kmem_cache_node->node[node] = n; init_kmem_cache_node(n); inc_slabs_node(kmem_cache_node, node, slab->objects); @@ -4611,6 +4737,7 @@ static int __kmem_cache_do_shrink(struct kmem_cache *s) if (free == slab->objects) { list_move(&slab->slab_list, &discard); n->nr_partial--; + dec_slabs_node(s, node, slab->objects); } else if (free <= SHRINK_PROMOTE_MAX) list_move(&slab->slab_list, promote + free - 1); } @@ -4626,7 +4753,7 @@ static int __kmem_cache_do_shrink(struct kmem_cache *s) /* Release empty slabs */ list_for_each_entry_safe(slab, t, &discard, slab_list) - discard_slab(s, slab); + free_slab(s, slab); if (slabs_node(s, node)) ret = 1; @@ -5601,7 +5728,7 @@ static ssize_t validate_store(struct kmem_cache *s, { int ret = -EINVAL; - if (buf[0] == '1') { + if (buf[0] == '1' && kmem_cache_debug(s)) { ret = validate_slab_cache(s); if (ret >= 0) ret = length; From patchwork Tue Aug 23 17:03:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12952324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EC61C32772 for ; Tue, 23 Aug 2022 17:04:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5D9F94000D; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ABF17940009; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7970C94000C; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3A7B394000B for ; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0B17C16163E for ; Tue, 23 Aug 2022 17:04:10 +0000 (UTC) X-FDA: 79831480260.28.EB12E3D Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf10.hostedemail.com (Postfix) with ESMTP id 73DB4C0008 for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 35C381F98D; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XvI+cVLhZvJBlhllZE4A6Yzem81PLYg11lufZc2b5wA=; b=Wn//52wgDW257TpmhoxWiIbXenmoAeREvdYXDuNJhTFU5UASXliUf7aZzyccOxZlZkRO5V ytHiREQhhZOllHG6RCBD7iUiYfmta3h6Lfif5x81t2JIVNpi3xk+WyW6lKI0edqQw6yZGg bWC8FCNC6MN5oiAiiU/VMm8Vf4by3Ds= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XvI+cVLhZvJBlhllZE4A6Yzem81PLYg11lufZc2b5wA=; b=msqnIw1enlGQLxKSg27/atL/NZqDVOl/KT8ujB+0fAeJ4ui6AZihEV7WRBFbY8PV/NIejA 46XTWCIwzQrCLWAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 0663013AE6; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id wHK4AIgIBWPTQgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 17:04:08 +0000 From: Vlastimil Babka To: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Sebastian Andrzej Siewior , Thomas Gleixner , Mike Galbraith , Vlastimil Babka Subject: [PATCH v2 3/5] mm/slub: remove slab_lock() usage for debug operations Date: Tue, 23 Aug 2022 19:03:58 +0200 Message-Id: <20220823170400.26546-4-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661274249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XvI+cVLhZvJBlhllZE4A6Yzem81PLYg11lufZc2b5wA=; b=m3XcpT+SgowiAQACWCvG86iatXouW//9TtoRTcQJ3D762tfODRs4MGgh9Oas1IaCHH1iC9 KC51uX4TBqjXPLD9y+qwb/XeF5iigffbzj2ypUshxrPyhwkPD382uHN8dNLZ4cV/Ywl9Kr dc/8BP2J/AbX0V6bQ8S9CYPpxut6X8s= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="Wn//52wg"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=msqnIw1e; spf=pass (imf10.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661274249; a=rsa-sha256; cv=none; b=cd24EbdY0tZIvKdBzzfnL4dh+sIuoFQ338+/lKjyj+w3u83bbvw+HkUQ+3LmKEFl/PGxld DywssNIt2NbeCxdj2o3L0G6ZfnkaIhZevq6Vs6BAwdLO77v6xoMAQ2vDACAIZ6RRpMvj5c 5XIC21un2f4ZvZWlsJdoKTLgdXrd0WA= X-Stat-Signature: u8u714f3c9774ps7sk51unw3irgjun6t X-Rspamd-Queue-Id: 73DB4C0008 X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="Wn//52wg"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=msqnIw1e; spf=pass (imf10.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam01 X-HE-Tag: 1661274249-665659 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: All alloc and free operations on debug caches are now serialized by n->list_lock, so we can remove slab_lock() usage in validate_slab() and list_slab_objects() as those also happen under n->list_lock. Note the usage in list_slab_objects() could happen even on non-debug caches, but only during cache shutdown time, so there should not be any parallel freeing activity anymore. Except for buggy slab users, but in that case the slab_lock() would not help against the common cmpxchg based fast paths (in non-debug caches) anyway. Also adjust documentation comments accordingly. Suggested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: David Rientjes --- mm/slub.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a5a913879871..b4065e892f7c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -50,7 +50,7 @@ * 1. slab_mutex (Global Mutex) * 2. node->list_lock (Spinlock) * 3. kmem_cache->cpu_slab->lock (Local lock) - * 4. slab_lock(slab) (Only on some arches or for debugging) + * 4. slab_lock(slab) (Only on some arches) * 5. object_map_lock (Only for debugging) * * slab_mutex @@ -64,8 +64,9 @@ * The slab_lock is a wrapper around the page lock, thus it is a bit * spinlock. * - * The slab_lock is only used for debugging and on arches that do not - * have the ability to do a cmpxchg_double. It only protects: + * The slab_lock is only used on arches that do not have the ability + * to do a cmpxchg_double. It only protects: + * * A. slab->freelist -> List of free objects in a slab * B. slab->inuse -> Number of objects in use * C. slab->objects -> Number of objects in slab @@ -94,6 +95,9 @@ * allocating a long series of objects that fill up slabs does not require * the list lock. * + * For debug caches, all allocations are forced to go through a list_lock + * protected region to serialize against concurrent validation. + * * cpu_slab->lock local lock * * This locks protect slowpath manipulation of all kmem_cache_cpu fields @@ -4368,7 +4372,6 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab, void *p; slab_err(s, slab, text, s->name); - slab_lock(slab, &flags); map = get_map(s, slab); for_each_object(p, s, addr, slab->objects) { @@ -4379,7 +4382,6 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab, } } put_map(map); - slab_unlock(slab, &flags); #endif } @@ -5107,12 +5109,9 @@ static void validate_slab(struct kmem_cache *s, struct slab *slab, { void *p; void *addr = slab_address(slab); - unsigned long flags; - - slab_lock(slab, &flags); if (!check_slab(s, slab) || !on_freelist(s, slab, NULL)) - goto unlock; + return; /* Now we know that a valid freelist exists */ __fill_map(obj_map, s, slab); @@ -5123,8 +5122,6 @@ static void validate_slab(struct kmem_cache *s, struct slab *slab, if (!check_object(s, slab, p, val)) break; } -unlock: - slab_unlock(slab, &flags); } static int validate_slab_node(struct kmem_cache *s, From patchwork Tue Aug 23 17:03:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12952323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2258C32774 for ; Tue, 23 Aug 2022 17:04:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9DDED940007; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 84D1594000D; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62249940007; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2C10E94000A for ; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F33DB12166D for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) X-FDA: 79831480218.04.CF1AED4 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf20.hostedemail.com (Postfix) with ESMTP id 84DDB1C0017 for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6CCA033728; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JfZ2lf4IPa2cN9yBFH4RoSps0msGVmnFikpNpd4PztQ=; b=Rnl+YjY7SCgTUNVAh1U7P+vCo4oewnVjfFMPA0cpH/E08XMHZdaPV/4fccwkGgE5u2h3b1 pByGv9zPvNEAqsnftFt98w6MrlmATr01gCyQP43Fj4v5aU3mjcmqfxd5C0Vu+InUZxrEX3 ryARIz74m6nAI1ksi4ddyUN8va/Aw5Y= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JfZ2lf4IPa2cN9yBFH4RoSps0msGVmnFikpNpd4PztQ=; b=y1dAxR0HZK0UWhggUZ82p1tDBU2vliJAnKOxh8rojLuk+kCHcoXYyiz5+2H0P0KbRI4Ubn rKpj/qiu++Kr1gCw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3518B13AB7; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id eJdXDIgIBWPTQgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 17:04:08 +0000 From: Vlastimil Babka To: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Sebastian Andrzej Siewior , Thomas Gleixner , Mike Galbraith , Vlastimil Babka Subject: [PATCH v2 4/5] mm/slub: convert object_map_lock to non-raw spinlock Date: Tue, 23 Aug 2022 19:03:59 +0200 Message-Id: <20220823170400.26546-5-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661274249; a=rsa-sha256; cv=none; b=xjs8uKxOgwho1fnAdyihb5ftPgwiR5vNXxDBgv4pAwVabCgjiXVFZ71UEQ1dhe6Fz0TxIR tf1fUyxSnDY5Xb/lLVRUNiav+CklaQCBjkY3CJygDxloCvIEQ/cwvWvW0ZBhRmzhRuRZNc oy9IsdpCEotuNFXiULgINIhW058b9RA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Rnl+YjY7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=y1dAxR0H; dmarc=none; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661274249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JfZ2lf4IPa2cN9yBFH4RoSps0msGVmnFikpNpd4PztQ=; b=IGCetDVtVrMbqPwt5Mqon5vwub/LrhJqyFiE4sLlapwOjzDVNk3+3OShOEsQJZmSDPiDIf ieMlsIO6g0ijrjZV3zCnUHcz8N35yONDVyoklzn+uBF7+lQNxiJtNpb52PS1o//ckTFzvF wn8e6YFsfQ5mRZAWmLzHWaybKGDLrCg= X-Stat-Signature: oji3mrioiz64gaa3a9qssnxfpp1bc7ro X-Rspamd-Queue-Id: 84DDB1C0017 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=Rnl+YjY7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=y1dAxR0H; dmarc=none; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1661274249-156372 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The only remaining user of object_map_lock is list_slab_objects(). Obtaining the lock there used to happen under slab_lock() which implied disabling irqs on PREEMPT_RT, thus it's a raw_spinlock. With the slab_lock() removed, we can convert it to a normal spinlock. Also remove the get_map()/put_map() wrappers as list_slab_objects() became their only remaining user. Signed-off-by: Vlastimil Babka Acked-by: David Rientjes Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reviewed-by: Sebastian Andrzej Siewior --- mm/slub.c | 36 ++++++------------------------------ 1 file changed, 6 insertions(+), 30 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index b4065e892f7c..0444a2ba4f12 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -565,7 +565,7 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, #ifdef CONFIG_SLUB_DEBUG static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)]; -static DEFINE_RAW_SPINLOCK(object_map_lock); +static DEFINE_SPINLOCK(object_map_lock); static void __fill_map(unsigned long *obj_map, struct kmem_cache *s, struct slab *slab) @@ -599,30 +599,6 @@ static bool slab_add_kunit_errors(void) static inline bool slab_add_kunit_errors(void) { return false; } #endif -/* - * Determine a map of objects in use in a slab. - * - * Node listlock must be held to guarantee that the slab does - * not vanish from under us. - */ -static unsigned long *get_map(struct kmem_cache *s, struct slab *slab) - __acquires(&object_map_lock) -{ - VM_BUG_ON(!irqs_disabled()); - - raw_spin_lock(&object_map_lock); - - __fill_map(object_map, s, slab); - - return object_map; -} - -static void put_map(unsigned long *map) __releases(&object_map_lock) -{ - VM_BUG_ON(map != object_map); - raw_spin_unlock(&object_map_lock); -} - static inline unsigned int size_from_object(struct kmem_cache *s) { if (s->flags & SLAB_RED_ZONE) @@ -4367,21 +4343,21 @@ static void list_slab_objects(struct kmem_cache *s, struct slab *slab, { #ifdef CONFIG_SLUB_DEBUG void *addr = slab_address(slab); - unsigned long flags; - unsigned long *map; void *p; slab_err(s, slab, text, s->name); - map = get_map(s, slab); + spin_lock(&object_map_lock); + __fill_map(object_map, s, slab); + for_each_object(p, s, addr, slab->objects) { - if (!test_bit(__obj_to_index(s, addr, p), map)) { + if (!test_bit(__obj_to_index(s, addr, p), object_map)) { pr_err("Object 0x%p @offset=%tu\n", p, p - addr); print_tracking(s, p); } } - put_map(map); + spin_unlock(&object_map_lock); #endif } From patchwork Tue Aug 23 17:04:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12952325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48587C32789 for ; Tue, 23 Aug 2022 17:04:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6FB9940009; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E19EC94000A; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF7F994000C; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7943A94000A for ; Tue, 23 Aug 2022 13:04:10 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 44D5912166D for ; Tue, 23 Aug 2022 17:04:10 +0000 (UTC) X-FDA: 79831480260.03.E6D0823 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf13.hostedemail.com (Postfix) with ESMTP id B462F2000B for ; Tue, 23 Aug 2022 17:04:09 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9C33D3372D; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M6Oa8AcVdF5tGcAEifrO9DkGH8lIXE4zswm8+5FHn5k=; b=CYDl6/lVAKMdnEevruhouuFOR5hBqzuwOFh49WSPGTusGAFfPKbHDzv3Rlm2ZJC3jCME2E XdEsNo8044sqnmQ9UpAi/j2xuvSQ58ka5C1Nl7U6pKF9ahfzeXglkImjH2YRm3vEdaJRG5 IY83bd8kqZUNrWtHVtlgxfFHlZ9uHU4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1661274248; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M6Oa8AcVdF5tGcAEifrO9DkGH8lIXE4zswm8+5FHn5k=; b=AeWJkytyVKlcvHIVPcwd2x1Y33sQxGXnpBqAX472T9Uu6TwsObuqF/uChox8+RDoE2LBhq MjidfAd03uH/xhDQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6BD5613AE6; Tue, 23 Aug 2022 17:04:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id CAzDGYgIBWPTQgAAMHmgww (envelope-from ); Tue, 23 Aug 2022 17:04:08 +0000 From: Vlastimil Babka To: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Sebastian Andrzej Siewior , Thomas Gleixner , Mike Galbraith , Vlastimil Babka Subject: [PATCH v2 5/5] mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock() Date: Tue, 23 Aug 2022 19:04:00 +0200 Message-Id: <20220823170400.26546-6-vbabka@suse.cz> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661274249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M6Oa8AcVdF5tGcAEifrO9DkGH8lIXE4zswm8+5FHn5k=; b=wQvpBkDHpf6sJHkg5WF5CJuf9VueAOF4d055t3EyGvOpDybuhftthEdVTBqLjiqndBGY+T +GUJumLD0V6R4Pe5aSMHKa07gl00/DvD+LN7sGTAFf0C6+pAtnTKc5sQHiOp5xpF3JY+d+ 3glgWMbcBu8HyBXCM93e742AtpNxAAk= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="CYDl6/lV"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=AeWJkyty; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661274249; a=rsa-sha256; cv=none; b=zhImZztmevdpX88sHA9UwKJnSyBH9Y+OGEW2Mfm8+M/3D2SBWBZYVFLBPB1nbFiZlJfQzc WHlijQ2cJRWAMkaHTnufPouXUAKOvskabnmaiuIXHLQ7ZKtH35yHbb7Hc9w0hXjTq3Gev5 oNsd05BixCiNMuFg2AqP0iOBeWSyT0g= X-Stat-Signature: muexrip46f4asmy7zpff3r9ph3mcntc3 X-Rspamd-Queue-Id: B462F2000B X-Rspam-User: Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="CYDl6/lV"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=AeWJkyty; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam01 X-HE-Tag: 1661274249-817902 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The PREEMPT_RT specific disabling of irqs in __cmpxchg_double_slab() (through slab_[un]lock()) is unnecessary as bit_spin_lock() disables preemption and that's sufficient on RT where interrupts are threaded. That means we no longer need the slab_[un]lock() wrappers, so delete them and rename the current __slab_[un]lock() to slab_[un]lock(). Signed-off-by: Vlastimil Babka Acked-by: David Rientjes Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reviewed-by: Sebastian Andrzej Siewior --- mm/slub.c | 39 ++++++++++++--------------------------- 1 file changed, 12 insertions(+), 27 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 0444a2ba4f12..bb8c1292d7e8 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -446,7 +446,7 @@ slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) /* * Per slab locking using the pagelock */ -static __always_inline void __slab_lock(struct slab *slab) +static __always_inline void slab_lock(struct slab *slab) { struct page *page = slab_page(slab); @@ -454,7 +454,7 @@ static __always_inline void __slab_lock(struct slab *slab) bit_spin_lock(PG_locked, &page->flags); } -static __always_inline void __slab_unlock(struct slab *slab) +static __always_inline void slab_unlock(struct slab *slab) { struct page *page = slab_page(slab); @@ -462,24 +462,12 @@ static __always_inline void __slab_unlock(struct slab *slab) __bit_spin_unlock(PG_locked, &page->flags); } -static __always_inline void slab_lock(struct slab *slab, unsigned long *flags) -{ - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_save(*flags); - __slab_lock(slab); -} - -static __always_inline void slab_unlock(struct slab *slab, unsigned long *flags) -{ - __slab_unlock(slab); - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - local_irq_restore(*flags); -} - /* * Interrupts must be disabled (for the fallback code to work right), typically - * by an _irqsave() lock variant. Except on PREEMPT_RT where locks are different - * so we disable interrupts as part of slab_[un]lock(). + * by an _irqsave() lock variant. Except on PREEMPT_RT where these variants do + * not actually disable interrupts. On the other hand the migrate_disable() + * done by bit_spin_lock() is sufficient on PREEMPT_RT thanks to its threaded + * interrupts. */ static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, @@ -498,18 +486,15 @@ static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab } else #endif { - /* init to 0 to prevent spurious warnings */ - unsigned long flags = 0; - - slab_lock(slab, &flags); + slab_lock(slab); if (slab->freelist == freelist_old && slab->counters == counters_old) { slab->freelist = freelist_new; slab->counters = counters_new; - slab_unlock(slab, &flags); + slab_unlock(slab); return true; } - slab_unlock(slab, &flags); + slab_unlock(slab); } cpu_relax(); @@ -540,16 +525,16 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, unsigned long flags; local_irq_save(flags); - __slab_lock(slab); + slab_lock(slab); if (slab->freelist == freelist_old && slab->counters == counters_old) { slab->freelist = freelist_new; slab->counters = counters_new; - __slab_unlock(slab); + slab_unlock(slab); local_irq_restore(flags); return true; } - __slab_unlock(slab); + slab_unlock(slab); local_irq_restore(flags); } From patchwork Thu Aug 25 07:51:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 12954313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82965C04AA5 for ; Thu, 25 Aug 2022 07:51:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21AD3940007; Thu, 25 Aug 2022 03:51:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A4366B0075; Thu, 25 Aug 2022 03:51:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01E0E940007; Thu, 25 Aug 2022 03:51:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E4CC56B0074 for ; Thu, 25 Aug 2022 03:51:40 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BD9FC1C4B80 for ; Thu, 25 Aug 2022 07:51:40 +0000 (UTC) X-FDA: 79837345560.08.9FE0EAF Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf22.hostedemail.com (Postfix) with ESMTP id 2FE87C0002 for ; Thu, 25 Aug 2022 07:51:40 +0000 (UTC) Date: Thu, 25 Aug 2022 09:51:36 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1661413897; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nQIEg4T59jEH7CtskgPVxtoZ3qrIbcaq3GNOPUVBBiI=; b=wg0l3If88bTfLBmeq/1o7x0m6kS14mmvDg/gl1cnfjCaGnqzlDGeCfhyth8120le0zbe4Z lE9h+HY8+tjzLKXqavzZwlVxerT4WjHUDkFsHImfEWzKOjRD1YW16mHLywtST0zhe2TODc 2rq1gmCLLZKYq3n8Lr4XiWKlc64XMWX2YX5PE9tRwJBfJJM+xR7BgVu619GgZJcfoxtoU0 2GHYWXSQ3TQUaErP+nsPEd9wQSWZco+W+prHNijUqDH2kIpKXf8//afJnnKGi/7V/WQjKo fsFop4zir32M7j8Cc2uhgcK4ScVB71XOj9zDaBJbMmabk4i01D3fRUK9kjW3xQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1661413897; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nQIEg4T59jEH7CtskgPVxtoZ3qrIbcaq3GNOPUVBBiI=; b=jaVKpxy44iXyulF+AyEkU5/ShE8TfUfYUgIzbDZdq65aSBC1ZCp8k4E3zkGh4GeJlBSpVJ 8uZwLyhBr3/dSZDA== From: Sebastian Andrzej Siewior To: Vlastimil Babka Cc: Rongwei Wang , Christoph Lameter , Joonsoo Kim , David Rientjes , Pekka Enberg , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , linux-mm@kvack.org, Thomas Gleixner , Mike Galbraith , Andrew Morton Subject: [PATCH 6/5] slub: Make PREEMPT_RT support less convoluted Message-ID: References: <20220823170400.26546-1-vbabka@suse.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220823170400.26546-1-vbabka@suse.cz> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661413900; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nQIEg4T59jEH7CtskgPVxtoZ3qrIbcaq3GNOPUVBBiI=; b=6MyoDcsJWyexqGSZMfRAbdLodv1NomGfjsK4c0hlxVtaKcvTurnRiQIKOyvqfncSG2fa0N X09HdrekiiD7FARQQz68gG4tNOllOhfKAZEd9uK6Oh1OkVkMpD+VRqH3lABBdITdA2yZc/ nktSPX7s3XNdKCm10NO37G3VAGPJb2c= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=wg0l3If8; dkim=pass header.d=linutronix.de header.s=2020e header.b=jaVKpxy4; spf=pass (imf22.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661413900; a=rsa-sha256; cv=none; b=Wr5iIdNvo+/o1sTKbyUjUaWXHqpj4NZ5vUA/j5WFNg9jfJz64ByWCDxuI9iwOugg7tA7rL NxS2B4/DMvMbkvStiCgaSThH0aR76uQjaZDc9eYmw+7/YFyGfzXIpXYos8RQJDMIDXpYLo jUOI0yew7L2vRaAlND+NznpS6JMglb4= Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=wg0l3If8; dkim=pass header.d=linutronix.de header.s=2020e header.b=jaVKpxy4; spf=pass (imf22.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de X-Rspam-User: X-Stat-Signature: d8imte75bt4iag47w41z3amd1qfk5nb7 X-Rspamd-Queue-Id: 2FE87C0002 X-Rspamd-Server: rspam03 X-HE-Tag: 1661413900-563170 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Thomas Gleixner The slub code already has a few helpers depending on PREEMPT_RT. Add a few more and get rid of the CONFIG_PREEMPT_RT conditionals all over the place. No functional change. Signed-off-by: Thomas Gleixner Cc: Andrew Morton Cc: Christoph Lameter Cc: David Rientjes Cc: Joonsoo Kim Cc: Pekka Enberg Cc: Vlastimil Babka Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- Vlastimil, does it work for you to include this patch in your series? It depends now on your series :) It has this USE_LOCKLESS_FAST_PATH() Linus asked about so we should be good. mm/slub.c | 56 ++++++++++++++++++++++++-------------------------------- 1 file changed, 24 insertions(+), 32 deletions(-) --- a/mm/slub.c +++ b/mm/slub.c @@ -104,9 +104,11 @@ * except the stat counters. This is a percpu structure manipulated only by * the local cpu, so the lock protects against being preempted or interrupted * by an irq. Fast path operations rely on lockless operations instead. - * On PREEMPT_RT, the local lock does not actually disable irqs (and thus - * prevent the lockless operations), so fastpath operations also need to take - * the lock and are no longer lockless. + * + * On PREEMPT_RT, the local lock neither disables interrupts nor preemption + * which means the lockless fastpath cannot be used as it might interfere with + * an in-progress slow path operations. In this case the local lock is always + * taken but it still utilizes the freelist for the common operations. * * lockless fastpaths * @@ -167,8 +169,9 @@ * function call even on !PREEMPT_RT, use inline preempt_disable() there. */ #ifndef CONFIG_PREEMPT_RT -#define slub_get_cpu_ptr(var) get_cpu_ptr(var) -#define slub_put_cpu_ptr(var) put_cpu_ptr(var) +#define slub_get_cpu_ptr(var) get_cpu_ptr(var) +#define slub_put_cpu_ptr(var) put_cpu_ptr(var) +#define USE_LOCKLESS_FAST_PATH() (true) #else #define slub_get_cpu_ptr(var) \ ({ \ @@ -180,6 +183,7 @@ do { \ (void)(var); \ migrate_enable(); \ } while (0) +#define USE_LOCKLESS_FAST_PATH() (false) #endif #ifdef CONFIG_SLUB_DEBUG @@ -474,7 +478,7 @@ static inline bool __cmpxchg_double_slab void *freelist_new, unsigned long counters_new, const char *n) { - if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + if (USE_LOCKLESS_FAST_PATH()) lockdep_assert_irqs_disabled(); #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) @@ -3287,14 +3291,8 @@ static __always_inline void *slab_alloc_ object = c->freelist; slab = c->slab; - /* - * We cannot use the lockless fastpath on PREEMPT_RT because if a - * slowpath has taken the local_lock_irqsave(), it is not protected - * against a fast path operation in an irq handler. So we need to take - * the slow path which uses local_lock. It is still relatively fast if - * there is a suitable cpu freelist. - */ - if (IS_ENABLED(CONFIG_PREEMPT_RT) || + + if (!USE_LOCKLESS_FAST_PATH() || unlikely(!object || !slab || !node_match(slab, node))) { object = __slab_alloc(s, gfpflags, node, addr, c); } else { @@ -3554,6 +3552,7 @@ static __always_inline void do_slab_free void *tail_obj = tail ? : head; struct kmem_cache_cpu *c; unsigned long tid; + void **freelist; redo: /* @@ -3568,9 +3567,13 @@ static __always_inline void do_slab_free /* Same with comment on barrier() in slab_alloc_node() */ barrier(); - if (likely(slab == c->slab)) { -#ifndef CONFIG_PREEMPT_RT - void **freelist = READ_ONCE(c->freelist); + if (unlikely(slab != c->slab)) { + __slab_free(s, slab, head, tail_obj, cnt, addr); + return; + } + + if (USE_LOCKLESS_FAST_PATH()) { + freelist = READ_ONCE(c->freelist); set_freepointer(s, tail_obj, freelist); @@ -3582,16 +3585,8 @@ static __always_inline void do_slab_free note_cmpxchg_failure("slab_free", s, tid); goto redo; } -#else /* CONFIG_PREEMPT_RT */ - /* - * We cannot use the lockless fastpath on PREEMPT_RT because if - * a slowpath has taken the local_lock_irqsave(), it is not - * protected against a fast path operation in an irq handler. So - * we need to take the local_lock. We shouldn't simply defer to - * __slab_free() as that wouldn't use the cpu freelist at all. - */ - void **freelist; - + } else { + /* Update the free list under the local lock */ local_lock(&s->cpu_slab->lock); c = this_cpu_ptr(s->cpu_slab); if (unlikely(slab != c->slab)) { @@ -3606,11 +3601,8 @@ static __always_inline void do_slab_free c->tid = next_tid(tid); local_unlock(&s->cpu_slab->lock); -#endif - stat(s, FREE_FASTPATH); - } else - __slab_free(s, slab, head, tail_obj, cnt, addr); - + } + stat(s, FREE_FASTPATH); } static __always_inline void slab_free(struct kmem_cache *s, struct slab *slab,