From patchwork Wed Nov 29 09:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF0ACC07CB1 for ; Wed, 29 Nov 2023 09:53:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 058936B03BA; Wed, 29 Nov 2023 04:53:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 721296B03BE; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1547D6B03BE; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 552896B03C2 for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 243CE1404F1 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.22.57C1699 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf27.hostedemail.com (Postfix) with ESMTP id DEC0640004 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DUPnHPdnAWU/UCFeEX1tNeDnCcnIMYmNivE3br+UPhg=; b=ZKISDxQenbfYJyb/putVH9rizQdRT7Xq+W/O2eNAVQedTHasWWjO5tt1N1s+PqAyJrSfrv cTp40XrhHsPCOBY2hqGZeqIYmclCv4vwhJSDgFUk3Ez6km8owFpVg3Oq1hgj6S/aYVazMt 8/oi9aA+cQFL8tclfsQLiK6d0uNjaII= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=gnK5efGObMXp3ytftH5a8N5t7+Cqvvf+Cv3BLvYi4tBlhS/bKIfLFYixZsA5OVSl30AqyO Y3DdLJo2RuOZwbFPxy1XuJCLXr5WES5PYvYis4LYORaFoF6spN3eVURs92RUrBhtM6xgw9 2djmj7c8N1an2Ewair4vs6mwmfOTQrE= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E3FEA21990; Wed, 29 Nov 2023 09:53:36 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C819013A96; Wed, 29 Nov 2023 09:53:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id wISPMCAKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:36 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:26 +0100 Subject: [PATCH RFC v3 1/9] mm/slub: fix bulk alloc and free stats MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-1-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Spamd-Bar: ++++++++++++ X-Rspamd-Queue-Id: DEC0640004 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: x8dfhkuzdrw67ka6qk1p34mzd65mdbwg X-HE-Tag: 1701251619-174418 X-HE-Meta: U2FsdGVkX18cs/3FdrVIQTYadERWsMzSeXbgT5Bs1ASaX7u87Uf2zs5L5CBK/60xuCKsahqmRpSKoRh8Qvh1njKwA9fmX2G5g1lUMXfHlvWEDoHdmfk5QMprSxMYOiw8XBokOitWqZS01RdojhcdTbi+amIbL8IJtcyUIAc96Ay8Qhe6D35Ml+PdrJH9rNAvx/vxkC5FN+Q22kdbDRYp1J41dRs/0a1/WRwZaEbOVBuR5UIn0YGm7gENOFSTa+yv33pxZivyTI8Wwy3qluuhpFEn6w4oHoKho/8YAtCydjudXO3FTTWDW7C57RYdtnhVDexuGMHZATLciorLMDWR7KvJORJ02Elrdl8MVlaNw2TTSbGESTlzv9G2snCwvTs5c6JLHNFMWioDYUCfQMycpvlwTBgTxtPjxFPjusrlwxkjLKE62LiMYupyRXeGfqh0LU6kG6lbRu/catY+A82EPiJtlI/mCGV6yQxFimpC81hTcR/v4kOuRYg7f7Pb5j94nn+CHNAveeTie3fyiQom5Wnmo6mjx8mdOr7L5y4MhP98tCAYmC7YSs70eMHGkEH6qPnziBQ+Y2GJKNxLebmyDdCyGUPTrb9lsZtYl2U3ArzWqgvGKEUejXoVFmwIE4kgb9Sys8cxzH6oLJhIU4kXyZyAQ76IQbR6eNG7YOGSnCzkjNGjBgXvpRzVXGOM0pX3FpA4rKCGIYz9tIkZVxiOlf27Qvxn+Ml2GaVfBGWG2aHoCHk61KfQEu5NnRTayo96IACOSZr9dC4M2w+uC725q8ySqG4e/P0BAF2ZvKnc5r3h2cbkW7zKpWMtOpN4lb6gfDt/ukFXl9WLeSjddd+unSkTbaLZhEa9iTEbA3SqI10GxG96ojsZ74/1eNCK0D0vVbspT+whe22XbLutOPgsad4aOr+z/HO5sGbVRDPiKENpFjTaohquepWDurRNL8BgWS+l1L99LrbeWGmYX3f IL2Czh37 csWODmEAC/SQsCIdSRY2fssbbDfR/LUCZzZ1scYmDkMLY5aigddHLFsuGeDIMexMRuPIKM2OWmxjA2Cv6h1Pg4DfvmtYpPVEhlIvG5w1zUL9XmQAPtoVthWew8Xo5RgJOWzR7YHluWTkiu/fgsuj7hOYCFNtHN5Bki4aPz1d6iBUonlL62lWs+6FR8YfvAZAgMs77 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies identified wrt bulk alloc/free operations: - Bulk allocations from cpu freelist are not counted. Add the ALLOC_FASTPATH counter there. - Bulk fastpath freeing will count a list of multiple objects with a single FREE_FASTPATH inc. Add a stat_add() variant to count them all. Signed-off-by: Vlastimil Babka --- mm/slub.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index 63d281dfacdb..f0cd55bb4e11 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -341,6 +341,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si) #endif } +static inline void stat_add(const struct kmem_cache *s, enum stat_item si, int v) +{ +#ifdef CONFIG_SLUB_STATS + raw_cpu_add(s->cpu_slab->stat[si], v); +#endif +} + + /* * Tracks for which NUMA nodes we have kmem_cache_nodes allocated. * Corresponds to node_state[N_NORMAL_MEMORY], but can temporarily @@ -3784,7 +3792,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s, local_unlock(&s->cpu_slab->lock); } - stat(s, FREE_FASTPATH); + stat_add(s, FREE_FASTPATH, cnt); } #else /* CONFIG_SLUB_TINY */ static void do_slab_free(struct kmem_cache *s, @@ -3986,6 +3994,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, c->freelist = get_freepointer(s, object); p[i] = object; maybe_wipe_obj_freeptr(s, p[i]); + stat(s, ALLOC_FASTPATH); } c->tid = next_tid(c->tid); local_unlock_irqrestore(&s->cpu_slab->lock, irqflags); From patchwork Wed Nov 29 09:53:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15D7FC4167B for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EB166B03C4; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 796846B03C6; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 638926B03C5; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 412B96B03BF for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1F5A5A0385 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.17.4BF64A9 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf19.hostedemail.com (Postfix) with ESMTP id F19541A0024 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r0ZIwChSWj55C5ZofQebA+Ysb/A45+GXttFMd4vVFZY=; b=gYWcbjCp2/n6Baeu+D5WJ80vpxI9GNwxUwuX8pqmKjy7+lzdzLlBKRsxoEe6QugFjuiXxX ds0FMSTNDyP7iAmir4aDTdfb+M2ulEa1kJuNBg9nP8vDkB1N+diKF5sU3tWJMxHzGSDQ5r SGkyKd3Nvu6ypDQ2c+045EzUDqhfVso= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=bTmAKMy40xO1WH6gGkcneqnQbyRku+BKaZKya+cmgLbcf6Lmv9az7Vz/DBeOAugHtgc7hf 1zq5mJQ+S0T+LrKrS//O4jBuhYlCBk4u02ZGDcxVytV2OyMkYnClNuPhxcaxwi5qCB5n3y PoRsGTRm42OfsQmyl2N9Llt3hUWmuKw= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 0AF1221991; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E23D613A97; Wed, 29 Nov 2023 09:53:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id AGjyNiAKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:36 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:27 +0100 Subject: [PATCH RFC v3 2/9] mm/slub: introduce __kmem_cache_free_bulk() without free hooks MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-2-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Spamd-Bar: +++++++++++++++++ X-Rspamd-Queue-Id: F19541A0024 X-Rspam-User: X-Stat-Signature: oxgn1in7r59pupy3ckwduw7wwai9cqkp X-Rspamd-Server: rspam01 X-HE-Tag: 1701251619-22542 X-HE-Meta: U2FsdGVkX1+yDTXi6vWlIdHUjqX9uKLrM9cGn/gTPjbi+Dlr76Lx8eyb7Qg1GOiNbxSBWdNppYqmCTMOuRyjNL8kVljKqbp1WDFfNurLsKWqqsAI/uHIztK2psF/a6y/+M7SN96mH66xH/NOEhfalxqUfWBSP0qgB0qkKpp2yGZRhy6+sXItir3uj6T4XRwWisiMcqAqULOCr11NaynT3cVFT23MUVtrUMenfVoUIhlJz7LVO6XrNglqBO0UpVeGV/6HpjIROknqJmukjs1HnJ2nHbVqG7KE15sWrAEORHRV2qiAoyMc6ciH6hN88nG1tB+drgckhbY5BhIZ2dtrMQM6wzVA+VN1vzKFXohFenGUsa+5fvOCpybt/5kOxYg8LSQU2HXNSgLI6asVPXWG2pbIVaUmGFTvDNHr2pGJt+oQahNTp1tExHMozDB2uHoyTb+PEFQS4QwQUVxpuPCa7oXpr7MCjdAsO+AJAFh2qzJxI4HNAhHa3nw/h7aqCFhhuiRu8LlTPosaNJoNRDO2J3XhV4N1bseLi3kNmkxMeQ6x/gZdwAqVYi/Jon3zM1jyThFi/x6E54S5VScP3VHmZZPoVbLL7ILTyvjhBCjms4rlsURQr5fAMLU8W/utJ53M94owwgulYbH8QNSuVwWsuh66nYaIWxuQhBOE0Cm9g4KszemgrBx8eqAMcuMA5NGgZX71Bvdrbj6bKcJON50NZdKX7fmUaf+VfIkpmbhKYZHqJU9NykzVqJGP16I6THCfBtyCDkUMzotVMtdyHUf/JpUqBof0orMqjA4k6B31MokXmaB/hTq5hrOdjDI8Uagt8AkXXkp7EPcOSUiVUKiCQ9dT6vb1M7uHzOTGIqC7epoklto3dcbw9+W7bXovzq1v8V5wDtBxQEFdBdJeDoF71G4HS1VF66Z2K6MBKn2em/PUQQGVehgxO3xt0cDWrvIzXIBaeEpsLO+S0O8dJR1 ABOzKF7i g5IJ1mFarlSmekoPO54TnxD0kcwa0hunBCjYLUDRmq3Na7Xq9u/yNhqklro80m4jE+sSkn+wJQORl0HyP8UF6RfuAsaFBP4zmXEAc4zvvA+yIFbXdRdxk1xgppAY0f4LRi3BQ2nLieSrJIsBTkUdBa4KcFT/CaRO6KIeEKRUdJl+PZUmpJWJ467KPWZ+73ZxrtLJXRAZ9BHyPP1J1lVithrcj+e+C6v2YXGJW6R7Qvg4mdUg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, when __kmem_cache_alloc_bulk() fails, it frees back the objects that were allocated before the failure, using kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free hooks (kasan etc.) and those expect objects processed by the post alloc hooks, slab_post_alloc_hook() is called before kmem_cache_free_bulk(). This is wasteful, although not a big concern in practice for the very rare error path. But in order to efficiently handle percpu array batch refill and free in the following patch, we will also need a variant of kmem_cache_free_bulk() that avoids the free hooks. So introduce it first and use it in the error path too. As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg parameter, remove it. Signed-off-by: Vlastimil Babka --- mm/slub.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index f0cd55bb4e11..16748aeada8f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3919,6 +3919,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size, return same; } +/* + * Internal bulk free of objects that were not initialised by the post alloc + * hooks and thus should not be processed by the free hooks + */ +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p) +{ + if (!size) + return; + + do { + struct detached_freelist df; + + size = build_detached_freelist(s, size, p, &df); + if (!df.slab) + continue; + + do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt, + _RET_IP_); + } while (likely(size)); +} + /* Note that interrupts must be enabled when calling this function. */ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p) { @@ -3940,7 +3961,7 @@ EXPORT_SYMBOL(kmem_cache_free_bulk); #ifndef CONFIG_SLUB_TINY static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, - size_t size, void **p, struct obj_cgroup *objcg) + size_t size, void **p) { struct kmem_cache_cpu *c; unsigned long irqflags; @@ -4004,14 +4025,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, error: slub_put_cpu_ptr(s->cpu_slab); - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size); - kmem_cache_free_bulk(s, i, p); + __kmem_cache_free_bulk(s, i, p); return 0; } #else /* CONFIG_SLUB_TINY */ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, - size_t size, void **p, struct obj_cgroup *objcg) + size_t size, void **p) { int i; @@ -4034,8 +4054,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, return i; error: - slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size); - kmem_cache_free_bulk(s, i, p); + __kmem_cache_free_bulk(s, i, p); return 0; } #endif /* CONFIG_SLUB_TINY */ @@ -4055,7 +4074,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, if (unlikely(!s)) return 0; - i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg); + i = __kmem_cache_alloc_bulk(s, flags, size, p); /* * memcg and kmem_cache debug support and memory initialization. From patchwork Wed Nov 29 09:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43915C4167B for ; Wed, 29 Nov 2023 09:53:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 158C86B03C2; Wed, 29 Nov 2023 04:53:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CEE3D6B03BB; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 996626B03BF; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 368FC6B03BA for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 06BF7160197 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.18.3930F99 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf13.hostedemail.com (Postfix) with ESMTP id DD24020013 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P2ukcSj2GFDFWZJr5dQPgX+igAnhpGokILF+Ev+OXuM=; b=Y2Q5e1OgN65Bg4Xlo8vll5GePkabm8AzSSR9ZsFQGcjnL7ieD1yTkUKaGiq3wW/5+PbeIr H2iibjGjsGoaUj0oMHqCNWKZG2E1CsJyTF1IJ3ZolhM/MSw/JmvJ2OBe7nL8oFKTsEq9at o0VNjg/KV5cTjdszdEQEbtcqykxHw5w= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=R7MMsMQvqzUx7oyK71a7VIf5incUhNrP5lChJFZaakKq7TpYUXFMWderZstrea93he7CF3 9taBQHBluEpgceQWcC1pf28yi8sDuns4fBU5wI00D5Lzkg2yA+FezbGvmjat3InFgt+bbY xk8MP/izsB86CuG17L1/+yG4CKOnxmA= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 2B8351F8B3; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 08AC613A98; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qLDSASEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:28 +0100 Subject: [PATCH RFC v3 3/9] mm/slub: handle bulk and single object freeing separately MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-3-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Spamd-Bar: +++++++++ X-Rspam-User: X-Stat-Signature: jqsz7o6qu9nibcj3w11fb6tmr5qrkhxt X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: DD24020013 X-HE-Tag: 1701251619-258326 X-HE-Meta: U2FsdGVkX18JgSuzPsOGvP33yjm79U/gqwLYMGffJstqZtf4D0NF6iVOYjxx4Mf14hNihUzqDdNO/GiCNJqP7kg+wJp8ZnWLfUjiBlep741j4WGjUdqCq49pSoGeSMcVtLSGv4E2Qbqcobt18CDVKSw6weLzyfQh8Wl7ZD8Hcm1GKLlKHAjQvh3NRS+isM3AzVlC/UreC6k2pf/GcMrx+gcM+7uN58MDYTv+G778UVmgQ8iYSawc6lPI2obD3y/gujqruHUx8G1o04XxgF1KocnzyO0EMibgSWzWbZ4I5pJrxzBX1YhZsrdurZ1Vy+VRnsb2G2QJ82vElkeNJCkHsYL9o+r7bGQkZ9qtBTaqahYVWnKUbcqNhNVMpQwVI7rSRJ96NkJk33s0Ba7kOLHgk6+txflTHymkRh5SeM7XRQKuiCSSdwkHZTC7ukEpcEEpHPtsFsRwCUx+4mx7EbG+2VDc2tnGWBqd0Tb6w3atyr2wByuBMcTxzCsXUbH/75i1MifmmP4JHIXTV1rwwFlwLQPF17vdBZLB6VIuZJwB9thY/HcKyg/fdIvOF1Uw68kCGVU9FWyqCHKfR2r5vtoT+nmfKNsMJrP1NdAJWOj0z3WzyQRAlNbz125zzAePCGVMpoLYQ91G+FqbRNRX4RZAXJ0sGGDaNRXinCEdtKVr54+tuzUDujukdZoAet/tSXR/ZQdFxDaumvX6UlLRxVtvtotLMGOK+TLRCvyBt3ZLQtPSD3I2gXu9NyDeaaNt0azcxkFig767NubdhIbQTnVNazKbxtPWAyVWsfdQ+ncWmesp3Xcx3HGmlI5G6hOcuezh4LkSKhrMKlLagGQIzmsmNDmOWBS6uxAG765udD1Lv2puJYH0l16sTHaM6Izc4TQLpLlNAxMJfB9dnBvEo8JUcAyNUOatxxxF4HuLyfZkW2ZlpVPLhohu2mZi7mLeQIby9fmtA4oOePvYUuGfJ5q OLorHBwz 4xwg3JsPWR+2XzIaHQdqsUSSFxfavJlLQkJnq4ncXejju8SwCtng0i05h76fi4y9ohitu+MEZ1G/05fhBM/ug4ILsz4KLcwDn8c+Ugc+pbAVnTLNby9KCv8BnF55unsHvuUQxn0smBpNCEfH/qXkrc8PrmwjgcEd7zl7oyjLFTW7WErJhv4scCMM+EdO6LoySJgKz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Until now we have a single function slab_free() handling both single object freeing and bulk freeing with neccessary hooks, the latter case requiring slab_free_freelist_hook(). It should be however better to distinguish the two scenarios for the following reasons: - code simpler to follow for the single object case - better code generation - although inlining should eliminate the slab_free_freelist_hook() in case no debugging options are enabled, it seems it's not perfect. When e.g. KASAN is enabled, we're imposing additional unnecessary overhead for single object freeing. - preparation to add percpu array caches in later patches Therefore, simplify slab_free() for the single object case by dropping unnecessary parameters and calling only slab_free_hook() instead of slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk() and adjust callers accordingly. While at it, flip (and document) slab_free_hook() return value so that it returns true when the freeing can proceed, which matches the logic of slab_free_freelist_hook() and is not confusingly the opposite. Additionally we can simplify a bit by changing the tail parameter of do_slab_free() when freeing a single object - instead of NULL we can set equal to head. bloat-o-meter shows small code reduction with a .config that has KASAN etc disabled: add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118) Function old new delta kmem_cache_alloc_bulk 1203 1196 -7 kmem_cache_free 861 835 -26 __kmem_cache_free 741 704 -37 kmem_cache_free_bulk 911 863 -48 Signed-off-by: Vlastimil Babka --- mm/slub.c | 57 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 34 insertions(+), 23 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 16748aeada8f..7d23f10d42e6 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1770,9 +1770,12 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, /* * Hooks for other subsystems that check memory allocations. In a typical * production configuration these hooks all should produce no code at all. + * + * Returns true if freeing of the object can proceed, false if its reuse + * was delayed by KASAN quarantine. */ -static __always_inline bool slab_free_hook(struct kmem_cache *s, - void *x, bool init) +static __always_inline +bool slab_free_hook(struct kmem_cache *s, void *x, bool init) { kmemleak_free_recursive(x, s->flags); kmsan_slab_free(s, x); @@ -1805,7 +1808,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s, s->size - s->inuse - rsize); } /* KASAN might put x into memory quarantine, delaying its reuse. */ - return kasan_slab_free(s, x, init); + return !kasan_slab_free(s, x, init); } static inline bool slab_free_freelist_hook(struct kmem_cache *s, @@ -1815,7 +1818,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, void *object; void *next = *head; - void *old_tail = *tail ? *tail : *head; + void *old_tail = *tail; if (is_kfence_address(next)) { slab_free_hook(s, next, false); @@ -1831,7 +1834,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, next = get_freepointer(s, object); /* If object's reuse doesn't have to be delayed */ - if (!slab_free_hook(s, object, slab_want_init_on_free(s))) { + if (slab_free_hook(s, object, slab_want_init_on_free(s))) { /* Move object to the new freelist */ set_freepointer(s, object, *head); *head = object; @@ -1846,9 +1849,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, } } while (object != old_tail); - if (*head == *tail) - *tail = NULL; - return *head != NULL; } @@ -3743,7 +3743,6 @@ static __always_inline void do_slab_free(struct kmem_cache *s, struct slab *slab, void *head, void *tail, int cnt, unsigned long addr) { - void *tail_obj = tail ? : head; struct kmem_cache_cpu *c; unsigned long tid; void **freelist; @@ -3762,14 +3761,14 @@ static __always_inline void do_slab_free(struct kmem_cache *s, barrier(); if (unlikely(slab != c->slab)) { - __slab_free(s, slab, head, tail_obj, cnt, addr); + __slab_free(s, slab, head, tail, cnt, addr); return; } if (USE_LOCKLESS_FAST_PATH()) { freelist = READ_ONCE(c->freelist); - set_freepointer(s, tail_obj, freelist); + set_freepointer(s, tail, freelist); if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) { note_cmpxchg_failure("slab_free", s, tid); @@ -3786,7 +3785,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s, tid = c->tid; freelist = c->freelist; - set_freepointer(s, tail_obj, freelist); + set_freepointer(s, tail, freelist); c->freelist = head; c->tid = next_tid(tid); @@ -3799,15 +3798,27 @@ static void do_slab_free(struct kmem_cache *s, struct slab *slab, void *head, void *tail, int cnt, unsigned long addr) { - void *tail_obj = tail ? : head; - - __slab_free(s, slab, head, tail_obj, cnt, addr); + __slab_free(s, slab, head, tail, cnt, addr); } #endif /* CONFIG_SLUB_TINY */ -static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab, - void *head, void *tail, void **p, int cnt, - unsigned long addr) +static __fastpath_inline +void slab_free(struct kmem_cache *s, struct slab *slab, void *object, + unsigned long addr) +{ + bool init; + + memcg_slab_free_hook(s, slab, &object, 1); + + init = !is_kfence_address(object) && slab_want_init_on_free(s); + + if (likely(slab_free_hook(s, object, init))) + do_slab_free(s, slab, object, object, 1, addr); +} + +static __fastpath_inline +void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head, + void *tail, void **p, int cnt, unsigned long addr) { memcg_slab_free_hook(s, slab, p, cnt); /* @@ -3821,13 +3832,13 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab, #ifdef CONFIG_KASAN_GENERIC void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr) { - do_slab_free(cache, virt_to_slab(x), x, NULL, 1, addr); + do_slab_free(cache, virt_to_slab(x), x, x, 1, addr); } #endif void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller) { - slab_free(s, virt_to_slab(x), x, NULL, &x, 1, caller); + slab_free(s, virt_to_slab(x), x, caller); } void kmem_cache_free(struct kmem_cache *s, void *x) @@ -3836,7 +3847,7 @@ void kmem_cache_free(struct kmem_cache *s, void *x) if (!s) return; trace_kmem_cache_free(_RET_IP_, x, s); - slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_); + slab_free(s, virt_to_slab(x), x, _RET_IP_); } EXPORT_SYMBOL(kmem_cache_free); @@ -3953,8 +3964,8 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p) if (!df.slab) continue; - slab_free(df.s, df.slab, df.freelist, df.tail, &p[size], df.cnt, - _RET_IP_); + slab_free_bulk(df.s, df.slab, df.freelist, df.tail, &p[size], + df.cnt, _RET_IP_); } while (likely(size)); } EXPORT_SYMBOL(kmem_cache_free_bulk); From patchwork Wed Nov 29 09:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472574 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8253DC4167B for ; Wed, 29 Nov 2023 09:54:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D01B6B03BE; Wed, 29 Nov 2023 04:53:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 994046B03C5; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 501206B03C3; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 59B2B6B03C3 for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 297871404F2 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.20.A994CE0 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf21.hostedemail.com (Postfix) with ESMTP id 017241C0015 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=JSaWHQ5n; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=F3IPQ3Nk; dmarc=none; spf=pass (imf21.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z/MdY/rw/HJ2oq6eayRQ1jISyVijjFBbUkDkU5iKbc8=; b=Vt1/CEllMdFeOutafbJF4gK9gGIR2ia8lJS7FRkbdmOq/3I49007ogygymKMF11WSOVs6R 5WaN3XVmKWaHhktX8ReAplOinzeYO5ESsfGMRdWgvqx4p+79aoPitYuA/ZtcyDfLe2RKHv 7X4XipTt8kOv9zViG/4VudNvQLsLjII= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=JSaWHQ5n; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=F3IPQ3Nk; dmarc=none; spf=pass (imf21.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=HmBOhfjI3xLSBMNk/nt+agRpv8LwOgW6MZl4n5JWZzIsVjO0mjsG3q8yNicbMkQjvCxrlZ 8LTMkXuXXPiAefJeNUjYva576J/QSw+t9Og0XsKBBpni7HkgSCv7LwtQypCGWeA0BYh2Ck NJYibuvfovU2f8pB92uRMJMvQaHLHtw= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 44CA81F8B9; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1701251617; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z/MdY/rw/HJ2oq6eayRQ1jISyVijjFBbUkDkU5iKbc8=; b=JSaWHQ5ngHNR9h0BjZPRO9pLV15+bY/sEiVQLXEtaalvdmi6pmxe9eubOmuCdSSXeNh/Sb Nh4DIf+6GCzwc4Z9cmo4/0h6AQF8+HTMP/IskJTEjbgBOYzfWDDmc9KEj599VKQ98q9w7M A6yjwcRMpx9Ot6/k9lkxkNg8+uMSuPs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1701251617; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z/MdY/rw/HJ2oq6eayRQ1jISyVijjFBbUkDkU5iKbc8=; b=F3IPQ3NkAI4OgYj5R6JdEz852oBdNpOhcIfg0KleOLz7BA8tL0iO7eyPul69JqVGKjb/Zt uYSShywwUe5LzyAQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 240B213A9A; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id +GmBCCEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:29 +0100 Subject: [PATCH RFC v3 4/9] mm/slub: free KFENCE objects in slab_free_hook() MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-4-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 017241C0015 X-Stat-Signature: gzkozhjos34kwx4wkty388537gt595g4 X-HE-Tag: 1701251619-870434 X-HE-Meta: U2FsdGVkX19mpuc4fh7gGaSZSFfXWSoyCMOaHvNCrXMdIRxROZaBOUggl7f0VCwZCr9i23nMhyE8/fg+HBjli2S5/ETSIk1L1njbMRpvEMJvkwSM7cQizSjkWMQ7EZ5ZgXOnuXDiIy34d6VetYZ4JJiIXq5ydG/9HaA/msryAI5f8jjVOz67pY0JBTGbrlZfV9gGLpZAOyv476osER8vTOkhpCfVDqOUqPy5/dtFNYcBjoCGzlz7F+UATimfj9wHFv/zlO5Kz6R3nS2XqwR70d3hJDS6UmjuCXyL2OZJrV7jv1n+5x1Xp2JT7hDKyUk6jp3N9N+QPFO+b4I5LDxma/FRsNRukPDo2LSgEKa/l52y1iQpXDwxu8wfcuv6xNaDNUsrVPuargsVJrG6XYLhDZY1nbPA7DlqzVkkTXZiK3ykPBfAyrlA0bmVszwbV/s03Cacp01NHMWABMP0w3/pSGeTJynsGosQq6T/ZOfnaRSUsDIIzkIxZL5CGP0SOd25ti+zGfmQ0GbFaGJcs1afJU0Cqra30gFHN2ssUG5Y30NXYR0nIos1CnEq3HdlHJIM+vjpTUWe5skHR2qapbbFGUf6eWI2rJmMGFVi/wcxI6185zySEuvaQreG3RvybvLl19xJTjBdok/tJ5ZsDq/7y3zI/t3KO0xIObR91dkrt9MOHNyWSCvHYuxt5LJ6Nlf7coLVpqGKS96729KPKZmikA9+17n7HOUnh4hG2PEichC2qtiTRL9zHnZMuXARgl17ypdFkFONyUkhgCP6+KJIZMhvFAUGcue/rHdstl1CXzLbVO/aySalZJp5e/QWzUdH24Ska/otAwVeJqvwCmd7DkFnFh2qK62mWSi53ek8W6sxjVQitfahUrJptFfPPrzFFEiAb95KL9kqyJQydER8QEn+spiaGsLn6ht8qNuX9POd6RR+4BbjA8NHuXMa2YZ0xnyTS910XAKqP0WE0em lhKonULz g/qeG0auZk9iiSmFB05kAfS7Slh783QNkam3r+N37hH94ZNH2Y2qIiQtqhkXlugwKeacSQRemImNSPe9wDBf8MycJfnZs1U+xG+24+k6MBXNnBmxlx8x190L0APTI0+P44f8NcmZwIzzdY7rO7t4nozZ8hs7KOkp3btGUdrASGjDX9UQThao2+DX8ChgGW2Qe+t7Szgv562kce3b7qJy9VnqtOhITX1M9ff1l4O+VCMiXDk00HT9K7xfp3lnnKJZ0Tbc0wRQBsKCARREgx1GhM2W0qDwGvyxAXJ36o6cmsdurzC2AB3F6Vyp90rjiiKVe//V67O/KnHjKGyvGMmeAUOg/6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When freeing an object that was allocated from KFENCE, we do that in the slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be the cpu slab, so the fastpath has to fallback to the slowpath. This optimization doesn't help much though, because is_kfence_address() is checked earlier anyway during the free hook processing or detached freelist building. Thus we can simplify the code by making the slab_free_hook() free the KFENCE object immediately, similarly to KASAN quarantine. In slab_free_hook() we can place kfence_free() above init processing, as callers have been making sure to set init to false for KFENCE objects. This simplifies slab_free(). This places it also above kasan_slab_free() which is ok as that skips KFENCE objects anyway. While at it also determine the init value in slab_free_freelist_hook() outside of the loop. This change will also make introducing per cpu array caches easier. Signed-off-by: Vlastimil Babka Tested-by: Marco Elver --- mm/slub.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 7d23f10d42e6..59912a376c6d 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1772,7 +1772,7 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, * production configuration these hooks all should produce no code at all. * * Returns true if freeing of the object can proceed, false if its reuse - * was delayed by KASAN quarantine. + * was delayed by KASAN quarantine, or it was returned to KFENCE. */ static __always_inline bool slab_free_hook(struct kmem_cache *s, void *x, bool init) @@ -1790,6 +1790,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init) __kcsan_check_access(x, s->object_size, KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT); + if (kfence_free(kasan_reset_tag(x))) + return false; + /* * As memory initialization might be integrated into KASAN, * kasan_slab_free and initialization memset's must be @@ -1819,22 +1822,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, void *object; void *next = *head; void *old_tail = *tail; + bool init; if (is_kfence_address(next)) { slab_free_hook(s, next, false); - return true; + return false; } /* Head and tail of the reconstructed freelist */ *head = NULL; *tail = NULL; + init = slab_want_init_on_free(s); + do { object = next; next = get_freepointer(s, object); /* If object's reuse doesn't have to be delayed */ - if (slab_free_hook(s, object, slab_want_init_on_free(s))) { + if (slab_free_hook(s, object, init)) { /* Move object to the new freelist */ set_freepointer(s, object, *head); *head = object; @@ -3619,9 +3625,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, stat(s, FREE_SLOWPATH); - if (kfence_free(head)) - return; - if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) { free_to_partial_list(s, slab, head, tail, cnt, addr); return; @@ -3806,13 +3809,9 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab, void *object, unsigned long addr) { - bool init; - memcg_slab_free_hook(s, slab, &object, 1); - init = !is_kfence_address(object) && slab_want_init_on_free(s); - - if (likely(slab_free_hook(s, object, init))) + if (likely(slab_free_hook(s, object, slab_want_init_on_free(s)))) do_slab_free(s, slab, object, object, 1, addr); } From patchwork Wed Nov 29 09:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48DD1C4167B for ; Wed, 29 Nov 2023 09:53:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 737586B03BF; Wed, 29 Nov 2023 04:53:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DC606B03BD; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9F0D6B03C7; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3802B6B03BB for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id F2007C04C2 for ; Wed, 29 Nov 2023 09:53:41 +0000 (UTC) X-FDA: 81510529842.21.7E5CC0D Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf07.hostedemail.com (Postfix) with ESMTP id DCFA640009 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=ft+DYZGV62tlZ4MogTzfOGFt2l4ceos+Ze/sCF5mVi1KWULziZFCHkfXpVcxQ6m13GeM6+ FYY4LvskG1aaHVH3TmmidhFA7Gm2rN6/qEP97oW2qOJ14F+B70GyxCiYq0o7QP3E5bJtTS hDMZIfP48s2D7xS1xSebPK+n3+H9a9I= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mCZNnnAgtZ9BCKVsGZOHUXKdRtd+yAwyaxi7IiapAks=; b=ODCFy2od4wJYcycv0CarFFUDbRjOw6BXwh0/omE9DKY0Vz4w5ww9VS9vqP00jPunflPQPG wpbstDQ/wZ/LBa9riAGap6IklFjRcuRAyDYnwVb3aNi7PHW7bibdgSNcwZDyVLtKiiCt5O yyxzmYxrbzxwt2OUYrGfe10avetznNY= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E262721992; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3E5FA13A9B; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id yPPuDiEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:30 +0100 Subject: [PATCH RFC v3 5/9] mm/slub: add opt-in percpu array cache of objects MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-5-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: DCFA640009 X-Stat-Signature: mnxsbframn9aqpaw7ugu7db3ds7deucs X-HE-Tag: 1701251619-977176 X-HE-Meta: U2FsdGVkX18V4YIxE0xhNgAy4jWYEqlnHhbOUmOwJvrgbCur6IdDdpN7PqK0MGbTNZFd24uZeq27auEPkWccJJbcDEIqMMwilwbXp8Baft+I5vDU9Vj279jWlrnUOfV41K/EqtELpxG0QODaWDtXK54ToRGeCPwcGbCf35eOjioYQIB6FLOoV2kuwRuCXaHKz+WR1SoFmwxrRodDHgepX49Uvq2ZR3lRPtkHRpXvx/A+xegPsjkpZjczG5KSSQSEsKptGKyihhzAzCwZImaGINSZhKaTEQGgyRRx96Rw1dNjXYxSJ37/69dVI63BtPnqIlVzL5egfZlTw1L7NmmvFvh7oaRfpTJ1+ImPTfCndBB98zDE3gSZAQkcIXaadhN1rq6lwzovaSLCj093P7RgDKecVJ/fxG2rSFISlznBMn7HZjcBbnPQsOcP2+SLGQJ5HrDqncKYUp7KJvziavsoounfqs424cN2DYWFhbte8mczXxMFT92H3fT5r4AlmItwtzcuqCO/4kc8oSRCtjh4YaSkm/CUAlydOR1JLvdKo2Z+bZiM9zRrJ5RbgpRlLACOlxNfrLiotwaoThcOmwhDiF+fWcslkCk6uto/c0EJn8OY0DIYs7wSjgXE2hAQj6zf8aJdzg5dB94k+Za+uebqjl+YvHSggnxvFKhtoRbpCuLQgQmkXsPdutKQvjyL4D6KItB7CbzJa0bYFtrznKtBPZABjSLiuk9mze2+kLDaQ6VimiSOhrwR1+6z1lZIo8LEkz4HjZk/NMXSk6rL45sYfq/9hurq14DyribdF8hSMwl8TlQ27IaRUQJL4qajvgfJFCD4Vy4jJCO9G1Rb9Qi3LBAIT3KcwDiMIccAYdZe56qRanXH6Zc8OXyj/4jr9IXvi0AKBGv14HdfSISCdgBhflRDwHMHsZjUGXY3U8mx6UEz1hpq04Vnb6oTienoJVDUHjtTkAM/zl4WNOtCKVJ lBPbBPvv snHRr2it6m6Lc4J/4zDDaFx33ud3HeOuVyFu1JBs/iWHrX2+xr2uKhJbMhIuxCIq8cEl/WzEAtipS3hrtxL5JGf0Bn98kfyxyOUe2eIRIVyGColDZ14tWQ/3iMOy7eWxhg2U1WJstElmQJrICfHgTpuzRHhmrn1xx5kW3c7cFVWw2JXorEhIG5wKRiXmmDEVKKT74 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: kmem_cache_setup_percpu_array() will allocate a per-cpu array for caching alloc/free objects of given size for the cache. The cache has to be created with SLAB_NO_MERGE flag. When empty, half of the array is filled by an internal bulk alloc operation. When full, half of the array is flushed by an internal bulk free operation. The array does not distinguish NUMA locality of the cached objects. If an allocation is requested with kmem_cache_alloc_node() with numa node not equal to NUMA_NO_NODE, the array is bypassed. The bulk operations exposed to slab users also try to utilize the array when possible, but leave the array empty or full and use the bulk alloc/free only to finish the operation itself. If kmemcg is enabled and active, bulk freeing skips the array completely as it would be less efficient to use it. The locking scheme is copied from the page allocator's pcplists, based on embedded spin locks. Interrupts are not disabled, only preemption (cpu migration on RT). Trylock is attempted to avoid deadlock due to an interrupt; trylock failure means the array is bypassed. Sysfs stat counters alloc_cpu_cache and free_cpu_cache count objects allocated or freed using the percpu array; counters cpu_cache_refill and cpu_cache_flush count objects refilled or flushed form the array. kmem_cache_prefill_percpu_array() can be called to ensure the array on the current cpu to at least the given number of objects. However this is only opportunistic as there's no cpu pinning between the prefill and usage, and trylocks may fail when the usage is in an irq handler. Therefore allocations cannot rely on the array for success even after the prefill. But misses should be rare enough that e.g. GFP_ATOMIC allocations should be acceptable after the refill. When slub_debug is enabled for a cache with percpu array, the objects in the array are considered as allocated from the slub_debug perspective, and the alloc/free debugging hooks occur when moving the objects between the array and slab pages. This means that e.g. an use-after-free that occurs for an object cached in the array is undetected. Collected alloc/free stacktraces might also be less useful. This limitation could be changed in the future. On the other hand, KASAN, kmemcg and other hooks are executed on actual allocations and frees by kmem_cache users even if those use the array, so their debugging or accounting accuracy should be unaffected. Signed-off-by: Vlastimil Babka --- include/linux/slab.h | 4 + include/linux/slub_def.h | 12 ++ mm/Kconfig | 1 + mm/slub.c | 457 ++++++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 468 insertions(+), 6 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index d6d6ffeeb9a2..fe0c0981be59 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -197,6 +197,8 @@ struct kmem_cache *kmem_cache_create_usercopy(const char *name, void kmem_cache_destroy(struct kmem_cache *s); int kmem_cache_shrink(struct kmem_cache *s); +int kmem_cache_setup_percpu_array(struct kmem_cache *s, unsigned int count); + /* * Please use this macro to create slab caches. Simply specify the * name of the structure and maybe some flags that are listed above. @@ -512,6 +514,8 @@ void kmem_cache_free(struct kmem_cache *s, void *objp); void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p); int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, void **p); +int kmem_cache_prefill_percpu_array(struct kmem_cache *s, unsigned int count, gfp_t gfp); + static __always_inline void kfree_bulk(size_t size, void **p) { kmem_cache_free_bulk(NULL, size, p); diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index deb90cf4bffb..2083aa849766 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -13,8 +13,10 @@ #include enum stat_item { + ALLOC_PCA, /* Allocation from percpu array cache */ ALLOC_FASTPATH, /* Allocation from cpu slab */ ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */ + FREE_PCA, /* Free to percpu array cache */ FREE_FASTPATH, /* Free to cpu slab */ FREE_SLOWPATH, /* Freeing not to cpu slab */ FREE_FROZEN, /* Freeing to frozen slab */ @@ -39,6 +41,8 @@ enum stat_item { CPU_PARTIAL_FREE, /* Refill cpu partial on free */ CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */ CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */ + PCA_REFILL, /* Refilling empty percpu array cache */ + PCA_FLUSH, /* Flushing full percpu array cache */ NR_SLUB_STAT_ITEMS }; @@ -66,6 +70,13 @@ struct kmem_cache_cpu { }; #endif /* CONFIG_SLUB_TINY */ +struct slub_percpu_array { + spinlock_t lock; + unsigned int count; + unsigned int used; + void * objects[]; +}; + #ifdef CONFIG_SLUB_CPU_PARTIAL #define slub_percpu_partial(c) ((c)->partial) @@ -99,6 +110,7 @@ struct kmem_cache { #ifndef CONFIG_SLUB_TINY struct kmem_cache_cpu __percpu *cpu_slab; #endif + struct slub_percpu_array __percpu *cpu_array; /* Used for retrieving partial slabs, etc. */ slab_flags_t flags; unsigned long min_partial; diff --git a/mm/Kconfig b/mm/Kconfig index 89971a894b60..aa53c51bb4a6 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -237,6 +237,7 @@ choice config SLAB_DEPRECATED bool "SLAB (DEPRECATED)" depends on !PREEMPT_RT + depends on BROKEN help Deprecated and scheduled for removal in a few cycles. Replaced by SLUB. diff --git a/mm/slub.c b/mm/slub.c index 59912a376c6d..f08bd71c244f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -188,6 +188,79 @@ do { \ #define USE_LOCKLESS_FAST_PATH() (false) #endif +/* copy/pasted from mm/page_alloc.c */ + +#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) +/* + * On SMP, spin_trylock is sufficient protection. + * On PREEMPT_RT, spin_trylock is equivalent on both SMP and UP. + */ +#define pcp_trylock_prepare(flags) do { } while (0) +#define pcp_trylock_finish(flag) do { } while (0) +#else + +/* UP spin_trylock always succeeds so disable IRQs to prevent re-entrancy. */ +#define pcp_trylock_prepare(flags) local_irq_save(flags) +#define pcp_trylock_finish(flags) local_irq_restore(flags) +#endif + +/* + * Locking a pcp requires a PCP lookup followed by a spinlock. To avoid + * a migration causing the wrong PCP to be locked and remote memory being + * potentially allocated, pin the task to the CPU for the lookup+lock. + * preempt_disable is used on !RT because it is faster than migrate_disable. + * migrate_disable is used on RT because otherwise RT spinlock usage is + * interfered with and a high priority task cannot preempt the allocator. + */ +#ifndef CONFIG_PREEMPT_RT +#define pcpu_task_pin() preempt_disable() +#define pcpu_task_unpin() preempt_enable() +#else +#define pcpu_task_pin() migrate_disable() +#define pcpu_task_unpin() migrate_enable() +#endif + +/* + * Generic helper to lookup and a per-cpu variable with an embedded spinlock. + * Return value should be used with equivalent unlock helper. + */ +#define pcpu_spin_lock(type, member, ptr) \ +({ \ + type *_ret; \ + pcpu_task_pin(); \ + _ret = this_cpu_ptr(ptr); \ + spin_lock(&_ret->member); \ + _ret; \ +}) + +#define pcpu_spin_trylock(type, member, ptr) \ +({ \ + type *_ret; \ + pcpu_task_pin(); \ + _ret = this_cpu_ptr(ptr); \ + if (!spin_trylock(&_ret->member)) { \ + pcpu_task_unpin(); \ + _ret = NULL; \ + } \ + _ret; \ +}) + +#define pcpu_spin_unlock(member, ptr) \ +({ \ + spin_unlock(&ptr->member); \ + pcpu_task_unpin(); \ +}) + +/* struct slub_percpu_array specific helpers. */ +#define pca_spin_lock(ptr) \ + pcpu_spin_lock(struct slub_percpu_array, lock, ptr) + +#define pca_spin_trylock(ptr) \ + pcpu_spin_trylock(struct slub_percpu_array, lock, ptr) + +#define pca_spin_unlock(ptr) \ + pcpu_spin_unlock(lock, ptr) + #ifndef CONFIG_SLUB_TINY #define __fastpath_inline __always_inline #else @@ -3454,6 +3527,78 @@ static __always_inline void maybe_wipe_obj_freeptr(struct kmem_cache *s, 0, sizeof(void *)); } +static bool refill_pca(struct kmem_cache *s, unsigned int count, gfp_t gfp); + +static __fastpath_inline +void *alloc_from_pca(struct kmem_cache *s, gfp_t gfp) +{ + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + void *object; + +retry: + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + + if (unlikely(!pca)) { + pcp_trylock_finish(UP_flags); + return NULL; + } + + if (unlikely(pca->used == 0)) { + unsigned int batch = pca->count / 2; + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + if (!gfpflags_allow_blocking(gfp) || in_irq()) + return NULL; + + if (refill_pca(s, batch, gfp)) + goto retry; + + return NULL; + } + + object = pca->objects[--pca->used]; + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + stat(s, ALLOC_PCA); + + return object; +} + +static __fastpath_inline +int alloc_from_pca_bulk(struct kmem_cache *s, size_t size, void **p) +{ + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + + if (unlikely(!pca)) { + size = 0; + goto failed; + } + + if (pca->used < size) + size = pca->used; + + for (int i = size; i > 0;) { + p[--i] = pca->objects[--pca->used]; + } + + pca_spin_unlock(pca); + stat_add(s, ALLOC_PCA, size); + +failed: + pcp_trylock_finish(UP_flags); + return size; +} + /* * Inlined fastpath so that allocation functions (kmalloc, kmem_cache_alloc) * have the fastpath folded into their functions. So no function call @@ -3479,7 +3624,11 @@ static __fastpath_inline void *slab_alloc_node(struct kmem_cache *s, struct list if (unlikely(object)) goto out; - object = __slab_alloc_node(s, gfpflags, node, addr, orig_size); + if (s->cpu_array && (node == NUMA_NO_NODE)) + object = alloc_from_pca(s, gfpflags); + + if (!object) + object = __slab_alloc_node(s, gfpflags, node, addr, orig_size); maybe_wipe_obj_freeptr(s, object); init = slab_want_init_on_alloc(gfpflags, s); @@ -3726,6 +3875,81 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, discard_slab(s, slab); } +static bool flush_pca(struct kmem_cache *s, unsigned int count); + +static __fastpath_inline +bool free_to_pca(struct kmem_cache *s, void *object) +{ + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + +retry: + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + + if (!pca) { + pcp_trylock_finish(UP_flags); + return false; + } + + if (pca->used == pca->count) { + unsigned int batch = pca->count / 2; + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + if (in_irq()) + return false; + + if (!flush_pca(s, batch)) + return false; + + goto retry; + } + + pca->objects[pca->used++] = object; + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + stat(s, FREE_PCA); + + return true; +} + +static __fastpath_inline +size_t free_to_pca_bulk(struct kmem_cache *s, size_t size, void **p) +{ + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + bool init; + + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + + if (unlikely(!pca)) { + size = 0; + goto failed; + } + + if (pca->count - pca->used < size) + size = pca->count - pca->used; + + init = slab_want_init_on_free(s); + + for (size_t i = 0; i < size; i++) { + if (likely(slab_free_hook(s, p[i], init))) + pca->objects[pca->used++] = p[i]; + } + + pca_spin_unlock(pca); + stat_add(s, FREE_PCA, size); + +failed: + pcp_trylock_finish(UP_flags); + return size; +} + #ifndef CONFIG_SLUB_TINY /* * Fastpath with forced inlining to produce a kfree and kmem_cache_free that @@ -3811,7 +4035,12 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object, { memcg_slab_free_hook(s, slab, &object, 1); - if (likely(slab_free_hook(s, object, slab_want_init_on_free(s)))) + if (unlikely(!slab_free_hook(s, object, slab_want_init_on_free(s)))) + return; + + if (s->cpu_array) + free_to_pca(s, object); + else do_slab_free(s, slab, object, object, 1, addr); } @@ -3956,6 +4185,26 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p) if (!size) return; + /* + * In case the objects might need memcg_slab_free_hook(), skip the array + * because the hook is not effective with single objects and benefits + * from groups of objects from a single slab that the detached freelist + * builds. But once we build the detached freelist, it's wasteful to + * throw it away and put the objects into the array. + * + * XXX: This test could be cache-specific if it was not possible to use + * __GFP_ACCOUNT with caches that are not SLAB_ACCOUNT + */ + if (s && s->cpu_array && !memcg_kmem_online()) { + size_t pca_freed = free_to_pca_bulk(s, size, p); + + if (pca_freed == size) + return; + + p += pca_freed; + size -= pca_freed; + } + do { struct detached_freelist df; @@ -4073,7 +4322,8 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, void **p) { - int i; + int from_pca = 0; + int allocated = 0; struct obj_cgroup *objcg = NULL; if (!size) @@ -4084,19 +4334,147 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, if (unlikely(!s)) return 0; - i = __kmem_cache_alloc_bulk(s, flags, size, p); + if (s->cpu_array) + from_pca = alloc_from_pca_bulk(s, size, p); + + if (from_pca < size) { + allocated = __kmem_cache_alloc_bulk(s, flags, size-from_pca, + p+from_pca); + if (allocated == 0 && from_pca > 0) { + __kmem_cache_free_bulk(s, from_pca, p); + } + } + + allocated += from_pca; /* * memcg and kmem_cache debug support and memory initialization. * Done outside of the IRQ disabled fastpath loop. */ - if (i != 0) + if (allocated != 0) slab_post_alloc_hook(s, objcg, flags, size, p, slab_want_init_on_alloc(flags, s), s->object_size); - return i; + return allocated; } EXPORT_SYMBOL(kmem_cache_alloc_bulk); +static bool refill_pca(struct kmem_cache *s, unsigned int count, gfp_t gfp) +{ + void *objects[32]; + unsigned int batch, allocated; + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + +bulk_alloc: + batch = min(count, 32U); + + allocated = __kmem_cache_alloc_bulk(s, gfp, batch, &objects[0]); + if (!allocated) + return false; + + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + if (!pca) { + pcp_trylock_finish(UP_flags); + return false; + } + + batch = min(allocated, pca->count - pca->used); + + for (unsigned int i = 0; i < batch; i++) { + pca->objects[pca->used++] = objects[i]; + } + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + stat_add(s, PCA_REFILL, batch); + + /* + * We could have migrated to a different cpu or somebody else freed to the + * pca while we were bulk allocating, and now we have too many objects + */ + if (batch < allocated) { + __kmem_cache_free_bulk(s, allocated - batch, &objects[batch]); + } else { + count -= batch; + if (count > 0) + goto bulk_alloc; + } + + return true; +} + +static bool flush_pca(struct kmem_cache *s, unsigned int count) +{ + void *objects[32]; + unsigned int batch, remaining; + unsigned long __maybe_unused UP_flags; + struct slub_percpu_array *pca; + +next_batch: + batch = min(count, 32); + + pcp_trylock_prepare(UP_flags); + pca = pca_spin_trylock(s->cpu_array); + if (!pca) { + pcp_trylock_finish(UP_flags); + return false; + } + + batch = min(batch, pca->used); + + for (unsigned int i = 0; i < batch; i++) { + objects[i] = pca->objects[--pca->used]; + } + + remaining = pca->used; + + pca_spin_unlock(pca); + pcp_trylock_finish(UP_flags); + + __kmem_cache_free_bulk(s, batch, &objects[0]); + + stat_add(s, PCA_FLUSH, batch); + + if (batch < count && remaining > 0) { + count -= batch; + goto next_batch; + } + + return true; +} + +/* Do not call from irq handler nor with irqs disabled */ +int kmem_cache_prefill_percpu_array(struct kmem_cache *s, unsigned int count, + gfp_t gfp) +{ + struct slub_percpu_array *pca; + unsigned int used; + + lockdep_assert_no_hardirq(); + + if (!s->cpu_array) + return -EINVAL; + + /* racy but we don't care */ + pca = raw_cpu_ptr(s->cpu_array); + + used = READ_ONCE(pca->used); + + if (used >= count) + return 0; + + if (pca->count < count) + return -EINVAL; + + count -= used; + + if (!refill_pca(s, count, gfp)) + return -ENOMEM; + + return 0; +} /* * Object placement in a slab is made very easy because we always start at @@ -5167,6 +5545,65 @@ int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags) return 0; } +/** + * kmem_cache_setup_percpu_array - Create a per-cpu array cache for the cache + * @s: The cache to add per-cpu array. Must be created with SLAB_NO_MERGE flag. + * @count: Size of the per-cpu array. + * + * After this call, allocations from the cache go through a percpu array. When + * it becomes empty, half is refilled with a bulk allocation. When it becomes + * full, half is flushed with a bulk free operation. + * + * Using the array cache is not guaranteed, i.e. it can be bypassed if its lock + * cannot be obtained. The array cache also does not distinguish NUMA nodes, so + * allocations via kmem_cache_alloc_node() with a node specified other than + * NUMA_NO_NODE will bypass the cache. + * + * Bulk allocation and free operations also try to use the array. + * + * kmem_cache_prefill_percpu_array() can be used to pre-fill the array cache + * before e.g. entering a restricted context. It is however not guaranteed that + * the caller will be able to subsequently consume the prefilled cache. Such + * failures should be however sufficiently rare so after the prefill, + * allocations using GFP_ATOMIC | __GFP_NOFAIL are acceptable for objects up to + * the prefilled amount. + * + * Limitations: when slub_debug is enabled for the cache, all relevant actions + * (i.e. poisoning, obtaining stacktraces) and checks happen when objects move + * between the array cache and slab pages, which may result in e.g. not + * detecting a use-after-free while the object is in the array cache, and the + * stacktraces may be less useful. + * + * Return: 0 if OK, -EINVAL on caches without SLAB_NO_MERGE or with the array + * already created, -ENOMEM when the per-cpu array creation fails. + */ +int kmem_cache_setup_percpu_array(struct kmem_cache *s, unsigned int count) +{ + int cpu; + + if (WARN_ON_ONCE(!(s->flags & SLAB_NO_MERGE))) + return -EINVAL; + + if (s->cpu_array) + return -EINVAL; + + s->cpu_array = __alloc_percpu(struct_size(s->cpu_array, objects, count), + sizeof(void *)); + + if (!s->cpu_array) + return -ENOMEM; + + for_each_possible_cpu(cpu) { + struct slub_percpu_array *pca = per_cpu_ptr(s->cpu_array, cpu); + + spin_lock_init(&pca->lock); + pca->count = count; + pca->used = 0; + } + + return 0; +} + #ifdef SLAB_SUPPORTS_SYSFS static int count_inuse(struct slab *slab) { @@ -5944,8 +6381,10 @@ static ssize_t text##_store(struct kmem_cache *s, \ } \ SLAB_ATTR(text); \ +STAT_ATTR(ALLOC_PCA, alloc_cpu_cache); STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath); STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath); +STAT_ATTR(FREE_PCA, free_cpu_cache); STAT_ATTR(FREE_FASTPATH, free_fastpath); STAT_ATTR(FREE_SLOWPATH, free_slowpath); STAT_ATTR(FREE_FROZEN, free_frozen); @@ -5970,6 +6409,8 @@ STAT_ATTR(CPU_PARTIAL_ALLOC, cpu_partial_alloc); STAT_ATTR(CPU_PARTIAL_FREE, cpu_partial_free); STAT_ATTR(CPU_PARTIAL_NODE, cpu_partial_node); STAT_ATTR(CPU_PARTIAL_DRAIN, cpu_partial_drain); +STAT_ATTR(PCA_REFILL, cpu_cache_refill); +STAT_ATTR(PCA_FLUSH, cpu_cache_flush); #endif /* CONFIG_SLUB_STATS */ #ifdef CONFIG_KFENCE @@ -6031,8 +6472,10 @@ static struct attribute *slab_attrs[] = { &remote_node_defrag_ratio_attr.attr, #endif #ifdef CONFIG_SLUB_STATS + &alloc_cpu_cache_attr.attr, &alloc_fastpath_attr.attr, &alloc_slowpath_attr.attr, + &free_cpu_cache_attr.attr, &free_fastpath_attr.attr, &free_slowpath_attr.attr, &free_frozen_attr.attr, @@ -6057,6 +6500,8 @@ static struct attribute *slab_attrs[] = { &cpu_partial_free_attr.attr, &cpu_partial_node_attr.attr, &cpu_partial_drain_attr.attr, + &cpu_cache_refill_attr.attr, + &cpu_cache_flush_attr.attr, #endif #ifdef CONFIG_FAILSLAB &failslab_attr.attr, From patchwork Wed Nov 29 09:53:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472568 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7789CC07CB1 for ; Wed, 29 Nov 2023 09:53:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39AE56B03BB; Wed, 29 Nov 2023 04:53:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 095D16B03BF; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDD676B03BA; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 443766B03C0 for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0643C1404E1 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.21.BB14862 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf23.hostedemail.com (Postfix) with ESMTP id D5FE9140009 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf23.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PX3MvAB9bT05yeaZSR4WJfiyLfcVNVB/Ayxh6IAIMM0=; b=TgEWBB2zJ0zh8+93bgQPrXjQIegtBQtRp+ppZJHwpi9/eggJ1DJ/hG48dSgoStLollT5G8 lJeYHBnQ/Jdfnxq6KxOS9O+UalBKq+0UR5ddK5k7Hf+daEH2Zk1DeJMy674Hwcj52YKCb0 La+dCGMEXpPzXp6ZweRjK7mFy7i0EiE= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf23.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=G7ZCs9PrcoIRcbxozrIAEpktaqhWNetr3DMiNiupBKYQ4Dpq+YwfkvjCUVxoQDF/Sb96Od 3F1/nbyol1qm9jf/xkhYro4apPRnIfaGhVE1Ta6G2l3VvVndM3etg1nG6Pk2NTQFxA6i0z rcC7GYJxK5eRj2m/Z6OeweQ0eStwbgo= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E628721995; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 59C0B13A9D; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id yN2eFSEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:31 +0100 Subject: [PATCH RFC v3 6/9] tools: Add SLUB percpu array functions for testing MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-6-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Rspam-User: X-Stat-Signature: tnj1psxo7dz6e9uk7fntr1c6urgexfyh X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D5FE9140009 X-HE-Tag: 1701251619-151661 X-HE-Meta: U2FsdGVkX19zQf0r6/iyo7wgFubhFJcxpzE+cnSnYSpeY0+hFZKJyOiUadaSXimbJd+MU2CgTS2vrbboIV9HE/VO35TZDmxpNNzGdQ4Z4aL0unfOcNgZJtroA3GJ1S3chRv4HZUugP2WQcTs8KpXMa1WvTwI/+hPvVmwByqorwR8FHx48vE7q00Nv6+5tNyFji0v0P368He1nwYfFSva3QyVQFlo+PxEPNQzR8/OXuBbEfxgBYi+QpcHU2KHRfdSiU+Oaz+S9j1vtauRAt5G7fBHvleDxU1wLDvfaJU8hJRW/HQxyTkcxGYyujN49kWgfGfzkPgTF+XfMk4RAuSy2T6isWs35rLfvBK8huzwnnygGTGFd8m7gl7lMUdct75t/UO3Fd788QBCnaDJA/k+Q9rdyY4ClnWBtsffRZBKCejHkZ766sAZIhqPDMaFS8ye5YAglCiowFlMT43FYhbLRouJbEKpWbPnu7x6XTdF4UTwthfgbLWP5+G1vYOiPKGlZ+ccHq1mBGy5NvBm34fo/RmG5v2kJk8BbMLwXkOSL6XS8JH8g0e25bWjZ2aw1vpcoe0kGmPz4VB8aoyfRf2iAgkVJcE/2scOxF1FzjWGgC/DGQN6gBUYCRfMuxYEGAuqAjbCVwLYNMwHIuneskYiRTW8YqcMFav7mL9QLcjt4RbB5JpV6weyfTbTKpKQT9Qwk590B7yfKr65le1O6E09aKMSg+VlG0m7VtGBdWceA2QtdJPN3Qdc6FeWL2Y2HLoCi3vnPLmiO1mf2/UOcTAIC0FQWANUKkgklQkYm8wK6ybSRLsXF1pk6kKZqbjdJtejXStbP0+Vi/fU0em+VniHf1bOPcZf3hz5gQkF+NlByCYxcAs3tP8lLDPUmamCPMnTCCghqV079kEYhYjKDicnLakiIo64ll2klHPzZBzhjIY06d0Rjlk0/b/cNSi82oh+tvo6eXPjJIlrPJ4H5FF EcwJajRW BNuydWI5yA8ChtlIk83SSSGGTlOQwA790321G6xqyfoeXMe0l2SRXWLRnoxCSS/sf8pquc97swHNlwV6npCsTyY1HTplpUeiU347cELxj0VNIFQunzueKSXLF6vhKKrwaz/xUv5cq8n5xyHnxO9LrUOMl2GmV/bHEO5yVtqJ/DTkTwW2g61eALW8AHTjoOZxeieURioBWdPdze83jFt4ht5qKXg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Liam R. Howlett" Support new percpu array functions to the test code so they can be used in the maple tree testing. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- tools/include/linux/slab.h | 4 ++++ tools/testing/radix-tree/linux.c | 14 ++++++++++++++ tools/testing/radix-tree/linux/kernel.h | 1 + 3 files changed, 19 insertions(+) diff --git a/tools/include/linux/slab.h b/tools/include/linux/slab.h index 311759ea25e9..1043f9c5ef4e 100644 --- a/tools/include/linux/slab.h +++ b/tools/include/linux/slab.h @@ -7,6 +7,7 @@ #define SLAB_PANIC 2 #define SLAB_RECLAIM_ACCOUNT 0x00020000UL /* Objects are reclaimable */ +#define SLAB_NO_MERGE 0x01000000UL /* Prevent merging with compatible kmem caches */ #define kzalloc_node(size, flags, node) kmalloc(size, flags) @@ -45,4 +46,7 @@ void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **list); int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size, void **list); +int kmem_cache_setup_percpu_array(struct kmem_cache *s, unsigned int count); +int kmem_cache_prefill_percpu_array(struct kmem_cache *s, unsigned int count, + gfp_t gfp); #endif /* _TOOLS_SLAB_H */ diff --git a/tools/testing/radix-tree/linux.c b/tools/testing/radix-tree/linux.c index 61fe2601cb3a..3c9372afe9bc 100644 --- a/tools/testing/radix-tree/linux.c +++ b/tools/testing/radix-tree/linux.c @@ -187,6 +187,20 @@ int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size, return size; } +int kmem_cache_setup_percpu_array(struct kmem_cache *s, unsigned int count) +{ + return 0; +} + +int kmem_cache_prefill_percpu_array(struct kmem_cache *s, unsigned int count, + gfp_t gfp) +{ + if (count > s->non_kernel) + return s->non_kernel; + + return count; +} + struct kmem_cache * kmem_cache_create(const char *name, unsigned int size, unsigned int align, unsigned int flags, void (*ctor)(void *)) diff --git a/tools/testing/radix-tree/linux/kernel.h b/tools/testing/radix-tree/linux/kernel.h index c5c9d05f29da..fc75018974de 100644 --- a/tools/testing/radix-tree/linux/kernel.h +++ b/tools/testing/radix-tree/linux/kernel.h @@ -15,6 +15,7 @@ #define printk printf #define pr_err printk +#define pr_warn printk #define pr_info printk #define pr_debug printk #define pr_cont printk From patchwork Wed Nov 29 09:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B9AFC4167B for ; Wed, 29 Nov 2023 09:53:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C28AC6B03C0; Wed, 29 Nov 2023 04:53:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 48A286B03C6; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9B706B03BA; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4072E6B03BE for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 16D3E1A03CB for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.17.DD68507 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf08.hostedemail.com (Postfix) with ESMTP id E813516001B for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sIzawMHw0qEvdtsrb8CaYai+9KHMb5Dh5VZ/ALQ2CnQ=; b=OHzwtJqWNAuf8KKfIvYzEhuPzCwV/N5mMltopIwvfcKFYFAhvfUEQEpq/dEyadwpMUlV/A LEsqLYi+kETPI/VPahoeMPPyRbhdx/tMsk6XA+riKXscmGbu08abxEBBnccpppOWfF4fMI RRtlt/Tf4bRLeYsRoscHYuYdvXMXbCs= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf08.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=ls825N5uqAnwooHM5BpBAKTRqAjxLdSJZA/u07thpzcktEOK415pes2XvFtq1MuVsEz73m TDW3fnyGOPKbjro5dCoi9A4C7pnG3ntSGyRnlxQ95A060TtaHTPyDmB1aZwgVHmchqf5Xl rv2Nq5++p2DstCAFSY0ZzL2OBHgcAbw= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id EE57F1F8BB; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 73E2B13A9E; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id eE8DHCEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:32 +0100 Subject: [PATCH RFC v3 7/9] maple_tree: use slub percpu array MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-7-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com X-Mailer: b4 0.12.4 X-Spamd-Bar: ++++++++++++ X-Rspamd-Queue-Id: E813516001B X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: xa9fgc9x3nrdtf59s9fahoppmopy8sk8 X-HE-Tag: 1701251619-478539 X-HE-Meta: U2FsdGVkX19r63jw5BkGIqiI8LxsdCQLgDT5hxhe+RAh7Ksj7kR/5NNNL4H3dvgBdaTrdwWwXttknFDh7EoLokXf0W/amOYJO9Is5KMjmCu+Bk6qIGSTLX4KDVY5/ehOwgzHU0TszIwbK5QhjxuEwkn1LAbc8ZrFiUU3q3tpudlw8e4u8CDGk9z437dQbHty6CBHzIymynBpwHRFNR2VB6ibnU+pUSVcixIsfa2N0twvKrdwVCCAF8udJ3dffKY8KgC7ZU+/XVIIp4n97xVEWf2Na6g9PwWJCIEwwyFmaiyY6tgzX3a0vaDzHZC2l6noe+YuoKG1xUOIqSK0gvmcbbNCrAcKYNslOKFyvbZ0zAT0+po772lSAH1kwylWgiuYLBlo96hyqfEWeMYtROBXdbby9Ji/tau+du+zb+YmvDRTbRKE7V7ro5t4FY79xPVfzMKn+RsT69IUNN9/F2fKYUyP15OxMZf2I2AgMeNieO4mk2J+y4HZj+4F8MG5elYB6Z3h1APDhAznrLufc3I+NvjXsM/Ji0PFTx+C/OAFFuXGIL1Ayr+JIjie2MYpWhMxmr9dclIrINFxubWPT2NW9ERl1CXJfY39DzzcKTVWv82eY+QhUa8e/uPQBp7zPHcBFzFKU0eq/P/x8VdZFzS4eGkmkjt23qozQZ+CWUwGS28WTAOQSm11/s7HjoI4BCt52T7ljWjMkH/0l+m22kpm2qgTrlGpsTF1TKwkr0/MU1Qr0U979Avj7wh5Mns82SdHPQaW22AXA8IfQnxy7rMH5kIQ1BykhvLH3H41yXNxGCtEvuY6bUU3l/V2G/9Z/x/NxsOzesDvt2+eLq+GjxxlVlvMuoh27hKRR2sPnKblW8Tw9VJoSLBvMaqE1rqu52k7+6gOACP2WhqvFQgD/MVxSoOrdR3xioDlShrqyg0zTTdtu+zeKKmQbzJGJEyjuWAQ3ynDvRcxmdU3bYOyZpt 63fCKRO/ 10XwK8bZplRFCBDaWerVmNTzOcVBhm8c4JqeGWolQBL3u0EOEti1Py72mOJxgiMIogDOc03vKKgeDnQQLRe2zuHHpAn0jY0SZHrXGTXDiG+NVul8IeJ7qpgoUtDJS0Ne9IrMoR1l5DJ60jkRRlhJC6i2bco+mNFZXz6m6MOujKKeIHKWycuDb58W4Ld+TiLbqrY5xeFEB+307Xa0CzZ5P+bhX20XejyeIzFMdtc0njLn1Ft8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Just make sure the maple_node_cache has a percpu array of size 32. Will break with CONFIG_SLAB. --- lib/maple_tree.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index bb24d84a4922..d9e7088fd9a7 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -6213,9 +6213,16 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp) void __init maple_tree_init(void) { + int ret; + maple_node_cache = kmem_cache_create("maple_node", sizeof(struct maple_node), sizeof(struct maple_node), - SLAB_PANIC, NULL); + SLAB_PANIC | SLAB_NO_MERGE, NULL); + + ret = kmem_cache_setup_percpu_array(maple_node_cache, 32); + + if (ret) + pr_warn("error %d creating percpu_array for maple_node_cache\n", ret); } /** From patchwork Wed Nov 29 09:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472566 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BA36C07E97 for ; Wed, 29 Nov 2023 09:53:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE4BA6B03BC; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B68506B03C6; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E55A6B03BC; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 389D86B03BC for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E3F24140471 for ; Wed, 29 Nov 2023 09:53:41 +0000 (UTC) X-FDA: 81510529842.23.5699AED Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf11.hostedemail.com (Postfix) with ESMTP id CA4AB4000D for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=CqBPNssnc7pDIvvI4ChWEk+1tkcXjlGDVPM4rOPDy1vi+etzPXu1p4ws/vi9xuuT7tAbzs Uj9LB4qCH3ECQAorJ08jym0H4ucKuaBpccr8ZSKPPWuCAD4j/eLmFW97h3dd0oFzO/UCo8 HkncJ3D4KTmOzIF1DO5qw2Xe4d1/UfI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ItnxEFdMdReUcgI2eC5m3fUPnWv97oIpUOC4WdcPO3w=; b=AvJM/i8lUp3SFWBa45a4Eaxk2HqumUiT97hV7WgM/kkhUm7Ynz6WUS04WxQ2VvdWWcDCLP 4+nCOIWJUg8IDKhIPKLmGwSUOhLrm0fR1nnh/yCtomV8wberVmjCeC9hTSJ8ejXTuArqWQ /jJJlchSqYhK587DOzlvxZECQWecgSE= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id EEF061F8BD; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 911A113AA0; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id YPESIyEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:33 +0100 Subject: [PATCH RFC v3 8/9] maple_tree: Remove MA_STATE_PREALLOC MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-8-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.12.4 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CA4AB4000D X-Stat-Signature: hj4ubuoqjdx7z1s939k957soxm4a7yhg X-HE-Tag: 1701251619-701416 X-HE-Meta: U2FsdGVkX19KwHzrc8DZVpWCFoqYOANmXX9wQo/06pp2v/yGX0L0UbxBAD1pe/LhkvVyyOfIhphWoWua/PLwWcVudJE6JXjOBu7q2NFrZFgqGuQ8K5oZIZbBlW2NK7Gk1LgoHRBJXyTtbz49jviBT6LCkNvjK/CeePm2E20CiI8pGdMGps6GM05vZ1qeDaboiEaOqDiIpKCScknpLTqrzc6kX7TuEyoYkyzeLTZMBAMEAv3AvygTcrcEiTiFMTMyfX+46MRy6X6NxqWKJs/SZkiY0ZdRVwFuK1xq7IgOHPnV+gm798dpx28HuofwbX/C+3zC08i7yWFg6Gu/HrBI5foxeFCg0I287kuDDyceL6CxcrXMSmlLeuwWS5Vcq/5cWS4I8aOcABkiJOlrIjIqeRn02BF4t9FwR7T1HVNn+qu6Oo5oLV6L0ZA1ajFffKTcjAQo3OUvRvo3EaPez6W021wKb/fniyHQPo+p1eFC93mb9Wyyu2mDjqUYY37ZfEm/y4VA4JWbB8Wvgi4IOyBRaBo+a9cATV4xh/L55XYyT+3uYeIcBx04vY6cpTJo4AMlakIcIU6ilzIwVe9oCj0GHq1ntldlLVC8Jt5f4i3xE9xJCcZsAhYv5W1xUNgytBq49D60f/q0gaXUTpK/RrZ+xPqz6fnXhXOHjiDEfpPSs6TPwGT5GIUZ08xo63akLZdrvS8xv7je80iIvknuhJGWrGAcyCbNI6mb7Y+6WR9erMJBFGyeNiHEN3iZXqM/Ve4gqUBEbdOQy6qK3XfgDQSfdyF4tb8JXAB9KirJIO3SS+fhw0vPAE22kzMHQ6aGoM5eVLl8ZbWmMpOrodw9aD5ixJgTPLWWiXr4HIE0hbsID4UJGKBGFXg0eopN76IbgY49zIGgex4sfz9j4vdCsH+8kWhoZjnCf2TKwcfPTahKV+ZB97kivju6wIHn+z7hdX6s7jKejRItbszCQeRZVQU 4ztalfjt HdRBOxJOB1xVDEGrO/0xlZYRiNo+hfyqH4bo2UuRVBAxjAhkNifTNY7dcQdExDeClhf3HR+eH0kHhjnoAmOrU+ik7YwuiPYRgTgkTEIP3oPeVt3y2XINzAeM4d9LENuntfaV2l6y7AwVBvNcrZu3k8UsK9V3RELGtsEvUHtBpVOrHzUw5x5wvVKcFnQqwY2n3KqPM0neB99NrvYk0PABjVg9N8tN6g4WesSjjCKn/0KJZ/jzLGAbAqYPisEAKDsWR/5+Z X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Liam R. Howlett" MA_SATE_PREALLOC was added to catch any writes that try to allocate when the maple state is being used in preallocation mode. This can safely be removed in favour of the percpu array of nodes. Note that mas_expected_entries() still expects no allocations during operation and so MA_STATE_BULK can be used in place of preallocations for this case, which is primarily used for forking. Signed-off-by: Liam R. Howlett Signed-off-by: Vlastimil Babka --- lib/maple_tree.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index d9e7088fd9a7..f5c0bca2c5d7 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -68,11 +68,9 @@ * Maple state flags * * MA_STATE_BULK - Bulk insert mode * * MA_STATE_REBALANCE - Indicate a rebalance during bulk insert - * * MA_STATE_PREALLOC - Preallocated nodes, WARN_ON allocation */ #define MA_STATE_BULK 1 #define MA_STATE_REBALANCE 2 -#define MA_STATE_PREALLOC 4 #define ma_parent_ptr(x) ((struct maple_pnode *)(x)) #define mas_tree_parent(x) ((unsigned long)(x->tree) | MA_ROOT_PARENT) @@ -1255,11 +1253,8 @@ static inline void mas_alloc_nodes(struct ma_state *mas, gfp_t gfp) return; mas_set_alloc_req(mas, 0); - if (mas->mas_flags & MA_STATE_PREALLOC) { - if (allocated) - return; - WARN_ON(!allocated); - } + if (mas->mas_flags & MA_STATE_BULK) + return; if (!allocated || mas->alloc->node_count == MAPLE_ALLOC_SLOTS) { node = (struct maple_alloc *)mt_alloc_one(gfp); @@ -5518,7 +5513,6 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp) /* node store, slot store needs one node */ ask_now: mas_node_count_gfp(mas, request, gfp); - mas->mas_flags |= MA_STATE_PREALLOC; if (likely(!mas_is_err(mas))) return 0; @@ -5561,7 +5555,7 @@ void mas_destroy(struct ma_state *mas) mas->mas_flags &= ~MA_STATE_REBALANCE; } - mas->mas_flags &= ~(MA_STATE_BULK|MA_STATE_PREALLOC); + mas->mas_flags &= ~MA_STATE_BULK; total = mas_allocated(mas); while (total) { @@ -5610,9 +5604,6 @@ int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries) * of nodes during the operation. */ - /* Optimize splitting for bulk insert in-order */ - mas->mas_flags |= MA_STATE_BULK; - /* * Avoid overflow, assume a gap between each entry and a trailing null. * If this is wrong, it just means allocation can happen during @@ -5629,8 +5620,9 @@ int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries) /* Add working room for split (2 nodes) + new parents */ mas_node_count_gfp(mas, nr_nodes + 3, GFP_KERNEL); - /* Detect if allocations run out */ - mas->mas_flags |= MA_STATE_PREALLOC; + /* Optimize splitting for bulk insert in-order */ + mas->mas_flags |= MA_STATE_BULK; + if (!mas_is_err(mas)) return 0; From patchwork Wed Nov 29 09:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13472570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72756C07CB1 for ; Wed, 29 Nov 2023 09:53:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A39216B03BD; Wed, 29 Nov 2023 04:53:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 32A1D6B03C0; Wed, 29 Nov 2023 04:53:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E26106B03C5; Wed, 29 Nov 2023 04:53:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3D9D46B03BD for ; Wed, 29 Nov 2023 04:53:42 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1ABF5160237 for ; Wed, 29 Nov 2023 09:53:42 +0000 (UTC) X-FDA: 81510529884.17.42C5E19 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf15.hostedemail.com (Postfix) with ESMTP id E19BEA0012 for ; Wed, 29 Nov 2023 09:53:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=inSOz4yT; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Fx7hfg5V; dmarc=none; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701251620; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OzJx1XLcKpfOVPPxGeC45HXa1Ww2Cw/dmnDirEUeMrA=; b=OyaCTykdSFvlzrn2BiPtZlkwBzTN6gX2YxaVuBINh0sX3aEuXmGJdF/kNNJNHPTqhi1V9G SjhopG1y8lONKlEFxkWkM3pcIDok90zSbm7rxJnPBOQ2hprqiI7d1Hkl6lwx+8m+BEDvRH Pra6nSHj0rwcBzIb13vJwM7SnNh852w= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=inSOz4yT; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=Fx7hfg5V; dmarc=none; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701251620; a=rsa-sha256; cv=none; b=APd2z0heL9t2hbrCyVdkgz5EwidlTTkqKClBZWFoBMnJIEZIPufu+37RHS/TKwgDz3rDaU YXy1+jZ041R6Sj5uvfuy8qLXOj9/9GHFItjlO8JJ1cv7zcmR/ZoBGzhrH3n0fCyDUgHB4v T+FL/TbXdNCLaUElvcQ0/jFW7p0Bmic= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F1F931F8BE; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1701251618; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OzJx1XLcKpfOVPPxGeC45HXa1Ww2Cw/dmnDirEUeMrA=; b=inSOz4yT8TFgPaEVNqR0q6mY6UH6gJHUbkvbIB60MQaG+WfZTTe8fyxGmqgFcEPzn+zbkH X4wFuffCvNZPe31CV8gw6BNVz2L7dpmQFwcoVuq+TvUiUMdeAYgBVI4N/mYaqnMfUmr0nc /CknMU6FEMRIquT7ifcgQ/wnwv5HELU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1701251618; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OzJx1XLcKpfOVPPxGeC45HXa1Ww2Cw/dmnDirEUeMrA=; b=Fx7hfg5Vv0T5+dLFQWNf1/uBwvIhSApKaJpzlzazw7YdiXTEwqcjlUJKSONFGFXMIuuIIj UDa9Bi0THVElqdBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id AC77013AA1; Wed, 29 Nov 2023 09:53:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id aFDSKSEKZ2UrfQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:53:37 +0000 From: Vlastimil Babka Date: Wed, 29 Nov 2023 10:53:34 +0100 Subject: [PATCH RFC v3 9/9] maple_tree: replace preallocation with slub percpu array prefill MIME-Version: 1.0 Message-Id: <20231129-slub-percpu-caches-v3-9-6bcf536772bc@suse.cz> References: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> In-Reply-To: <20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , "Liam R. Howlett" Cc: Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko , Marco Elver , Dmitry Vyukov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, maple-tree@lists.infradead.org, kasan-dev@googlegroups.com X-Mailer: b4 0.12.4 X-Rspamd-Queue-Id: E19BEA0012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 13fxjt8uowhnbdgir1uod8osqztnzedb X-HE-Tag: 1701251619-871978 X-HE-Meta: U2FsdGVkX19BvpYvS/64SFFIzj6I7rs9LVae8WcMAYZ7oJHSiEh3Us8Sjwjblh+GwTExK/TlpwK177uvNsQVWEA3XJl59H9xvq2OxTqvqZx8FH1tQHg92hRfQG8sW0x6gnD9346fuo5yZlKaXHuROoEt4Sn6/58rtGpTrVp1Cenu7i2B/UVBIantTCTOwxuWXz9MlIzN+os2Lcy6VPEO702QN0uPVwXpoJ1XPiCBZcSLM9kGJyej7YZBQVJy/7lTNZeMxrIH2WSpEQQ5RQsMsp7zfXtivA8AWYP8B0qmOtB97hUWkosGX1XrP3eXiZJjCbQrqrGKDqoM/bbbyRgUYCqxj8qyIzdmcp9w82skGmvRbTFK8vDT7XmNi6sKvc1Mk6XDwnr34IQ1TkwTr+/MYGg1y5L2j2VR5tJpV36g9nOFeL+7ypPS9MXeshUL8wue5Pe29tuMIjdZDJIUtA57qXbQ9AJShxXI+xyDLmjS2dddv9gdJM4qT659s3zdkyRJE12b/anvNRVIvA346h0KqJijjNInBeSx0aA9bPpjcqjKaq+Y/YB9qmN5tnNOGu0HqjJhaxdPiZ9nRxreffBJrnGHyXSpLPkFrksO6dYsp1JlBVXmH7wjc+oAm3dTm1QGEu3X1gSFgtzdB/1Hqx/FiNa25OhenNtVPi4ipLLoy5dLGXcArktkbdG+gbN0tTrXw5TODZdejVSG3gavapxB0J5ElcxJfPGOv6ntmA98Xf+53IR5F554NE9egiGgRP619vwOAPHy7mvLUTdCGyGhAtqIxk0WU0Izt7H89gG9ENdJCqqaXkc/klu3scOAZynA6bkAVBo/+SNMk+Y38hOK+ATfRu8KuRHsPN2WKD7CMOqYOIFLeeE9wOSSZJUFMT8Y8SkdJ4B9NUsbAtqyr0RIQsR0ldvXCHOSgPUfGQd0S6+CuZ6mVs/PmnZp6+1cLTrmWfj+oWo+rdTfmzsPjZW A4YCwQDn fVvn+Zabtmm9+DO7hS5saVaUI3EylPI9r/96eXCuklnAtv4EpZgNmpWHn7KQNtdfDKtEapMkMWK2klGz5SR85Elx29ttn395yB2GPS1zjHU5zjxxRqeLCN4/gkOvBd0q9LdpHBahtu97UbdiD6XmTWKXYv+JEGHL8FlKBKadGgQZEOGepY2WSZUgMPOkuxd1OqveZXbmRbe/z4di4Q0SJ0JKgTmxsl2bzO2r0Q6DWJBrwAmm+8Dq4ogR2q40arlzYiig2Lk5N087M34t4htfIsoLJ0d/ujSZzKwsjuk8WGnwMJCOxZGI0WOVn/HOjltXI83lE9hwh+VCzvhhaNvXhZQ8QtA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With the percpu array we can try not doing the preallocations in maple tree, and instead make sure the percpu array is prefilled, and using GFP_ATOMIC in places that relied on the preallocation (in case we miss or fail trylock on the array), i.e. mas_store_prealloc(). For now simply add __GFP_NOFAIL there as well. --- lib/maple_tree.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index f5c0bca2c5d7..d84a0c0fe83b 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -5452,7 +5452,12 @@ void mas_store_prealloc(struct ma_state *mas, void *entry) mas_wr_store_setup(&wr_mas); trace_ma_write(__func__, mas, 0, entry); + +retry: mas_wr_store_entry(&wr_mas); + if (unlikely(mas_nomem(mas, GFP_ATOMIC | __GFP_NOFAIL))) + goto retry; + MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas)); mas_destroy(mas); } @@ -5471,8 +5476,6 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp) MA_WR_STATE(wr_mas, mas, entry); unsigned char node_size; int request = 1; - int ret; - if (unlikely(!mas->index && mas->last == ULONG_MAX)) goto ask_now; @@ -5512,16 +5515,8 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp) /* node store, slot store needs one node */ ask_now: - mas_node_count_gfp(mas, request, gfp); - if (likely(!mas_is_err(mas))) - return 0; + return kmem_cache_prefill_percpu_array(maple_node_cache, request, gfp); - mas_set_alloc_req(mas, 0); - ret = xa_err(mas->node); - mas_reset(mas); - mas_destroy(mas); - mas_reset(mas); - return ret; } EXPORT_SYMBOL_GPL(mas_preallocate);