From patchwork Fri Sep 6 16:04:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Lameter via B4 Relay X-Patchwork-Id: 13794408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54542E6FE2F for ; Fri, 6 Sep 2024 16:04:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD49B6B008C; Fri, 6 Sep 2024 12:04:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D84C56B0092; Fri, 6 Sep 2024 12:04:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C26526B0093; Fri, 6 Sep 2024 12:04:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A564F6B008C for ; Fri, 6 Sep 2024 12:04:49 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 687C1140917 for ; Fri, 6 Sep 2024 16:04:49 +0000 (UTC) X-FDA: 82534786698.21.0B88521 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf03.hostedemail.com (Postfix) with ESMTP id 5E85F20006 for ; Fri, 6 Sep 2024 16:04:46 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GB4pfwoj; spf=pass (imf03.hostedemail.com: domain of devnull+cl.gentwo.org@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=devnull+cl.gentwo.org@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725638661; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=ncnjPkx8eXt7mWFWtJHA5NGBUUkuZ4q9Eme6f/j7SxM=; b=USnTyX04ALgIOSeu2gAzLWiH+CIj9Q8k0/T5yX2sRVeidBkONchWDrKWDKSJl+PAGvLtkl pGA3VPo8OGK1rngM7YBf6VfS9Ai3uCTj6gKVBapzmhq9nN6K8pPLD+EH5TbEyHyzex7iQr BDMWoVpXUZY8IaXax32DRMM6MZUSCYs= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GB4pfwoj; spf=pass (imf03.hostedemail.com: domain of devnull+cl.gentwo.org@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=devnull+cl.gentwo.org@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725638661; a=rsa-sha256; cv=none; b=uiMPKXkA9RwzGk21U/0j8ZLAEW4wPd/ENYQZOijcJ2LxX7oFuobNcSw6yF+g3mXnJ9W2hX 0RCUOdpGPjq8E8b9cIu8AMIiwtYU5pxSKX/iJOYp/wMuYOPaC2jUuG3p0ZzaynBlU9TmKv qnWBqKiBM8YdfcpGdqpQ5pV7n7N+Szc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id F41D3A4513A; Fri, 6 Sep 2024 16:04:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPS id E9B8FC4CEC9; Fri, 6 Sep 2024 16:04:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1725638685; bh=EkjbHE0FlAmRrMlgDfWFE2B65U98InGjSxqxIy/m5Yc=; h=From:Date:Subject:To:Cc:Reply-To:From; b=GB4pfwojyi8l5/vFB3uD0pOv05WS+0lQEaWX5HWJ+je/oCooS/DtP5VNOSoU5fsnZ MDv56OyVTDUag+pdRLR3SIYwcHt3LYTXzDt5erdxbG0MBA7YQghX6nTJOlVDLwrAD6 gQgP03/2iWGEm41vKWMSjLrxuuvYPIBg4qB/GWO+rnDqdfucuki35XZHs/kJKWz6FK ta4GQj4h5COODoF0B5JuHi4XeM5VG16Ed72EDqel8jdzqEIhNbM37ic6eKKmyRvcxx Z1zcPowN+0mwImH659bZ873/BwfaCVfceJRHIW33pWoSHpSArfgR5jm+vQNag2QJC0 1/xEQwc5aMTjQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D341BCE7B0C; Fri, 6 Sep 2024 16:04:44 +0000 (UTC) From: Christoph Lameter via B4 Relay Date: Fri, 06 Sep 2024 09:04:36 -0700 Subject: [PATCH v2] SLUB: Add support for per object memory policies MIME-Version: 1.0 Message-Id: <20240906-strict_numa-v2-1-f104e6de6d1e@gentwo.org> X-B4-Tracking: v=1; b=H4sIABMo22YC/22NsQ7CIBRFf6V5s5gCGlonjYujg5tpTKWP9g2CA ayahn+XMDuee3PPXSCgJwywqxbwOFMgZzOIVQV66u2IjIbMIGqxqRveshA96Xizr0fPjN62dym 5kL2AvHh6NPQptiucD5fjCbocTxSi89/yMfNS/tXNnHGGA2+NUI1WRu1HtPHt1s6P0KWUfoutD nauAAAA To: Vlastimil Babka , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Yang Shi , Christoph Lameter Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Huang Shijie , "Christoph Lameter (Ampere)" X-Mailer: b4 0.15-dev-37811 X-Developer-Signature: v=1; a=ed25519-sha256; t=1725638684; l=3924; i=cl@gentwo.org; s=20240811; h=from:subject:message-id; bh=9Z6f1byj/IKu+iwRNPojSYbVKnkUeb562+j3sxm2IM4=; b=aEPyBeGGFT1QmjxLlIwTI8asDa6MXj9OiFp9VoP8MxGQhchv8AOVMDi6+CeRXuHB+YbVNZ3nc kuYYoWZ1pbCAPlYn69h/iFNcpKfniy3fHkQv4mqFlUDhinkxrUbYiis X-Developer-Key: i=cl@gentwo.org; a=ed25519; pk=I7gqGwDi9drzCReFIuf2k9de1FI1BGibsshXI0DIvq8= X-Endpoint-Received: by B4 Relay for cl@gentwo.org/20240811 with auth_id=194 X-Original-From: Christoph Lameter Reply-To: cl@gentwo.org X-Rspam-User: X-Stat-Signature: srtis7z4pncwsg67se5dybnru85c3uck X-Rspamd-Queue-Id: 5E85F20006 X-Rspamd-Server: rspam11 X-HE-Tag: 1725638686-531272 X-HE-Meta: U2FsdGVkX1+yDFv+LUgG8X6KD13kv8A/Ir0vNLa/UC1QH3T4o8M+0jVjJarRMDxSBJSukueqE6xnnLJxITQA3im4kw30E4NDel3IcPhvCFIy5bV8JAB+KVJYLhMm7TUiTfZFqMlq329z2oqSPs1KtbEAiVrVNqPgSTzr33guSYzayMRzz/8RkSVGivYUsMgiGPUwmVVXpI6LDoi+R1C1TkCZ5Cq25UDUtfJRjBCZUuLTV+vTpMYLr8B9ZeKOjm5MjqfcKo4X/Y/StduNDqloUcLC6sRW+gsR+Ni6bcfZ/TbnYXLUl/34NU4QwrpRYVMLNOwBfZ8dYEFdjKv7uZeZXpnYT2ZskcX1whzdrv+tKV7D6YddZpDAf/Ux/4mecRxgMBGLZGMwVjfCsmPZRvpL8D1Xle+wGF0HrLf0YeTpO+qXzaWZHLUyUjUSOAxqrMbMre/nDU03dWze+FgWWchfJsJUxQRV+25HgCS7HAgLTSO8KT6GyGj1B/p70qAsFIdtbD+3O8s5uQNRRhBT3JsVCd2uTHCJCCfLOfd4v5PpZehG0EZJzOxyauxHC9yqcEK0nIQ52KTg6uMMm3dcFqJMiUk/lo3zoSsCjdkIdwVE+49E9V1tbmDV6yTHx74uPUPRlSdbpT7w06QJN4miQzFhXdj0w9pe2qYeeD+rIm93ARXsogOuQHQIbtHltv3H8zqTTMZGnxhLd9hbiuQYZPbeCjh45BMkaFE8Zz/FoH78yBq5EOPlx4Bx0f/guODNttoVQKTDKQNZdBJf+H+I8o4UBIGSqt2051px8SQEwH4a1Uy+1yT3wesfhUOox95wPM0j4g9Fhg8DJ1COnSWoJbbF4wk1BJ2hAws0RFKQBaBrR/rUyyghUwA35MensueddGSPVeCWb/b1ok77RSfOhq51RtrgW0crM6/OaoJfa7HeT6t5Spp2v7H1jXQa/BAZ1TgFajRX1HjWnusHPpr3Kae uuQoWvGe z6wLzE9D9AuLh6vFVl0lK9SPlakGcZcRQefFDh8KJJ17G8ErAP7aRpGckBsHyKnCMGT64a/nZLMFFfylpIirMicR6hIwlgCEpBRS8g90Olv2Gfz6zhg4SwbO0WHyPVtkoQ8q0gELlpHM3S2Grj2u1ixIwzNPGv62wg0yiXPptMAjdsjYuSR2nI7xGEE/DHZTTJpf9AqhAG+NB5Xghemz2ndd1MINCc+v9ThH/hkYao3ME+QWXgpI9qjxeOyddKLWWdR/ZPrqsqbpzpAMIo8cL6baaNpp37qtuuG40FhdYmO99X4ImyVIfuAM0yrrt3CdNeg/OU5DhAjosV/DB87FqEcnMO3vmeSyqz4GvszNd1oNQQGQQLlxQfftECnzminHfVflGxOLx9QaGZRNy3O9CSmRp5/gJGkiFOYyErTl0EWNDvrj01H7dID+Lzqhxha21jfPOuKjNVNJPfUI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Lameter The old SLAB allocator used to support memory policies on a per allocation bases. In SLUB the memory policies are applied on a per page frame / folio bases. Doing so avoids having to check memory policies in critical code paths for kmalloc and friends. This worked on general well on Intel/AMD/PowerPC because the interconnect technology is mature and can minimize the latencies through intelligent caching even if a small object is not placed optimally. However, on ARM we have an emergence of new NUMA interconnect technology based more on embedded devices. Caching of remote content can currently be ineffective using the standard building blocks / mesh available on that platform. Such architectures benefit if each slab object is individually placed according to memory policies and other restrictions. This patch adds another kernel parameter slab_strict_numa If that is set then a static branch is activated that will cause the hotpaths of the allocator to evaluate the current memory allocation policy. Each object will be properly placed by paying the price of extra processing and SLUB will no longer defer to the page allocator to apply memory policies at the folio level. This patch improves performance of memcached running on Ampere Altra 2P system (ARM Neoverse N1 processor) by 3.6% due to accurate placement of small kernel objects. Tested-by: Huang Shijie Signed-off-by: Christoph Lameter (Ampere) Signed-off-by: Christoph Lameter --- Changes in v2: - Fix various issues - Testing --- mm/slub.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) --- base-commit: b831f83e40a24f07c8dcba5be408d93beedc820f change-id: 20240819-strict_numa-fc59b33123a2 Best regards, diff --git a/mm/slub.c b/mm/slub.c index a77f354f8325..2fa7c35e076a 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -218,6 +218,10 @@ DEFINE_STATIC_KEY_FALSE(slub_debug_enabled); #endif #endif /* CONFIG_SLUB_DEBUG */ +#ifdef CONFIG_NUMA +DEFINE_STATIC_KEY_FALSE(strict_numa); +#endif + /* Structure holding parameters for get_partial() call chain */ struct partial_context { gfp_t flags; @@ -3865,6 +3869,28 @@ static __always_inline void *__slab_alloc_node(struct kmem_cache *s, object = c->freelist; slab = c->slab; +#ifdef CONFIG_NUMA + if (static_branch_unlikely(&strict_numa) && + node == NUMA_NO_NODE) { + + struct mempolicy *mpol = current->mempolicy; + + if (mpol) { + /* + * Special BIND rule support. If existing slab + * is in permitted set then do not redirect + * to a particular node. + * Otherwise we apply the memory policy to get + * the node we need to allocate on. + */ + if (mpol->mode != MPOL_BIND || !slab || + !node_isset(slab_nid(slab), mpol->nodes)) + + node = mempolicy_slab_node(); + } + } +#endif + if (!USE_LOCKLESS_FAST_PATH() || unlikely(!object || !slab || !node_match(slab, node))) { object = __slab_alloc(s, gfpflags, node, addr, c, orig_size); @@ -5527,6 +5553,22 @@ static int __init setup_slub_min_objects(char *str) __setup("slab_min_objects=", setup_slub_min_objects); __setup_param("slub_min_objects=", slub_min_objects, setup_slub_min_objects, 0); +#ifdef CONFIG_NUMA +static int __init setup_slab_strict_numa(char *str) +{ + if (nr_node_ids > 1) { + static_branch_enable(&strict_numa); + printk(KERN_INFO "SLUB: Strict NUMA enabled.\n"); + } else + printk(KERN_WARNING "slab_strict_numa parameter set on non NUMA system.\n"); + + return 1; +} + +__setup("slab_strict_numa", setup_slab_strict_numa); +#endif + + #ifdef CONFIG_HARDENED_USERCOPY /* * Rejects incorrectly sized objects and objects that are to be copied