From patchwork Thu Feb 22 12:05:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Rulin" X-Patchwork-Id: 13567187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FD5BC48BEB for ; Thu, 22 Feb 2024 12:04:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC6DE6B0095; Thu, 22 Feb 2024 07:04:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4FF16B0096; Thu, 22 Feb 2024 07:04:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F0FD6B009A; Thu, 22 Feb 2024 07:04:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8972B6B0095 for ; Thu, 22 Feb 2024 07:04:36 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5A42F1A0D90 for ; Thu, 22 Feb 2024 12:04:36 +0000 (UTC) X-FDA: 81819307752.21.30803C8 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by imf04.hostedemail.com (Postfix) with ESMTP id 532F240017 for ; Thu, 22 Feb 2024 12:04:33 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=YWSgvRXv; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of rulin.huang@intel.com designates 192.198.163.12 as permitted sender) smtp.mailfrom=rulin.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708603474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ucXwZ8efAhmTm0KtcMMh4GXV6fUBepEcQZmOLA3i7yI=; b=TsiSFHmqUpsjzWiXR6uEfQOMUyQhFTsA7KzC/ufLJEPNiRXAzgtXJXFqJbkx1lnDFvBmiq 8RUUpp84MBzO36y0FDY1nPU0D57VhchvqpzTLR/CivVSatFhB6WecoOUz7rkTLp4p7bmbX 62221ZGHFoLIgNqSdJPIFdFoBB8N72o= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=YWSgvRXv; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of rulin.huang@intel.com designates 192.198.163.12 as permitted sender) smtp.mailfrom=rulin.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708603474; a=rsa-sha256; cv=none; b=XolfV8sA/iNVp3SV5z9dl4qL51dL8jZqarXOhrx+FJDH2LKGushNmg2a55DEkcGxTEUBd2 FJcv/OggUpbehuJeP8UmemkIQ4b8z5eVCntga+ZUHymTzpoNUps9s/S/K+xr5rD69KL60v kNJ7BLG1/W8aGqPiKs23F+VgPDxHdzk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708603474; x=1740139474; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bYSaX56dO8kdy94pHYYZFze3kwfrczgEVdVcwjZvFXk=; b=YWSgvRXv/7y2CYdA0Qk/sL/Pm+jhsSeC+ar/y7f8oA2XRPs782s658eW ygqeEr/KCuRxJ0f1Qebt47RnwHf+2war7ZhIWDKDp2QBSSuWnS2J+9ZCd KAwOyKZmNmOoJCfRzIPOPaydPOhpvO+h/csccT+w0Mt23reuAVQvE20ep NS726jY7XvxB2SdlYL9qhpVFvxs34jHmea5QyDKgC+1M690F6n1ySQvS3 w0Dqyr76cfGG0X7KRG7EFN502/BrJixjGMkZ9ZeS1raFmHXehrar7pLX0 3NGrudtdmHPvLXDs+NLJLHzy1eDPg8bcLmVrjCH5bG/gLEBr3IDcP0di8 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10991"; a="6600880" X-IronPort-AV: E=Sophos;i="6.06,177,1705392000"; d="scan'208";a="6600880" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2024 04:04:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10991"; a="913510455" X-IronPort-AV: E=Sophos;i="6.06,177,1705392000"; d="scan'208";a="913510455" Received: from linux-pnp-server-09.sh.intel.com ([10.239.176.190]) by fmsmga002.fm.intel.com with ESMTP; 22 Feb 2024 04:04:28 -0800 From: rulinhuang To: akpm@linux-foundation.org, urezki@gmail.com Cc: colin.king@intel.com, hch@infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lstoakes@gmail.com, rulin.huang@intel.com, tianyou.li@intel.com, tim.c.chen@intel.com, wangyang.guo@intel.com, zhiguo.zhou@intel.com Subject: [PATCH v4] mm/vmalloc: lock contention optimization under multi-threading Date: Thu, 22 Feb 2024 07:05:36 -0500 Message-ID: <20240222120536.216166-1-rulin.huang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240207033059.1565623-1-rulin.huang@intel.com> References: <20240207033059.1565623-1-rulin.huang@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: k3xf48ftsy5n41xdqz3mg6zre5oduowe X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 532F240017 X-HE-Tag: 1708603473-929948 X-HE-Meta: U2FsdGVkX18KSzlZmNUHk10lrDSq0kRDWKs1sKBnwW3salkf7qfN5weqWX/Tl1UO1zqfrEbGfgv56kc9AKBgtYLYeCNHVGGnIS33oVbxs4GZ0wQA/YlZvVFv5JbXllA4Z6ROKEm8h4y232CcYBAf1OMM8Uh8l3zMmL2SwC/0uYhtfluLNcz5mO7gbmciX2r17rAqHkLYjojk6ikD8VarUJsjbksaZ8Y0AgkqKfaFFbQqYYwBUrRFi55axz1pihK8DsVK/LdEpTLKRoM+cBItdHJjzX8le/snk/Ea4eRoEfWnrgsHAQMayXlFf2P4KBwnBNi8pCkNNFDVjYt96GqVTRAiJz4yLVpnLdd/tM99IArtOrlv19jzuk9NNI58LOWYd7nRQz2JGGdzmV5oU7EybWCxgWuJVnLS28wHonTKb/a+nAsJ7xDbVcLYxeMik78gVcFjvVHw/NUgWSO6yXevHKOq5DeWIrBI7fx3+lgdojBdkE4XjabSxcRYDfYmrdmj5nhV8C3pPQBV04S0hh7eKh160qkEIDWK3gHWZ7mJn7jRfV+vEyA3zNRHGAVz5mNeGHRhIUAmAibnwuRLQl+E5eZbVjpP6P//b7WWt1BjKu/cXxAWZqtIkjxmF5LDuzy+0etDoh9fBDXiNReV2r/U1FiW9MarHoEjgGPVgEUsfFFyRZCuBKKXh8xxxxZ/PC5SuCRXZygVl5+pIw8EscP/L3pgH5IorAONmE7Asq3nQN8qsOLv6cWWx+5qeBRb/OSzJW7ITiNlqboJEOMo3JE1amulSm9b7GLlIwvccTwV2hbdzOqfGGsMUVPer0xj1M7EnAP1bSJQ2Al/kddiKTdoNwkRU87mjhkaoQsHC93iHrP/uyVN9YXS0RciPqa2Ddy6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When allocating a new memory area where the mapping address range is known, it is observed that the vmap_area lock is acquired twice. The first acquisition occurs in the alloc_vmap_area() function when inserting the vm area into the vm mapping red-black tree. The second acquisition occurs in the setup_vmalloc_vm() function when updating the properties of the vm, such as flags and address, etc. Combine these two operations together in alloc_vmap_area(), which improves scalability when the vmap_area lock is contended. By doing so, the need to acquire the lock twice can also be eliminated. With the above change, tested on intel icelake platform(160 vcpu, kernel v6.7), a 6% performance improvement and a 7% reduction in overall spinlock hotspot are gained on stress-ng/pthread(https://github.com/ColinIanKing/stress-ng), which is the stress test of thread creations. Reviewed-by: Chen Tim C Reviewed-by: King Colin Signed-off-by: rulinhuang --- V1 -> V2: Avoided the partial initialization issue of vm and separated insert_vmap_area() from alloc_vmap_area() V2 -> V3: Rebased on 6.8-rc5 V3 -> V4: Rebased on mm-unstable branch --- mm/vmalloc.c | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) base-commit: 9d193b36872d153e02e80c26203de4ee15127b58 diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 25a8df497255..ce126e7bc3d8 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1851,7 +1851,6 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, int node, gfp_t gfp_mask, unsigned long va_flags) { - struct vmap_node *vn; struct vmap_area *va; unsigned long freed; unsigned long addr; @@ -1912,19 +1911,18 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, va->vm = NULL; va->flags = (va_flags | vn_id); - vn = addr_to_node(va->va_start); - - spin_lock(&vn->busy.lock); - insert_vmap_area(va, &vn->busy.root, &vn->busy.head); - spin_unlock(&vn->busy.lock); - BUG_ON(!IS_ALIGNED(va->va_start, align)); BUG_ON(va->va_start < vstart); BUG_ON(va->va_end > vend); ret = kasan_populate_vmalloc(addr, size); if (ret) { - free_vmap_area(va); + /* + * Insert/Merge it back to the free tree/list. + */ + spin_lock(&free_vmap_area_lock); + merge_or_add_vmap_area_augment(va, &free_vmap_area_root, &free_vmap_area_list); + spin_unlock(&free_vmap_area_lock); return ERR_PTR(ret); } @@ -1953,6 +1951,15 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, return ERR_PTR(-EBUSY); } +static inline void insert_vmap_area_locked(struct vmap_area *va) +{ + struct vmap_node *vn = addr_to_node(va->va_start); + + spin_lock(&vn->busy.lock); + insert_vmap_area(va, &vn->busy.root, &vn->busy.head); + spin_unlock(&vn->busy.lock); +} + int register_vmap_purge_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(&vmap_notify_list, nb); @@ -2492,6 +2499,8 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) return ERR_CAST(va); } + insert_vmap_area_locked(va); + vaddr = vmap_block_vaddr(va->va_start, 0); spin_lock_init(&vb->lock); vb->va = va; @@ -2847,6 +2856,8 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node) if (IS_ERR(va)) return NULL; + insert_vmap_area_locked(va); + addr = va->va_start; mem = (void *)addr; } @@ -2946,7 +2957,7 @@ void __init vm_area_register_early(struct vm_struct *vm, size_t align) kasan_populate_early_vm_area_shadow(vm->addr, vm->size); } -static inline void setup_vmalloc_vm_locked(struct vm_struct *vm, +static inline void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, unsigned long flags, const void *caller) { vm->flags = flags; @@ -2956,16 +2967,6 @@ static inline void setup_vmalloc_vm_locked(struct vm_struct *vm, va->vm = vm; } -static void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, - unsigned long flags, const void *caller) -{ - struct vmap_node *vn = addr_to_node(va->va_start); - - spin_lock(&vn->busy.lock); - setup_vmalloc_vm_locked(vm, va, flags, caller); - spin_unlock(&vn->busy.lock); -} - static void clear_vm_uninitialized_flag(struct vm_struct *vm) { /* @@ -3010,6 +3011,8 @@ static struct vm_struct *__get_vm_area_node(unsigned long size, setup_vmalloc_vm(area, va, flags, caller); + insert_vmap_area_locked(va); + /* * Mark pages for non-VM_ALLOC mappings as accessible. Do it now as a * best-effort approach, as they can be mapped outside of vmalloc code. @@ -4584,7 +4587,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, spin_lock(&vn->busy.lock); insert_vmap_area(vas[area], &vn->busy.root, &vn->busy.head); - setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC, + setup_vmalloc_vm(vms[area], vas[area], VM_ALLOC, pcpu_get_vm_areas); spin_unlock(&vn->busy.lock); }