From patchwork Thu Dec 22 19:00:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13080236 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7DF4C4332F for ; Thu, 22 Dec 2022 19:00:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3EA7900002; Thu, 22 Dec 2022 14:00:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EBE21900004; Thu, 22 Dec 2022 14:00:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D37EE900003; Thu, 22 Dec 2022 14:00:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BD5AF900002 for ; Thu, 22 Dec 2022 14:00:38 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7D5A1AB4CE for ; Thu, 22 Dec 2022 19:00:38 +0000 (UTC) X-FDA: 80270858556.17.98D718B Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf11.hostedemail.com (Postfix) with ESMTP id 63F9140002 for ; Thu, 22 Dec 2022 19:00:35 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=PjEfHeIv; spf=pass (imf11.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.51 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671735635; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=xFGD4VIGKFF/VgK3W8l6J2/VT0xeVWmVNx8bDUgm6n8=; b=eCYm+qj2Dv/MSPiEo9XBfIve2+d2fMXUx9i9de9u/qdKVt6ibyLCXbjQGEg44SS/MMSlpn y++o8jMNsF5mrdbYE9d8wEjp9bJENqPIOBn3KBi63NjYHk2k+sUziOmWW3eP8sQxsSeNiG ldycbwhQuA55aB9O92t/cGR1YmWf4Js= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=PjEfHeIv; spf=pass (imf11.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.51 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671735635; a=rsa-sha256; cv=none; b=UBFa6p8NOfguvnO5fd0MdhdTRjflokrZYq833Ys+HBA6jmKqH3CU+YJSh6PFDHhuNKsNR3 OjlCzGgbXvCTqbNLoT30MqaAveBQXiv06QMF0gdji0P4k3bWd7k0JZJYbCvKuLm5PvupRN 4gl04NJuO/AXKI4rpqWEDf5ZmSwcntY= Received: by mail-lf1-f51.google.com with SMTP id b3so4093014lfv.2 for ; Thu, 22 Dec 2022 11:00:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xFGD4VIGKFF/VgK3W8l6J2/VT0xeVWmVNx8bDUgm6n8=; b=PjEfHeIvZJi/4S12sn9JGXZNOZM2wzHObFOhOl9XJx0YgNJvbJRTcTu5+qyQ0wKPYU PBhUKu4FcEJyPvnsxq3X5tofzpUREJxpyb0pKYrwk3FLWIWghNUvhAILYvekBCp3o5sc C52ub7lXZ9HtC+lvZqE+wQTUrCCVgFsEtOMEQpgDif60a4yPSjpF1/YlgyPK6RLbPWbs 43WCM75n2T4FUVL2H0eXfPFQffnCLflKwiE88owiUiN5Fw1ckhaqjpn2JfHARVjcDozT 5082axGIHk5OOLCfkB1lIzKpbmRUgXIGxFjIaKb6HqxFpQRtJozeS9GYTrRx8e1RUP4/ JK5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xFGD4VIGKFF/VgK3W8l6J2/VT0xeVWmVNx8bDUgm6n8=; b=ye+OyVoEDTAhwDPqP0knCD0zx6kuWW/13T8+e2czekIj03BuuSFuqAUiu97kzUwd/N QGB4hNGrnSw6dhhQtc03F4niLFDQh6tn5fy/LVZIWfA0rzJueQECQfG70s+MnRynnoZf sCHqX01Eq6V2i9nlBKs/KtXWvbJyM5pdhKq/52trYhy4tlLjkFQ0kTI8uRgk38HENiGN 2FkiVMdegpRohGg2nyxW7+a+XZydobM+CMo86q3EOgc1FYlxdIXnF40jTi9Yq6OZ7Jud ktM2jOY5rl9w6zy4YSQtBjEeLkYzSg0e48y2zRSIT/oLGS/4hWT/qNPOF+el+Jdj4+0T Juqg== X-Gm-Message-State: AFqh2kpAhULVInxyB02JnvNqVg7KrpnOPN8xVNhnVt2gq2K47PKeofeK Ujv1K08yPpv7TK4BZFlEpZc= X-Google-Smtp-Source: AMrXdXtwRLeGl6roX1wbkOoZ3GXJm4X3LNeUwUJivdXD+/FVCK7yPlPjMZWjL7DFMTWhKsmS7oQpGg== X-Received: by 2002:ac2:47f6:0:b0:4af:f5c1:b1c7 with SMTP id b22-20020ac247f6000000b004aff5c1b1c7mr1729091lfp.20.1671735633323; Thu, 22 Dec 2022 11:00:33 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id c20-20020ac24154000000b0048a8c907fe9sm164209lfi.167.2022.12.22.11.00.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Dec 2022 11:00:32 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: Andrew Morton Cc: linux-mm@kvack.org, LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , Nicholas Piggin , Uladzislau Rezki , Oleksiy Avramchenko , Roman Gushchin Subject: [PATCH v3 1/3] mm: vmalloc: Avoid calling __find_vmap_area() twice in __vunmap() Date: Thu, 22 Dec 2022 20:00:20 +0100 Message-Id: <20221222190022.134380-1-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 63F9140002 X-Rspam-User: X-Stat-Signature: dr5ict8zkk3kopzg5e8wwq16xpbn5zqi X-HE-Tag: 1671735635-40655 X-HE-Meta: U2FsdGVkX1+ZS/H/mkiANiqNWvOR+8QrxaO11rcnqJMcRqaA46igoXO9l3NCg9OeOCbOb1LSO3klrpBxx/0HhImoYOxLebUFZLjDAEJ/PrD/PO0hAgbmdg0R0+/1z69VJvrVpjB7ixLbK8yzzazrny0cbcvwpsUoUAM73+cGO034mU07xlJfjJh/QjyXw1WHq/FutSPll6ZlrYQGSMS2Eo+uhr/BWRprciB6cEtwwlOw6GN9SaoMVMmEC5dp/dQHLafu5BmPEDZyMCPFHJkbb/OlJIKuSeHKHrz5v98F1uRfDuI2xkl/HQLASBrgkuBhBF/i94vXR5dM5zeXJI/H82KoilJ5UikwmUGBgNiQGjBiq6e9HQgIpLb37caIFGUx5srXkVJeypzBK6oWsI3gDNGyAmS3nWGMypEcS1XLjnDTUUDIkgfSpslnFgpDHw8BB91h9sDaVnJ3+7mAqt5nTAO2JFBrNXNgHg7YtG5OG3u84A6VIU/4A9bfOGqrpSLYA1zbD8adPmWPScn3+EG4aWeMnrds6Nvm+ERJpg3t7wQBNAD2sfYMo5Th2WjtWcdmbTi9qILiJzRHHhNk78FZZvaHcAx5yq0akMU3ELweFJIYkCKNBNulzjW5e5np4nb+lRDRRLt66UOMC13BZauxlyEFKs6BVWM1U/JgA1BdK4eZfhprROBFIsK/Lmo7RYrkMbaFZhTGzqa1LzPLxDvT4VE1bd4jfpxdcwMa48cZB5TgyOT8YVc4HDZCt+I6UldLBh7g0qSzkRWIUIDa46xvsI3wEClsxIvudx6c0+xH5+rAMYYB9CeC1SBINB4oRpmBB4mSR00uIERl+fz7xicmwFoFEw5OvtjLdr83/4FHtIF2Zngg6o1TbTJhWLvNdQFw8Z0IBTc6s62FH6Z55fYfBAmpOEa41ZSN3JQKA41XDHWgxSIk2jGdU0TbShRGWYS+2b/XiYnk+NluAUIbKlr A/6fZk8j Ol2EjtoiCme8a/G4xnUlVQ3E+UnO1P34G/qbLhMDjF5dFK2UTj2Mak9vUc1L0/ItfTDQDZzA7TbOmLBSdsB+gRK/4l1aaueBfGbm3iAaoASvTe/3WpGmfWj8b2HRQ44jtfKas1hdjqdiuahfaj43RP5mQz5DZPIV7OpwI7GfWERBrmSM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently the __vunmap() path calls __find_vmap_area() twice. Once on entry to check that the area exists, then inside the remove_vm_area() function which also performs a new search for the VA. In order to improvie it from a performance point of view we split remove_vm_area() into two new parts: - find_unlink_vmap_area() that does a search and unlink from tree; - __remove_vm_area() that removes without searching. In this case there is no any functional change for remove_vm_area() whereas vm_remove_mappings(), where a second search happens, switches to the __remove_vm_area() variant where the already detached VA is passed as a parameter, so there is no need to find it again. Performance wise, i use test_vmalloc.sh with 32 threads doing alloc free on a 64-CPUs-x86_64-box: perf without this patch: - 31.41% 0.50% vmalloc_test/10 [kernel.vmlinux] [k] __vunmap - 30.92% __vunmap - 17.67% _raw_spin_lock native_queued_spin_lock_slowpath - 12.33% remove_vm_area - 11.79% free_vmap_area_noflush - 11.18% _raw_spin_lock native_queued_spin_lock_slowpath 0.76% free_unref_page perf with this patch: - 11.35% 0.13% vmalloc_test/14 [kernel.vmlinux] [k] __vunmap - 11.23% __vunmap - 8.28% find_unlink_vmap_area - 7.95% _raw_spin_lock 7.44% native_queued_spin_lock_slowpath - 1.93% free_vmap_area_noflush - 0.56% _raw_spin_lock 0.53% native_queued_spin_lock_slowpath 0.60% __vunmap_range_noflush __vunmap() consumes around ~20% less CPU cycles on this test. v2 -> v3: - update commit message; - rename the vm_remove_mappings() to the va_remove_mappings(); - move va-unlinking to the callers so the free_vmap_area_noflush() now expects a VA that has been disconnected; - eliminate a local variable in the remove_vm_area(). Reported-by: Roman Gushchin Signed-off-by: Uladzislau Rezki (Sony) Reviewed-by: Lorenzo Stoakes Reviewed-by: Christoph Hellwig --- mm/vmalloc.c | 77 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 47 insertions(+), 30 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 9e30f0b39203..eb91ecaa7277 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1815,9 +1815,9 @@ static void drain_vmap_area_work(struct work_struct *work) } /* - * Free a vmap area, caller ensuring that the area has been unmapped - * and flush_cache_vunmap had been called for the correct range - * previously. + * Free a vmap area, caller ensuring that the area has been unmapped, + * unlinked and flush_cache_vunmap had been called for the correct + * range previously. */ static void free_vmap_area_noflush(struct vmap_area *va) { @@ -1825,9 +1825,8 @@ static void free_vmap_area_noflush(struct vmap_area *va) unsigned long va_start = va->va_start; unsigned long nr_lazy; - spin_lock(&vmap_area_lock); - unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + if (WARN_ON_ONCE(!list_empty(&va->list))) + return; nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); @@ -1871,6 +1870,19 @@ struct vmap_area *find_vmap_area(unsigned long addr) return va; } +static struct vmap_area *find_unlink_vmap_area(unsigned long addr) +{ + struct vmap_area *va; + + spin_lock(&vmap_area_lock); + va = __find_vmap_area(addr, &vmap_area_root); + if (va) + unlink_va(va, &vmap_area_root); + spin_unlock(&vmap_area_lock); + + return va; +} + /*** Per cpu kva allocator ***/ /* @@ -2015,6 +2027,10 @@ static void free_vmap_block(struct vmap_block *vb) tmp = xa_erase(&vmap_blocks, addr_to_vb_idx(vb->va->va_start)); BUG_ON(tmp != vb); + spin_lock(&vmap_area_lock); + unlink_va(vb->va, &vmap_area_root); + spin_unlock(&vmap_area_lock); + free_vmap_area_noflush(vb->va); kfree_rcu(vb, rcu_head); } @@ -2591,6 +2607,20 @@ struct vm_struct *find_vm_area(const void *addr) return va->vm; } +static struct vm_struct *__remove_vm_area(struct vmap_area *va) +{ + struct vm_struct *vm; + + if (!va || !va->vm) + return NULL; + + vm = va->vm; + kasan_free_module_shadow(vm); + free_unmap_vmap_area(va); + + return vm; +} + /** * remove_vm_area - find and remove a continuous kernel virtual area * @addr: base address @@ -2603,26 +2633,10 @@ struct vm_struct *find_vm_area(const void *addr) */ struct vm_struct *remove_vm_area(const void *addr) { - struct vmap_area *va; - might_sleep(); - spin_lock(&vmap_area_lock); - va = __find_vmap_area((unsigned long)addr, &vmap_area_root); - if (va && va->vm) { - struct vm_struct *vm = va->vm; - - va->vm = NULL; - spin_unlock(&vmap_area_lock); - - kasan_free_module_shadow(vm); - free_unmap_vmap_area(va); - - return vm; - } - - spin_unlock(&vmap_area_lock); - return NULL; + return __remove_vm_area( + find_unlink_vmap_area((unsigned long) addr)); } static inline void set_area_direct_map(const struct vm_struct *area, @@ -2636,16 +2650,17 @@ static inline void set_area_direct_map(const struct vm_struct *area, set_direct_map(area->pages[i]); } -/* Handle removing and resetting vm mappings related to the vm_struct. */ -static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages) +/* Handle removing and resetting vm mappings related to the VA's vm_struct. */ +static void va_remove_mappings(struct vmap_area *va, int deallocate_pages) { + struct vm_struct *area = va->vm; unsigned long start = ULONG_MAX, end = 0; unsigned int page_order = vm_area_page_order(area); int flush_reset = area->flags & VM_FLUSH_RESET_PERMS; int flush_dmap = 0; int i; - remove_vm_area(area->addr); + __remove_vm_area(va); /* If this is not VM_FLUSH_RESET_PERMS memory, no need for the below. */ if (!flush_reset) @@ -2690,6 +2705,7 @@ static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages) static void __vunmap(const void *addr, int deallocate_pages) { struct vm_struct *area; + struct vmap_area *va; if (!addr) return; @@ -2698,19 +2714,20 @@ static void __vunmap(const void *addr, int deallocate_pages) addr)) return; - area = find_vm_area(addr); - if (unlikely(!area)) { + va = find_unlink_vmap_area((unsigned long)addr); + if (unlikely(!va)) { WARN(1, KERN_ERR "Trying to vfree() nonexistent vm area (%p)\n", addr); return; } + area = va->vm; debug_check_no_locks_freed(area->addr, get_vm_area_size(area)); debug_check_no_obj_freed(area->addr, get_vm_area_size(area)); kasan_poison_vmalloc(area->addr, get_vm_area_size(area)); - vm_remove_mappings(area, deallocate_pages); + va_remove_mappings(va, deallocate_pages); if (deallocate_pages) { int i;