From patchwork Mon Aug 29 21:25:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12958424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19722ECAAD4 for ; Mon, 29 Aug 2022 21:25:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CC276B0073; Mon, 29 Aug 2022 17:25:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 35685940008; Mon, 29 Aug 2022 17:25:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CE77940007; Mon, 29 Aug 2022 17:25:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 06D9D6B0073 for ; Mon, 29 Aug 2022 17:25:38 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id ABD721A05A1 for ; Mon, 29 Aug 2022 21:25:37 +0000 (UTC) X-FDA: 79853911914.09.8C2A158 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 51FA1C003B for ; Mon, 29 Aug 2022 21:25:37 +0000 (UTC) Received: by mail-pl1-f201.google.com with SMTP id z14-20020a170903018e00b00174fff57d17so978299plg.14 for ; Mon, 29 Aug 2022 14:25:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:from:to:cc; bh=pDTzUH9ZmkMZg3cmno897HgIwaaxstGwmWtClqeSQMY=; b=gjUmOZLV9bcbLonNS4lb3WBoz84QjOA/D7CmogvUP5ud9+InysL8Oh9E7DKJdwkXA1 t51SYWvWY7me7fS7UP+Cv2Nf/JhijDPumYxXmtSNDCmzKiJTL6HeIT2vpSjyKp+dKUA+ i/6AfO9yhnI27e6m85YLYK5uydUZDc69Ws3yGDc+9jVDOLuag8kYSvlT6GKmBabHISlj Agpyx6xSKbJ2VmCCsK/dQfHtKexX+gaWqNmpXiQWr3NzSlhT0v1YMSw9eFOFs2Oau/MZ ybOwhTMeJ9rLhGpwSsJLA1PyAZvwxA2dIcGal5knPsRHjNnfif7k984xL88bpesVgDea AYvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:x-gm-message-state:from:to:cc; bh=pDTzUH9ZmkMZg3cmno897HgIwaaxstGwmWtClqeSQMY=; b=bXmBKXkQNfQotA2ovNhmMGlJWTo0El3Lh406h8re6ROOmVsa/e9OOlEalpb6T+F2TY vIA3zrqwu74N54SBWD0oOmKHiaznlc3SVQ9YvQGhjYOjqJcoS9LiPA6GuXd9H6Q+gEUu sWczgE0yjkreqZXexRdy9KDhrA3FaDBoWJf3RwPua9V+oRAvDiTV0wdg9qkpZkLf9CfI iTnoFDo7xTi1kNeB9rd270VE8DTUx0wPVp9Viwx7ivRbZkAor+y+vOfE9CL3w7YHRRF6 R0btx40LyZfIizWxR011Vai6MV3XYFtrC3nKc2zaLpLGvvs2Ju38TqfFcNxxLIMutwSP K9Sw== X-Gm-Message-State: ACgBeo0sHikp/NH0oqyx8Rw7YG1wJ1v7bNIn0U098mPC/oq1PZggIxzI Vro060G+Bt9nRHz04zFEdJrjMYPzfZQ= X-Google-Smtp-Source: AA6agR6UQmZ/vaRok0dLo0fY9ZItUpZFj7DUiATdYYW4UNrGosZUH5TdrbUN0W11FQtWqAaY7kD6g0jugM8= X-Received: from surenb-spec.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:3e30]) (user=surenb job=sendgmr) by 2002:a05:6a00:1a47:b0:52e:6a8c:5430 with SMTP id h7-20020a056a001a4700b0052e6a8c5430mr18212460pfv.48.1661808335904; Mon, 29 Aug 2022 14:25:35 -0700 (PDT) Date: Mon, 29 Aug 2022 21:25:03 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220829212531.3184856-1-surenb@google.com> Subject: [RFC PATCH 00/28] per-VMA locks proposal From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, riel@surriel.com, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661808337; a=rsa-sha256; cv=none; b=qjk4B7HZxKhp19QTcWXxYfu8trFYXK17fC0WdiBNJzfVe0dE9AVjJ04gePkLWmS53/R6j6 HbTeL8SbV6/Te8cPaDp05sH7kbCujt4EvjBflPu80v3HeZJr2lQ/qvdDNc0rZZixsr2CxV rQBCywf7xrFRKkV+mm8u5jn3Jnm3l7I= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gjUmOZLV; spf=pass (imf10.hostedemail.com: domain of 3zy4NYwYKCIAwyvirfksskpi.gsqpmry1-qqozego.svk@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3zy4NYwYKCIAwyvirfksskpi.gsqpmry1-qqozego.svk@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661808337; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=pDTzUH9ZmkMZg3cmno897HgIwaaxstGwmWtClqeSQMY=; b=KukVKwJ7gZEA0U2HpQM3myPFzUWWeo7Drs0kEuGPhZuyuOHsX+LWjOyS6jM/D7h8l7J29j DIVVs2mVjatodZL1UCkGamuK8j9GppEYliQJ7fhJCIQfKGZSaiu8ZRqAsjp8SbmGphyuf9 AjlbfNGJU9wpP64CfDjJytdSgvjOWjo= Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=gjUmOZLV; spf=pass (imf10.hostedemail.com: domain of 3zy4NYwYKCIAwyvirfksskpi.gsqpmry1-qqozego.svk@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3zy4NYwYKCIAwyvirfksskpi.gsqpmry1-qqozego.svk@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam05 X-Stat-Signature: nixq7ptxydaaeo55wcb5jb5jobe4u6z1 X-Rspamd-Queue-Id: 51FA1C003B X-Rspam-User: X-HE-Tag: 1661808337-642973 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a proof of concept for per-vma locks idea that was discussed during SPF [1] discussion at LSF/MM this year [2], which concluded with suggestion that “a reader/writer semaphore could be put into the VMA itself; that would have the effect of using the VMA as a sort of range lock. There would still be contention at the VMA level, but it would be an improvement.” This patchset implements this suggested approach. When handling page faults we lookup the VMA that contains the faulting page under RCU protection and try to acquire its lock. If that fails we fall back to using mmap_lock, similar to how SPF handled this situation. One notable way the implementation deviates from the proposal is the way VMAs are marked as locked. Because during some of mm updates multiple VMAs need to be locked until the end of the update (e.g. vma_merge, split_vma, etc). Tracking all the locked VMAs, avoiding recursive locks and other complications would make the code more complex. Therefore we provide a way to "mark" VMAs as locked and then unmark all locked VMAs all at once. This is done using two sequence numbers - one in the vm_area_struct and one in the mm_struct. VMA is considered locked when these sequence numbers are equal. To mark a VMA as locked we set the sequence number in vm_area_struct to be equal to the sequence number in mm_struct. To unlock all VMAs we increment mm_struct's seq number. This allows for an efficient way to track locked VMAs and to drop the locks on all VMAs at the end of the update. The patchset implements per-VMA locking only for anonymous pages which are not in swap. If the initial proposal is considered acceptable, then support for swapped and file-backed page faults will be added. Performance benchmarks show similar although slightly smaller benefits as with SPF patchset (~75% of SPF benefits). Still, with lower complexity this approach might be more desirable. The patchset applies cleanly over 6.0-rc3 The tree for testing is posted at [3] [1] https://lore.kernel.org/all/20220128131006.67712-1-michel@lespinasse.org/ [2] https://lwn.net/Articles/893906/ [3] https://github.com/surenbaghdasaryan/linux/tree/per_vma_lock_rfc Laurent Dufour (2): powerc/mm: try VMA lock-based page fault handling first powerpc/mm: define ARCH_SUPPORTS_PER_VMA_LOCK Michel Lespinasse (1): mm: rcu safe VMA freeing Suren Baghdasaryan (25): mm: introduce CONFIG_PER_VMA_LOCK mm: introduce __find_vma to be used without mmap_lock protection mm: move mmap_lock assert function definitions mm: add per-VMA lock and helper functions to control it mm: mark VMA as locked whenever vma->vm_flags are modified kernel/fork: mark VMAs as locked before copying pages during fork mm/khugepaged: mark VMA as locked while collapsing a hugepage mm/mempolicy: mark VMA as locked when changing protection policy mm/mmap: mark VMAs as locked in vma_adjust mm/mmap: mark VMAs as locked before merging or splitting them mm/mremap: mark VMA as locked while remapping it to a new address range mm: conditionally mark VMA as locked in free_pgtables and unmap_page_range mm: mark VMAs as locked before isolating them mm/mmap: mark adjacent VMAs as locked if they can grow into unmapped area kernel/fork: assert no VMA readers during its destruction mm/mmap: prevent pagefault handler from racing with mmu_notifier registration mm: add FAULT_FLAG_VMA_LOCK flag mm: disallow do_swap_page to handle page faults under VMA lock mm: introduce per-VMA lock statistics mm: introduce find_and_lock_anon_vma to be used from arch-specific code x86/mm: try VMA lock-based page fault handling first x86/mm: define ARCH_SUPPORTS_PER_VMA_LOCK arm64/mm: try VMA lock-based page fault handling first arm64/mm: define ARCH_SUPPORTS_PER_VMA_LOCK kernel/fork: throttle call_rcu() calls in vm_area_free arch/arm64/Kconfig | 1 + arch/arm64/mm/fault.c | 36 +++++++++ arch/powerpc/mm/fault.c | 41 ++++++++++ arch/powerpc/platforms/powernv/Kconfig | 1 + arch/powerpc/platforms/pseries/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/mm/fault.c | 36 +++++++++ drivers/gpu/drm/i915/i915_gpu_error.c | 4 +- fs/proc/task_mmu.c | 1 + fs/userfaultfd.c | 6 ++ include/linux/mm.h | 104 ++++++++++++++++++++++++- include/linux/mm_types.h | 33 ++++++-- include/linux/mmap_lock.h | 37 ++++++--- include/linux/vm_event_item.h | 6 ++ include/linux/vmstat.h | 6 ++ kernel/fork.c | 75 +++++++++++++++++- mm/Kconfig | 13 ++++ mm/Kconfig.debug | 8 ++ mm/init-mm.c | 6 ++ mm/internal.h | 4 +- mm/khugepaged.c | 1 + mm/madvise.c | 1 + mm/memory.c | 82 ++++++++++++++++--- mm/mempolicy.c | 6 +- mm/mlock.c | 2 + mm/mmap.c | 60 ++++++++++---- mm/mprotect.c | 1 + mm/mremap.c | 1 + mm/nommu.c | 2 + mm/oom_kill.c | 3 +- mm/vmstat.c | 6 ++ 31 files changed, 531 insertions(+), 54 deletions(-)