From patchwork Mon Jul 31 17:12:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13335318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EB58C001DC for ; Mon, 31 Jul 2023 17:12:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09181280086; Mon, 31 Jul 2023 13:12:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F0C9D28007A; Mon, 31 Jul 2023 13:12:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1249280086; Mon, 31 Jul 2023 13:12:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AD93D28007A for ; Mon, 31 Jul 2023 13:12:49 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 79B3240138 for ; Mon, 31 Jul 2023 17:12:49 +0000 (UTC) X-FDA: 81072551658.11.BCCE06A Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf30.hostedemail.com (Postfix) with ESMTP id 8B62480010 for ; Mon, 31 Jul 2023 17:12:47 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=TqEf6Hrd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3juvHZAYKCLEjliVeSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3juvHZAYKCLEjliVeSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--surenb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690823567; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pWiy0eqxBgZlb+dN+Yox0ISeOPzNiRGRRyq+SZhtGFE=; b=3BCrronjs+d3t7EOU3rzaZ/wrTeJrNNmLGZ3u7YYlHyHHTicnQdt5YTSy8L5WVY38T5waE 9kDE3VrtkTPJ8ddvbslk+y94teyWhYW/5gupSNWtMQqEq6OQfdkX31eYurrdEjXBhnenRf w1S/xeBSzFWRJQo0z363yXU1Z8mUVlw= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=TqEf6Hrd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of 3juvHZAYKCLEjliVeSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3juvHZAYKCLEjliVeSXffXcV.TfdcZelo-ddbmRTb.fiX@flex--surenb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690823567; a=rsa-sha256; cv=none; b=mmjm5SQGfCbN+RzuYIPUgO7s7jsrQ8g516HD5sVf+5hO6A9RSTP/kwwXhXz56lXEp3k9ro OlVyhqoyDY5iX+0ohXwOtHbh4aqUCli9G5l9GQtycfmQPo9r8PretneMvusf1Sp2DZPxGD SCjXGQxDafbD5XTkfetDqHa7V80IPA4= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-585f254c41aso31858357b3.1 for ; Mon, 31 Jul 2023 10:12:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823566; x=1691428366; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pWiy0eqxBgZlb+dN+Yox0ISeOPzNiRGRRyq+SZhtGFE=; b=TqEf6HrdE+I2RnvUQAJhHrd/y5HGFhwmubpEUXYwAE4bGwVm81VSt8g5W0j9fPPAuQ sX4JCbn4xB1mj7KL3Fly3i1ij68a6VGGbaiodngH2NBWtY3W2Fduv5VAGXk0Y+AsbLAU 2uykQrOSFF3WK6HbnGND60wrUIQwprYirtJ7g4akiT7JRGvKgviVxDtLgAB37WUQewaT bIDeNPs0lImrqBS8/+XnjxQ4XT/lIUE0hzBuPIL335K48KzLJyX4EWLQ3B19lgcjlmTZ ixFnC3T+Kv/cFjI8V1Y3EgfT6F9hQnaglRNBsbIeI6o7F9OSz2bO7I7d5X/ejesuCCqD 50Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823566; x=1691428366; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pWiy0eqxBgZlb+dN+Yox0ISeOPzNiRGRRyq+SZhtGFE=; b=i4v9sarRHjD889+n3nTsnXoYvtF/wEDRsjkTiyh0KABFoNIgZcmD4eX/Srr/4nuB1v Cg0sU0uHq0/tjvfMTWH4FG9ZwByUBLBLsIWHjZv9MoTSkJq21Mg2cVOmnAJr692aWhBm /a4Ch9PocgG/X/wpwWzGbYdpHFAOgrESN6D8IPJvbOXZljS4/lnXQiDr6JhdZPvKixvT O9dUD8ykqn2l7qVhnQiRj1F52sNK+x4wWtH7s0vt49FdkxQGps09ogoBqW91pjUheXyb aMpa7hJgycyf98o8z6FTpkcUDxqHhcPid6j7I2u8m/Bx6XFmJXnja+dR4IB719Z/bfjA iLiA== X-Gm-Message-State: ABy/qLby+6R9fIuK44xEpa+nMyXnSbM4Yx2rjjhV3B7Vb7Dzhp1vzuFy BXeYSB70FYppJblkdbawAD+wTzAo0Zo= X-Google-Smtp-Source: APBJJlHwwvxymWxZDUL8yzdLNqvheNGhA91W/OiK0dav8XIzZNSGI7RuCw/tew0kh4tMP8lXsVeDK4HgzX0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a81:b625:0:b0:586:50cf:e13f with SMTP id u37-20020a81b625000000b0058650cfe13fmr9301ywh.1.1690823566545; Mon, 31 Jul 2023 10:12:46 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:30 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-5-surenb@google.com> Subject: [PATCH 4/6] mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_once From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan , Linus Torvalds X-Rspamd-Queue-Id: 8B62480010 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9tmfo4tgxeo7h16jnhtqrxfznw1ymg6g X-HE-Tag: 1690823567-657211 X-HE-Meta: U2FsdGVkX19hb9jYR+CNzVihopR+x6ZqY1x1hzS5/RVrfzzJg5pBt+p+c+WMMPhoftih4T5AILrWvfNhJ38PEA+HOq0aNQxKX8dyWSaQ48q60FgD2Xjo194gkTVwsu9r8g+OTXUYEDNJvwiixNgL7ZFwFQcWHPJlUXegx+FJ9iy7mLqrNPWSLFzxkYoWmjdhf4mRAiur86tXE68oTxyHhoTlUZ9vH36E3o6oHjS3NxH4qDN/b3scATXuBIdcZgRCwyP//vgEFP2qnLh2U0y8EW3RQlaVsBq3h1Juu2lJ86/pYPcPJajRLlybzyIi5f9rfYQqATAIZlKsqBz/SPbKzNgvC0RemVVcJSLTuFkNM5l4WyjEe3l833mXNDA0QG8oT3G9/V64XtDSOZ+qmPgoQinUyUuVuiHjcttqP1zRV/tk5MInFMYGHv8sb3DYLQ1HyV//ahPSczmc0ZaaTROkQgiUfR8OtC1yAIfq9kp7YtWpaM1JDg4C9xIUGUZrCGTBf/g8gTy+7HC3hEGAoSvJ6I5Uoaj9yhGtmCvrbpnJ59IFYL6W7vX25HeWnhf6Nc5oahzB7CYnPek/OM/Hav34TRm4XPOSitIgecjgL2hthRXIpnqmBuD6Lvd8vD2cMPgdtqt7Trnc/KW73KmHUkevp+tHg4Jf88kTesE+vNVnaLO5EXEp+Z73LtfA7h27MEWzJijcTsL5kJACCvXsYEuHkpUmucYWG2uIc794Pw3ICkxr8i8b2sAXSRUK1BAmZLypN7/diDqftOhS775DfvKZiVdB9vDy7kFKfVFUctOeBcUBxLoJYbgHQR3Wvo+2yj2ZfdAoLEeAhVTQygdmGmNdbuNwvfRliIlw1OGrF0+9k7+z03ru1Y1UWh0l20gPoE/uPuUFCMqhFiDyjs4kw32wADTjDQWmINwZxq4sFc+ai+K2Qqeqig0iM/R34oiUwaAO2JPkIXpo9MIkMbPYhDZ otSY/ePL m8dqlZ8cdM2sag+scEswjnkNP3duOsHdwP4K7lx0IL9vOFVMYDBb3i2Iyhc2Emotw81FMnk4HPbPs/JqNWdbxYJoGWmLi/kQUfVweaYWsSk1Bzn6zQXcGa5uL010igNL/ice16dxEL4qqEcK63ohI4R1KgzYnHrAF53rBohFO3qCkxY8rl1SVNROlhCfKdFaX4YXP1+TNdSVAfgtCbDI9eFVW+san6perfps1IKnhdL14fWhLxyM4i/h27KacCqPgtfMI1IsMYmtiPVE/PQ92c+et1C8nx1xxmvCCdJUEMlwDaZxfogn6/Z9DJEG1PU6oepPh9XcjROG3APC/jj87ihQ/zVIAKq/p/s0JWfk5RMCHswvdfd+yVqoe8VjoSKwxEvo4e8dzVXNC4jbqr0X65cyLTFwJwde06+0Vhe+pR9A1OJQFHFCHdCXbianEbfYl53AJ6J4/JrkPssc/INBt1DfTdR9uRdyso0AZN11tn6lCLhCUFx/OmYUTC7pFRabx4mdpCtu9Kq0Bna5Zizk8bpmT1+zBEvhs5sIS2ERTgX5UM+IN1MN911qVFZ2WhpgFHpU25Q7P84ZF+80f0gZ9MX/fwitsOy53P25R7WsdK6yUNVmjUbq5MI5kLW9Qm5VPKa1Wazd/n2p3Zw0twBgtBT8cglVaKa23cLNUAPETLwhuqFzuMNYcwAALHEuU1cspLjOANtf4MEBQhV38sK94lZ4xuw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is not obvious and makes it hard to understand where vma locking is happening. Also in some cases (like in dup_userfaultfd()) vma should be locked earlier than vma_flags modification. To make locking more visible, change these functions to assert that the vma write lock is taken and explicitly lock the vma beforehand. Fix userfaultfd functions which should lock the vma earlier. Suggested-by: Linus Torvalds Signed-off-by: Suren Baghdasaryan --- arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + drivers/infiniband/hw/hfi1/file_ops.c | 1 + fs/userfaultfd.c | 6 ++++++ include/linux/mm.h | 10 +++++++--- mm/madvise.c | 5 ++--- mm/mlock.c | 3 ++- mm/mprotect.c | 1 + 7 files changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 709ebd578394..e2d6f9327f77 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -410,6 +410,7 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm, ret = H_STATE; break; } + vma_start_write(vma); /* Copy vm_flags to avoid partial modifications in ksm_madvise */ vm_flags = vma->vm_flags; ret = ksm_madvise(vma, vma->vm_start, vma->vm_end, diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c index a5ab22cedd41..5920bfc1e1c5 100644 --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -344,6 +344,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_area_struct *vma) goto done; } + vma_start_write(vma); /* * vm_pgoff is used as a buffer selector cookie. Always mmap from * the beginning. diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 7cecd49e078b..6cde95533dcd 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -667,6 +667,7 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, mmap_write_lock(mm); for_each_vma(vmi, vma) { if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) { + vma_start_write(vma); vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); @@ -702,6 +703,7 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) octx = vma->vm_userfaultfd_ctx.ctx; if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) { + vma_start_write(vma); vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); return 0; @@ -783,6 +785,7 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma, atomic_inc(&ctx->mmap_changing); } else { /* Drop uffd context if remap feature not enabled */ + vma_start_write(vma); vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); } @@ -940,6 +943,7 @@ static int userfaultfd_release(struct inode *inode, struct file *file) prev = vma; } + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; } @@ -1502,6 +1506,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx.ctx = ctx; @@ -1685,6 +1690,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; diff --git a/include/linux/mm.h b/include/linux/mm.h index 262b5f44101d..2c720c9bb1ae 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -780,18 +780,22 @@ static inline void vm_flags_init(struct vm_area_struct *vma, ACCESS_PRIVATE(vma, __vm_flags) = flags; } -/* Use when VMA is part of the VMA tree and modifications need coordination */ +/* + * Use when VMA is part of the VMA tree and modifications need coordination + * Note: vm_flags_reset and vm_flags_reset_once do not lock the vma and + * it should be locked explicitly beforehand. + */ static inline void vm_flags_reset(struct vm_area_struct *vma, vm_flags_t flags) { - vma_start_write(vma); + vma_assert_write_locked(vma); vm_flags_init(vma, flags); } static inline void vm_flags_reset_once(struct vm_area_struct *vma, vm_flags_t flags) { - vma_start_write(vma); + vma_assert_write_locked(vma); WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); } diff --git a/mm/madvise.c b/mm/madvise.c index 0e484111a1d2..54628f4ca217 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -173,9 +173,8 @@ static int madvise_update_vma(struct vm_area_struct *vma, } success: - /* - * vm_flags is protected by the mmap_lock held in write mode. - */ + /* vm_flags is protected by the mmap_lock held in write mode. */ + vma_start_write(vma); vm_flags_reset(vma, new_flags); if (!vma->vm_file || vma_is_anon_shmem(vma)) { error = replace_anon_vma_name(vma, anon_name); diff --git a/mm/mlock.c b/mm/mlock.c index 3634de0b28e3..f0f5125188ba 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -386,6 +386,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma, */ if (newflags & VM_LOCKED) newflags |= VM_IO; + vma_start_write(vma); vm_flags_reset_once(vma, newflags); lru_add_drain(); @@ -460,9 +461,9 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma, * It's okay if try_to_unmap_one unmaps a page just after we * set VM_LOCKED, populate_vma_page_range will bring it back. */ - if ((newflags & VM_LOCKED) && (oldflags & VM_LOCKED)) { /* No work to do, and mlocking twice would be wrong */ + vma_start_write(vma); vm_flags_reset(vma, newflags); } else { mlock_vma_pages_range(vma, start, end, newflags); diff --git a/mm/mprotect.c b/mm/mprotect.c index f781f709c39d..0eab019914db 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -656,6 +656,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, * vm_flags and vm_page_prot are protected by the mmap_lock * held in write mode. */ + vma_start_write(vma); vm_flags_reset(vma, newflags); if (vma_wants_manual_pte_write_upgrade(vma)) mm_cp_flags |= MM_CP_TRY_CHANGE_WRITABLE;