From patchwork Thu Sep 14 15:26:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13385625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FC4EEAA51 for ; Thu, 14 Sep 2023 15:26:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB0CE6B02BE; Thu, 14 Sep 2023 11:26:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E61066B02C0; Thu, 14 Sep 2023 11:26:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D01716B02C1; Thu, 14 Sep 2023 11:26:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C01DD6B02BE for ; Thu, 14 Sep 2023 11:26:29 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A2FA51CA086 for ; Thu, 14 Sep 2023 15:26:29 +0000 (UTC) X-FDA: 81235579698.23.2AC5451 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf05.hostedemail.com (Postfix) with ESMTP id D9DF2100009 for ; Thu, 14 Sep 2023 15:26:27 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ICTR85Sk; spf=pass (imf05.hostedemail.com: domain of 3IiYDZQYKCKkbdaNWKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--surenb.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3IiYDZQYKCKkbdaNWKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694705187; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=szuJNW0JYA8v41wMmDUwFhK5OZ9EO2QBXA4Fv3dKB/g=; b=ve8NJTZ0D7RgWma7H3+YNFD+hKsRZPubY8p2E9lIWqGmt80f/SscvXPjXe0E8hgPCOQ0HX nOiGr7/hxMbIsZ+J1OLI9LgeRqaBw1xjXuu+irFRm5c5VwhJBR2tGgrjLzA8MsXzIHhC7M x8sWKftf0oUOImpuBHZ3uVEjtIhsbGI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694705187; a=rsa-sha256; cv=none; b=WdTVt/piVM3rwr2S0xYiFlIBqxw4My7ZaUIeiuD9NBE+zs/r6+NHb7mlEUTYOcC2yaoaat arRnSu8gMuQ+6CnP552WW6/g7BAFSfFBTzHoyqQ0B8OynR5eRxZMdyVs+stzRihmpPGH+q PZpjNrjnjkmPnQMYREiZ6iHfXBBhZ50= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ICTR85Sk; spf=pass (imf05.hostedemail.com: domain of 3IiYDZQYKCKkbdaNWKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--surenb.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3IiYDZQYKCKkbdaNWKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-d814105dc2cso1379600276.2 for ; Thu, 14 Sep 2023 08:26:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694705187; x=1695309987; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=szuJNW0JYA8v41wMmDUwFhK5OZ9EO2QBXA4Fv3dKB/g=; b=ICTR85SkrmxbNJixf1+DDkArkQ4fnbX4N9bXsz86zwJRq6AP/CGobQbXVoLIwDUraS Pli9MFYF36d0FVBOUpYNRZOaOQ5kE3KUhvehwz8hYryeMW22S/7apqmSyTTJtMwRQzI8 hoNGDTlYA6fTdVWPf91cbBaFiCyyraSx0yZKwzhINNnJuFWacZn6IJzveo5WeG+/hyOG t5yTdHRRm52JxdgDHoCGC2Hab1n1T+Nj2MJr5zwS2FHAC3FIh0J2BeC5YGfCIkUkJfst AgXCy/kuONOQEd50ot5hjG2dDLjuIxalyJQHva3xOWkYU3Ojgkj8wetDRhmT6NewWt45 wBng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694705187; x=1695309987; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=szuJNW0JYA8v41wMmDUwFhK5OZ9EO2QBXA4Fv3dKB/g=; b=amrpRVZyo67llWwtCPXbYxLPcj4bBqS0cNoP4T9tYX2Il/8OwFeWMkv68N1xt3uakA b3aR1WodF1rcMWHyGikthl3uxqKe8oSsljNc7fqTKMAKHWCtlsGD7pQDUYM6YoKMvWB2 trXDpzb2f+/wN2N1KF8jr0Xapdx7vLrtjnlJQj20eHyhrSGn4L5aycRs6MNg5s/Jx8EZ K1wb7lxy5U5ZHoMA4s52ojiQILDy8lu+dI6GN0Rct65XNP5z5/jI8Xm5DhhY86c5rwem ybMk56NUawa55tgryYbNCwZVr4pp3zMc0mC6N35H2hYG08xkhQB3Tx5LjdnpoDBS3nPS ahxQ== X-Gm-Message-State: AOJu0YxPLYbcC5JYpEjhW/imtEBfxH9kt81C/MCJ9S12krTc97596oeM vNnX93UGdzRB57SHwfMY5kxYaIydiBQ= X-Google-Smtp-Source: AGHT+IEd5v61XzoVq0A62rwjSzBLodEWtJ0EzGsQ7diWTmRZrqfRIFz5ezY13wh1THs9a+DnBIwa+z8BazU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:405c:ba64:810f:5fee]) (user=surenb job=sendgmr) by 2002:a25:b45:0:b0:d40:932e:f7b1 with SMTP id 66-20020a250b45000000b00d40932ef7b1mr128068ybl.7.1694705186976; Thu, 14 Sep 2023 08:26:26 -0700 (PDT) Date: Thu, 14 Sep 2023 08:26:11 -0700 In-Reply-To: <20230914152620.2743033-1-surenb@google.com> Mime-Version: 1.0 References: <20230914152620.2743033-1-surenb@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230914152620.2743033-2-surenb@google.com> Subject: [PATCH 1/3] userfaultfd: UFFDIO_REMAP: rmap preparation From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, peterx@redhat.com, david@redhat.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com, jannh@google.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, surenb@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com X-Stat-Signature: 1j9fsy7744g4hdd1ppfotoem5fk1bb1w X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D9DF2100009 X-Rspam-User: X-HE-Tag: 1694705187-597234 X-HE-Meta: U2FsdGVkX18nCJ7V1X74M4J2+Z9jKUL/f6+8AnsC2r314fZ1olfv15KxaP5JFukdBiiDXe2OFZF7VwBmk3xgzF4P1JfiaWwAwGpmKH+O+tXTD+sLYclF+FeJhZUc5wol/YtWMcFp7RmnI1B+2lVoHatqpChn5VM4w87crdeLUSQSx2a8ZMwuv7fG9C9mA+UmFgFaTbIG5aplZ3e3mJk6ttEg1w1yA7vGMXoQUlTuIasNNPW+koXTNq/vYpWSMALO/fEVqK9P7dUQ3/HkmLh4RQKLOmpD7a02YeWvC66yxZFRo4qnq166xUCpsTP09fU5YMe0E2TA+ZAmviVtRy1jMx3TRSrfaVcKtmLdthap/S0DaJvV36JR6AkPK5V5IT5edgDHekXWXUW47i1YAgFcce86mLr58/7TXg5ucfGFKzzBDTtozPgshfWILH6jwaxhWR3bh8xfXF7m7dhhB3emKcZQ3IFz8wJTUWy1jYVO17FuCOr9QPsZ8qabA0wfqJe3ZqHepbx68AOSHwiXTXtIdSZpKAwv1IJCm3aNNDoZHSw0xftY/C4+PgnLPPTgIj0vs8il/ofM9nQSdM8zUpFaFZIGNvuTloac+ppDltk/MknRtq82Up7zlJdZck8k+ORJfXrKPtLRT5sD0f2eNl4iCXxRKaKl92pcKdiiUbgWmpFDoAziU20/LU4p0pv8X09pXaDYtvsNJDCX7wnZDuKj5c8cNIvBj/fq34q5W0qfZDSaamoUvXzRadfoQ/SBHuj1420szbswjWsQ8+KM2QJ/ZVnQkxy51ACmyVMQekhzQWp0CBVr+SrNQE6IRT/sIt5lYWGZBY8V8OsAQozeaNOhBAFSgy41zyKNLOvsYywGl6tkY0O1updQlKsx0IKOzQ8bBB8lDyXCY1CLYeOVC8lvLFJ/kpQhkuNT4oC9j3YZg1A194Cg3z2KRDVnzHc91J3sS1E5rvbxAvt1BjkWh1X w6z+6HZg ergm9D09aa01m+1mgyH7ZIP2w04UxhmHmqLRX+cspnC+iBmpBODAUPyLUO1pPSO5sTNFqvROWrBOh1moYe3hC/rDIJN/in+PCZfbXeNiSGGLX7JO1Cl093gpEG9hyEFk+S7dYYDYQGoR9FPpDv2+JSeiC1fobkn+nmXGe+Th7+S7mGuqKsnV+T0EmbmcsuRgSI0WEp1FZXHN/cVlM1Gh0JoEi1i0aqJEG3cPJimwEhX59/9dqZRlcLZ9WOTguRQiT+0Cw4zJYjVARn567SYb6wxVCOSbyVlZQbxw9JxZbTXrVTUCE6zBx1LeTbHVpgs3ub0NJ4dUKOmtj4UiY/CHqTUGTxxQsws0fArI0PV8PCLamXjS5J04EoRaDpGRrzkBjaT+F/byLVbK56BhTFhGRs2zMynOPpcFVpv3QFygbsdKe5RYDcUDKxg/wgAm8GFfQLQhevs3q+vJc/4mTxer025IhVnFrHL/s8SSHBu+aaCysJq83QD226WoucYlELYGLVi0MsbmiEwdXhH+7XCtiyBWbsm2tWeUDM0M1Op/TT4Vi9vMiRA3/JGu7+BC/SZmq2+oygx9aEerYUd+kkP5bPJKi3Cg7GvW/Ke82CHbihwTD2RQymIgB58fIqJ/bSlEwWmaL8YZ5DRx7RDWJ0yZfiql7hAsAie6OImuxXEWBGtGJAEk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andrea Arcangeli As far as the rmap code is concerned, UFFDIO_REMAP only alters the page->mapping and page->index. It does it while holding the page lock. However folio_referenced() is doing rmap walks without taking the folio lock first, so folio_lock_anon_vma_read() must be updated to re-check that the folio->mapping didn't change after we obtained the anon_vma read lock. UFFDIO_REMAP takes the anon_vma lock for writing before altering the folio->mapping, so if the folio->mapping is still the same after obtaining the anon_vma read lock (without the folio lock), the rmap walks can go ahead safely (and UFFDIO_REMAP will wait the rmap walk to complete before proceeding). UFFDIO_REMAP serializes against itself with the folio lock. All other places taking the anon_vma lock while holding the mmap_lock for writing, don't need to check if the folio->mapping has changed after taking the anon_vma lock, regardless of the folio lock, because UFFDIO_REMAP holds the mmap_lock for reading. There's one constraint enforced to allow this simplification: the source pages passed to UFFDIO_REMAP must be mapped only in one vma, but this constraint is an acceptable tradeoff for UFFDIO_REMAP users. The source addresses passed to UFFDIO_REMAP can be set as VM_DONTCOPY with MADV_DONTFORK to avoid any risk of the mapcount of the pages increasing if some thread of the process forks() before UFFDIO_REMAP run. Signed-off-by: Andrea Arcangeli Signed-off-by: Suren Baghdasaryan --- mm/rmap.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/rmap.c b/mm/rmap.c index ec7f8e6c9e48..c1ebbd23fa61 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -542,6 +542,7 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, struct anon_vma *root_anon_vma; unsigned long anon_mapping; +repeat: rcu_read_lock(); anon_mapping = (unsigned long)READ_ONCE(folio->mapping); if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON) @@ -586,6 +587,18 @@ struct anon_vma *folio_lock_anon_vma_read(struct folio *folio, rcu_read_unlock(); anon_vma_lock_read(anon_vma); + /* + * Check if UFFDIO_REMAP changed the anon_vma. This is needed + * because we don't assume the folio was locked. + */ + if (unlikely((unsigned long) READ_ONCE(folio->mapping) != + anon_mapping)) { + anon_vma_unlock_read(anon_vma); + put_anon_vma(anon_vma); + anon_vma = NULL; + goto repeat; + } + if (atomic_dec_and_test(&anon_vma->refcount)) { /* * Oops, we held the last refcount, release the lock