From patchwork Wed Jul 14 22:24:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12377965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F28CEC12002 for ; Wed, 14 Jul 2021 22:24:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A89AB600D1 for ; Wed, 14 Jul 2021 22:24:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A89AB600D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 052936B00C1; Wed, 14 Jul 2021 18:24:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 029236B00C2; Wed, 14 Jul 2021 18:24:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE5176B00C3; Wed, 14 Jul 2021 18:24:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id BC7D66B00C1 for ; Wed, 14 Jul 2021 18:24:49 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 99CAF1838DC16 for ; Wed, 14 Jul 2021 22:24:48 +0000 (UTC) X-FDA: 78362624256.31.98C347D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 473CC90000A4 for ; Wed, 14 Jul 2021 22:24:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626301487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LFcYHlyDKbLlgbIUnPtt3LhncHUQTe/w3qvUe3g6dgM=; b=I+2dbi5Q9ZGTcPdU4QQ0CS5fRtSOjWg8752IMvbHrR02jPVJQ30ZNt5UnL7e3xrNiqtp0u HQvXrTlLg1t+CMn9TGrbL2qWSeafG6bvEoVqtwFh5Q53VWvkbqMyawkEClfcCvTDVZa5q4 GoUD0ScHuupxXnZGo6WBTAMqoT56qps= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-227-cBXvo-70NlGgA0IP1PkVOA-1; Wed, 14 Jul 2021 18:24:46 -0400 X-MC-Unique: cBXvo-70NlGgA0IP1PkVOA-1 Received: by mail-qv1-f69.google.com with SMTP id t15-20020a056214036fb02902dc72b5283dso2665751qvu.23 for ; Wed, 14 Jul 2021 15:24:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LFcYHlyDKbLlgbIUnPtt3LhncHUQTe/w3qvUe3g6dgM=; b=CaLxOetDEh8CSYjl1JDO2lxBzvZZdOxdzlaasQk4RKAZc4oviw8enpiPWR5isfE8Df dzZODE0bQIrMNgYvYVfTrhIcp2mTpr/Ebkop71hr6+/HcjOeCHHZnqHxknOCXvBASxm5 h2UfrG2I0dVM7aOsqKRwiyPJIWBsDFBg4LPYmCYTxWWUoWFpgBf1tO/JUHggrR41qGI/ EWfsFPa6+f2/GEV8h+FlNmn9ZTYqMXhxUevxziJ04jCqSuF9nqrSNazpP9r73etVMev4 L0Tp1u8Jw4KjHdGC5B98DW2qBVkOyfRxX0EEY5Z0LYj8ApQDAi/MGJcXkJvHJ6JOLRHz YcFg== X-Gm-Message-State: AOAM531xJnxC78nXsIOORO4O8nKBsL7jPjAzWrpRXfDcXiA/coJ8fHF4 jfkNGj2AyHkN1taf2IL/m1NCr8BN7sifO7WQeyE1JbkhyilJZ0fAjBZ0O49k9301uFSt/osQw+Y MPb9lINsEiVU6MEujTEmNO2jujEpG7uNJLm9edVKpXJpAZ+vItxk8cVKlcVii X-Received: by 2002:ac8:5853:: with SMTP id h19mr363312qth.66.1626301486009; Wed, 14 Jul 2021 15:24:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBfKAWFYo/mOF5P/V0APrbl+7RO7yIBtyeG/gjJiUvpZs7s3HU04/TQ0OhCIH6sjSAsXdp8Q== X-Received: by 2002:ac8:5853:: with SMTP id h19mr363274qth.66.1626301485652; Wed, 14 Jul 2021 15:24:45 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id t125sm1645141qkf.41.2021.07.14.15.24.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 15:24:45 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Mike Kravetz , Axel Rasmussen , Miaohe Lin , "Kirill A . Shutemov" , Hugh Dickins , Jason Gunthorpe , Alistair Popple , Matthew Wilcox , peterx@redhat.com, Jerome Glisse , Andrea Arcangeli , Mike Rapoport , Nadav Amit , David Hildenbrand Subject: [PATCH v4 13/26] shmem/userfaultfd: Handle the left-overed special swap ptes Date: Wed, 14 Jul 2021 18:24:41 -0400 Message-Id: <20210714222441.48737-1-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210714222117.47648-1-peterx@redhat.com> References: <20210714222117.47648-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 473CC90000A4 X-Stat-Signature: tsetkj75rftkrot63easr58n5bm7um63 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I+2dbi5Q; spf=none (imf28.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1626301488-939314 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Note that the special uffd-wp swap pte can be left over even if the page under the pte got evicted. Normally when evict a page, we will unmap the ptes by walking through the reverse mapping. However we never tracked such information for the special swap ptes because they're not real mappings but just markers. So we need to take care of that when we see a marker but when it's actually meaningless (the page behind it got evicted). We have already taken care of that in e.g. alloc_set_pte() where we'll treat the special swap pte as pte_none() when necessary. However we need to also teach userfaultfd itself on either UFFDIO_COPY or handling page faults, so that everything will still work as expected. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 15 +++++++++++++++ mm/userfaultfd.c | 13 ++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index f6e0f0c0d0e5..e1c1cbc7bcc8 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -329,6 +329,21 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, */ if (pte_none(*pte)) ret = true; + /* + * We also treat the swap special uffd-wp pte as the pte_none() here. + * This should in most cases be a missing event, as we never handle + * wr-protect upon a special uffd-wp swap pte - it should first be + * converted into a normal read request before handling wp. It just + * means the page/swap cache that backing this pte is gone, so this + * special pte is leftover. + * + * We can't simply replace it with a none pte because we're not with + * the pgtable lock here. Instead of taking it and clearing the pte, + * the easy way is to let UFFDIO_COPY understand this pte too when + * trying to install a new page onto it. + */ + if (pte_swp_uffd_wp_special(*pte)) + ret = true; if (!pte_write(*pte) && (reason & VM_UFFD_WP)) ret = true; pte_unmap(pte); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2a9c9e6eb876..0c7212dfb95d 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -100,7 +100,18 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, } ret = -EEXIST; - if (!pte_none(*dst_pte)) + /* + * Besides the none pte, we also allow UFFDIO_COPY to install a pte + * onto the uffd-wp swap special pte, because that pte should be the + * same as a pte_none() just in that it contains wr-protect information + * (which could only be dropped when unmap the memory). + * + * It's safe to drop that marker because we know this is part of a + * MISSING fault, and the caller is very clear about this page missing + * rather than wr-protected. Then we're sure the wr-protect bit is + * just a leftover so it's useless already and is the same as none pte. + */ + if (!pte_none(*dst_pte) && !pte_swp_uffd_wp_special(*dst_pte)) goto out_unlock; if (page_in_cache)