From patchwork Wed Jul 14 22:24:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12377961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DA77C12002 for ; Wed, 14 Jul 2021 22:24:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A914B600D1 for ; Wed, 14 Jul 2021 22:24:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A914B600D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA8B76B00BD; Wed, 14 Jul 2021 18:24:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E59576B00BE; Wed, 14 Jul 2021 18:24:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAB688D0003; Wed, 14 Jul 2021 18:24:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id A03E46B00BD for ; Wed, 14 Jul 2021 18:24:41 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9236B231B1 for ; Wed, 14 Jul 2021 22:24:40 +0000 (UTC) X-FDA: 78362623920.18.C64747E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 288DC70000AF for ; Wed, 14 Jul 2021 22:24:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626301479; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8MXCY9EtHu8l00OeGEX9KSWIeSxJZ91wjW9WQpZLBho=; b=DW5639ykFP64RzWxxFTFvCAl84yuS6u+WHUV51ipWC1Pa2gQuPEnSduVyGf38D3aPhLoUk Mt6LXXAcjuzW/ZasBFtuMxRQV897qf1w2nil/Lw1NAXTqUG3fz8EXzTerTeUpvh+FMSRcl rmuwX9KLJH5xbfVAm5hsYXc1Ecg60qA= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-592-U0nImRbjPk-hV4-HG9n_Ow-1; Wed, 14 Jul 2021 18:24:38 -0400 X-MC-Unique: U0nImRbjPk-hV4-HG9n_Ow-1 Received: by mail-qk1-f200.google.com with SMTP id ay38-20020a05620a17a6b02903b7bed4e05eso2319258qkb.6 for ; Wed, 14 Jul 2021 15:24:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8MXCY9EtHu8l00OeGEX9KSWIeSxJZ91wjW9WQpZLBho=; b=EUiNbN2Wac8lZcfGhYokHWwqcriYmmiy78/8hIAFD63XrB4LSeGVahagjkT2kQXWUv AWHUQEgKWsut7vmAKMug6J3Et9LRJET+S9fb5nEojRbiLGcT0o6zt42bvmYbNy0k8ImA ZLVfxxM/NDULAs0idJC2VUHfEvOWYwCZBS5q2CK/9nFO6sjXFmBS4tDH7zUOIaCS1zCr 03AUIisdrOUpaVx9eVTAodkcgAvG+wheKjurb7Rjbz8hIQ7PVgkx0liFFXh0GuPhgF+1 MfvCiiTkwiOW1Co11X8X7WPR2WlAQtHuvVcUjyf2YD63Z9cvhTWInMIHaDvTIO0DlUML HtCA== X-Gm-Message-State: AOAM530oILjTTcpEtqeoBd1GqWIb1zXkL2chfghl7awLf86kJt9Ktsl+ Rikz8WcLtZm/jQqd8sXWGNxY/DyONo4PbPVYOFTJu4KiBA1sBrQ4QE96OhMH9RuTdctnSYib78m 2Fw1BbmiWm1V4S2jVeJLUrDz/UJr8FKN20Pg3MhVodkxbPJZ0Rc6j2+vB6ATO X-Received: by 2002:a0c:f244:: with SMTP id z4mr351840qvl.27.1626301477499; Wed, 14 Jul 2021 15:24:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxXznl2Ndf5naG4U62173v8gGkYkHnaeWte5gx6dL3tnVR+2chcnWckUY/pRUVCWX6tOzr7Lw== X-Received: by 2002:a0c:f244:: with SMTP id z4mr351808qvl.27.1626301477140; Wed, 14 Jul 2021 15:24:37 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-65-184-144-111-238.dsl.bell.ca. [184.144.111.238]) by smtp.gmail.com with ESMTPSA id n184sm1661512qkb.22.2021.07.14.15.24.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 15:24:36 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton , Mike Kravetz , Axel Rasmussen , Miaohe Lin , "Kirill A . Shutemov" , Hugh Dickins , Jason Gunthorpe , Alistair Popple , Matthew Wilcox , peterx@redhat.com, Jerome Glisse , Andrea Arcangeli , Mike Rapoport , Nadav Amit , David Hildenbrand Subject: [PATCH v4 11/26] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem Date: Wed, 14 Jul 2021 18:24:33 -0400 Message-Id: <20210714222433.48637-1-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210714222117.47648-1-peterx@redhat.com> References: <20210714222117.47648-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DW5639yk; spf=none (imf27.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam05 X-Stat-Signature: pq8tu8o5jrcjed4u7w5askas769hpm99 X-Rspamd-Queue-Id: 288DC70000AF X-HE-Tag: 1626301480-355729 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory differs from anonymous memory in that even if the pte is missing, the data could still resides either in the file or in page/swap cache. So when wr-protect a pte, we need to consider none ptes too. We do that by installing the uffd-wp special swap pte as a marker. So when there's a future write to the pte, the fault handler will go the special path to first fault-in the page as read-only, then report to userfaultfd server with the wr-protect message. On the other hand, when unprotecting a page, it's also possible that the pte got unmapped but replaced by the special uffd-wp marker. Then we'll need to be able to recover from a uffd-wp special swap pte into a none pte, so that the next access to the page will fault in correctly as usual when trigger the fault handler next time, rather than sending a uffd-wp message. Special care needs to be taken throughout the change_protection_range() process. Since now we allow user to wr-protect a none pte, we need to be able to pre-populate the page table entries if we see !anonymous && MM_CP_UFFD_WP requests, otherwise change_protection_range() will always skip when the pgtable entry does not exist. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. But this will be a base for thps too. Signed-off-by: Peter Xu --- mm/mprotect.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/mm/mprotect.c b/mm/mprotect.c index 4b743394afbe..8ec85b276975 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -186,6 +187,32 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } + } else if (unlikely(is_swap_special_pte(oldpte))) { + if (uffd_wp_resolve && !vma_is_anonymous(vma) && + pte_swp_uffd_wp_special(oldpte)) { + /* + * This is uffd-wp special pte and we'd like to + * unprotect it. What we need to do is simply + * recover the pte into a none pte; the next + * page fault will fault in the page. + */ + pte_clear(vma->vm_mm, addr, pte); + pages++; + } + } else { + /* It must be an none page, or what else?.. */ + WARN_ON_ONCE(!pte_none(oldpte)); + if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + /* + * For file-backed mem, we need to be able to + * wr-protect even for a none pte! Because + * even if the pte is null, the page/swap cache + * could exist. + */ + set_pte_at(vma->vm_mm, addr, pte, + pte_swp_mkuffd_wp_special(vma)); + pages++; + } } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); @@ -219,6 +246,25 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } +/* + * File-backed vma allows uffd wr-protect upon none ptes, because even if pte + * is missing, page/swap cache could exist. When that happens, the wr-protect + * information will be stored in the page table entries with the marker (e.g., + * PTE_SWP_UFFD_WP_SPECIAL). Prepare for that by always populating the page + * tables to pte level, so that we'll install the markers in change_pte_range() + * where necessary. + * + * Note that we only need to do this in pmd level, because if pmd does not + * exist, it means the whole range covered by the pmd entry (of a pud) does not + * contain any valid data but all zeros. Then nothing to wr-protect. + */ +#define change_protection_prepare(vma, pmd, addr, cp_flags) \ + do { \ + if (unlikely((cp_flags & MM_CP_UFFD_WP) && pmd_none(*pmd) && \ + !vma_is_anonymous(vma))) \ + WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd)); \ + } while (0) + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -237,6 +283,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, next = pmd_addr_end(addr, end); + change_protection_prepare(vma, pmd, addr, cp_flags); + /* * Automatic NUMA balancing walks the tables with mmap_lock * held for read. It's possible a parallel update to occur