From patchwork Tue Apr 27 16:12:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12226873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5D5AC433ED for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5571561151 for ; Tue, 27 Apr 2021 16:13:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5571561151 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAA246B0072; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A808F6B0073; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ACDA6B0074; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 5FAE46B0072 for ; Tue, 27 Apr 2021 12:13:33 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1093E181AF5D7 for ; Tue, 27 Apr 2021 16:13:33 +0000 (UTC) X-FDA: 78078642306.07.58577A5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 6889740001DE for ; Tue, 27 Apr 2021 16:13:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619540012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yHI+Q0evUhnkGEC1JxzgrefD9XzCM+vPDN1MAUFX8eY=; b=cNrNJC+piQjUbwYXavAZwG6FNwKNp9vW1Rm7epfGZpT4Y1yECx8p1qh38BiPzGrHJROAPv LEbE0WRfN3XV+qUbO3sx5bMrPGRormBxxCHVZuk/dGtvwBarKR4q9+WdBRceQ2aTttK9vR ll3fqB7ujyJR19ZkGVpv1wkALggvVKE= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-153-HAWVwzgXNj-1nSQNDVnTmQ-1; Tue, 27 Apr 2021 12:13:27 -0400 X-MC-Unique: HAWVwzgXNj-1nSQNDVnTmQ-1 Received: by mail-qv1-f69.google.com with SMTP id s13-20020a0cdc0d0000b02901bbc03198caso3770861qvk.22 for ; Tue, 27 Apr 2021 09:13:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yHI+Q0evUhnkGEC1JxzgrefD9XzCM+vPDN1MAUFX8eY=; b=OI2MIpgsOfzWsHIIK0xfniYiezUu75slqTMa3HTnezmY5niZaXY53o6bCtEYezmwMC OwGYzEDxmWx097+vSHG5uvcmJ7hzDdmlO7bAm5k2oGy7cjLVrTx4v+FmoWJY0BjqhGRj 7mVESanDm6aB1N3SyrHq9eiawjzte3oi44r0/zAx/QvXzlHotsDzrKdFSYA6C0KBFgVQ 1mJuOklA4CVZdA43JV7qBZ2bjuVvkuKS2Jbzc7Aa8SASs8sNPHxwxmlQt0cN1vZAxArU HxymFj8Cbf0PNf5WYymb2K6YcaBWnp2qC5UAm3twivsAY91wsVTO0cU3svzTm8J3p5dU 43mw== X-Gm-Message-State: AOAM532Bf9Ghf5JzUTX29WWwlVjneVkDdhoev5CdWLeJpbsd+2QHE9/J Fbibe5Bh1A1KDeU5FRofc5yMghtiaLCBFHWqdZIDkYIvmV/BHuh16M44M95ODr1uanj2rqlHTbW X2odOAZr1gUbouXZ3sRKemFPzTWXZtQorOBqUS55/sI0eqZruQPP7LJBbHxHP X-Received: by 2002:a37:8744:: with SMTP id j65mr25195912qkd.304.1619540005774; Tue, 27 Apr 2021 09:13:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyz8mXLkD2L73o3l8180wYyAYZ2+Gx9rvoIyRgA/dN9pjw0lpVdb5eO//oDua/vL2UX+K6Lkw== X-Received: by 2002:a37:8744:: with SMTP id j65mr25195859qkd.304.1619540005386; Tue, 27 Apr 2021 09:13:25 -0700 (PDT) Received: from xz-x1.redhat.com (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id v66sm3103621qkd.113.2021.04.27.09.13.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 09:13:24 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , peterx@redhat.com, Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: [PATCH v2 03/24] mm/userfaultfd: Introduce special pte for unmapped file-backed mem Date: Tue, 27 Apr 2021 12:12:56 -0400 Message-Id: <20210427161317.50682-4-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210427161317.50682-1-peterx@redhat.com> References: <20210427161317.50682-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 6889740001DE X-Stat-Signature: aixz8cxm7ajpm77pedux4bgw7n9gjani Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619540002-764250 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a very special swap-like pte for file-backed memories. Currently it's only defined for x86_64 only, but as long as any arch that can properly define the UFFD_WP_SWP_PTE_SPECIAL value as requested, it should conceptually work too. We will use this special pte to arm the ptes that got either unmapped or swapped out for a file-backed region that was previously wr-protected. This special pte could trigger a page fault just like swap entries, and as long as the page fault will satisfy pte_none()==false && pte_present()==false. Then we can revive the special pte into a normal pte backed by the page cache. This idea is greatly inspired by Hugh and Andrea in the discussion, which is referenced in the links below. The other idea (from Hugh) is that we use swp_type==1 and swp_offset=0 as the special pte. The current solution (as pointed out by Andrea) is slightly preferred in that we don't even need swp_entry_t knowledge at all in trapping these accesses. Meanwhile, we also reuse _PAGE_SWP_UFFD_WP from the anonymous swp entries. This patch only introduces the special pte and its operators. It's not yet applied to have any functional difference. Link: https://lore.kernel.org/lkml/20201126222359.8120-1-peterx@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-peterx@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 28 ++++++++++++++++++++++++++++ include/asm-generic/pgtable_uffd.h | 3 +++ include/linux/userfaultfd_k.h | 21 +++++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfcb..379bae343dd16 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1329,6 +1329,34 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) #endif #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +/* + * This is a very special swap-like pte that marks this pte as "wr-protected" + * by userfaultfd-wp. It should only exist for file-backed memory where the + * page (previously got wr-protected) has been unmapped or swapped out. + * + * For anonymous memories, the userfaultfd-wp _PAGE_SWP_UFFD_WP bit is kept + * along with a real swp entry instead. + * + * Let's make some rules for this special pte: + * + * (1) pte_none()==false, so that it'll not trigger a missing page fault. + * + * (2) pte_present()==false, so that it's recognized as swap (is_swap_pte). + * + * (3) pte_swp_uffd_wp()==true, so it can be tested just like a swap pte that + * contains a valid swap entry, so that we can check a swap pte always + * using "is_swap_pte() && pte_swp_uffd_wp()" without caring about whether + * there's one swap entry inside of the pte. + * + * (4) It should not be a valid swap pte anywhere, so that when we see this pte + * we know it does not contain a swap entry. + * + * For x86, the simplest special pte which satisfies all of above should be the + * pte with only _PAGE_SWP_UFFD_WP bit set (where swp_type==swp_offset==0). + */ +#define UFFD_WP_SWP_PTE_SPECIAL __pte(_PAGE_SWP_UFFD_WP) + static inline pte_t pte_swp_mkuffd_wp(pte_t pte) { return pte_set_flags(pte, _PAGE_SWP_UFFD_WP); diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h index 828966d4c2811..95e9811ce9d1f 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -2,6 +2,9 @@ #define _ASM_GENERIC_PGTABLE_UFFD_H #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP + +#define UFFD_WP_SWP_PTE_SPECIAL __pte(0) + static __always_inline int pte_uffd_wp(pte_t pte) { return 0; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 794d1538b8bac..bc733512c6905 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -140,6 +140,17 @@ extern int userfaultfd_unmap_prep(struct vm_area_struct *vma, extern void userfaultfd_unmap_complete(struct mm_struct *mm, struct list_head *uf); +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + WARN_ON_ONCE(vma_is_anonymous(vma)); + return UFFD_WP_SWP_PTE_SPECIAL; +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return pte_same(pte, UFFD_WP_SWP_PTE_SPECIAL); +} + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ @@ -229,6 +240,16 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, { } +static inline pte_t pte_swp_mkuffd_wp_special(struct vm_area_struct *vma) +{ + return __pte(0); +} + +static inline bool pte_swp_uffd_wp_special(pte_t pte) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ #endif /* _LINUX_USERFAULTFD_K_H */