From patchwork Fri Mar 4 05:16:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EA88C433F5 for ; Fri, 4 Mar 2022 05:17:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 057978D0003; Fri, 4 Mar 2022 00:17:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 007608D0001; Fri, 4 Mar 2022 00:17:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E11188D0003; Fri, 4 Mar 2022 00:17:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id D370D8D0001 for ; Fri, 4 Mar 2022 00:17:34 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id A1C64608E3 for ; Fri, 4 Mar 2022 05:17:34 +0000 (UTC) X-FDA: 79205546028.01.EDA25E6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 0F0A21C000E for ; Fri, 4 Mar 2022 05:17:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371053; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KCL3KButghAabZJudedPY5ytCZQMosbpyIRioh4blvI=; b=LqxFGr9TqxNUBScmZrfmefF8klNSs7opw1N7M2/B1AEav6KpL10lqB7nIQKHd2w87u92Pa y0a/nMBbbX9dDuxN6rG+KhANQbmLI7GPO5WEfTJkV1PNCU/uxD0l0QzfYiFOfyKpOt5ydl gU9rQjn6+YgGmvPIh9Uy5KFXqoYYlaY= Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-146-leO4QG2UOwKe-S52rqi1bw-1; Fri, 04 Mar 2022 00:17:32 -0500 X-MC-Unique: leO4QG2UOwKe-S52rqi1bw-1 Received: by mail-pf1-f200.google.com with SMTP id j204-20020a6280d5000000b004e107ad3488so4451688pfd.15 for ; Thu, 03 Mar 2022 21:17:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KCL3KButghAabZJudedPY5ytCZQMosbpyIRioh4blvI=; b=wmqpKadIldibs5dguHD94tItyaWgU7lytdQ04GgO88pTSIfZKc5fyDS5VwnYUgIN7d qmZVM7nwF3FmHqrSJCoSNPUJsFfwIsz4jOWyi/p0gFz3e3GWbSpdTSipLZimcz07Nr8d X+g/LitUtPTclZteQMKXlbiGO6XkBXNJLTRe6TlhIf50GqHaAR0VRBxYhA5+2N6q9g8c fWPiTtD0w+IKCmNj0kR+kHRGOLANG03Ug6lPmB2r1KhUQIf1TCOwTDHRMNg/YfTcX5Rj SDPWhkdT938beFOYlF6iyzDqy1rhaiBGTvgFbQmjWvyTZ7Onr0Zr8avsIewzOuQ+XwYl uWnQ== X-Gm-Message-State: AOAM530U2nTEHf2yC9TH6chnKk6cVN5LpW5GRYTqj6Xl5ItA/Jrhsxf3 ELE/Lthuv4h7ekpSsz80w5l0mwwYZHljkIl87TqUcKyFZLAD12T5+YK9oXiKoou4ciQ8d5KMAoQ HoG9yEBc8hIf2ILdnGSmAgCTlq5lcngcwgbK3SXyalg0k8SP5V4vWdw1qfNb1 X-Received: by 2002:a63:2fc1:0:b0:374:9f30:9559 with SMTP id v184-20020a632fc1000000b003749f309559mr33022361pgv.278.1646371050944; Thu, 03 Mar 2022 21:17:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJym7Td/RpnBv1y2IaKZPLGC+ETMIqu+l9tJA1ieJk002T9FcPI25/B0PQuSJLNVv/X4K+wkwg== X-Received: by 2002:a63:2fc1:0:b0:374:9f30:9559 with SMTP id v184-20020a632fc1000000b003749f309559mr33022332pgv.278.1646371050512; Thu, 03 Mar 2022 21:17:30 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.17.22 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:17:30 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 01/23] mm: Introduce PTE_MARKER swap entry Date: Fri, 4 Mar 2022 13:16:46 +0800 Message-Id: <20220304051708.86193-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0F0A21C000E X-Stat-Signature: cskwnbdu4tz8gydaow3mim85jp8rozp4 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=LqxFGr9T; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf20.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371053-408578 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a new swap entry type called PTE_MARKER. It can be installed for any pte that maps a file-backed memory when the pte is temporarily zapped, so as to maintain per-pte information. The information that kept in the pte is called a "marker". Here we define the marker as "unsigned long" just to match pgoff_t, however it will only work if it still fits in swp_offset(), which is e.g. currently 58 bits on x86_64. A new config CONFIG_PTE_MARKER is introduced too; it's by default off. A bunch of helpers are defined altogether to service the rest of the pte marker code. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 9 ++++ include/linux/swap.h | 15 ++++++- include/linux/swapops.h | 78 +++++++++++++++++++++++++++++++++++ mm/Kconfig | 7 ++++ 4 files changed, 108 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 8e1e6244a89d..f39cad20ffc6 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -2,6 +2,9 @@ #ifndef _ASM_GENERIC_HUGETLB_H #define _ASM_GENERIC_HUGETLB_H +#include +#include + static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) { return mk_pte(page, pgprot); @@ -80,6 +83,12 @@ static inline int huge_pte_none(pte_t pte) } #endif +/* Please refer to comments above pte_none_mostly() for the usage */ +static inline int huge_pte_none_mostly(pte_t pte) +{ + return huge_pte_none(pte) || is_pte_marker(pte); +} + #ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT static inline pte_t huge_pte_wrprotect(pte_t pte) { diff --git a/include/linux/swap.h b/include/linux/swap.h index 42ebe2d6078d..20b4aceed920 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -55,6 +55,19 @@ static inline int current_is_kswapd(void) * actions on faults. */ +/* + * PTE markers are used to persist information onto PTEs that are mapped with + * file-backed memories. As its name "PTE" hints, it should only be applied to + * the leaves of pgtables. + */ +#ifdef CONFIG_PTE_MARKER +#define SWP_PTE_MARKER_NUM 1 +#define SWP_PTE_MARKER (MAX_SWAPFILES + SWP_HWPOISON_NUM + \ + SWP_MIGRATION_NUM + SWP_DEVICE_NUM) +#else +#define SWP_PTE_MARKER_NUM 0 +#endif + /* * Unaddressable device memory support. See include/linux/hmm.h and * Documentation/vm/hmm.rst. Short description is we need struct pages for @@ -100,7 +113,7 @@ static inline int current_is_kswapd(void) #define MAX_SWAPFILES \ ((1 << MAX_SWAPFILES_SHIFT) - SWP_DEVICE_NUM - \ - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM) + SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - SWP_PTE_MARKER_NUM) /* * Magic header for a swap area. The first part of the union is diff --git a/include/linux/swapops.h b/include/linux/swapops.h index d356ab4047f7..5103d2a4ae38 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -247,6 +247,84 @@ static inline int is_writable_migration_entry(swp_entry_t entry) #endif +typedef unsigned long pte_marker; + +#define PTE_MARKER_MASK (0) + +#ifdef CONFIG_PTE_MARKER + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + return swp_entry(SWP_PTE_MARKER, marker); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return swp_type(entry) == SWP_PTE_MARKER; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return swp_offset(entry) & PTE_MARKER_MASK; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return is_swap_pte(pte) && is_pte_marker_entry(pte_to_swp_entry(pte)); +} + +#else /* CONFIG_PTE_MARKER */ + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + /* This should never be called if !CONFIG_PTE_MARKER */ + WARN_ON_ONCE(1); + return swp_entry(0, 0); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return false; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return 0; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return false; +} + +#endif /* CONFIG_PTE_MARKER */ + +static inline pte_t make_pte_marker(pte_marker marker) +{ + return swp_entry_to_pte(make_pte_marker_entry(marker)); +} + +/* + * This is a special version to check pte_none() just to cover the case when + * the pte is a pte marker. It existed because in many cases the pte marker + * should be seen as a none pte; it's just that we have stored some information + * onto the none pte so it becomes not-none any more. + * + * It should be used when the pte is file-backed, ram-based and backing + * userspace pages, like shmem. It is not needed upon pgtables that do not + * support pte markers at all. For example, it's not needed on anonymous + * memory, kernel-only memory (including when the system is during-boot), + * non-ram based generic file-system. It's fine to be used even there, but the + * extra pte marker check will be pure overhead. + * + * For systems configured with !CONFIG_PTE_MARKER this will be automatically + * optimized to pte_none(). + */ +static inline int pte_none_mostly(pte_t pte) +{ + return pte_none(pte) || is_pte_marker(pte); +} + static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) { struct page *p = pfn_to_page(swp_offset(entry)); diff --git a/mm/Kconfig b/mm/Kconfig index c313bad5167a..25bcbb89f8e5 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -900,6 +900,13 @@ config ANON_VMA_NAME area from being merged with adjacent virtual memory areas due to the difference in their name. +config PTE_MARKER + def_bool n + bool "Marker PTEs support" + + help + Allows to create marker PTEs for file-backed memory. + source "mm/damon/Kconfig" endmenu From patchwork Fri Mar 4 05:16:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B0EEC433EF for ; Fri, 4 Mar 2022 05:17:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A6AA8D0005; Fri, 4 Mar 2022 00:17:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 956048D0001; Fri, 4 Mar 2022 00:17:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81E928D0005; Fri, 4 Mar 2022 00:17:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id 717CD8D0001 for ; Fri, 4 Mar 2022 00:17:44 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 29F7F8249980 for ; Fri, 4 Mar 2022 05:17:44 +0000 (UTC) X-FDA: 79205546448.18.68350C8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 9EB391C0006 for ; Fri, 4 Mar 2022 05:17:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1FpwyjJeS3X1Irq3CGuq+3W5FvpvI1rV+CjmQkx/pZ4=; b=CABEIkuvbnv5R+SsUpuHnqRv/YArt5KcQYeas7MlkZPVzGvITGQJIw3IeSN6FIsSZNS2jO jbK/mAG1+8VLa4/XhCDs3wMlZxRXqyVpD+a/Jo0rQL4/+3nN1OspVhz+dZ6KDIyPtUnDmL B6ya/Aqm6vcJZa5hcaUgm3wRDb4wL0k= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-529-ylsWfUjwPgunf4YtGt8JGA-1; Fri, 04 Mar 2022 00:17:41 -0500 X-MC-Unique: ylsWfUjwPgunf4YtGt8JGA-1 Received: by mail-pg1-f198.google.com with SMTP id bj8-20020a056a02018800b0035ec8c16f0bso3916919pgb.11 for ; Thu, 03 Mar 2022 21:17:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1FpwyjJeS3X1Irq3CGuq+3W5FvpvI1rV+CjmQkx/pZ4=; b=rNaC9xHvF9YrJ4FSlsx2VJuBlxOZ7ToH/h2f6Bd3byx52eSOc5AGXe883hLEUJuLWj QXpHw3kGE90KAlErk12LTQ38egoqMyBg8qEcLZ5PYfodVS2BG5zbASH/hawARRD+ivX1 N2hS4k09oXa/giUWB7NFiYFB5ioDSJp228tTXleS0j1bJ9JGP+Kyx6ig30lEzAi+OWQc h0hCVdeYA3e+FXCNryubH9A7R+TcO70WD9bEkASzYcoYCXBjZVV9u5d2UJYpYGUuv0uM cXY+z3u10Wx2SaghBo2788drf9u+WTjxhvMH7fiK1Aw8nAn1uGLFDRjX91ZMVykneR4b zmAA== X-Gm-Message-State: AOAM531F7GFTXe28Kbl7KOOCQZ1vTHMwXo4jQuE0SEnB5w9EV+hHhb7H Td69Riz3d8ouXoAawpFacMdvDN7/yQyxUmggOE6/QXTwEjmoSkDzTtBFTNRXwD5L6yGrhyUKLT4 MjOT6zax1uulAxcOcDfI+tTQ5YYdHSYvN9dTyf9Xt6qoysRM7yQTDq+0h8hms X-Received: by 2002:a17:903:2285:b0:151:4b38:298e with SMTP id b5-20020a170903228500b001514b38298emr29448179plh.36.1646371058998; Thu, 03 Mar 2022 21:17:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJw9mrpZaTeSPHfY/AgvFnrkerVofenDGmydQIn76U/VF813c+ZQ9LQV59XdkRzKTOQM8WpPSg== X-Received: by 2002:a17:903:2285:b0:151:4b38:298e with SMTP id b5-20020a170903228500b001514b38298emr29448139plh.36.1646371058465; Thu, 03 Mar 2022 21:17:38 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.17.30 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:17:38 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 02/23] mm: Teach core mm about pte markers Date: Fri, 4 Mar 2022 13:16:47 +0800 Message-Id: <20220304051708.86193-3-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9EB391C0006 X-Stat-Signature: omm5kbo958c6wip9rxxinnqa1e7qrrcm Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=CABEIkuv; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf21.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371063-376469 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch still does not use pte marker in any way, however it teaches the core mm about the pte marker idea. For example, handle_pte_marker() is introduced that will parse and handle all the pte marker faults. Many of the places are more about commenting it up - so that we know there's the possibility of pte marker showing up, and why we don't need special code for the cases. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 10 ++++++---- mm/filemap.c | 5 +++++ mm/hmm.c | 2 +- mm/memcontrol.c | 8 ++++++-- mm/memory.c | 23 +++++++++++++++++++++++ mm/mincore.c | 3 ++- mm/mprotect.c | 3 +++ 7 files changed, 46 insertions(+), 8 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index aa0c47cb0d16..8b4a94f5a238 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -249,9 +249,10 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, /* * Lockless access: we're in a wait_event so it's ok if it - * changes under us. + * changes under us. PTE markers should be handled the same as none + * ptes here. */ - if (huge_pte_none(pte)) + if (huge_pte_none_mostly(pte)) ret = true; if (!huge_pte_write(pte) && (reason & VM_UFFD_WP)) ret = true; @@ -330,9 +331,10 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, pte = pte_offset_map(pmd, address); /* * Lockless access: we're in a wait_event so it's ok if it - * changes under us. + * changes under us. PTE markers should be handled the same as none + * ptes here. */ - if (pte_none(*pte)) + if (pte_none_mostly(*pte)) ret = true; if (!pte_write(*pte) && (reason & VM_UFFD_WP)) ret = true; diff --git a/mm/filemap.c b/mm/filemap.c index 8f7e6088ee2a..464b8f0f111a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3379,6 +3379,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, vmf->pte += xas.xa_index - last_pgoff; last_pgoff = xas.xa_index; + /* + * NOTE: If there're PTE markers, we'll leave them to be + * handled in the specific fault path, and it'll prohibit the + * fault-around logic. + */ if (!pte_none(*vmf->pte)) goto unlock; diff --git a/mm/hmm.c b/mm/hmm.c index af71aac3140e..3fd3242c5e50 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -239,7 +239,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, pte_t pte = *ptep; uint64_t pfn_req_flags = *hmm_pfn; - if (pte_none(pte)) { + if (pte_none_mostly(pte)) { required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (required_fault) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f79bb3f25ce4..bba3b7e9f699 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5636,10 +5636,14 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, if (pte_present(ptent)) page = mc_handle_present_pte(vma, addr, ptent); + else if (pte_none_mostly(ptent)) + /* + * PTE markers should be treated as a none pte here, separated + * from other swap handling below. + */ + page = mc_handle_file_pte(vma, addr, ptent); else if (is_swap_pte(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); - else if (pte_none(ptent)) - page = mc_handle_file_pte(vma, addr, ptent); if (!page && !ent.val) return ret; diff --git a/mm/memory.c b/mm/memory.c index a0ca84756159..22d24ea7b87d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -99,6 +99,8 @@ struct page *mem_map; EXPORT_SYMBOL(mem_map); #endif +static vm_fault_t do_fault(struct vm_fault *vmf); + /* * A number of key systems in x86 including ioremap() rely on the assumption * that high_memory defines the upper bound on direct map memory, then end @@ -1419,6 +1421,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (!should_zap_page(details, page)) continue; rss[mm_counter(page)]--; + } else if (is_pte_marker_entry(entry)) { + /* By default, simply drop all pte markers when zap */ } else if (is_hwpoison_entry(entry)) { if (!should_zap_cows(details)) continue; @@ -3508,6 +3512,23 @@ static inline bool should_try_to_free_swap(struct page *page, page_count(page) == 2; } +static vm_fault_t handle_pte_marker(struct vm_fault *vmf) +{ + swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); + unsigned long marker = pte_marker_get(entry); + + /* + * PTE markers should always be with file-backed memories, and the + * marker should never be empty. If anything weird happened, the best + * thing to do is to kill the process along with its mm. + */ + if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) + return VM_FAULT_SIGBUS; + + /* TODO: handle pte markers */ + return 0; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3544,6 +3565,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; + } else if (is_pte_marker_entry(entry)) { + ret = handle_pte_marker(vmf); } else { print_bad_pte(vma, vmf->address, vmf->orig_pte, NULL); ret = VM_FAULT_SIGBUS; diff --git a/mm/mincore.c b/mm/mincore.c index 9122676b54d6..736869f4b409 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -121,7 +121,8 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, for (; addr != end; ptep++, addr += PAGE_SIZE) { pte_t pte = *ptep; - if (pte_none(pte)) + /* We need to do cache lookup too for pte markers */ + if (pte_none_mostly(pte)) __mincore_unmapped_range(addr, addr + PAGE_SIZE, vma, vec); else if (pte_present(pte)) diff --git a/mm/mprotect.c b/mm/mprotect.c index b69ce7a7b2b7..6d179c720089 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -184,6 +184,9 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); + } else if (is_pte_marker_entry(entry)) { + /* Skip it, the same as none pte */ + continue; } else { newpte = oldpte; } From patchwork Fri Mar 4 05:16:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44A7CC433F5 for ; Fri, 4 Mar 2022 05:17:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8CAC8D0006; Fri, 4 Mar 2022 00:17:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C3BDD8D0001; Fri, 4 Mar 2022 00:17:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B2A988D0006; Fri, 4 Mar 2022 00:17:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id A50888D0001 for ; Fri, 4 Mar 2022 00:17:50 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6FA4E22762 for ; Fri, 4 Mar 2022 05:17:50 +0000 (UTC) X-FDA: 79205546700.02.B8CA9BC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 9DA6CA000C for ; Fri, 4 Mar 2022 05:17:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371069; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DWHI+KLiCdBad7t4Fcw0RAEljy7M5LeKI3evjgamGnQ=; b=EfC1GuAUFfEESPtX9JLqTttVWH1i2A0/a8Yw8Ftes9rxieTSapTY2eP4RCRMDevOdylpBU y5BgAXlbu+TmdVdr+DjY9h+TuVaUyiaudCS60lahl7s/2P/v+yte+U7vfJVwhxYQ8EH3IH h3HDYT/eUawzz42osnf/ZyrLO9mPYuQ= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-110-ej5fZQypNPuNAc_UGJFlWg-1; Fri, 04 Mar 2022 00:17:48 -0500 X-MC-Unique: ej5fZQypNPuNAc_UGJFlWg-1 Received: by mail-pl1-f199.google.com with SMTP id z10-20020a170902708a00b0014fc3888923so4089727plk.22 for ; Thu, 03 Mar 2022 21:17:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DWHI+KLiCdBad7t4Fcw0RAEljy7M5LeKI3evjgamGnQ=; b=PU+w7M+WuiGvTgYZ+7ROLqvfdFg3WR1JWtLvN8ELU9phQ7HKRo1ew2BMaR4X0Rq/FL ZYFTAqvuyYQcutLZp8YSB/1ooqUKnGoMLLtXAgbWqO/+m7vhegXrcwv/31rJuwQjoOIz qbHf5RhxAUe1D2zEULggIDESCfoQkFVfUwAjD6Ich+CmequupS1II/GptMAuqPkElZ1O 2oiT5KGyXvQb4TXe05ng0atJXGMyRihP7I1lWvsHrh7kaUFd5kfjTwBQQLFW6cNUMBFJ PpiWUh4a3OMSDEL6zir2iQY+lyD/Ys4d4EZ1g88Y6BrHry3938ypZgROy/m4QYcwI6w5 QvgQ== X-Gm-Message-State: AOAM531PmKW18Nu6MDyy7U75ukQgwnXQRN/pbm91eUGHolpaTGkC4hdo g4lbn20TCp7c+OXjA8CU1ZLwEv80jFCo8sQcBFEHGVplRqZNsqcnkiBeGq11z2iGs9eid9HOqon xfk25HcYatSGNWUug14Bos2/sMMbcc9zlR4R0JdA0nRFsRUdsajdWzUhXYqSY X-Received: by 2002:a17:902:d894:b0:151:64c6:273 with SMTP id b20-20020a170902d89400b0015164c60273mr24307322plz.150.1646371067215; Thu, 03 Mar 2022 21:17:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJwBDtDn1Z3wbdPzigbaBlDKAgZd/ChJa3km/enVI+LZTN2c4X2t5bAMomnNrL0k2Ir186+95Q== X-Received: by 2002:a17:902:d894:b0:151:64c6:273 with SMTP id b20-20020a170902d89400b0015164c60273mr24307281plz.150.1646371066620; Thu, 03 Mar 2022 21:17:46 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.17.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:17:46 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 03/23] mm: Check against orig_pte for finish_fault() Date: Fri, 4 Mar 2022 13:16:48 +0800 Message-Id: <20220304051708.86193-4-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9DA6CA000C X-Stat-Signature: xcww5b75mdz7h95ffqzphhr4szgd58gp Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EfC1GuAU; spf=none (imf15.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1646371069-651862 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We used to check against none pte in finish_fault(), with the assumption that the orig_pte is always none pte. This change prepares us to be able to call do_fault() on !none ptes. For example, we should allow that to happen for pte marker so that we can restore information out of the pte markers. Let's change the "pte_none" check into detecting changes since we fetched orig_pte. One trivial thing to take care of here is, when pmd==NULL for the pgtable we may not initialize orig_pte at all in handle_pte_fault(). By default orig_pte will be all zeros however the problem is not all architectures are using all-zeros for a none pte. pte_clear() will be the right thing to use here so that we'll always have a valid orig_pte value for the whole handle_pte_fault() call. Signed-off-by: Peter Xu --- mm/memory.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 22d24ea7b87d..cdd0d108d3ee 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4135,7 +4135,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->address, &vmf->ptl); ret = 0; /* Re-check under ptl */ - if (likely(pte_none(*vmf->pte))) + if (likely(pte_same(*vmf->pte, vmf->orig_pte))) do_set_pte(vmf, page, vmf->address); else ret = VM_FAULT_NOPAGE; @@ -4600,6 +4600,13 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * concurrent faults and from rmap lookups. */ vmf->pte = NULL; + /* + * Always initialize orig_pte. This matches with below + * code to have orig_pte to be the none pte if pte==NULL. + * This makes the rest code to be always safe to reference + * it, e.g. in finish_fault() we'll detect pte changes. + */ + pte_clear(vmf->vma->vm_mm, vmf->address, &vmf->orig_pte); } else { /* * If a huge pmd materialized under us just retry later. Use From patchwork Fri Mar 4 05:16:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07C00C433EF for ; Fri, 4 Mar 2022 05:17:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F3FA8D0007; Fri, 4 Mar 2022 00:17:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A34F8D0001; Fri, 4 Mar 2022 00:17:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86BB78D0007; Fri, 4 Mar 2022 00:17:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id 796558D0001 for ; Fri, 4 Mar 2022 00:17:58 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 30E3D8249980 for ; Fri, 4 Mar 2022 05:17:58 +0000 (UTC) X-FDA: 79205547036.27.48D1014 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id A430E4000A for ; Fri, 4 Mar 2022 05:17:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YEH//nu79wWwuy8WPoBNLdXG3sMM2ihkgiTBywoRJ9c=; b=FyWNvuRSHvyu8C7bcNKENLpqPCLBNCcRzuH1HEJOQwWEpKVADyrVArblByHb8679njJoyV UBao93ehEhQoJ55x/6yAo19Os1dCUb+C440AV9B47p1gZUz+xvK1u15tiLsyrUPD0tpNu+ faP0brv2KdxSuwhwhoKZocPIEWDpSE0= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-633-LXf7z2OHNEeIf91hDCfVDA-1; Fri, 04 Mar 2022 00:17:56 -0500 X-MC-Unique: LXf7z2OHNEeIf91hDCfVDA-1 Received: by mail-pg1-f199.google.com with SMTP id u74-20020a63794d000000b00373efe2ac5aso3905851pgc.14 for ; Thu, 03 Mar 2022 21:17:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YEH//nu79wWwuy8WPoBNLdXG3sMM2ihkgiTBywoRJ9c=; b=EV3A7+O4C12Jlh6cZhoNmeD2XkmY/u8oY/pzj1+fmImfXYsSGevd5UUEtH9Sj/Mwz8 ExgC3kxGyGrZpPv3JCIHcUSDM8wQENeHmE3RMD8Uc74htCwPntBqZeDvhX73Ug5kPkbW XWe8xKjMVkEpzrdcis1tP95xEX8szgLDEbF2mmObi9ET6atJ1p834nfpoAXQNDWLtGyw XVhDnZrfnZCpd8/sH+vFlolmvXvdqizC3nprjUSMJHaYE0IH2HVRE7w4PIbcL9Q/tEQp JsYRcW6ZjmBwo48d8MCUNy+3+Ss4A4zAaxzQVNyYl6zZTjowrMf1FhsrKtoK4UP2M2c0 HhdQ== X-Gm-Message-State: AOAM531P3icKZ6sNa8kAxm9ImuPLXxWBo4zRToDgoOmAQNadsdryC+IN /JhLBOmGpERfEkvh3+M2qqyZ9Tt7YFepVaejVmO/zh4B9ZXjDGUaAKfwRxAY18yDL5HdVBnuEkK UMZhoBuD/wF0xGNUosl6Ib7b4xe4XH4TNBnZrxdg1z6f05Gzhk1GIQ9wX7hDI X-Received: by 2002:a17:90a:5d93:b0:1bc:4f9c:8eed with SMTP id t19-20020a17090a5d9300b001bc4f9c8eedmr8907327pji.180.1646371074867; Thu, 03 Mar 2022 21:17:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJwXiEiEBTk5qw5SPMWSMqnxEw34BDQjWY2sS9WIU+2vrSP3R9/ofOCk8lF82RBcX9U1hDyWIw== X-Received: by 2002:a17:90a:5d93:b0:1bc:4f9c:8eed with SMTP id t19-20020a17090a5d9300b001bc4f9c8eedmr8907294pji.180.1646371074503; Thu, 03 Mar 2022 21:17:54 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.17.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:17:54 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 04/23] mm/uffd: PTE_MARKER_UFFD_WP Date: Fri, 4 Mar 2022 13:16:49 +0800 Message-Id: <20220304051708.86193-5-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A430E4000A X-Stat-Signature: y7bjczmqe5jwg377gsdifrf7kmgeftju Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FyWNvuRS; spf=none (imf04.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-HE-Tag: 1646371077-907670 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces the 1st user of pte marker: the uffd-wp marker. When the pte marker is installed with the uffd-wp bit set, it means this pte was wr-protected by uffd. We will use this special pte to arm the ptes that got either unmapped or swapped out for a file-backed region that was previously wr-protected. This special pte could trigger a page fault just like swap entries. This idea is greatly inspired by Hugh and Andrea in the discussion, which is referenced in the links below. Some helpers are introduced to detect whether a swap pte is uffd wr-protected. After the pte marker introduced, one swap pte can be wr-protected in two forms: either it is a normal swap pte and it has _PAGE_SWP_UFFD_WP set, or it's a pte marker that has PTE_MARKER_UFFD_WP set. Link: https://lore.kernel.org/lkml/20201126222359.8120-1-peterx@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-peterx@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu --- include/linux/swapops.h | 3 ++- include/linux/userfaultfd_k.h | 43 +++++++++++++++++++++++++++++++++++ mm/Kconfig | 9 ++++++++ 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 5103d2a4ae38..2cec3ef355a7 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -249,7 +249,8 @@ static inline int is_writable_migration_entry(swp_entry_t entry) typedef unsigned long pte_marker; -#define PTE_MARKER_MASK (0) +#define PTE_MARKER_UFFD_WP BIT(0) +#define PTE_MARKER_MASK (PTE_MARKER_UFFD_WP) #ifdef CONFIG_PTE_MARKER diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 33cea484d1ad..bd09c3c89b59 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -15,6 +15,8 @@ #include #include +#include +#include #include /* The set of all possible UFFD-related VM flags. */ @@ -236,4 +238,45 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, #endif /* CONFIG_USERFAULTFD */ +static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) +{ + return is_pte_marker_entry(entry) && + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); +} + +static inline bool pte_marker_uffd_wp(pte_t pte) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + swp_entry_t entry; + + if (!is_swap_pte(pte)) + return false; + + entry = pte_to_swp_entry(pte); + + return pte_marker_entry_uffd_wp(entry); +#else + return false; +#endif +} + +/* + * Returns true if this is a swap pte and was uffd-wp wr-protected in either + * forms (pte marker or a normal swap pte), false otherwise. + */ +static inline bool pte_swp_uffd_wp_any(pte_t pte) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!is_swap_pte(pte)) + return false; + + if (pte_swp_uffd_wp(pte)) + return true; + + if (pte_marker_uffd_wp(pte)) + return true; +#endif + return false; +} + #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/mm/Kconfig b/mm/Kconfig index 25bcbb89f8e5..a80ea8721885 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -907,6 +907,15 @@ config PTE_MARKER help Allows to create marker PTEs for file-backed memory. +config PTE_MARKER_UFFD_WP + bool "Marker PTEs support for userfaultfd write protection" + depends on PTE_MARKER && HAVE_ARCH_USERFAULTFD_WP + + help + Allows to create marker PTEs for userfaultfd write protection + purposes. It is required to enable userfaultfd write protection on + file-backed memory types like shmem and hugetlbfs. + source "mm/damon/Kconfig" endmenu From patchwork Fri Mar 4 05:16:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E47AC433EF for ; Fri, 4 Mar 2022 05:18:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CE1B8D0008; Fri, 4 Mar 2022 00:18:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 97CE38D0001; Fri, 4 Mar 2022 00:18:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 844CC8D0008; Fri, 4 Mar 2022 00:18:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 773468D0001 for ; Fri, 4 Mar 2022 00:18:06 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2C9BF181C43D5 for ; Fri, 4 Mar 2022 05:18:06 +0000 (UTC) X-FDA: 79205547372.26.C065020 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 8C8E340002 for ; Fri, 4 Mar 2022 05:18:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J4WHOp4As9jU9WPKckCAt0i8DCpDD3uwt3O/JTJCiOk=; b=RktsEAEDP/2f4cOdtwXBsjhK8pSK4gjhzAHMGFPlipEIJRVhmCXZm4tTYqgQkzuFaCXGQR RBD59udbfJQQPf5D/qcTlqn3xNyad93bV2VsRzTvooCdCMjd3nwfGWzQF73G8bjvVBW01U RBu2yeaXWZwACt6Ee3luIc4WCqcrHV4= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-163-Cz-7GtBdMvW0eAudAKlSWQ-1; Fri, 04 Mar 2022 00:18:04 -0500 X-MC-Unique: Cz-7GtBdMvW0eAudAKlSWQ-1 Received: by mail-pj1-f71.google.com with SMTP id t10-20020a17090a5d8a00b001bed9556134so6714702pji.5 for ; Thu, 03 Mar 2022 21:18:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J4WHOp4As9jU9WPKckCAt0i8DCpDD3uwt3O/JTJCiOk=; b=fLHNJtbsuKDi2NhEA7zO7Km3EBGaiyj7zAZBdZUgCEPzUB8aPsctru0fopc4Tg6lBA l2syM1QMQh2JLclbh6OD+LX6ssfoXjR96nc2Ds+TMTkV7COZFMGfdampi9m5EJtKIt4q AF6VGP5p3GaZM1zryqWXQQGmlCzuwQ4G2x6Loj8eVn/F/fEqOxdG6I6KCJCoNvMqDRTO UkJsOB5vIed1N0MwkR+JvpT76ORGVtD03zwgHaApaWsFDDS2GeqSqS9aJpMnRuunXHfW trruFBH0HP/x8nAMBGgg3diDSmIg7gwOzoX6IeN0sHtQPJOSmQXDimgi0CJHzC4R62nn Kfuw== X-Gm-Message-State: AOAM532HkqY9YhjmrZmXj7YOjpXyuFMgYn6Oa/hNEd6jFuob5PQUW1H+ gfIlBi0x5cNjKgfvXXeBJcxux1/rJNfrkiD9u+xIRXou6XlioBRpRXM5DtwK1qvbrBW4ADI6b8v H4oROcFHVns5EzkRki9Fdigfoyifw0I7AAJqGmG20JoHHVtNjtnro15cDy/O9 X-Received: by 2002:a63:8648:0:b0:37c:8fc8:ae4 with SMTP id x69-20020a638648000000b0037c8fc80ae4mr2421613pgd.482.1646371083010; Thu, 03 Mar 2022 21:18:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJznlFXBVvf7x27XrfMAEWi1fvHMcNPQxFuHlxUyI/+59HbIcVzbyw8INKOOkmh2hJfkrZdmyg== X-Received: by 2002:a63:8648:0:b0:37c:8fc8:ae4 with SMTP id x69-20020a638648000000b0037c8fc80ae4mr2421585pgd.482.1646371082478; Thu, 03 Mar 2022 21:18:02 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.17.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:02 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 05/23] mm/shmem: Take care of UFFDIO_COPY_MODE_WP Date: Fri, 4 Mar 2022 13:16:50 +0800 Message-Id: <20220304051708.86193-6-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 8C8E340002 X-Rspam-User: Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RktsEAED; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf04.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: 9531ypi7mn7r3nsrkr3rtrj4nyh3ffxh X-HE-Tag: 1646371085-827044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pass wp_copy into shmem_mfill_atomic_pte() through the stack, then apply the UFFD_WP bit properly when the UFFDIO_COPY on shmem is with UFFDIO_COPY_MODE_WP. wp_copy lands mfill_atomic_install_pte() finally. Note: we must do pte_wrprotect() if !writable in mfill_atomic_install_pte(), as mk_pte() could return a writable pte (e.g., when VM_SHARED on a shmem file). Signed-off-by: Peter Xu --- include/linux/shmem_fs.h | 4 ++-- mm/shmem.c | 4 ++-- mm/userfaultfd.c | 23 ++++++++++++++++++----- 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index ab51d3cd39bd..02d23ce5f979 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -145,11 +145,11 @@ extern int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - bool zeropage, + bool zeropage, bool wp_copy, struct page **pagep); #else /* !CONFIG_SHMEM */ #define shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, \ - src_addr, zeropage, pagep) ({ BUG(); 0; }) + src_addr, zeropage, wp_copy, pagep) ({ BUG(); 0; }) #endif /* CONFIG_SHMEM */ #endif /* CONFIG_USERFAULTFD */ diff --git a/mm/shmem.c b/mm/shmem.c index 81a69bd247b4..3f0332c1c1e9 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2317,7 +2317,7 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - bool zeropage, + bool zeropage, bool wp_copy, struct page **pagep) { struct inode *inode = file_inode(dst_vma->vm_file); @@ -2390,7 +2390,7 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release; ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - page, true, false); + page, true, wp_copy); if (ret) goto out_delete_from_cache; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e9bb6db002aa..ef418a48b121 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -77,10 +77,19 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, * Always mark a PTE as write-protected when needed, regardless of * VM_WRITE, which the user might change. */ - if (wp_copy) + if (wp_copy) { _dst_pte = pte_mkuffd_wp(_dst_pte); - else if (writable) + writable = false; + } + + if (writable) _dst_pte = pte_mkwrite(_dst_pte); + else + /* + * We need this to make sure write bit removed; as mk_pte() + * could return a pte with write bit set. + */ + _dst_pte = pte_wrprotect(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); @@ -95,7 +104,12 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, } ret = -EEXIST; - if (!pte_none(*dst_pte)) + /* + * We allow to overwrite a pte marker: consider when both MISSING|WP + * registered, we firstly wr-protect a none pte which has no page cache + * page backing it, then access the page. + */ + if (!pte_none_mostly(*dst_pte)) goto out_unlock; if (page_in_cache) { @@ -479,11 +493,10 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { - VM_WARN_ON_ONCE(wp_copy); err = shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, src_addr, mode != MCOPY_ATOMIC_NORMAL, - page); + wp_copy, page); } return err; From patchwork Fri Mar 4 05:16:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6011AC433EF for ; Fri, 4 Mar 2022 05:18:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02FB58D0009; Fri, 4 Mar 2022 00:18:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F21D18D0001; Fri, 4 Mar 2022 00:18:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E11348D0009; Fri, 4 Mar 2022 00:18:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id D396B8D0001 for ; Fri, 4 Mar 2022 00:18:14 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8A99B181C43D5 for ; Fri, 4 Mar 2022 05:18:14 +0000 (UTC) X-FDA: 79205547708.20.64B4A58 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id F3CD41C000A for ; Fri, 4 Mar 2022 05:18:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371093; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h4ZxUpUrUaV8nyMPDE8x8P31UP3HUv0lUXU89pH2T9E=; b=FOf4oiFZOorHD4glkQAoavw/Nu9j4gx5SwiHmX+KPtX+y2Cy8V7wJPXHUF7JuDG12WGJaI 818XQk7KkqIk4+jR23eAylrg97zOTN1o0+DuUmp66DNc2Sq1n/gBjBSY3tqBicgwo9Gs2q ZGH/SQDzK7UCvtjPxgfHm3M1fALecfo= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-401-G4Bc-_G7P125ACKnyWPY-Q-1; Fri, 04 Mar 2022 00:18:12 -0500 X-MC-Unique: G4Bc-_G7P125ACKnyWPY-Q-1 Received: by mail-pl1-f199.google.com with SMTP id o15-20020a170902d4cf00b00151559fadd7so4084873plg.20 for ; Thu, 03 Mar 2022 21:18:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h4ZxUpUrUaV8nyMPDE8x8P31UP3HUv0lUXU89pH2T9E=; b=rwRtPzCFXkXfMIHcG/L2A8E6LYvUMqyIfc+aYSNYwloadSw4eTW+osomflQC/KM/hI Hzt1MjSwE9vegJjfLZC3afT1A5eQUOLGA7sK1yJ3V5GEHUaJgNhsbZ+QanaWURdym/xH ZxnuxMivUCr+qA6dXdY72y9cQoe7PQf4w/8r2ufsPBKAhhRr9Ro7TJ/Hqjk3p8cKH+4k u2A66L0+gP1jarFUT55pvFjFvTyRCwMfyQp5Hed+TGzimYulTK/wRwd74py3eQhjsZ/0 938aBn1RdGIyU/8n2A7+d9oZic40G06YiS8uAJ38d0u4v3/Zy7g9pAPepjtzHNh7pe50 3BRQ== X-Gm-Message-State: AOAM531YChGkvcRSkPpLfJOQyssxjZTUbG0oFEhQ3oEiRYS/Jvi5a1Io g2cLQgr2yWPoVcTEfEBQJIidktIcsxvg2Th0QcSexsCiLsmC/+yiUvVNWX/NkpyXtXmQLpRgqT0 VRniUNvqAvCV+spQuGb6LXKF7H1qMb3qXVCEyfVidg5V+4k2gjO9eeq1S2GV/ X-Received: by 2002:a05:6a00:b96:b0:4f3:c0f6:5c47 with SMTP id g22-20020a056a000b9600b004f3c0f65c47mr517344pfj.69.1646371091295; Thu, 03 Mar 2022 21:18:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJyn5dg8SYjJevnXT+y2Ca46TGDW60Atz3RHVXa5U9+Jx4K7xiL9CysJtRs0lRcaiYEnHEu04Q== X-Received: by 2002:a05:6a00:b96:b0:4f3:c0f6:5c47 with SMTP id g22-20020a056a000b9600b004f3c0f65c47mr517282pfj.69.1646371090410; Thu, 03 Mar 2022 21:18:10 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.02 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:10 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 06/23] mm/shmem: Handle uffd-wp special pte in page fault handler Date: Fri, 4 Mar 2022 13:16:51 +0800 Message-Id: <20220304051708.86193-7-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: F3CD41C000A X-Stat-Signature: 1wi6exnksb8ftbz8qhk4c7mhgy7bmagk Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FOf4oiFZ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf18.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371093-675381 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memories are prone to unmap/swap so the ptes are always unstable, because they can be easily faulted back later using the page cache. This could lead to uffd-wp getting lost when unmapping or swapping out such memory. One example is shmem. PTE markers are needed to store those information. This patch prepares it by handling uffd-wp pte markers first it is applied elsewhere, so that the page fault handler can recognize uffd-wp pte markers. The handling of uffd-wp pte markers is similar to missing fault, it's just that we'll handle this "missing fault" when we see the pte markers, meanwhile we need to make sure the marker information is kept during processing the fault. This is a slow path of uffd-wp handling, because zapping of wr-protected shmem ptes should be rare. So far it should only trigger in two conditions: (1) When trying to punch holes in shmem_fallocate(), there is an optimization to zap the pgtables before evicting the page. (2) When swapping out shmem pages. Because of this, the page fault handling is simplifed too by not sending the wr-protect message in the 1st page fault, instead the page will be installed read-only, so the uffd-wp message will be generated in the next fault, which will trigger the do_wp_page() path of general uffd-wp handling. Disable fault-around for all uffd-wp registered ranges for extra safety just like uffd-minor fault, and clean the code up. Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 17 +++++++++ mm/memory.c | 67 ++++++++++++++++++++++++++++++----- 2 files changed, 75 insertions(+), 9 deletions(-) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index bd09c3c89b59..827e38b7be65 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -96,6 +96,18 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); } +/* + * Don't do fault around for either WP or MINOR registered uffd range. For + * MINOR registered range, fault around will be a total disaster and ptes can + * be installed without notifications; for WP it should mostly be fine as long + * as the fault around checks for pte_none() before the installation, however + * to be super safe we just forbid it. + */ +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; @@ -236,6 +248,11 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, { } +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) diff --git a/mm/memory.c b/mm/memory.c index cdd0d108d3ee..f509ddf2ad39 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3512,6 +3512,39 @@ static inline bool should_try_to_free_swap(struct page *page, page_count(page) == 2; } +static vm_fault_t pte_marker_clear(struct vm_fault *vmf) +{ + vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + /* + * Be careful so that we will only recover a special uffd-wp pte into a + * none pte. Otherwise it means the pte could have changed, so retry. + */ + if (is_pte_marker(*vmf->pte)) + pte_clear(vmf->vma->vm_mm, vmf->address, vmf->pte); + pte_unmap_unlock(vmf->pte, vmf->ptl); + return 0; +} + +/* + * This is actually a page-missing access, but with uffd-wp special pte + * installed. It means this pte was wr-protected before being unmapped. + */ +static vm_fault_t pte_marker_handle_uffd_wp(struct vm_fault *vmf) +{ + /* + * Just in case there're leftover special ptes even after the region + * got unregistered - we can simply clear them. We can also do that + * proactively when e.g. when we do UFFDIO_UNREGISTER upon some uffd-wp + * ranges, but it should be more efficient to be done lazily here. + */ + if (unlikely(!userfaultfd_wp(vmf->vma) || vma_is_anonymous(vmf->vma))) + return pte_marker_clear(vmf); + + /* do_fault() can handle pte markers too like none pte */ + return do_fault(vmf); +} + static vm_fault_t handle_pte_marker(struct vm_fault *vmf) { swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); @@ -3525,8 +3558,11 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) return VM_FAULT_SIGBUS; - /* TODO: handle pte markers */ - return 0; + if (pte_marker_entry_uffd_wp(entry)) + return pte_marker_handle_uffd_wp(vmf); + + /* This is an unknown pte marker */ + return VM_FAULT_SIGBUS; } /* @@ -4051,6 +4087,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) { struct vm_area_struct *vma = vmf->vma; + bool uffd_wp = pte_marker_uffd_wp(vmf->orig_pte); bool write = vmf->flags & FAULT_FLAG_WRITE; bool prefault = vmf->address != addr; pte_t entry; @@ -4065,6 +4102,8 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (unlikely(uffd_wp)) + entry = pte_mkuffd_wp(pte_wrprotect(entry)); /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); @@ -4238,9 +4277,21 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); } +/* Return true if we should do read fault-around, false otherwise */ +static inline bool should_fault_around(struct vm_fault *vmf) +{ + /* No ->map_pages? No way to fault around... */ + if (!vmf->vma->vm_ops->map_pages) + return false; + + if (uffd_disable_fault_around(vmf->vma)) + return false; + + return fault_around_bytes >> PAGE_SHIFT > 1; +} + static vm_fault_t do_read_fault(struct vm_fault *vmf) { - struct vm_area_struct *vma = vmf->vma; vm_fault_t ret = 0; /* @@ -4248,12 +4299,10 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) * if page by the offset is not ready to be mapped (cold cache or * something). */ - if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) { - if (likely(!userfaultfd_minor(vmf->vma))) { - ret = do_fault_around(vmf); - if (ret) - return ret; - } + if (should_fault_around(vmf)) { + ret = do_fault_around(vmf); + if (ret) + return ret; } ret = __do_fault(vmf); From patchwork Fri Mar 4 05:16:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B216EC433EF for ; Fri, 4 Mar 2022 05:18:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5110B8D000A; Fri, 4 Mar 2022 00:18:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C15D8D0001; Fri, 4 Mar 2022 00:18:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 389428D000A; Fri, 4 Mar 2022 00:18:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 2A6698D0001 for ; Fri, 4 Mar 2022 00:18:24 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id DBDA4181C43D5 for ; Fri, 4 Mar 2022 05:18:23 +0000 (UTC) X-FDA: 79205548086.22.1CB75D4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 4E629140006 for ; Fri, 4 Mar 2022 05:18:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371102; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1n551s1g6Vv5nncVOaasRljGy2eKMZ3v3Uj1KhN2rf0=; b=NWJ3qVnLJRmUNrx3HPj7kuM6J2SmoLuNNBBpU1EwXfI9rsIJZTL3K3BUGmwq29lGqTrJVx ZY4a8gZev9tVGzVEoF0RDYStEmaYGTkn1gdf8/fqwSo1ViOnUmMt4nRXDbDMnFwY9kv0er rt1KpqK2f10Mr1AagK4zg5h4BNE0mvQ= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-577-Yjb_5klAMUWZIRGBsnNixw-1; Fri, 04 Mar 2022 00:18:21 -0500 X-MC-Unique: Yjb_5klAMUWZIRGBsnNixw-1 Received: by mail-pj1-f71.google.com with SMTP id m9-20020a17090ade0900b001bedf2d1d4cso6200614pjv.2 for ; Thu, 03 Mar 2022 21:18:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1n551s1g6Vv5nncVOaasRljGy2eKMZ3v3Uj1KhN2rf0=; b=8JiPNOXnvtFb8yoN7rCgLbYgFMIBBSM1g9zVI2CCkQrvfXvH4d6jhuf/Jt87k1U0xa Prr4ak/inYDyvHUV2pxQ/VJqxhhy3jqrXk06NnGoC4wBjCeunaCXPfKKA521wgXXIzhU 2SZaHPqkslH92tUNBBn6P1IxO+iWK3pmb6/jMU5xA7IMJFujdrVrfTfqVa5z1UJ2R4rK LSEzTrsDp/7yWz80xRrqrTfGmJaQSJA3SUYRlnw0v1yybXi710xiBMHTzdDtaqsrcTTf QykKvPlFrfV6f/p90CWAWdCLTrfb/4iAxGN3pXcEgqW6t2n0gqAZf4Fb1QNkYGogPxl6 TLgw== X-Gm-Message-State: AOAM533e4zPk4E7cklB0vlVJaU5CTQdf9oS0Zv/6wZz07Bgx8SZvZdE1 6hyQiQB5jDHuSo2/NmGDCcwv0ZMYJSA8En2kFMB+sME65ZtpawE1qvMuZcHmpt7nnWhRiJSUVoM 2T3EytGsSTts/1EPUA0UEbdXowkgFScU6x0HFb13vOLVDcfKhrr8BF7ibkDGk X-Received: by 2002:a17:902:7c13:b0:151:a784:34c7 with SMTP id x19-20020a1709027c1300b00151a78434c7mr7469114pll.174.1646371099653; Thu, 03 Mar 2022 21:18:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJx0fLWHy2rctiM2ENjL5ljsJ4tYkRpfVBiuZmQMy/vB+djlUOenh/QkOWrinBqK7A5FOggeSw== X-Received: by 2002:a17:902:7c13:b0:151:a784:34c7 with SMTP id x19-20020a1709027c1300b00151a78434c7mr7469075pll.174.1646371099075; Thu, 03 Mar 2022 21:18:19 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:18 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 07/23] mm/shmem: Persist uffd-wp bit across zapping for file-backed Date: Fri, 4 Mar 2022 13:16:52 +0800 Message-Id: <20220304051708.86193-8-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4E629140006 X-Stat-Signature: xc1zkrw4d7mcjpuay16eo3w689h5g4si Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NWJ3qVnL; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf26.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371103-419674 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory is prone to being unmapped at any time. It means all information in the pte will be dropped, including the uffd-wp flag. To persist the uffd-wp flag, we'll use the pte markers. This patch teaches the zap code to understand uffd-wp and know when to keep or drop the uffd-wp bit. Add a new flag ZAP_FLAG_DROP_MARKER and set it in zap_details when we don't want to persist such an information, for example, when destroying the whole vma, or punching a hole in a shmem file. For the rest cases we should never drop the uffd-wp bit, or the wr-protect information will get lost. The new ZAP_FLAG_DROP_MARKER needs to be put into mm.h rather than memory.c because it'll be further referenced in hugetlb files later. Signed-off-by: Peter Xu --- include/linux/mm.h | 10 ++++++++ include/linux/mm_inline.h | 43 ++++++++++++++++++++++++++++++++++ mm/memory.c | 49 ++++++++++++++++++++++++++++++++++++--- mm/rmap.c | 8 +++++++ 4 files changed, 107 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 0b9a0334d0f8..cdefbb078a73 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3385,4 +3385,14 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start, } #endif +typedef unsigned int __bitwise zap_flags_t; + +/* + * Whether to drop the pte markers, for example, the uffd-wp information for + * file-backed memory. This should only be specified when we will completely + * drop the page in the mm, either by truncation or unmapping of the vma. By + * default, the flag is not set. + */ +#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0)) + #endif /* _LINUX_MM_H */ diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index ac32125745ab..70e72ce85b25 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -6,6 +6,8 @@ #include #include #include +#include +#include /** * folio_is_file_lru - Should the folio be on a file LRU or anon LRU? @@ -316,5 +318,46 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending) > 1; } +/* + * If this pte is wr-protected by uffd-wp in any form, arm the special pte to + * replace a none pte. NOTE! This should only be called when *pte is already + * cleared so we will never accidentally replace something valuable. Meanwhile + * none pte also means we are not demoting the pte so tlb flushed is not needed. + * E.g., when pte cleared the caller should have taken care of the tlb flush. + * + * Must be called with pgtable lock held so that no thread will see the none + * pte, and if they see it, they'll fault and serialize at the pgtable lock. + * + * This function is a no-op if PTE_MARKER_UFFD_WP is not enabled. + */ +static inline void +pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, pte_t pteval) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + bool arm_uffd_pte = false; + + /* The current status of the pte should be "cleared" before calling */ + WARN_ON_ONCE(!pte_none(*pte)); + + if (vma_is_anonymous(vma)) + return; + + /* A uffd-wp wr-protected normal pte */ + if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) + arm_uffd_pte = true; + + /* + * A uffd-wp wr-protected swap pte. Note: this should even cover an + * existing pte marker with uffd-wp bit set. + */ + if (unlikely(pte_swp_uffd_wp_any(pteval))) + arm_uffd_pte = true; + + if (unlikely(arm_uffd_pte)) + set_pte_at(vma->vm_mm, addr, pte, + make_pte_marker(PTE_MARKER_UFFD_WP)); +#endif +} #endif diff --git a/mm/memory.c b/mm/memory.c index f509ddf2ad39..e3e67e32eb8a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -74,6 +74,7 @@ #include #include #include +#include #include @@ -1310,6 +1311,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + zap_flags_t zap_flags; /* Extra flags for zapping */ }; /* Whether we should zap all COWed (private) pages too */ @@ -1338,6 +1340,29 @@ static inline bool should_zap_page(struct zap_details *details, struct page *pag return !PageAnon(page); } +static inline bool zap_drop_file_uffd_wp(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_DROP_MARKER; +} + +/* + * This function makes sure that we'll replace the none pte with an uffd-wp + * swap special pte marker when necessary. Must be with the pgtable lock held. + */ +static inline void +zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, + struct zap_details *details, pte_t pteval) +{ + if (zap_drop_file_uffd_wp(details)) + return; + + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1375,6 +1400,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); if (unlikely(!page)) continue; @@ -1405,6 +1432,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, page = pfn_swap_entry_to_page(entry); if (unlikely(!should_zap_page(details, page))) continue; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); rss[mm_counter(page)]--; if (is_device_private_entry(entry)) page_remove_rmap(page, vma, false); @@ -1421,8 +1455,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (!should_zap_page(details, page)) continue; rss[mm_counter(page)]--; - } else if (is_pte_marker_entry(entry)) { - /* By default, simply drop all pte markers when zap */ + } else if (pte_marker_entry_uffd_wp(entry)) { + /* Only drop the uffd-wp marker if explicitly requested */ + if (!zap_drop_file_uffd_wp(details)) + continue; } else if (is_hwpoison_entry(entry)) { if (!should_zap_cows(details)) continue; @@ -1431,6 +1467,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, WARN_ON_ONCE(1); } pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); } while (pte++, addr += PAGE_SIZE, addr != end); add_mm_rss_vec(mm, rss); @@ -1641,12 +1678,17 @@ void unmap_vmas(struct mmu_gather *tlb, unsigned long end_addr) { struct mmu_notifier_range range; + struct zap_details details = { + .zap_flags = ZAP_FLAG_DROP_MARKER, + /* Careful - we need to zap private pages too! */ + .even_cows = true, + }; mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details); mmu_notifier_invalidate_range_end(&range); } @@ -3391,6 +3433,7 @@ void unmap_mapping_folio(struct folio *folio) details.even_cows = false; details.single_folio = folio; + details.zap_flags = ZAP_FLAG_DROP_MARKER; i_mmap_lock_write(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) diff --git a/mm/rmap.c b/mm/rmap.c index 3d288a7c8c32..f83d812d0a5e 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -73,6 +73,7 @@ #include #include #include +#include #include @@ -1526,6 +1527,13 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pteval = ptep_clear_flush(vma, address, pvmw.pte); } + /* + * Now the pte is cleared. If this is uffd-wp armed pte, we + * may want to replace a none pte with a marker pte if it's + * file-backed, so we don't lose the tracking information. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + /* Set the dirty flag on the folio now the pte is gone. */ if (pte_dirty(pteval)) folio_mark_dirty(folio); From patchwork Fri Mar 4 05:16:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A57CAC433EF for ; Fri, 4 Mar 2022 05:18:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 427088D000B; Fri, 4 Mar 2022 00:18:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D6FD8D0001; Fri, 4 Mar 2022 00:18:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29E9D8D000B; Fri, 4 Mar 2022 00:18:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 1B9D88D0001 for ; Fri, 4 Mar 2022 00:18:31 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C64F5181C43D5 for ; Fri, 4 Mar 2022 05:18:30 +0000 (UTC) X-FDA: 79205548380.21.184504C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 114C1100006 for ; Fri, 4 Mar 2022 05:18:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371109; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tQXe4rml5LknCAC9VWmfIhmJrE0oVG26j9HM2k7VJ/Y=; b=YRk6CwSJRirz5ZPup+ZRPTleUQaY/xixCTA/Ffp5Vn7i2jLZiyigtkymXB/bC3ZHu4NnEL /E1Zh9nryNoNXgx+qmDH85idTSIuviXAl46HCKViC45vC4PKc/0g8PgxP4NgSWrZBNtBLI b1F1GqS9SiYjrYH1kDPxSbKhR31ipI8= Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-473-lu25IBDPMDKtSImHRj3MmA-1; Fri, 04 Mar 2022 00:18:28 -0500 X-MC-Unique: lu25IBDPMDKtSImHRj3MmA-1 Received: by mail-pj1-f72.google.com with SMTP id lp2-20020a17090b4a8200b001bc449ecbceso6724413pjb.8 for ; Thu, 03 Mar 2022 21:18:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tQXe4rml5LknCAC9VWmfIhmJrE0oVG26j9HM2k7VJ/Y=; b=43mbfPt9go1oupgJ4MWS1m3mMEpR6sLnb4OQs/BNDSWRnCfyqfsA2kUOA8lVxoZNG5 wD0yLR8PQKs+KdgKIqofPehlxnZRxHtSb1eUuIJGxvmrxP6e/uk0p4Lfqi96p+PxpPSm s/BkeYDPNRgXRIjwgCECBPXRfkzYMp2N6ojP330nFPqwSsAEha9famWHfAxoEkMQFfXu xj8Wt7rFQoARTLUTf+oy0joWEIX/eT+SeYu5+xznfLPJ8uBSIGfd91x5l3ri2YbmEC0H qzyMEA1NTswpd5OiztKVq1uy5EuvPOH+7svnnspfx8dcxm/cCZ3w7FSvi9xSEH76TTV+ zOqQ== X-Gm-Message-State: AOAM530Y2r2heSXudqwc5GZuFRIjUhjIqxM0XXFudePE7LDrcDMVD8Lo qiBDktTbonrjOJEPtIN8/K+gvcI6Xm+VS+xxflaG+hd7CpgnhXktEOzaRFiFuBj5Wc8gNH5lru2 olLMb96gmaqHMgFmSt/BrWISpH2ZpB3fLjTlrNITdlN/ZMEvIajVCl+mahppN X-Received: by 2002:a17:903:2285:b0:151:4b38:298e with SMTP id b5-20020a170903228500b001514b38298emr29450215plh.36.1646371107503; Thu, 03 Mar 2022 21:18:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJxWOXIBo4RzUqL6IP5zCBIf1bLH01fWQdczZaceP2UrGilCNmkw3HF88pBzciWEi4lBzLO6HQ== X-Received: by 2002:a17:903:2285:b0:151:4b38:298e with SMTP id b5-20020a170903228500b001514b38298emr29450185plh.36.1646371107064; Thu, 03 Mar 2022 21:18:27 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:26 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 08/23] mm/shmem: Allow uffd wr-protect none pte for file-backed mem Date: Fri, 4 Mar 2022 13:16:53 +0800 Message-Id: <20220304051708.86193-9-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 114C1100006 X-Stat-Signature: fg9h1r33c3e3tucpzmm567ghdtq7g6es Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YRk6CwSJ; spf=none (imf05.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-HE-Tag: 1646371109-663774 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory differs from anonymous memory in that even if the pte is missing, the data could still resides either in the file or in page/swap cache. So when wr-protect a pte, we need to consider none ptes too. We do that by installing the uffd-wp pte markers when necessary. So when there's a future write to the pte, the fault handler will go the special path to first fault-in the page as read-only, then report to userfaultfd server with the wr-protect message. On the other hand, when unprotecting a page, it's also possible that the pte got unmapped but replaced by the special uffd-wp marker. Then we'll need to be able to recover from a uffd-wp pte marker into a none pte, so that the next access to the page will fault in correctly as usual when accessed the next time. Special care needs to be taken throughout the change_protection_range() process. Since now we allow user to wr-protect a none pte, we need to be able to pre-populate the page table entries if we see (!anonymous && MM_CP_UFFD_WP) requests, otherwise change_protection_range() will always skip when the pgtable entry does not exist. For example, the pgtable can be missing for a whole chunk of 2M pmd, but the page cache can exist for the 2M range. When we want to wr-protect one 4K page within the 2M pmd range, we need to pre-populate the pgtable and install the pte marker showing that we want to get a message and block the thread when the page cache of that 4K page is written. Without pre-populating the pmd, change_protection() will simply skip that whole pmd. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. That will be done later, and this patch will be a preparation for it too. Signed-off-by: Peter Xu --- mm/mprotect.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 62 insertions(+), 2 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 6d179c720089..4878b6b99df9 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -184,8 +185,16 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); - } else if (is_pte_marker_entry(entry)) { - /* Skip it, the same as none pte */ + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * If this is uffd-wp pte marker and we'd like + * to unprotect it, drop it; the next page + * fault will trigger without uffd trapping. + */ + if (uffd_wp_resolve) { + pte_clear(vma->vm_mm, addr, pte); + pages++; + } continue; } else { newpte = oldpte; @@ -200,6 +209,20 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } + } else { + /* It must be an none page, or what else?.. */ + WARN_ON_ONCE(!pte_none(oldpte)); + if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + /* + * For file-backed mem, we need to be able to + * wr-protect a none pte, because even if the + * pte is none, the page/swap cache could + * exist. Doing that by install a marker. + */ + set_pte_at(vma->vm_mm, addr, pte, + make_pte_marker(PTE_MARKER_UFFD_WP)); + pages++; + } } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); @@ -233,6 +256,39 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } +/* Return true if we're uffd wr-protecting file-backed memory, or false */ +static inline bool +uffd_wp_protect_file(struct vm_area_struct *vma, unsigned long cp_flags) +{ + return (cp_flags & MM_CP_UFFD_WP) && !vma_is_anonymous(vma); +} + +/* + * If wr-protecting the range for file-backed, populate pgtable for the case + * when pgtable is empty but page cache exists. When {pte|pmd|...}_alloc() + * failed it means no memory, we don't have a better option but stop. + */ +#define change_pmd_prepare(vma, pmd, cp_flags) \ + do { \ + if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + if (WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd))) \ + break; \ + } \ + } while (0) +/* + * This is the general pud/p4d/pgd version of change_pmd_prepare(). We need to + * have separate change_pmd_prepare() because pte_alloc() returns 0 on success, + * while {pmd|pud|p4d}_alloc() returns the valid pointer on success. + */ +#define change_prepare(vma, high, low, addr, cp_flags) \ + do { \ + if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + low##_t *p = low##_alloc(vma->vm_mm, high, addr); \ + if (WARN_ON_ONCE(p == NULL)) \ + break; \ + } \ + } while (0) + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -251,6 +307,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, next = pmd_addr_end(addr, end); + change_pmd_prepare(vma, pmd, cp_flags); /* * Automatic NUMA balancing walks the tables with mmap_lock * held for read. It's possible a parallel update to occur @@ -316,6 +373,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); + change_prepare(vma, pud, pmd, addr, cp_flags); if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, @@ -336,6 +394,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, p4d = p4d_offset(pgd, addr); do { next = p4d_addr_end(addr, end); + change_prepare(vma, p4d, pud, addr, cp_flags); if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, @@ -361,6 +420,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, inc_tlb_flush_pending(mm); do { next = pgd_addr_end(addr, end); + change_prepare(vma, pgd, p4d, addr, cp_flags); if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, From patchwork Fri Mar 4 05:16:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 406EFC433FE for ; Fri, 4 Mar 2022 05:18:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D76738D0002; Fri, 4 Mar 2022 00:18:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D257E8D0001; Fri, 4 Mar 2022 00:18:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEFB48D0002; Fri, 4 Mar 2022 00:18:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id B24BF8D0001 for ; Fri, 4 Mar 2022 00:18:40 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8D0B0607EF for ; Fri, 4 Mar 2022 05:18:40 +0000 (UTC) X-FDA: 79205548800.04.209FC0D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id EA1AE100006 for ; Fri, 4 Mar 2022 05:18:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371119; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cWQZRYNGI2cSzJcmY2NeIfodfPZX4V25Mo1spySEF5s=; b=DY1ane5qIndpnSkPRvPltE7BZBl1rxjHQgGdPuuQCgTyb+0iK0eia0XvP/VDzbciplZlWH c0UIYJyxQ6ttjMwQTnIp4CopBa2RkePIBJow28eixzTTjyR5shctryaFbZqdRPefzcWFxZ gk8WVUdN7Uo1owuEx/BcIJYkeRw1iU0= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-166-BxG1HIjCOzWmAjAYZtGFBQ-1; Fri, 04 Mar 2022 00:18:36 -0500 X-MC-Unique: BxG1HIjCOzWmAjAYZtGFBQ-1 Received: by mail-pg1-f199.google.com with SMTP id bm16-20020a656e90000000b00372932b1d83so3904542pgb.10 for ; Thu, 03 Mar 2022 21:18:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cWQZRYNGI2cSzJcmY2NeIfodfPZX4V25Mo1spySEF5s=; b=f3o/W2fZvwiqIDXwz0F2GNPzPc+LiYnAeNGDwOb1iv0b1iirzx/GND5O7cV/SMx4N5 1jZIs6Q5cgscmtgL9LIYUccZUfyzXb+ek2KM8+94bsEPErcr8dAi+5Z5VaMr7ekE3iAY ZR40H2LqPGhASrJlbtoKHS5xgSgCi/Mjq61VdSuhsMUZTrL0ESuM2EtMm6wAzVte2fqU KZ8zWfEZKFyOt0ZpHYRpqAJDf3aIcHxGAOKVvZmVFmjuYeOAz2EvnktuzQJkjC18EYz7 POTm68Q5p94UiCrGeKSWv4TljTtdvm4wJqsJzoVb5l0x12gBn9GXBubgUkqdhL27z3fl ag8Q== X-Gm-Message-State: AOAM5313X4cDEduWQSHJJIajR1M3UMXzh9QUtMQBv60HVnfT8a+cKZVN fyYcPjeuQGmHVVEiaBSiJiFLQOpO2vENdFZueSXWB55Cr5uCvi23TN2t0xM2Yxq9Rix8nbOEu8w GeXt7/hiyaJnFxvNfJnJEjK8dITCp0r4v3Cq17WSmH100Iaq73CzUH/6p8zql X-Received: by 2002:a63:af02:0:b0:375:57f0:8af1 with SMTP id w2-20020a63af02000000b0037557f08af1mr32843647pge.188.1646371115511; Thu, 03 Mar 2022 21:18:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJx4jOImxHShOaqYtf+LhwSwkfO4VXJz5d+7rIaM+Vv6wcTRt41/dvgoYP9dxEas9oL3cZidzQ== X-Received: by 2002:a63:af02:0:b0:375:57f0:8af1 with SMTP id w2-20020a63af02000000b0037557f08af1mr32843618pge.188.1646371115135; Thu, 03 Mar 2022 21:18:35 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:34 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 09/23] mm/shmem: Allows file-back mem to be uffd wr-protected on thps Date: Fri, 4 Mar 2022 13:16:54 +0800 Message-Id: <20220304051708.86193-10-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: EA1AE100006 X-Stat-Signature: fanp5o4twg569pi5z4hofpairrphnjcd X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DY1ane5q; spf=none (imf05.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam07 X-HE-Tag: 1646371119-524919 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have "huge" version of pte markers, instead when necessary we split the thp. However split the thp is not enough, because file-backed thp is handled totally differently comparing to anonymous thps: rather than doing a real split, the thp pmd will simply got cleared in __split_huge_pmd_locked(). That is not enough if e.g. when there is a thp covers range [0, 2M) but we want to wr-protect small page resides in [4K, 8K) range, because after __split_huge_pmd() returns, there will be a none pmd, and change_pmd_range() will just skip it right after the split. Here we leverage the previously introduced change_pmd_prepare() macro so that we'll populate the pmd with a pgtable page after the pmd split (in which process the pmd will be cleared for cases like shmem). Then change_pte_range() will do all the rest for us by installing the uffd-wp pte marker at any none pte that we'd like to wr-protect. Signed-off-by: Peter Xu --- mm/mprotect.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 4878b6b99df9..95b307d4766d 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -329,8 +329,15 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + if ((next - addr != HPAGE_PMD_SIZE) || + uffd_wp_protect_file(vma, cp_flags)) { __split_huge_pmd(vma, pmd, addr, false, NULL); + /* + * For file-backed, the pmd could have been + * cleared; make sure pmd populated if + * necessary, then fall-through to pte level. + */ + change_pmd_prepare(vma, pmd, cp_flags); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, newprot, cp_flags); From patchwork Fri Mar 4 05:16:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDEAAC433EF for ; Fri, 4 Mar 2022 05:18:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8BB658D0003; Fri, 4 Mar 2022 00:18:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 86A748D0001; Fri, 4 Mar 2022 00:18:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 732308D0003; Fri, 4 Mar 2022 00:18:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 658BF8D0001 for ; Fri, 4 Mar 2022 00:18:47 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 440A5608E3 for ; Fri, 4 Mar 2022 05:18:47 +0000 (UTC) X-FDA: 79205549094.01.11FF03C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id B20FC40008 for ; Fri, 4 Mar 2022 05:18:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371126; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=87YY6m3X7TCIFcgpIMmGMt9yOOTi6qvz7lf8TeZlPS8=; b=KFpkXEgAn+yhHFJB5z2XI+Ki305phj/nfRIY/IADo/7Hvs6eIh44s8LjZlIJjyIxzZglqj RIAQ5H1WPncoj8QqI33sxP82NzSVrLc2LZXJCoHnzqIGX4GW9ZDejdiiyx+e3Qh8GdO+LK tHoWw94dJjpibvy0q/sykb4i0ghgZB8= Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-437-KD4WbscKMbys_fYVecJKGw-1; Fri, 04 Mar 2022 00:18:44 -0500 X-MC-Unique: KD4WbscKMbys_fYVecJKGw-1 Received: by mail-pg1-f200.google.com with SMTP id n8-20020a654508000000b003783b1e9834so3941536pgq.0 for ; Thu, 03 Mar 2022 21:18:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=87YY6m3X7TCIFcgpIMmGMt9yOOTi6qvz7lf8TeZlPS8=; b=JjS8M9dByAvjJJpyNJhgip1mMqm2bYhhtIJdydXwCpVc3BkyefIWZeEBMFlmh1axKJ UM93dWCZM+LQtjkc6b6P2n+0bJJ6oyuA9/WxtTwiuJ8p4QPhGhUWMgzELS6HwYrYwila HPY+8ZXohNXjFurQgDa/UD2eFFFsWJxZCSXDL/3FHAEl2t5OL5BxwQG7aXJQXKhvx+2j Ddo3Gvl4Cg/ivZGKl5UIV1zV8mt91dMEihC0RUrVtT64Ubqi5Kpxes1mO/VhZ3zy/igL Oode/6U8w6lwPC0/A4X61STjh5c8IES1xXF1B7S/lkvLmYnVHiVUIfLqTF6tKJ7Er0lN hr0w== X-Gm-Message-State: AOAM530+cgRvmnoBM9jnwkeNAGCyu40yVh5v482fz3/XCsENaEXLokkK 34nFY/KN2tAwbtPyD8mvHzl21QVQ8HCHJWqjgJEr8QXAa3mmgCyoznI+eltEsv7uYv1U+Qn879c HK7UKOolQY2skHpGMQW51wQIthvv1cjjiIWz+RUHoJNT7yD7S57BMLO7XIrqh X-Received: by 2002:a63:5525:0:b0:372:c376:74f1 with SMTP id j37-20020a635525000000b00372c37674f1mr32598175pgb.433.1646371123487; Thu, 03 Mar 2022 21:18:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJxvoeKLrN4o7hnThDJS0UIUxVik/k+WtOZ+b4Gnnk3c8Fk0c2piAJVffr6/tNq+9lzTQITaSA== X-Received: by 2002:a63:5525:0:b0:372:c376:74f1 with SMTP id j37-20020a635525000000b00372c37674f1mr32598143pgb.433.1646371123117; Thu, 03 Mar 2022 21:18:43 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.35 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:42 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 10/23] mm/shmem: Handle uffd-wp during fork() Date: Fri, 4 Mar 2022 13:16:55 +0800 Message-Id: <20220304051708.86193-11-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam10 X-Rspam-User: X-Stat-Signature: mf1763smk36bgiypdawwmhwzt5xu4k46 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KFpkXEgA; spf=none (imf07.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Queue-Id: B20FC40008 X-HE-Tag: 1646371126-432023 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Normally we skip copy page when fork() for VM_SHARED shmem, but we can't skip it anymore if uffd-wp is enabled on dst vma. This should only happen when the src uffd has UFFD_FEATURE_EVENT_FORK enabled on uffd-wp shmem vma, so that VM_UFFD_WP will be propagated onto dst vma too, then we should copy the pgtables with uffd-wp bit and pte markers, because these information will be lost otherwise. Since the condition checks will become even more complicated for deciding "whether a vma needs to copy the pgtable during fork()", introduce a helper vma_needs_copy() for it, so everything will be clearer. Signed-off-by: Peter Xu --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index e3e67e32eb8a..e9e335ecb5dc 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -857,6 +857,14 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (try_restore_exclusive_pte(src_pte, src_vma, addr)) return -EBUSY; return -ENOENT; + } else if (is_pte_marker_entry(entry)) { + /* + * We're copying the pgtable should only because dst_vma has + * uffd-wp enabled, do sanity check. + */ + WARN_ON_ONCE(!userfaultfd_wp(dst_vma)); + set_pte_at(dst_mm, addr, dst_pte, pte); + return 0; } if (!userfaultfd_wp(dst_vma)) pte = pte_swp_clear_uffd_wp(pte); @@ -1225,6 +1233,38 @@ copy_p4d_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, return 0; } +/* + * Return true if the vma needs to copy the pgtable during this fork(). Return + * false when we can speed up fork() by allowing lazy page faults later until + * when the child accesses the memory range. + */ +bool +vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) +{ + /* + * Always copy pgtables when dst_vma has uffd-wp enabled even if it's + * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable + * contains uffd-wp protection information, that's something we can't + * retrieve from page cache, and skip copying will lose those info. + */ + if (userfaultfd_wp(dst_vma)) + return true; + + if (src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) + return true; + + if (src_vma->anon_vma) + return true; + + /* + * Don't copy ptes where a page fault will fill them correctly. Fork + * becomes much lighter when there are big shared or private readonly + * mappings. The tradeoff is that copy_page_range is more efficient + * than faulting. + */ + return false; +} + int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) { @@ -1238,14 +1278,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) bool is_cow; int ret; - /* - * Don't copy ptes where a page fault will fill them correctly. - * Fork becomes much lighter when there are big shared or private - * readonly mappings. The tradeoff is that copy_page_range is more - * efficient than faulting. - */ - if (!(src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) && - !src_vma->anon_vma) + if (!vma_needs_copy(dst_vma, src_vma)) return 0; if (is_vm_hugetlb_page(src_vma)) From patchwork Fri Mar 4 05:16:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 340A3C433EF for ; Fri, 4 Mar 2022 05:18:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8BA68D0005; Fri, 4 Mar 2022 00:18:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C3AB78D0001; Fri, 4 Mar 2022 00:18:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B02E18D0005; Fri, 4 Mar 2022 00:18:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0056.hostedemail.com [216.40.44.56]) by kanga.kvack.org (Postfix) with ESMTP id A3CD48D0001 for ; Fri, 4 Mar 2022 00:18:54 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5F1BB8249980 for ; Fri, 4 Mar 2022 05:18:54 +0000 (UTC) X-FDA: 79205549388.21.23404F7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf20.hostedemail.com (Postfix) with ESMTP id DFCE21C000E for ; Fri, 4 Mar 2022 05:18:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371133; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jyv1lzZ4mkTvzdL9kLWgnj86vlKE/ygyLYqVdoUf6Sw=; b=BhbhHQgJ1QeYJQJIsLsTzPlCfMqWkDCFMTX5oelGUlK5Mh6qU9pmP5+Z4MTvZ7IJ6W6Y9o HLpj+ohro4AWpEFW9Uo4W+u4fpYFoUYS6f8RGkmNyT3Sh3HultHdFQStHgeLw59pDU3ua8 CcqENhhiUCpWyHHS59wSGqXzySASIZA= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-362-MuxeSx1bNFiNfU0Pjm9uEg-1; Fri, 04 Mar 2022 00:18:52 -0500 X-MC-Unique: MuxeSx1bNFiNfU0Pjm9uEg-1 Received: by mail-pj1-f71.google.com with SMTP id lp2-20020a17090b4a8200b001bc449ecbceso6724856pjb.8 for ; Thu, 03 Mar 2022 21:18:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Jyv1lzZ4mkTvzdL9kLWgnj86vlKE/ygyLYqVdoUf6Sw=; b=BEta7E2O4uiQiIRom/GnezbZIBemwFzo8s8d3xxsnccZ2Rla8P035nycdieEv8L6c5 GD9I6BYQak1ezvuWI+N73hlCPkjR77LxcSicjAM1wWGfJCFpi0hu/HAWYPQQVgKtKAGw usBQGi7wXyNcqLsQgvSxzSUAnXF2piN5X1mbmPCHjRRPqH7cfds2+Icsjb5q7X8TjS6m +vBxm2pcFD7/l6xGSlkFM1kVwZbkEd0IP2QuGZzDWQ0/ojjQASEvpA2V6v9ZkZUf+EJy JwMAcqmlGCGFFyF09yP0LuTMqR3tSjcJRP9P6/WudDuRVMlP7zitNxZrg8fghncRxHdK zqRQ== X-Gm-Message-State: AOAM530BoZAu42O711oN95HJHaIIFGSqVJrXVCW5t8ikBgLYe0LTgMbc p+IoYy7mbLKpSQWj99IU8WEueOD520dHGVFOGXFuQung7HzJWteIkG15wUojd+MPRif9qUwx7vT m7FelEZ1/fXphjtWm50cFp7uQCh+m6PmPEZT/mdll7wXxi6R4sJLezx/5yM8F X-Received: by 2002:a63:944:0:b0:374:5324:eea1 with SMTP id 65-20020a630944000000b003745324eea1mr33249213pgj.366.1646371131405; Thu, 03 Mar 2022 21:18:51 -0800 (PST) X-Google-Smtp-Source: ABdhPJwhisbQFqQkhrO4+7tBm5bGrC3HEMdBg8mxUGWFFpK8iGmzvB3YNBGzza29cCioxbX4vov0sQ== X-Received: by 2002:a63:944:0:b0:374:5324:eea1 with SMTP id 65-20020a630944000000b003745324eea1mr33249184pgj.366.1646371131059; Thu, 03 Mar 2022 21:18:51 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:50 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 11/23] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Date: Fri, 4 Mar 2022 13:16:56 +0800 Message-Id: <20220304051708.86193-12-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: DFCE21C000E X-Stat-Signature: azrg7jfwaay6ucsox1wshqstc6y51n1o Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BhbhHQgJ; spf=none (imf20.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1646371133-710910 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: They will be used in the follow up patches to either check/set/clear uffd-wp bit of a huge pte. So far it reuses all the small pte helpers. Archs can overwrite these versions when necessary (with __HAVE_ARCH_HUGE_PTE_UFFD_WP* macros) in the future. Signed-off-by: Peter Xu --- arch/s390/include/asm/hugetlb.h | 15 +++++++++++++++ include/asm-generic/hugetlb.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 60f9241e5e4a..19c4b4431d27 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -115,6 +115,21 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) return pte_modify(pte, newprot); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static inline int huge_pte_uffd_wp(pte_t pte) +{ + return 0; +} + static inline bool gigantic_page_runtime_supported(void) { return true; diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index f39cad20ffc6..896f341f614d 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -35,6 +35,21 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) return pte_modify(pte, newprot); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte_mkuffd_wp(pte); +} + +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_uffd_wp(pte); +} + +static inline int huge_pte_uffd_wp(pte_t pte) +{ + return pte_uffd_wp(pte); +} + #ifndef __HAVE_ARCH_HUGE_PTE_CLEAR static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long sz) From patchwork Fri Mar 4 05:16:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5011DC433EF for ; Fri, 4 Mar 2022 05:19:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E87288D0006; Fri, 4 Mar 2022 00:19:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E37518D0001; Fri, 4 Mar 2022 00:19:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFEBF8D0006; Fri, 4 Mar 2022 00:19:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id C36518D0001 for ; Fri, 4 Mar 2022 00:19:02 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 939FB2082C for ; Fri, 4 Mar 2022 05:19:02 +0000 (UTC) X-FDA: 79205549724.11.FF8FE77 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 05A584000D for ; Fri, 4 Mar 2022 05:19:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371141; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n7rwO48JtzdkJuoHFZCI8H9KL9Pu0D4gPtoWSB8Kejg=; b=DR+OXZwEIbK4zFpJNB3sUIk0XVIPKbH/6dxa1+s3V510mqYhIysXBxJfUe3Qbaj8dS7gs5 h8UZDldlEDZXm+fG/C2FkCin24MvweSAVCVmTCPFTiaTEerlVzFX5N5YKt7MuMqaVFBSQf +ni6+twohUsp2L/Q8+Uo5Mn4lzVCNow= Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-240-gvG9xfgxPW6_sKTrCLMHlA-1; Fri, 04 Mar 2022 00:19:00 -0500 X-MC-Unique: gvG9xfgxPW6_sKTrCLMHlA-1 Received: by mail-pg1-f197.google.com with SMTP id bh9-20020a056a02020900b0036c0d29eb3eso3915796pgb.9 for ; Thu, 03 Mar 2022 21:19:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n7rwO48JtzdkJuoHFZCI8H9KL9Pu0D4gPtoWSB8Kejg=; b=W9Uc/9U+Xb6j7kmcvUNUGPTyYvrC1mkF745dJb6iKxPXNCeDl0ZZoFZgPDawP8hHnh /72LHVjhD4pDpJ8pDbzOt5T2qISez65xkHwVG8BrW0f79DF12/NWhux/Osr0atitNN7O V5V8D4UtfadOcnv5ScQPyCB8N5mswT9eFLCP29aK8kpormOtL4KD3StTPcqP2+sWBrsn WKe2/RRll7vg9xgrSInLqU4Uzq5LQCctYkiZ6BrfFVwvbE3HggWIuw+aQKHYdnaEZ1QA BgESJwDntl03Ozt3eYZD7IUTv51mdrEadjkxE28BXldbS/BSDYNYnJmXcN/AAjyuGYAP riqw== X-Gm-Message-State: AOAM530Jo37H1nOfsW3gOvwi5Cc2f+lVhE6tXpVnh4yf03coKAxrDnjg QAaxEe5v7mAybO55Do2NaDaELgImvqemEfuyPETWr63qcnNjZriTZrrPS9gBgf/leST2EV71Nk8 lnZtXea1k/ZcReVhMdVGpKDd/gJnh+oTgKY/ZB1gRtBg9o0lpQEawYRmLuHId X-Received: by 2002:a17:902:bc83:b0:14f:d9b7:ab3 with SMTP id bb3-20020a170902bc8300b0014fd9b70ab3mr38914597plb.23.1646371139447; Thu, 03 Mar 2022 21:18:59 -0800 (PST) X-Google-Smtp-Source: ABdhPJyxQvXTM1bbdfv800pkAzqKOP1rVnQ9jp11rKIolBaZd7PJNXmzE+rE+wUwHWSuQvUScVEx9A== X-Received: by 2002:a17:902:bc83:b0:14f:d9b7:ab3 with SMTP id bb3-20020a170902bc8300b0014fd9b70ab3mr38914569plb.23.1646371139079; Thu, 03 Mar 2022 21:18:59 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:18:58 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 12/23] mm/hugetlb: Hook page faults for uffd write protection Date: Fri, 4 Mar 2022 13:16:57 +0800 Message-Id: <20220304051708.86193-13-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 05A584000D X-Stat-Signature: qm5sa6gbfqa1hby9wfs4uwhfaoubwe5f Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DR+OXZwE; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf27.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371141-37535 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b34f50156f7e..d2539e2fe066 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5680,6 +5680,26 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) goto out_ptl; + /* Handle userfault-wp first, before trying to lock more pages */ + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) && + (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { + struct vm_fault vmf = { + .vma = vma, + .address = haddr, + .real_address = address, + .flags = flags, + }; + + spin_unlock(ptl); + if (pagecache_page) { + unlock_page(pagecache_page); + put_page(pagecache_page); + } + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + return handle_userfault(&vmf, VM_UFFD_WP); + } + /* * hugetlb_cow() requires page locks of pte_page(entry) and * pagecache_page, so here we need take the former one From patchwork Fri Mar 4 05:16:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A08BC433F5 for ; Fri, 4 Mar 2022 05:19:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F3038D0007; Fri, 4 Mar 2022 00:19:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A2D68D0001; Fri, 4 Mar 2022 00:19:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86AE98D0007; Fri, 4 Mar 2022 00:19:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 78E448D0001 for ; Fri, 4 Mar 2022 00:19:11 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3F4E1209E8 for ; Fri, 4 Mar 2022 05:19:11 +0000 (UTC) X-FDA: 79205550102.15.02177A0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf01.hostedemail.com (Postfix) with ESMTP id B771B40003 for ; Fri, 4 Mar 2022 05:19:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371150; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RL3s2QRFkPddYKwSOJ6UpdW5WS1d6dwfLlTkzvd/MfY=; b=DH09TNKZ8tk+gXl3cAF9iiwTPDWYttjfUEakGVylmUVTDpksws7sYqWJJiLnuSMs+YdBG5 dHNGS51jB31b35gF0A/42iQddjX0VVZXiWFabW/w7l5fvzip2pYlJOZfQZljN6TKWgPym7 lvqaqUu+ffIaj72IBoqGTSS0DXvCas4= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-494-9Zt98SzaNF-x2tlJ9r_DAQ-1; Fri, 04 Mar 2022 00:19:09 -0500 X-MC-Unique: 9Zt98SzaNF-x2tlJ9r_DAQ-1 Received: by mail-pl1-f199.google.com with SMTP id x18-20020a170902b41200b0014fc2665bddso4131429plr.0 for ; Thu, 03 Mar 2022 21:19:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RL3s2QRFkPddYKwSOJ6UpdW5WS1d6dwfLlTkzvd/MfY=; b=4njbn/mqJkFeB0b8eSmMqm+uGUoZe3GnDhxIMFWY7Ti4Kv4Fipvs9x0vwIxGzQufu5 Ghl/i2CIYpFUqAc3jtQmTvF2zY2ho0jcydXRymPZbpOW49ARtAGrlj9bwcdNZt/xhZdI IzcvpYv+YbJmrvGAv1LO0m4MfuE6QejxVMaMuN+s+3bcCUe4+MEtZ/i4p1JigKrbc71k vNdz7PtfRimt/wUVVz5zkosGuVphWmDrKQHAxSf7WHsq/6r5Z/siS1elhX2y4u3o2dMq mQfV7EdAfke8+ALS1OyAbBJadMs7ss1zdOI7x6yIja4mg2zNK6CfhUP1MX4+Ig/lOICh RQgA== X-Gm-Message-State: AOAM531ckY7cb/+J2nFh3JTMNnKlQfAkSuvXLdkrXVNTTlrtMG2qyYtz KLr3tG60YJmQXW+acLK7bJZf/V1XvsdP7YT4FJOfmx3bWLAu/Q+SN3c9y4fSVqCuqlUTnyjljNo VokWslubWMII7seXodf3xIGTIw5DxSUnVIHVWIW/s4ZqZjmQ3P2FrpD1yiP+3 X-Received: by 2002:a17:90a:6542:b0:1bd:149f:1c29 with SMTP id f2-20020a17090a654200b001bd149f1c29mr8883058pjs.240.1646371147468; Thu, 03 Mar 2022 21:19:07 -0800 (PST) X-Google-Smtp-Source: ABdhPJw98NhiPLfQ+4A3W3/3rIHdWYk9obskF5GhOKrRBQEImBHdUerNxlQUCrFQ/jhk98NTLdWLrQ== X-Received: by 2002:a17:90a:6542:b0:1bd:149f:1c29 with SMTP id f2-20020a17090a654200b001bd149f1c29mr8883022pjs.240.1646371147002; Thu, 03 Mar 2022 21:19:07 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.18.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:06 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 13/23] mm/hugetlb: Take care of UFFDIO_COPY_MODE_WP Date: Fri, 4 Mar 2022 13:16:58 +0800 Message-Id: <20220304051708.86193-14-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: B771B40003 X-Stat-Signature: kmk4jczfidw78kufnn9ke95qipg4co3x X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DH09TNKZ; spf=none (imf01.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam07 X-HE-Tag: 1646371150-303951 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pass the wp_copy variable into hugetlb_mcopy_atomic_pte() thoughout the stack. Apply the UFFD_WP bit if UFFDIO_COPY_MODE_WP is with UFFDIO_COPY. Hugetlb pages are only managed by hugetlbfs, so we're safe even without setting dirty bit in the huge pte if the page is installed as read-only. However we'd better still keep the dirty bit set for a read-only UFFDIO_COPY pte (when UFFDIO_COPY_MODE_WP bit is set), not only to match what we do with shmem, but also because the page does contain dirty data that the kernel just copied from the userspace. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 29 +++++++++++++++++++++++------ mm/userfaultfd.c | 14 +++++++++----- 3 files changed, 36 insertions(+), 13 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 53c1b6082a4c..6347298778b6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -160,7 +160,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep); + struct page **pagep, + bool wp_copy); #endif /* CONFIG_USERFAULTFD */ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, struct vm_area_struct *vma, @@ -355,7 +356,8 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d2539e2fe066..b094359255f7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5763,7 +5763,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct hstate *h = hstate_vma(dst_vma); @@ -5893,7 +5894,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_unlock; ret = -EEXIST; - if (!huge_pte_none(huge_ptep_get(dst_pte))) + /* + * We allow to overwrite a pte marker: consider when both MISSING|WP + * registered, we firstly wr-protect a none pte which has no page cache + * page backing it, then access the page. + */ + if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) goto out_release_unlock; if (vm_shared) { @@ -5903,17 +5909,28 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, hugepage_add_new_anon_rmap(page, dst_vma, dst_addr); } - /* For CONTINUE on a non-shared VMA, don't set VM_WRITE for CoW. */ - if (is_continue && !vm_shared) + /* + * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY + * with wp flag set, don't set pte write bit. + */ + if (wp_copy || (is_continue && !vm_shared)) writable = 0; else writable = dst_vma->vm_flags & VM_WRITE; _dst_pte = make_huge_pte(dst_vma, page, writable); - if (writable) - _dst_pte = huge_pte_mkdirty(_dst_pte); + /* + * Always mark UFFDIO_COPY page dirty; note that this may not be + * extremely important for hugetlbfs for now since swapping is not + * supported, but we should still be clear in that this page cannot be + * thrown away at will, even if write bit not set. + */ + _dst_pte = huge_pte_mkdirty(_dst_pte); _dst_pte = pte_mkyoung(_dst_pte); + if (wp_copy) + _dst_pte = huge_pte_mkuffd_wp(_dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); (void)huge_ptep_set_access_flags(dst_vma, dst_addr, dst_pte, _dst_pte, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index ef418a48b121..54e58f0d93e4 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -304,7 +304,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode) + enum mcopy_atomic_mode mode, + bool wp_copy) { int vm_shared = dst_vma->vm_flags & VM_SHARED; ssize_t err; @@ -392,7 +393,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } if (mode != MCOPY_ATOMIC_CONTINUE && - !huge_pte_none(huge_ptep_get(dst_pte))) { + !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { err = -EEXIST; mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -400,7 +401,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, - dst_addr, src_addr, mode, &page); + dst_addr, src_addr, mode, &page, + wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -455,7 +457,8 @@ extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode); + enum mcopy_atomic_mode mode, + bool wp_copy); #endif /* CONFIG_HUGETLB_PAGE */ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, @@ -575,7 +578,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, */ if (is_vm_hugetlb_page(dst_vma)) return __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start, - src_start, len, mcopy_mode); + src_start, len, mcopy_mode, + wp_copy); if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) goto out_unlock; From patchwork Fri Mar 4 05:16:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74752C433F5 for ; Fri, 4 Mar 2022 05:19:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 118F98D000C; Fri, 4 Mar 2022 00:19:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C9578D0001; Fri, 4 Mar 2022 00:19:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF9BD8D000C; Fri, 4 Mar 2022 00:19:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id E1F788D0001 for ; Fri, 4 Mar 2022 00:19:19 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9BD3D998D9 for ; Fri, 4 Mar 2022 05:19:19 +0000 (UTC) X-FDA: 79205550438.23.62445C9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 1A17CC000D for ; Fri, 4 Mar 2022 05:19:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371158; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7O5uZll4sV1zFyOKkIfzUXmTvhYgtYwRuXLomdEwgYM=; b=Fgqvf1qnRQFMmmNrTFhlKbH4U6bX2v/2MJPn1VDWoXRMa/9ZN3c2BZt8mMZJN/clyQgQAj l7O1qq4aEFyih78ElGoUO9qx2Hmj/SzWb1+WLhT/vl7uvUbYDUDHMWyRzZUJfbqIta18H6 R6WzgcHEakJPmutQHXjNwSHDwEQuqTI= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-hkh6XXnvPMOOhkW7Z3CCSA-1; Fri, 04 Mar 2022 00:19:17 -0500 X-MC-Unique: hkh6XXnvPMOOhkW7Z3CCSA-1 Received: by mail-pg1-f199.google.com with SMTP id t18-20020a63dd12000000b00342725203b5so3894893pgg.16 for ; Thu, 03 Mar 2022 21:19:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7O5uZll4sV1zFyOKkIfzUXmTvhYgtYwRuXLomdEwgYM=; b=PRTS35wmmxqXyx24WJ5yko2XuLH6EnpnKuCZ+uvGDdd+jFOpBNS9jooS2wZNJsXy/C YfmWUxsVP7Ag9FoaL/p/S5tA9wOJSZDF20FjwJm80gNA+FdGlurZtbfxqh078MQ+Qu4x EXOwwR/e+JM0IMswuGASqK7qtBHmkmddCDMC0YU5pDRvSfnVawP7tudbO8pYJAWOJxDj iB2Y5BrzJcp7iNxCstTIzwmD0m6ltFJ5En5eyA9YNREOdAYoaxCsU/3SkqBzqqyAYVmE K4DBG5qXiAyy/1rG5DlAJ0iucB40GKXrIhMVZbqZ6CeQxbjSeM6O+Yu4rO97EIgjZBa+ DB6A== X-Gm-Message-State: AOAM532VFtoEqNmdceeOMlBdBdgHq3QVunAoiosf9PuYBYrku75Cygmh OZZ+dICA9CwX0G2eBefAapLQ66cu670tpdkns32TXTeWshcEQKs16S8iqmyy7pecCrCwZ3IhBc/ sxfXOc+tajYy2BUiA57zC+prqfqL5XBM8rpZbr7sh/UH9yR00Ttis4jt3luId X-Received: by 2002:a05:6a00:1a92:b0:4f0:edf6:83f5 with SMTP id e18-20020a056a001a9200b004f0edf683f5mr41896961pfv.31.1646371155448; Thu, 03 Mar 2022 21:19:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJwjpsXb3Z+my2n7fkg0RxYfFGgIXvxEsSLrg+d/2jd0WAHT4xitgJREhUKjI7MHFmmcFfIeRA== X-Received: by 2002:a05:6a00:1a92:b0:4f0:edf6:83f5 with SMTP id e18-20020a056a001a9200b004f0edf683f5mr41896931pfv.31.1646371155050; Thu, 03 Mar 2022 21:19:15 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.07 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:14 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 14/23] mm/hugetlb: Handle UFFDIO_WRITEPROTECT Date: Fri, 4 Mar 2022 13:16:59 +0800 Message-Id: <20220304051708.86193-15-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 1A17CC000D X-Stat-Signature: 3tjs6s1oammbrhq9jp3d14r9nbcfnwpw Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Fgqvf1qn; spf=none (imf28.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1646371158-172852 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 13 ++++++++++++- mm/mprotect.c | 3 ++- mm/userfaultfd.c | 8 ++++++++ 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6347298778b6..38c5ac28b787 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -210,7 +210,8 @@ struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address, int pmd_huge(pmd_t pmd); int pud_huge(pud_t pud); unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot); + unsigned long address, unsigned long end, pgprot_t newprot, + unsigned long cp_flags); bool is_hugetlb_entry_migration(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); @@ -391,7 +392,8 @@ static inline void move_hugetlb_state(struct page *oldpage, static inline unsigned long hugetlb_change_protection( struct vm_area_struct *vma, unsigned long address, - unsigned long end, pgprot_t newprot) + unsigned long end, pgprot_t newprot, + unsigned long cp_flags) { return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b094359255f7..396d5a516d05 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6151,7 +6151,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, } unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot) + unsigned long address, unsigned long end, + pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; @@ -6161,6 +6162,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long pages = 0; bool shared_pmd = false; struct mmu_notifier_range range; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * In the case of shared PMDs, the area to flush could be beyond @@ -6202,6 +6205,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, entry = make_readable_migration_entry( swp_offset(entry)); newpte = swp_entry_to_pte(entry); + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, newpte, huge_page_size(h)); pages++; @@ -6216,6 +6223,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, old_pte = huge_ptep_modify_prot_start(vma, address, ptep); pte = huge_pte_modify(old_pte, newprot); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); + if (uffd_wp) + pte = huge_pte_mkuffd_wp(huge_pte_wrprotect(pte)); + else if (uffd_wp_resolve) + pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } diff --git a/mm/mprotect.c b/mm/mprotect.c index 95b307d4766d..1b98e29316b6 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -451,7 +451,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); if (is_vm_hugetlb_page(vma)) - pages = hugetlb_change_protection(vma, start, end, newprot); + pages = hugetlb_change_protection(vma, start, end, newprot, + cp_flags); else pages = change_protection_range(vma, start, end, newprot, cp_flags); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 54e58f0d93e4..441728732033 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -704,6 +704,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, atomic_t *mmap_changing) { struct vm_area_struct *dst_vma; + unsigned long page_mask; pgprot_t newprot; int err; @@ -740,6 +741,13 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, if (!vma_is_anonymous(dst_vma)) goto out_unlock; + if (is_vm_hugetlb_page(dst_vma)) { + err = -EINVAL; + page_mask = vma_kernel_pagesize(dst_vma) - 1; + if ((start & page_mask) || (len & page_mask)) + goto out_unlock; + } + if (enable_wp) newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); else From patchwork Fri Mar 4 05:17:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F72AC433EF for ; Fri, 4 Mar 2022 05:19:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44DAF8D000D; Fri, 4 Mar 2022 00:19:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FCFE8D0001; Fri, 4 Mar 2022 00:19:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ECE98D000D; Fri, 4 Mar 2022 00:19:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id 209788D0001 for ; Fri, 4 Mar 2022 00:19:27 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D338F8249980 for ; Fri, 4 Mar 2022 05:19:26 +0000 (UTC) X-FDA: 79205550732.31.D14A3BD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 493EB1C000C for ; Fri, 4 Mar 2022 05:19:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371165; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u8Oi5M9aJF+Nhh3uBfIH6zUjZCwWQp8p9uYRbzwSERc=; b=VJHQ/Qr2p5Pu2BfIbBazsGeXfgYavstxarpm67KIUkpMZ7sdomMXBtPc1VBNJwWRvyk2ID JKfS0B4MCISf0zCsNjnbJhN0iearYv37Vlkkq+dcvCqVZkm0mH6BVxhryhtItrpJBLK71g G1nVvd0i5s3+DP3yHbZLYWFCPB67D00= Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-53-AgjJz61wOCiUPigPer3ymg-1; Fri, 04 Mar 2022 00:19:24 -0500 X-MC-Unique: AgjJz61wOCiUPigPer3ymg-1 Received: by mail-pj1-f69.google.com with SMTP id ev5-20020a17090aeac500b001bc3cb23d4cso4314383pjb.1 for ; Thu, 03 Mar 2022 21:19:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=u8Oi5M9aJF+Nhh3uBfIH6zUjZCwWQp8p9uYRbzwSERc=; b=gCxHb8rpbrAOPj/235BBboc3YzDfWgULuclNPbHM3BddgelZ9CKmV75+PfTtnA5Hru WWzc/49hhaApcCiCOo62hmBWow20MQaNrfYa1TNuK/hGlNk4Twzv2NBN5Zr0Up5Z2qEh HxicDQNJn+0mdtGG/hKTtpxVi8tW4e2MycKeHnJfZfYW//54o50nCM30vLDlkISEqkjJ ipLRW5IVP/9RwmRM3Z+177TyChzqapkXCrrxgi77kIFqPm06EhPGtdDlaR0ac7xwh+Ok KOGOxmNAuVRDaBlgzyZ887S362hszJjWYWnyu5u7eNUfvkdWkYTJxHtLbmfz31//BksX dRZQ== X-Gm-Message-State: AOAM533IsaAp2P8Hs687sE3KBO8WCL0wG1gmpBD8VEy9pVUtVPL2MNxt deNQ5r1FXZoNx2vH/B5T/U+fp4ZR3IUsaUMY7ks4a/AISufQ4Shn4xt1B7FJINSYTggxycH6q5K sfN9dkTnfQ8oJ0HUe7VKEwrpN9auZUiL794tg1XtSXZWxlWg5JDkEYfXKcsaS X-Received: by 2002:a17:902:e80f:b0:151:bdd2:cabc with SMTP id u15-20020a170902e80f00b00151bdd2cabcmr970857plg.31.1646371163509; Thu, 03 Mar 2022 21:19:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJx50hZU3JRaU/3TajwefuQOeadN49vMH2q5JxcC4kmufwLK+rjdl0sW8g3unVKCqbQjfQkoew== X-Received: by 2002:a17:902:e80f:b0:151:bdd2:cabc with SMTP id u15-20020a170902e80f00b00151bdd2cabcmr970827plg.31.1646371163102; Thu, 03 Mar 2022 21:19:23 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:22 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 15/23] mm/hugetlb: Handle pte markers in page faults Date: Fri, 4 Mar 2022 13:17:00 +0800 Message-Id: <20220304051708.86193-16-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 493EB1C000C X-Stat-Signature: 6573mace5if75q58ccmrueb8k4tcq5n8 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="VJHQ/Qr2"; spf=none (imf20.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1646371166-398752 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Allow hugetlb code to handle pte markers just like none ptes. It's mostly there, we just need to make sure we don't assume hugetlb_no_page() only handles none pte, so when detecting pte change we should use pte_same() rather than pte_none(). We need to pass in the old_pte to do the comparison. Check the original pte to see whether it's a pte marker, if it is, we should recover uffd-wp bit on the new pte to be installed, so that the next write will be trapped by uffd. Signed-off-by: Peter Xu --- mm/hugetlb.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 396d5a516d05..afd3d93cfe9a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5383,7 +5383,8 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, unsigned int flags) + unsigned long address, pte_t *ptep, + pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; @@ -5509,7 +5510,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, ptl = huge_pte_lock(h, mm, ptep); ret = 0; - if (!huge_pte_none(huge_ptep_get(ptep))) + /* If pte changed from under us, retry */ + if (!pte_same(huge_ptep_get(ptep), old_pte)) goto backout; if (anon_rmap) { @@ -5519,6 +5521,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, page_dup_rmap(page, true); new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); + /* + * If this pte was previously wr-protected, keep it wr-protected even + * if populated. + */ + if (unlikely(pte_marker_uffd_wp(old_pte))) + new_pte = huge_pte_wrprotect(huge_pte_mkuffd_wp(new_pte)); set_huge_pte_at(mm, haddr, ptep, new_pte); hugetlb_count_add(pages_per_huge_page(h), mm); @@ -5636,8 +5644,10 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); entry = huge_ptep_get(ptep); - if (huge_pte_none(entry)) { - ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, flags); + /* PTE markers should be handled the same way as none pte */ + if (huge_pte_none_mostly(entry)) { + ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + entry, flags); goto out_mutex; } From patchwork Fri Mar 4 05:17:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BECEDC433EF for ; Fri, 4 Mar 2022 05:19:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D4398D0009; Fri, 4 Mar 2022 00:19:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 583938D0001; Fri, 4 Mar 2022 00:19:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 473B18D0009; Fri, 4 Mar 2022 00:19:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0146.hostedemail.com [216.40.44.146]) by kanga.kvack.org (Postfix) with ESMTP id 3ADF98D0001 for ; Fri, 4 Mar 2022 00:19:35 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id F15589790B for ; Fri, 4 Mar 2022 05:19:34 +0000 (UTC) X-FDA: 79205551068.19.9F89CC4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 752ED80014 for ; Fri, 4 Mar 2022 05:19:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8c/lNa0oM80rkM4kbMMwF9soU2NQfSaZ5E7MuZnV//g=; b=BpIU8IX4aZq2mQ4OpxNSOS1347l7NI45mKR2ht5ruCz0hg/adnD3vpucH86bbSyF/1idad LLZIwcHmAcfRvqgM9Aqx185q6Xvoq3/xMLViJIRy4eTuihEtPcjEf5Kkl4qo052u9utZts 4MgxwpSc2FphMGYe37J/f2aWzjMEdBo= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-102-gL3lvmDbOqmwmze5tyvTmg-1; Fri, 04 Mar 2022 00:19:32 -0500 X-MC-Unique: gL3lvmDbOqmwmze5tyvTmg-1 Received: by mail-pj1-f71.google.com with SMTP id y1-20020a17090a644100b001bc901aba0dso3369632pjm.8 for ; Thu, 03 Mar 2022 21:19:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8c/lNa0oM80rkM4kbMMwF9soU2NQfSaZ5E7MuZnV//g=; b=PFJGUHy7t1/A6X6eWKFFsWqE3GgwytepafXWP3fURs2t27iAhC6yB03EDjAOcpHVu+ DTBpnEMXcKa6veleBHpmINUynnhAF4NNeUTTAENelKHggiUt+kDodKHgeIFf5dYm2QQ7 IxRbBDfWagxlUt/ggo564MLIVcdc9BEasyu0gZ/8dAU+fD+MNjVGKmf3PHgRYvZJI44Z oVq5Gk8RKuxhnnPZGxYvxgHUG0zDyySrwt0Z7g3cB1JyqzilfJR3iXKrLqANlOdyebk/ L8k4+AFiS1kis3d4cbolTfiJh5wzP1cTKVktBAnmMRR1kPUZVbUuyYa+i6OvyGO/4w4R 2ofw== X-Gm-Message-State: AOAM531hM+m+DNWH33d+sv5MCpwFKVsFgFLYNVQj0kpPsN80koJL4NUY 45DzfWMjmUG7M61t5FTvwSJwWZmh7og89VK0YCyaONeWigtb6IFQpuDRIl3sDkvmK+9NnCeNqRU ZBB5/guJMcaMKDPrxNjlH/U9WF//Er1ofksv2ULZqn9ZZtL41EhiKGDHgv9Wa X-Received: by 2002:a17:902:f145:b0:151:a441:433a with SMTP id d5-20020a170902f14500b00151a441433amr8804590plb.44.1646371171422; Thu, 03 Mar 2022 21:19:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJzFfQF8+il2cx3Q+YaVVz5krns+0z+Q5hZNGFwZcBniVedYFkNfGyU++c+42HLp3JPbzL24yQ== X-Received: by 2002:a17:902:f145:b0:151:a441:433a with SMTP id d5-20020a170902f14500b00151a441433amr8804562plb.44.1646371171051; Thu, 03 Mar 2022 21:19:31 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.23 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:30 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 16/23] mm/hugetlb: Allow uffd wr-protect none ptes Date: Fri, 4 Mar 2022 13:17:01 +0800 Message-Id: <20220304051708.86193-17-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 752ED80014 X-Rspam-User: Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BpIU8IX4; spf=none (imf02.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: qmw6pq758j6eqqtahudz97zcnndixqer X-HE-Tag: 1646371174-52984 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach hugetlbfs code to wr-protect none ptes just in case the page cache existed for that pte. Meanwhile we also need to be able to recognize a uffd-wp marker pte and remove it for uffd_wp_resolve. Since at it, introduce a variable "psize" to replace all references to the huge page size fetcher. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index afd3d93cfe9a..1a20be29ac3a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6169,7 +6169,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - unsigned long pages = 0; + unsigned long pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; @@ -6189,13 +6189,19 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); i_mmap_lock_write(vma->vm_file->f_mapping); - for (; address < end; address += huge_page_size(h)) { + for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, huge_page_size(h)); + ptep = huge_pte_offset(mm, address, psize); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, vma, &address, ptep)) { + /* + * When uffd-wp is enabled on the vma, unshare + * shouldn't happen at all. Warn about it if it + * happened due to some reason. + */ + WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); pages++; spin_unlock(ptl); shared_pmd = true; @@ -6220,12 +6226,20 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, - newpte, huge_page_size(h)); + newpte, psize); pages++; } spin_unlock(ptl); continue; } + if (unlikely(pte_marker_uffd_wp(pte))) { + /* + * This is changing a non-present pte into a none pte, + * no need for huge_ptep_modify_prot_start/commit(). + */ + if (uffd_wp_resolve) + huge_pte_clear(mm, address, ptep, psize); + } if (!huge_pte_none(pte)) { pte_t old_pte; unsigned int shift = huge_page_shift(hstate_vma(vma)); @@ -6239,6 +6253,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; + } else { + /* None pte */ + if (unlikely(uffd_wp)) + /* Safe to modify directly (none->non-present). */ + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); } spin_unlock(ptl); } From patchwork Fri Mar 4 05:17:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA651C433EF for ; Fri, 4 Mar 2022 05:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C30F8D0008; Fri, 4 Mar 2022 00:19:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4725D8D0001; Fri, 4 Mar 2022 00:19:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 312CF8D0008; Fri, 4 Mar 2022 00:19:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id 2324A8D0001 for ; Fri, 4 Mar 2022 00:19:43 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CAA9F181AC217 for ; Fri, 4 Mar 2022 05:19:42 +0000 (UTC) X-FDA: 79205551404.24.3656AD8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 3F6B0140002 for ; Fri, 4 Mar 2022 05:19:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371181; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W3A1Sw4gSLvq3WpWIO0c1Ts85eaL9QjEv2hQ97BtHXI=; b=cX3abnRgEotL7LxvvP8L9vT/I1b1KSKr5DBwkgAgFoFK219Zojc8+U/WPhUKA0W/5AnCh6 J9iYsolvPGqzhfHn4J8DdbfrSD/c84wcDupRfcgoKH1/AxMgDzpompkxVUlT9P113L2/FJ Xe+iIyBTxZY3XxKuBzuP9rqmbJqy2WQ= Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-97-2RYlOBs3P_uyIe4tEyCf6Q-1; Fri, 04 Mar 2022 00:19:40 -0500 X-MC-Unique: 2RYlOBs3P_uyIe4tEyCf6Q-1 Received: by mail-pj1-f69.google.com with SMTP id t12-20020a17090a448c00b001b9cbac9c43so4182292pjg.2 for ; Thu, 03 Mar 2022 21:19:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=W3A1Sw4gSLvq3WpWIO0c1Ts85eaL9QjEv2hQ97BtHXI=; b=M11UIVGhmNFDJfQ86NjWC2WTmdLO580RtqPXZu6WgyqGVTFuZlW0+mYIuiZq0kyaTB 9z/Uvr5dvV1g/Tfpmw5XW29kTjmcOa/bN94SOnfA999x4J59w8w3/hAFlVSdXV9waBAC DvFXJ5WxL0PevT6CDKkQOqugDuT5h7qRllYkWZjL/VLICivYRpMZnnR7aBwN8t3zlnoX uO9Wf4yM+n+qYEamMB7sz6zEcBEJgKpsYmh1solkyIrj6chJoNKTjSvkpRwww2uGKrCX yt4jNpQYjVm+ngGvQuXSNCGAXoDyk1qzdGLT6SIqL1I+QY/lg5Sk4u8rxmfz/AcgmzHv 3fpQ== X-Gm-Message-State: AOAM531JA0ahf/yYdS0avl08njAOXapeca+VTU94MqDmZxMwRmwfZQ3N bZ48bq972UkMDfqPXTn6mJwjm7jfGCplhgp18bhUmBUDczFN7YLuOFsjnI0Bvz5l4PGZioRGlc3 BJo/cx0rsySicHUo+Y4laWHzjF1hnGpfP+rE6UEhjXvNYkMMZDz3dMEAU9RVp X-Received: by 2002:a17:902:b908:b0:151:b8ec:2038 with SMTP id bf8-20020a170902b90800b00151b8ec2038mr2680092plb.76.1646371179522; Thu, 03 Mar 2022 21:19:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJx0oSKjLHhzbwnFEI3D/leuju9Hh++AWa1WGIDqEaacCZODw/P1z3OkOVq6mvSOA7JhcqdiMA== X-Received: by 2002:a17:902:b908:b0:151:b8ec:2038 with SMTP id bf8-20020a170902b90800b00151b8ec2038mr2680056plb.76.1646371179057; Thu, 03 Mar 2022 21:19:39 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:38 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 17/23] mm/hugetlb: Only drop uffd-wp special pte if required Date: Fri, 4 Mar 2022 13:17:02 +0800 Message-Id: <20220304051708.86193-18-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 3F6B0140002 X-Rspam-User: Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cX3abnRg; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf26.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: snuhep9nwswad3zjbrie4tn68oqh8wyz X-HE-Tag: 1646371182-369855 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As with shmem uffd-wp special ptes, only drop the uffd-wp special swap pte if unmapping an entire vma or synchronized such that faults can not race with the unmap operation. This requires passing zap_flags all the way to the lowest level hugetlb unmap routine: __unmap_hugepage_range. In general, unmap calls originated in hugetlbfs code will pass the ZAP_FLAG_DROP_MARKER flag as synchronization is in place to prevent faults. The exception is hole punch which will first unmap without any synchronization. Later when hole punch actually removes the page from the file, it will check to see if there was a subsequent fault and if so take the hugetlb fault mutex while unmapping again. This second unmap will pass in ZAP_FLAG_DROP_MARKER. The justification of "whether to apply ZAP_FLAG_DROP_MARKER flag when unmap a hugetlb range" is (IMHO): we should never reach a state when a page fault could errornously fault in a page-cache page that was wr-protected to be writable, even in an extremely short period. That could happen if e.g. we pass ZAP_FLAG_DROP_MARKER when hugetlbfs_punch_hole() calls hugetlb_vmdelete_list(), because if a page faults after that call and before remove_inode_hugepages() is executed, the page cache can be mapped writable again in the small racy window, that can cause unexpected data overwritten. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- fs/hugetlbfs/inode.c | 15 +++++++++------ include/linux/hugetlb.h | 8 +++++--- mm/hugetlb.c | 33 +++++++++++++++++++++++++-------- mm/memory.c | 5 ++++- 4 files changed, 43 insertions(+), 18 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 171212bdaae6..d017c674f1b8 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -404,7 +404,8 @@ static void remove_huge_page(struct page *page) } static void -hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) +hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, + unsigned long zap_flags) { struct vm_area_struct *vma; @@ -438,7 +439,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) } unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end, - NULL); + NULL, zap_flags); } } @@ -516,7 +517,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vmdelete_list(&mapping->i_mmap, index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); + (index + 1) * pages_per_huge_page(h), + ZAP_FLAG_DROP_MARKER); i_mmap_unlock_write(mapping); } @@ -582,7 +584,8 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset) i_mmap_lock_write(mapping); i_size_write(inode, offset); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) - hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0); + hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0, + ZAP_FLAG_DROP_MARKER); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, offset, LLONG_MAX); } @@ -615,8 +618,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, - hole_start >> PAGE_SHIFT, - hole_end >> PAGE_SHIFT); + hole_start >> PAGE_SHIFT, + hole_end >> PAGE_SHIFT, 0); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, hole_start, hole_end); inode_unlock(inode); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 38c5ac28b787..ab48b3bbb0e6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -143,11 +143,12 @@ long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, unsigned long *, unsigned long *, long, unsigned int, int *); void unmap_hugepage_range(struct vm_area_struct *, - unsigned long, unsigned long, struct page *); + unsigned long, unsigned long, struct page *, + unsigned long); void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page); + struct page *ref_page, unsigned long zap_flags); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -400,7 +401,8 @@ static inline unsigned long hugetlb_change_protection( static inline void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { BUG(); } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1a20be29ac3a..994d7a3ee871 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4931,7 +4931,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page) + struct page *ref_page, unsigned long zap_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long address; @@ -4987,7 +4987,18 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct * unmapped and its refcount is dropped, so just clear pte here. */ if (unlikely(!pte_present(pte))) { - huge_pte_clear(mm, address, ptep, sz); + /* + * If the pte was wr-protected by uffd-wp in any of the + * swap forms, meanwhile the caller does not want to + * drop the uffd-wp bit in this zap, then replace the + * pte with a marker. + */ + if (pte_swp_uffd_wp_any(pte) && + !(zap_flags & ZAP_FLAG_DROP_MARKER)) + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); + else + huge_pte_clear(mm, address, ptep, sz); spin_unlock(ptl); continue; } @@ -5015,7 +5026,11 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct tlb_remove_huge_tlb_entry(h, tlb, ptep, address); if (huge_pte_dirty(pte)) set_page_dirty(page); - + /* Leave a uffd-wp pte marker if needed */ + if (huge_pte_uffd_wp(pte) && + !(zap_flags & ZAP_FLAG_DROP_MARKER)) + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); hugetlb_count_sub(pages_per_huge_page(h), mm); page_remove_rmap(page, vma, true); @@ -5049,9 +5064,10 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { - __unmap_hugepage_range(tlb, vma, start, end, ref_page); + __unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags); /* * Clear this flag so that x86's huge_pmd_share page_table_shareable @@ -5067,12 +5083,13 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb, } void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { struct mmu_gather tlb; tlb_gather_mmu(&tlb, vma->vm_mm); - __unmap_hugepage_range(&tlb, vma, start, end, ref_page); + __unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags); tlb_finish_mmu(&tlb); } @@ -5127,7 +5144,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma, */ if (!is_vma_resv_set(iter_vma, HPAGE_RESV_OWNER)) unmap_hugepage_range(iter_vma, address, - address + huge_page_size(h), page); + address + huge_page_size(h), page, 0); } i_mmap_unlock_write(mapping); } diff --git a/mm/memory.c b/mm/memory.c index e9e335ecb5dc..43ab8d6c768e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1679,8 +1679,11 @@ static void unmap_single_vma(struct mmu_gather *tlb, * safe to do nothing in this case. */ if (vma->vm_file) { + unsigned long zap_flags = details ? + details->zap_flags : 0; i_mmap_lock_write(vma->vm_file->f_mapping); - __unmap_hugepage_range_final(tlb, vma, start, end, NULL); + __unmap_hugepage_range_final(tlb, vma, start, end, + NULL, zap_flags); i_mmap_unlock_write(vma->vm_file->f_mapping); } } else From patchwork Fri Mar 4 05:17:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DAEBC433EF for ; Fri, 4 Mar 2022 05:19:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B0F48D0002; Fri, 4 Mar 2022 00:19:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 360098D0001; Fri, 4 Mar 2022 00:19:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2285E8D0002; Fri, 4 Mar 2022 00:19:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0118.hostedemail.com [216.40.44.118]) by kanga.kvack.org (Postfix) with ESMTP id 153668D0001 for ; Fri, 4 Mar 2022 00:19:52 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CC93E181AC217 for ; Fri, 4 Mar 2022 05:19:51 +0000 (UTC) X-FDA: 79205551782.20.EE43C4F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 4584CC0010 for ; Fri, 4 Mar 2022 05:19:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wn3ILT0Lxv5s6XkvrMXIyY+ihJQ80lsNhwBboggDj0M=; b=bv0d5AFVEBZ51AwvNKYnAdjfHOtrlFrMRkl02lmYJCtKaDwi4O1Gdhrwu/1/jSS8EY8xEc 3m7J/+R2/bQYmx7NtjSFEGUu+KlAZQEiZ6wl3G9jW/RlocR1LBJXpcoy8jV9u1E0XRtpGy k9E1+HWb5GwS/sFML4oD8yoEE/BFREg= Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-15-wBVzls8PO6Kirl37yTYPug-1; Fri, 04 Mar 2022 00:19:50 -0500 X-MC-Unique: wBVzls8PO6Kirl37yTYPug-1 Received: by mail-pg1-f200.google.com with SMTP id n8-20020a654508000000b003783b1e9834so3942781pgq.0 for ; Thu, 03 Mar 2022 21:19:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Wn3ILT0Lxv5s6XkvrMXIyY+ihJQ80lsNhwBboggDj0M=; b=cq1Ss/wjAd5xYIKNw1WNeQh3A2CwfJGXL8nOCAMa4s4rp/1I9AXcGgMTnCW3XyZtkV VZpGGya0dmeGZnQ864USvtDFvmefD4Dbzx5GJU/ljwDujfhvyPun0zENW/0/Kji1HGP5 bXeSONtSTrcm7M51nHs1VT4MDcBm6rpsJvnQQntaIEU6MLtnELOKxya8bS0mZbZB5AMa EhYlkB2WRo/3GoxNzGRfKWOKov1yPXNIx7yI/pUtvehyoZ5x9wgjyFElJKRjO4W+/ER2 lUZ+edji6nebNgA6qdPi9S+TalYsoIWth894c57FwDkOUqVLR4Oxkxl0e/E9gqoCVyEW KZQA== X-Gm-Message-State: AOAM532GzJetCMpzXtNc/7lFbIXvxYPsuFzOtE8s1mbb8HZEQycnY08J sotxC57Mep7qRqBWx7FET4dnWcF9GEnWXJi5DppXwbJTc2umfX3+OnVpVEdeVZQ2pjo0P1Lwe7V 2kUam0zvMJrdCEZ2v/Sr4TUt6Aoy81tzPcrMykA4p5BtQyODqQflyCZRrAH2j X-Received: by 2002:a17:90a:f48f:b0:1bc:2521:fb0a with SMTP id bx15-20020a17090af48f00b001bc2521fb0amr9165190pjb.48.1646371188252; Thu, 03 Mar 2022 21:19:48 -0800 (PST) X-Google-Smtp-Source: ABdhPJycPvMgb2+JEX9mKyCqqvdoYelrEZ2x92Thb/SSCgHsoZmufdyixH7q1XZKq5P8YsP5zDuMwg== X-Received: by 2002:a17:90a:f48f:b0:1bc:2521:fb0a with SMTP id bx15-20020a17090af48f00b001bc2521fb0amr9165135pjb.48.1646371187413; Thu, 03 Mar 2022 21:19:47 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:47 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 18/23] mm/hugetlb: Handle uffd-wp during fork() Date: Fri, 4 Mar 2022 13:17:03 +0800 Message-Id: <20220304051708.86193-19-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 4584CC0010 X-Stat-Signature: w59f9frpxorau5wedichu6urr8y1qxya X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bv0d5AFV; spf=none (imf22.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam07 X-HE-Tag: 1646371191-348125 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, we'll need to pass in dst_vma into copy_hugetlb_page_range() because for uffd-wp it's the dst vma that matters on deciding how we should treat uffd-wp protected ptes. We should recognize pte markers during fork and do the pte copy if needed. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 7 +++++-- mm/hugetlb.c | 41 +++++++++++++++++++++++++++-------------- mm/memory.c | 2 +- 3 files changed, 33 insertions(+), 17 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ab48b3bbb0e6..6df51d23b7ee 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -137,7 +137,8 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, struct vm_area_struct *new_vma, unsigned long old_addr, unsigned long new_addr, unsigned long len); -int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *); +int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, + struct vm_area_struct *, struct vm_area_struct *); long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, unsigned long *, unsigned long *, long, unsigned int, @@ -268,7 +269,9 @@ static inline struct page *follow_huge_addr(struct mm_struct *mm, } static inline int copy_hugetlb_page_range(struct mm_struct *dst, - struct mm_struct *src, struct vm_area_struct *vma) + struct mm_struct *src, + struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 994d7a3ee871..f2508620f197 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4696,23 +4696,24 @@ hugetlb_install_page(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr } int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, - struct vm_area_struct *vma) + struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma) { pte_t *src_pte, *dst_pte, entry, dst_entry; struct page *ptepage; unsigned long addr; - bool cow = is_cow_mapping(vma->vm_flags); - struct hstate *h = hstate_vma(vma); + bool cow = is_cow_mapping(src_vma->vm_flags); + struct hstate *h = hstate_vma(src_vma); unsigned long sz = huge_page_size(h); unsigned long npages = pages_per_huge_page(h); - struct address_space *mapping = vma->vm_file->f_mapping; + struct address_space *mapping = src_vma->vm_file->f_mapping; struct mmu_notifier_range range; int ret = 0; if (cow) { - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, src, - vma->vm_start, - vma->vm_end); + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, src_vma, src, + src_vma->vm_start, + src_vma->vm_end); mmu_notifier_invalidate_range_start(&range); } else { /* @@ -4724,12 +4725,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, i_mmap_lock_read(mapping); } - for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { + for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; - dst_pte = huge_pte_alloc(dst, vma, addr, sz); + dst_pte = huge_pte_alloc(dst, dst_vma, addr, sz); if (!dst_pte) { ret = -ENOMEM; break; @@ -4764,6 +4765,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else if (unlikely(is_hugetlb_entry_migration(entry) || is_hugetlb_entry_hwpoisoned(entry))) { swp_entry_t swp_entry = pte_to_swp_entry(entry); + bool uffd_wp = huge_pte_uffd_wp(entry); if (is_writable_migration_entry(swp_entry) && cow) { /* @@ -4773,10 +4775,21 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, swp_entry = make_readable_migration_entry( swp_offset(swp_entry)); entry = swp_entry_to_pte(swp_entry); + if (userfaultfd_wp(src_vma) && uffd_wp) + entry = huge_pte_mkuffd_wp(entry); set_huge_swap_pte_at(src, addr, src_pte, entry, sz); } + if (!userfaultfd_wp(dst_vma) && uffd_wp) + entry = huge_pte_clear_uffd_wp(entry); set_huge_swap_pte_at(dst, addr, dst_pte, entry, sz); + } else if (unlikely(is_pte_marker(entry))) { + /* + * We copy the pte marker only if the dst vma has + * uffd-wp enabled. + */ + if (userfaultfd_wp(dst_vma)) + set_huge_pte_at(dst, addr, dst_pte, entry); } else { entry = huge_ptep_get(src_pte); ptepage = pte_page(entry); @@ -4791,20 +4804,20 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * need to be without the pgtable locks since we could * sleep during the process. */ - if (unlikely(page_needs_cow_for_dma(vma, ptepage))) { + if (unlikely(page_needs_cow_for_dma(src_vma, ptepage))) { pte_t src_pte_old = entry; struct page *new; spin_unlock(src_ptl); spin_unlock(dst_ptl); /* Do not use reserve as it's private owned */ - new = alloc_huge_page(vma, addr, 1); + new = alloc_huge_page(dst_vma, addr, 1); if (IS_ERR(new)) { put_page(ptepage); ret = PTR_ERR(new); break; } - copy_user_huge_page(new, ptepage, addr, vma, + copy_user_huge_page(new, ptepage, addr, dst_vma, npages); put_page(ptepage); @@ -4814,13 +4827,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { - restore_reserve_on_error(h, vma, addr, + restore_reserve_on_error(h, dst_vma, addr, new); put_page(new); /* dst_entry won't change as in child */ goto again; } - hugetlb_install_page(vma, dst_pte, addr, new); + hugetlb_install_page(dst_vma, dst_pte, addr, new); spin_unlock(src_ptl); spin_unlock(dst_ptl); continue; diff --git a/mm/memory.c b/mm/memory.c index 43ab8d6c768e..66c9890b7678 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1282,7 +1282,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) return 0; if (is_vm_hugetlb_page(src_vma)) - return copy_hugetlb_page_range(dst_mm, src_mm, src_vma); + return copy_hugetlb_page_range(dst_mm, src_mm, dst_vma, src_vma); if (unlikely(src_vma->vm_flags & VM_PFNMAP)) { /* From patchwork Fri Mar 4 05:17:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E262C433F5 for ; Fri, 4 Mar 2022 05:19:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A4028D0003; Fri, 4 Mar 2022 00:19:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 053888D0001; Fri, 4 Mar 2022 00:19:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E84068D0003; Fri, 4 Mar 2022 00:19:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DBA778D0001 for ; Fri, 4 Mar 2022 00:19:58 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9F1D1181951F6 for ; Fri, 4 Mar 2022 05:19:58 +0000 (UTC) X-FDA: 79205552076.30.A23BA5F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 33025A000C for ; Fri, 4 Mar 2022 05:19:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PRicOcRvWzoxpLs51SSJ+aFkDsMZLS/bjPquNOlSlQc=; b=gEKvOt7r/yOcpKptgVHLfdmz0R3d4G/sDtUxuqTqKfg5fJ0lLPFYMJqFPk4xOLEr3UKHgn HUJcYBQONqqz7yLIUrcxpG/z7ZDOmRvDoiim4yEJLrBfgUYszvPR8ICMIGlsD30FOUQKHW 1Vlg6XVOejHJPbohv4XsMEnFlE1vNoc= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-204-FJUJ0ldmNi6QhxP8N8TitA-1; Fri, 04 Mar 2022 00:19:56 -0500 X-MC-Unique: FJUJ0ldmNi6QhxP8N8TitA-1 Received: by mail-pg1-f198.google.com with SMTP id v4-20020a63f844000000b003745fd0919aso3901887pgj.20 for ; Thu, 03 Mar 2022 21:19:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PRicOcRvWzoxpLs51SSJ+aFkDsMZLS/bjPquNOlSlQc=; b=6Iabi8ni739NCWaPhxw+5pFqE9xWCCoFcvIFMMZTMYp3eEyK82q8MFGTC1TqqLAbo3 ExzA3ZXSEbHMmT0ZoJ3sbUB7Q4zWdcBO6XWPlIYHUYBBLua2ibfk48e/fDykj1EwOleo 9xA5XsBrX5FTvHt1cTOxrBW/z7iflwhkrrgNfKoF6NWwWqYopwKaBFw0hPGJFI4bbhhy Ea6dPAgcU+NzSoroZN1Ni8sS15/ILPbVqreKrsNmwSwjc+BvPoq8LyBdRkbikby69k/H l5VnJcW5owdUn40S4DyRQaRI/pIlWWFckg8DThocaqTAOu7mcQtUWhLkAbh0dBpBEYqt lWNw== X-Gm-Message-State: AOAM532imawloGJ7tBFYfw1Mr9LAlCZQsu2zcAS9oTTesF9o4zDfTbcf GC5odn8JMIO8rIJkmGjWLEgBzJiOYVkSy0Di0iFxq/qH2/57Pq1TZarSe3+W6maH7450gY8NoDU wxH2unoNapVnfYO4mNBrkDWhH0nH4N9u+LjZMz3HbFati4xlTGeI1FgOJm9gk X-Received: by 2002:a63:8bca:0:b0:37c:9049:103 with SMTP id j193-20020a638bca000000b0037c90490103mr2179473pge.387.1646371195450; Thu, 03 Mar 2022 21:19:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJzMQjw3TZyj+/ynwRaeUsP3tPV3EyAhD/DLSy/Gjrm4jimUydTxbOQNlZECErcoEoI+7O+Amw== X-Received: by 2002:a63:8bca:0:b0:37c:9049:103 with SMTP id j193-20020a638bca000000b0037c90490103mr2179446pge.387.1646371195056; Thu, 03 Mar 2022 21:19:55 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:19:54 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 19/23] mm/khugepaged: Don't recycle vma pgtable if uffd-wp registered Date: Fri, 4 Mar 2022 13:17:04 +0800 Message-Id: <20220304051708.86193-20-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 33025A000C X-Stat-Signature: 3roe5oq4hdnb6ngxr469nzh7dkdwxwnw X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gEKvOt7r; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf15.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam03 X-HE-Tag: 1646371198-331527 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When we're trying to collapse a 2M huge shmem page, don't retract pgtable pmd page if it's registered with uffd-wp, because that pgtable could have pte markers installed. Recycling of that pgtable means we'll lose the pte markers. That could cause data loss for an uffd-wp enabled application on shmem. Instead of disabling khugepaged on these files, simply skip retracting these special VMAs, then the page cache can still be merged into a huge thp, and other mm/vma can still map the range of file with a huge thp when proper. Note that checking VM_UFFD_WP needs to be done with mmap_sem held for write, that avoids race like: khugepaged user thread ========== =========== check VM_UFFD_WP, not set UFFDIO_REGISTER with uffd-wp on shmem wr-protect some pages (install markers) take mmap_sem write lock erase pmd and free pmd page --> pte markers are dropped unnoticed! Signed-off-by: Peter Xu --- mm/khugepaged.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a4e5eaf3eb01..87d88d6725af 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1456,6 +1456,10 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE)) return; + /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */ + if (userfaultfd_wp(vma)) + return; + hpage = find_lock_page(vma->vm_file->f_mapping, linear_page_index(vma, haddr)); if (!hpage) @@ -1591,7 +1595,15 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * reverse order. Trylock is a way to avoid deadlock. */ if (mmap_write_trylock(mm)) { - if (!khugepaged_test_exit(mm)) + /* + * When a vma is registered with uffd-wp, we can't + * recycle the pmd pgtable because there can be pte + * markers installed. Skip it only, so the rest mm/vma + * can still have the same file mapped hugely, however + * it'll always mapped in small page size for uffd-wp + * registered ranges. + */ + if (!khugepaged_test_exit(mm) && !userfaultfd_wp(vma)) collapse_and_free_pmd(mm, vma, addr, pmd); mmap_write_unlock(mm); } else { From patchwork Fri Mar 4 05:17:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AABCC433FE for ; Fri, 4 Mar 2022 05:20:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CFFF38D0005; Fri, 4 Mar 2022 00:20:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CAE808D0001; Fri, 4 Mar 2022 00:20:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9E278D0005; Fri, 4 Mar 2022 00:20:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id ACBF18D0001 for ; Fri, 4 Mar 2022 00:20:06 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6508397909 for ; Fri, 4 Mar 2022 05:20:06 +0000 (UTC) X-FDA: 79205552412.25.6AB075B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id ECE62C000D for ; Fri, 4 Mar 2022 05:20:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371205; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VZsj8wGvJYP0Z35VYck5GXCqHSNO5FmtKWCwc6ZUvEE=; b=UJQepTws6yabS8ug5Ts25tQrYW1BqvZ3Idju22QXYfWR1JEf4VDYnv0HdB6ZyRVi3BTnxW 5EOY8raYm2f/IF61O+UasqgfdugmJQufpRz8gtcn69AW6AnUuqvzwG69Ib8Za3gG4tCKRz 0Ic0J0LAoHH65qwVr5ByrB3MPrijgng= Received: from mail-pj1-f70.google.com (mail-pj1-f70.google.com [209.85.216.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-572-Yv_RW60yOC6x2k42yuv_AQ-1; Fri, 04 Mar 2022 00:20:04 -0500 X-MC-Unique: Yv_RW60yOC6x2k42yuv_AQ-1 Received: by mail-pj1-f70.google.com with SMTP id w3-20020a17090ac98300b001b8b914e91aso4197796pjt.0 for ; Thu, 03 Mar 2022 21:20:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VZsj8wGvJYP0Z35VYck5GXCqHSNO5FmtKWCwc6ZUvEE=; b=KwqJy5S9xBQ1xUfQUpHVErl2QwAUnGx+V5eVk3jNgA0x/mhaRSodG4UbfdXMJ53HHV VaZRiANv3G6sfdekLpJlrleJWmrfxf2VDsyrSd+m+ryhHn+huaPe/buZm0EJfY7Y3uSs D8Ih94dFxGWsz3LHRj9ijL7cm8Y4F2VcDWOtZNMhr/GwC0orvVDpHE7U7q1eb6+yBqGq pQ1Rc3QVcH69UUYAPMnhAosC2iI1f8H+nPzxASySVWi4ZXwDGDDaojeIduh143LsNFT6 /nj29UNASHfbV1Kxy+0sp0J24kBMntqSrh/jNgqwqtmGU29pSz1oYVTCE/WG5r6GDFpJ 2TmA== X-Gm-Message-State: AOAM532hIQuy9b2x3K/I8GdU1VvgGcn+vUdSPA0xVoSNNmlKDA2hnNY8 ouhEFm8vNI6P7KHRNFYafS93AHo1D5P7q77zKv5kDUGeSFADU8J1DTka/Wtb0KgkNbehoofGgHQ rWcnO/OVtMbwYAUopfZpXYmc76SfUWQHaQXqbdcMd+Egbx/y1NeyFbkIwDJZN X-Received: by 2002:a17:902:6b47:b0:150:80de:5d49 with SMTP id g7-20020a1709026b4700b0015080de5d49mr34640530plt.77.1646371203361; Thu, 03 Mar 2022 21:20:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJw05sOycQa3nYtvDYuzzS5+ZO7xhuLtA37+jl2qgXReyHL11C3e1uJ1D5StzvLh4AArL9JTNA== X-Received: by 2002:a17:902:6b47:b0:150:80de:5d49 with SMTP id g7-20020a1709026b4700b0015080de5d49mr34640498plt.77.1646371203014; Thu, 03 Mar 2022 21:20:03 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.19.55 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:20:02 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 20/23] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Date: Fri, 4 Mar 2022 13:17:05 +0800 Message-Id: <20220304051708.86193-21-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: ECE62C000D X-Stat-Signature: b75968n3xmnssw74aqb7trewm5tdmgy6 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UJQepTws; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf22.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1646371205-227939 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This requires the pagemap code to be able to recognize the newly introduced swap special pte for uffd-wp, meanwhile the general case for hugetlb that we recently start to support. It should make pagemap uffd-wp support complete. Signed-off-by: Peter Xu --- fs/proc/task_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2c48b1eaaa9c..46e5896b82d2 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1421,6 +1421,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, migration = is_migration_entry(entry); if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); + if (pte_marker_entry_uffd_wp(entry)) + flags |= PM_UFFD_WP; } if (page && !PageAnon(page)) @@ -1556,10 +1558,15 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, if (page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE; + if (huge_pte_uffd_wp(pte)) + flags |= PM_UFFD_WP; + flags |= PM_PRESENT; if (pm->show_pfn) frame = pte_pfn(pte) + ((addr & ~hmask) >> PAGE_SHIFT); + } else if (pte_swp_uffd_wp_any(pte)) { + flags |= PM_UFFD_WP; } for (; addr != end; addr += PAGE_SIZE) { From patchwork Fri Mar 4 05:17:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 337E3C433EF for ; Fri, 4 Mar 2022 05:20:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C62E68D0006; Fri, 4 Mar 2022 00:20:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C12328D0001; Fri, 4 Mar 2022 00:20:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADA538D0006; Fri, 4 Mar 2022 00:20:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id A10068D0001 for ; Fri, 4 Mar 2022 00:20:15 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 62C859813D for ; Fri, 4 Mar 2022 05:20:15 +0000 (UTC) X-FDA: 79205552790.24.555A928 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id C80101A0005 for ; Fri, 4 Mar 2022 05:20:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371214; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ffibrZjnGVszpXN3ddH8kcpciqEGO3vV2nAFK0S1oq0=; b=YxDnaodls8TJiB7wciZVON6tvQZUjiV9bD6dMHDRv3bGLfhRzOxxK8K2i1eklwmKpBvAfZ /I0wRutIROlgWOB2y5KTEvnb2QKsSrqTNXJFJMKjbm0IOelM+ArWw5vy9MmpCmir+lyfkp np2VOmuXEPDudaGBeqN+L/eLcfb9fXM= Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-631-XOX4BotOO8SSdnj8t0GXiw-1; Fri, 04 Mar 2022 00:20:13 -0500 X-MC-Unique: XOX4BotOO8SSdnj8t0GXiw-1 Received: by mail-pj1-f69.google.com with SMTP id t10-20020a17090a5d8a00b001bed9556134so6717267pji.5 for ; Thu, 03 Mar 2022 21:20:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ffibrZjnGVszpXN3ddH8kcpciqEGO3vV2nAFK0S1oq0=; b=kBsPBthOB4mmggCf0bcDHjDj6+XQkJiUNl2pM3M9LNdhm5XRctsP8QtLTxKGBqf4XM xbNdPwEVH7aaHCI+96wpOk9s/jF1kWD415qq6OS3U1oT16CcqAOUbul370lL0mJa/ylF hliBJCC2xL2L2TAAPjlAcRT8HfN5aF8Hgj/a1mf++pL5zvGN9C0yKqkgW0DPwT5gy9O6 fxFhCDn6hOwclDfJ7iR7GpZbxtPHvmTfeEwMy/OAo9+zFjwK2VdgJz0baWW0zjjLJLdu XwcWJk9bitag/t0N/YSGt+8vE+ivfnRvTfEnV/cU3uybynUMFeIS3+TFDxg36FqtfUK2 EuKw== X-Gm-Message-State: AOAM532e81bYh+zX8ZwiqerOtTiLGxBpTbPpnn2I0kk7D3VROjrl+148 1ITCQ64d4nB6RE/EgYLz1uzg/pLtbGp+adBRlFxgcgvYGglOHixCbEC7i0UXxF/Wys4pq80aRf5 uGsBjZ9KWXjsp3Leiqh5JLssq6Cu+2G3xG3YpRCibW1kVbrGXuE8abBBEqu4/ X-Received: by 2002:a17:902:8f83:b0:151:5c71:a6e6 with SMTP id z3-20020a1709028f8300b001515c71a6e6mr27671449plo.126.1646371211369; Thu, 03 Mar 2022 21:20:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJxJfL2XlMGfvdIOmaVhgHly9k8v9pLNrjUxE51LcGrUcbGkXTtSAjOI3Or1kjLz33VXgzk0CA== X-Received: by 2002:a17:902:8f83:b0:151:5c71:a6e6 with SMTP id z3-20020a1709028f8300b001515c71a6e6mr27671420plo.126.1646371210948; Thu, 03 Mar 2022 21:20:10 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.20.03 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:20:10 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 21/23] mm/uffd: Enable write protection for shmem & hugetlbfs Date: Fri, 4 Mar 2022 13:17:06 +0800 Message-Id: <20220304051708.86193-22-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C80101A0005 X-Stat-Signature: 88xkdi4y7tjycfoqenk4jjaaq4pd93kx Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YxDnaodl; spf=none (imf19.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1646371214-170369 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We've had all the necessary changes ready for both shmem and hugetlbfs. Turn on all the shmem/hugetlbfs switches for userfaultfd-wp. We can expand UFFD_API_RANGE_IOCTLS_BASIC with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 21 ++------------------- include/linux/userfaultfd_k.h | 11 +++++++++++ include/uapi/linux/userfaultfd.h | 10 ++++++++-- mm/userfaultfd.c | 9 +++------ 4 files changed, 24 insertions(+), 27 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 8b4a94f5a238..cd19083123fe 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1257,24 +1257,6 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma, - unsigned long vm_flags) -{ - /* FIXME: add WP support to hugetlbfs and shmem */ - if (vm_flags & VM_UFFD_WP) { - if (is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) - return false; - } - - if (vm_flags & VM_UFFD_MINOR) { - if (!(is_vm_hugetlb_page(vma) || vma_is_shmem(vma))) - return false; - } - - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); -} - static int userfaultfd_register(struct userfaultfd_ctx *ctx, unsigned long arg) { @@ -1954,7 +1936,8 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM); #endif #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP - uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; + uffdio_api.features &= + ~(UFFD_FEATURE_PAGEFAULT_FLAG_WP | UFFD_FEATURE_WP_HUGETLBFS_SHMEM); #endif uffdio_api.ioctls = UFFD_API_IOCTLS; ret = -EFAULT; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 827e38b7be65..69b174807d8f 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -18,6 +18,7 @@ #include #include #include +#include /* The set of all possible UFFD-related VM flags. */ #define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_WP | VM_UFFD_MINOR) @@ -140,6 +141,16 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & __VM_UFFD_FLAGS; } +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) +{ + if (vm_flags & VM_UFFD_MINOR) + return is_vm_hugetlb_page(vma) || vma_is_shmem(vma); + + return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || + vma_is_shmem(vma); +} + extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index ef739054cb1c..7d32b1e797fb 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -33,7 +33,8 @@ UFFD_FEATURE_THREAD_ID | \ UFFD_FEATURE_MINOR_HUGETLBFS | \ UFFD_FEATURE_MINOR_SHMEM | \ - UFFD_FEATURE_EXACT_ADDRESS) + UFFD_FEATURE_EXACT_ADDRESS | \ + UFFD_FEATURE_WP_HUGETLBFS_SHMEM) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -47,7 +48,8 @@ #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_CONTINUE) + (__u64)1 << _UFFDIO_CONTINUE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) /* * Valid ioctl command number range with this API is from 0x00 to @@ -194,6 +196,9 @@ struct uffdio_api { * UFFD_FEATURE_EXACT_ADDRESS indicates that the exact address of page * faults would be provided and the offset within the page would not be * masked. + * + * UFFD_FEATURE_WP_HUGETLBFS_SHMEM indicates that userfaultfd + * write-protection mode is supported on both shmem and hugetlbfs. */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #define UFFD_FEATURE_EVENT_FORK (1<<1) @@ -207,6 +212,7 @@ struct uffdio_api { #define UFFD_FEATURE_MINOR_HUGETLBFS (1<<9) #define UFFD_FEATURE_MINOR_SHMEM (1<<10) #define UFFD_FEATURE_EXACT_ADDRESS (1<<11) +#define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12) __u64 features; __u64 ioctls; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 441728732033..b70167a563f8 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -730,15 +730,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, err = -ENOENT; dst_vma = find_dst_vma(dst_mm, start, len); - /* - * Make sure the vma is not shared, that the dst range is - * both valid and fully within a single existing vma. - */ - if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + + if (!dst_vma) goto out_unlock; if (!userfaultfd_wp(dst_vma)) goto out_unlock; - if (!vma_is_anonymous(dst_vma)) + if (!vma_can_userfault(dst_vma, dst_vma->vm_flags)) goto out_unlock; if (is_vm_hugetlb_page(dst_vma)) { From patchwork Fri Mar 4 05:17:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02A66C433EF for ; Fri, 4 Mar 2022 05:20:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 892C78D0007; Fri, 4 Mar 2022 00:20:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 841D18D0001; Fri, 4 Mar 2022 00:20:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7579D8D0007; Fri, 4 Mar 2022 00:20:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 68FD18D0001 for ; Fri, 4 Mar 2022 00:20:22 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4AD70208B4 for ; Fri, 4 Mar 2022 05:20:22 +0000 (UTC) X-FDA: 79205553084.12.9C0A826 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id D2245A0002 for ; Fri, 4 Mar 2022 05:20:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371221; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TsbV9aZwys4B5ATDrVcah3NNWPIloXpy6hGN6YFlyJ0=; b=BZmJ5Gz2a2wPLwAYt6nf2L9AkvA3twlrbTH60wqJtIBNhlN1kaj7xTtvNnn18pGLzWZ5xK 1E1XaKwd43+ZwtkUGKazdAv/GMXQVJ1SabxGbOjgOpsrGIee0EGy41cnj/x7hw9ylNRz1z HOQzjLjTdHy7YcDNg4yDnlZFrG1gtV4= Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-154-ZMfuUTYANU6mpfaMgLVVXw-1; Fri, 04 Mar 2022 00:20:20 -0500 X-MC-Unique: ZMfuUTYANU6mpfaMgLVVXw-1 Received: by mail-pf1-f199.google.com with SMTP id s128-20020a627786000000b004f65e8a30e4so3552304pfc.12 for ; Thu, 03 Mar 2022 21:20:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TsbV9aZwys4B5ATDrVcah3NNWPIloXpy6hGN6YFlyJ0=; b=uAtUlo7St3IhSZMyFBY+Zi3jidVAzpYSH9zeQJp8YEA1gPEeqsSPxKOFOy6GJb3eI2 JIWqpir2btVjHQCkfLu/sRAU76n+xccX3/XEBqRGj4qD2Z8ilu+tgsLV/MGc4tjb54bR OkzgIpDiof9EDgDcfg+G4pYL2LXrfeU97ILWmzYQwyXNOpa3pJ+gtgvrJJyS66DnkdQ5 WcU+v5fmHihLhxk4pAGFEVO0kVNy/2gnAZOhs2M+81YbZVflYMi8l8I0i2/fstXzl9OX gqkZLqDGUjIBn2ZRot5Df8BLiJxJ4ijc5J2A42Vh1I/cQ8AzlqZOdKAyBPoKtGTP98Zp yNKg== X-Gm-Message-State: AOAM530YvPIgKtnrDlH6zeHUwmybdz+vARmHOBUoBrM420gw3qR8J1Cc 0BkimxWmlAi2aYJrgrFRCDQjh/wVClOJ0YBzDJpG/VSRpUkKO64HHgSbJoEO+Uo4G5pIMWqjWJ5 cwDU7ftxB2x4WOBtY/in3mqVvKRBhPUBDtDMmCTiIRuy3VPSXdXWtim4UHq3y X-Received: by 2002:a63:8648:0:b0:37c:8fc8:ae4 with SMTP id x69-20020a638648000000b0037c8fc80ae4mr2426573pgd.482.1646371219178; Thu, 03 Mar 2022 21:20:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJxRWTNjzqF9FDUsgNw/iTSctk6JZiPnQ2cg8VqL5h80ojfUPsMdu0+kgc5kk3/QOyRztfI6Hg== X-Received: by 2002:a63:8648:0:b0:37c:8fc8:ae4 with SMTP id x69-20020a638648000000b0037c8fc80ae4mr2426547pgd.482.1646371218872; Thu, 03 Mar 2022 21:20:18 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.20.11 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:20:18 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 22/23] mm: Enable PTE markers by default Date: Fri, 4 Mar 2022 13:17:07 +0800 Message-Id: <20220304051708.86193-23-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: D2245A0002 X-Stat-Signature: rx78hxogqxdnuu7wznte7wnuxsyarc39 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BZmJ5Gz2; spf=none (imf25.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1646371221-727843 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enable PTE markers by default. On x86_64 it means it'll auto-enable PTE_MARKER_UFFD_WP as well. Signed-off-by: Peter Xu --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/Kconfig b/mm/Kconfig index a80ea8721885..93e90efc4ab7 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -901,7 +901,7 @@ config ANON_VMA_NAME difference in their name. config PTE_MARKER - def_bool n + def_bool y bool "Marker PTEs support" help From patchwork Fri Mar 4 05:17:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12768478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1681C433F5 for ; Fri, 4 Mar 2022 05:20:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5FA8A8D000A; Fri, 4 Mar 2022 00:20:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A9C48D0001; Fri, 4 Mar 2022 00:20:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 471988D000A; Fri, 4 Mar 2022 00:20:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id 3964C8D0001 for ; Fri, 4 Mar 2022 00:20:30 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EDC3A8249980 for ; Fri, 4 Mar 2022 05:20:29 +0000 (UTC) X-FDA: 79205553378.31.3F69AF6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 7AB868000F for ; Fri, 4 Mar 2022 05:20:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646371229; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/LhV/Ezx1y/irXCjQ5LcN5y8nixtLOeiBAqYVpW0dWs=; b=f/ykolbZ8pHHDBTqjEigSUZYO2EyTRylYR4gx4V6FrttfhXFXRUCJoNDH3wny/xZytr6FI H4cP/4oNh0yRbZXht4v2tbZAe4s5sy9zGic7yIrsMgGGVfCnat2f+u7aGYyx9OdNjWxrnQ /cXsHlolZQ/mrAW+rex88mufW/dXUoo= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-500-6Ml0gXcWOgiv9z0TWnLBpg-1; Fri, 04 Mar 2022 00:20:28 -0500 X-MC-Unique: 6Ml0gXcWOgiv9z0TWnLBpg-1 Received: by mail-pg1-f198.google.com with SMTP id v8-20020a654608000000b0037d5ef9cfa0so212854pgq.8 for ; Thu, 03 Mar 2022 21:20:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/LhV/Ezx1y/irXCjQ5LcN5y8nixtLOeiBAqYVpW0dWs=; b=ZNLDngfg1sXo6wk/h0mdeF3dTo6rJU2FEAO87oIR0S78sD3l7MbFxYCRE88XPSTUPO Ozac/NQYu+hl/Hiy/oIcOUNIZ5UDF/T9tY729gkdtqjH37CxSm0+z4Cbn9JFLO9ZqTFY hluVewAP2YVIZsXAZ6uesuXEkuNRJq7GA4GIz636bs8cgH96h8IRz6NlxtdU8LV1aVEb PhxJvxABdmY8AcB9/l60kMj/2UmQfD7vKzioyrHX39NrenlsvP0vIBA96TZ4zHfNMhtS YRptg1zWE6Nqqik78aaU6HE5IjUt5NmB4Woaih9WwLods9uA8VnQyLCaZDXfAPYDUDTu 5q+w== X-Gm-Message-State: AOAM532NLF+qhezKiM0e38tWkpRnHlSIUuG4eVLrASAnmyZn8x/2vjaw vUi8E49+iBuwvqAI6F7Aq/oCjo7bWWtzytcma5Aw8miB13OL9G41YqBHd4dNmlF+RRKIYN2nZmV yr+8r1FPtneVJ7LykkRRJaNTwbAAuCrqZHg3BVRpSX3GBf/uDe89h3IyfcsBR X-Received: by 2002:a17:90b:17cb:b0:1bf:138d:e0f8 with SMTP id me11-20020a17090b17cb00b001bf138de0f8mr5251925pjb.157.1646371226981; Thu, 03 Mar 2022 21:20:26 -0800 (PST) X-Google-Smtp-Source: ABdhPJw1mSRuLkPo2ioYs6r74omrthJGnPJt2ZsaElav8kmbyVq3KwqPqDcyzWlHKXdEjHb9j9zBKw== X-Received: by 2002:a17:90b:17cb:b0:1bf:138d:e0f8 with SMTP id me11-20020a17090b17cb00b001bf138de0f8mr5251896pjb.157.1646371226610; Thu, 03 Mar 2022 21:20:26 -0800 (PST) Received: from localhost.localdomain ([94.177.118.59]) by smtp.gmail.com with ESMTPSA id p16-20020a056a000b5000b004f669806cd9sm4323865pfo.87.2022.03.03.21.20.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Mar 2022 21:20:26 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Nadav Amit , Hugh Dickins , David Hildenbrand , Axel Rasmussen , Matthew Wilcox , Alistair Popple , Mike Rapoport , Andrew Morton , Jerome Glisse , Mike Kravetz , "Kirill A . Shutemov" , Andrea Arcangeli Subject: [PATCH v7 23/23] selftests/uffd: Enable uffd-wp for shmem/hugetlbfs Date: Fri, 4 Mar 2022 13:17:08 +0800 Message-Id: <20220304051708.86193-24-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220304051708.86193-1-peterx@redhat.com> References: <20220304051708.86193-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7AB868000F X-Stat-Signature: 5x75ms1fpy5ftc761fwyagcfmt6a4tax Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="f/ykolbZ"; spf=none (imf30.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-HE-Tag: 1646371229-706747 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After we added support for shmem and hugetlbfs, we can turn uffd-wp test on always now. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index fe404398c65a..d91668df8135 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -82,7 +82,7 @@ static int test_type; static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; /* Whether to test uffd write-protection */ -static bool test_uffdio_wp = false; +static bool test_uffdio_wp = true; /* Whether to test uffd minor faults */ static bool test_uffdio_minor = false; @@ -1597,8 +1597,6 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; - /* Only enable write-protect test for anonymous test */ - test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;